EAAI Journal 2026 Journal Article
A doubly reinforced local search for solving the Quadratic Multiple Knapsack Problem
- Yingsong Nie
- Lingyan Zhang
- Xiaolu Liu
- Lei He
- Tao Guan
- Yan Jin
The Quadratic Multiple Knapsack Problem (QMKP) is a computationally challenging combinatorial optimization problem with important real-world applications in manufacturing and production planning. Traditional methods often struggle to effectively balance intensification and diversification, relying heavily on expert experience to guide the search process. To address these limitations, we propose a doubly reinforced local search algorithm (denoted as DRLS) that integrates two distinct reinforcement learning methods into a multi-neighborhood tabu search framework, each activated at different stages of the search process to improve adaptive decision-making. Specifically, a multi-armed bandit mechanism is incorporated into the neighborhood selection phase to dynamically select promising neighborhood operators in a stateless environments for effective search. In addition, a Q-learning model is employed in item removal phase to state-dependent remove items, enabling the search to escape from local optima. Evaluations on 720 benchmark instances across four datasets demonstrate that DRLS consistently outperforms six state-of-the-art algorithms in both solution quality and runtime efficiency. In particular, DRLS discovers new best-known solutions for over 50% of the instances, highlighting its effectiveness. Additional experiments are presented to gain insight into the role of the reinforcement learning components.