EAAI Journal 2026 Journal Article
A doubly reinforced local search for solving the Quadratic Multiple Knapsack Problem
- Yingsong Nie
- Lingyan Zhang
- Xiaolu Liu
- Lei He
- Tao Guan
- Yan Jin
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
EAAI Journal 2025 Journal Article
AAAI Conference 2025 Conference Paper
This paper proposes a dual divide-and-optimize algorithm (DualOpt) for solving the large-scale traveling salesman problem (TSP). DualOpt combines two complementary strategies to improve both solution quality and computational efficiency. The first strategy is a grid-based divide-and-conquer procedure that partitions the TSP into smaller sub-problems, solving them in parallel and iteratively refining the solution by merging nodes and partial routes. The process continues until only one grid remains, yielding a high-quality initial solution. The second strategy involves a path-based divide-and-optimize procedure that further optimizes the solution by dividing it into sub-paths, optimizing each using a neural solver, and merging them back to progressively improve the overall solution. Extensive experiments conducted on two groups of TSP benchmark instances, including randomly generated instances with up to 100,000 nodes and real-world datasets from TSPLIB, demonstrate the effectiveness of DualOpt. The proposed DualOpt achieves highly competitive results compared to 10 state-of-the-art algorithms in the literature. In particular, DualOpt achieves an improvement gap up to 1.40% for the largest instance TSP100K with a remarkable 104x speed-up over the leading heuristic solver LKH3. Additionally, DualOpt demonstrates strong generalization on TSPLIB benchmarks, confirming its capability to tackle diverse real-world TSP applications.
AIJ Journal 2025 Journal Article
IJCAI Conference 2025 Conference Paper
Large Language Models (LLMs) have emerged as a promising technology for solving combinatorial optimization problems. However, their direct application to scheduling problems remains limited due to the inherent complexity of these problems. This paper proposes an LLMs-based neighborhood search method that leverages LLMs to tackle the job shop scheduling problem (JSP) and its variants. The main contributions of this work are threefold. First, we introduce a novel LLMs-guided neighborhood evaluation strategy that guides local search by dynamically adjusting operation weights. Second, we develop a verification evolution (VeEvo) framework to mitigate the hallucination effects of LLMs, enabling the generation of high-quality heuristics for weight updates. Third, we integrate this framework with the weighted neighborhood evaluation strategy to effectively guide the search towards promising regions. Extensive experiments are conducted on 349 benchmark instances across three classical scheduling problems. The results demonstrate that our algorithm significantly outperforms existing state-of-the-art methods. For JSP, our algorithm reduces the average optimality gap from 10. 46% to 1. 35% on Taillard's instances compared to reinforced adaptive staircase curriculum learning. For flexible JSP (FJSP), it reduces the gap from 13. 24% to 0. 05% on Brandimarte's instances compared to deep reinforcement learning methods. Furthermore, for FJSP with sequence dependent setup time, our algorithm updates 9 upper bounds for benchmark instances.
IJCAI Conference 2024 Conference Paper
Existing continual learning literature relies heavily on a strong assumption that tasks arrive with a balanced data stream, which is often unrealistic in real-world applications. In this work, we explore task-imbalanced continual learning (TICL) scenarios where the distribution of task data is non-uniform across the whole learning process. We find that imbalanced tasks significantly challenge the capability of models to control the trade-off between stability and plasticity from the perspective of recent prompt-based continual learning methods. On top of the above finding, we propose Dynamically Anchored Prompting (DAP), a prompt-based method that only maintains a single general prompt to adapt to the shifts within a task stream dynamically. This general prompt is regularized in the prompt space with two specifically designed prompt anchors, called boosting anchor and stabilizing anchor, to balance stability and plasticity in TICL. Remarkably, DAP achieves this balance by only storing a prompt across the data stream, therefore offering a substantial advantage in rehearsal-free CL. Extensive experiments demonstrate that the proposed DAP results in 4. 5% to 15% absolute improvements over state-of-the-art methods on benchmarks under task-imbalanced settings. Our code is available at https: //github. com/chenxing6666/DAP.
IJCAI Conference 2024 Conference Paper
The Traveling Thief Problem (TTP) is a challenging combinatorial optimization problem with broad practical applications. TTP combines two NP-hard problems: the Traveling Salesman Problem (TSP) and Knapsack Problem (KP). While a number of machine learning and deep learning based algorithms have been developed for TSP and KP, there is limited research dedicated to TTP. In this paper, we present the first reinforcement learning based multi-start neighborhood search algorithm, denoted by ReinforceNS, for solving TTP. To accelerate the search, we employ a pre-processing procedure for neighborhood reduction. A TSP routing and an iterated greedy packing are independently utilized to construct a high-quality initial solution, further improved by a reinforcement learning based neighborhood search. Additionally, a post-optimization procedure is devised for continued solution improvement. We conduct extensive experiments on 60 commonly used benchmark instances with 76 to 33810 cities in the literature. The experimental results demonstrate that our proposed ReinforceNS algorithm outperforms three state-of-the-art algorithms in terms of solution quality with the same time limit. In particular, ReinforceNS achieves 12 new results for 18 instances publicly reported in a recent TTP competition. We also perform an additional experiment to validate the effectiveness of the reinforcement learning strategy.
EAAI Journal 2024 Journal Article
AAAI Conference 2023 Conference Paper
We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Traveling Salesman Problem (TSP). The proposed H-TSP constructs a solution of a TSP instance starting from the scratch relying on two components: the upper-level policy chooses a small subset of nodes (up to 200 in our experiment) from all nodes that are to be traversed, while the lower-level policy takes the chosen nodes as input and outputs a tour connecting them to the existing partial route (initially only containing the depot). After jointly training the upper-level and lower-level policies, our approach can directly generate solutions for the given TSP instances without relying on any time-consuming search procedures. To demonstrate effectiveness of the proposed approach, we have conducted extensive experiments on randomly generated TSP instances with different numbers of nodes. We show that H-TSP can achieve comparable results (gap 3.42% vs. 7.32%) as SOTA search-based approaches, and more importantly, we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s). To the best of our knowledge, H-TSP is the first end-to-end deep reinforcement learning approach that can scale to TSP instances of up to 10000 nodes. Although there are still gaps to SOTA results with respect to solution quality, we believe that H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride hailing service.
AAAI Conference 2023 Conference Paper
Traveling Salesman Problem (TSP), as a classic routing optimization problem originally arising in the domain of transportation and logistics, has become a critical task in broader domains, such as manufacturing and biology. Recently, Deep Reinforcement Learning (DRL) has been increasingly employed to solve TSP due to its high inference efficiency. Nevertheless, most of existing end-to-end DRL algorithms only perform well on small TSP instances and can hardly generalize to large scale because of the drastically soaring memory consumption and computation time along with the enlarging problem scale. In this paper, we propose a novel end-to-end DRL approach, referred to as Pointerformer, based on multi-pointer Transformer. Particularly, Pointerformer adopts both reversible residual network in the encoder and multi-pointer network in the decoder to effectively contain memory consumption of the encoder-decoder architecture. To further improve the performance of TSP solutions, Pointerformer employs a feature augmentation method to explore the symmetries of TSP at both training and inference stages as well as an enhanced context embedding approach to include more comprehensive context information in the query. Extensive experiments on a randomly generated benchmark and a public benchmark have shown that, while achieving comparative results on most small-scale TSP instances as state-of-the-art DRL approaches do, Pointerformer can also well generalize to large-scale TSPs.
IJCAI Conference 2022 Conference Paper
We address Partial MaxSAT (PMS) and Weighted PMS (WPMS), two practical generalizations of the MaxSAT problem, and propose a local search algorithm called BandMaxSAT, that applies a multi-armed bandit to guide the search direction, for these problems. The bandit in our method is associated with all the soft clauses in the input (W)PMS instance. Each arm corresponds to a soft clause. The bandit model can help BandMaxSAT to select a good direction to escape from local optima by selecting a soft clause to be satisfied in the current step, that is, selecting an arm to be pulled. We further propose an initialization method for (W)PMS that prioritizes both unit and binary clauses when producing the initial solutions. Extensive experiments demonstrate that BandMaxSAT significantly outperforms the state-of-the-art (W)PMS local search algorithm SATLike3. 0. Specifically, the number of instances in which BandMaxSAT obtains better results is about twice that obtained by SATLike3. 0. We further combine BandMaxSAT with the complete solver TT-Open-WBO-Inc. The resulting solver BandMaxSAT-c also outperforms some of the best state-of-the-art complete (W)PMS solvers, including SATLike-c, Loandra and TT-Open-WBO-Inc.
AAAI Conference 2021 Conference Paper
We address the Traveling Salesman Problem (TSP), a famous NP-hard combinatorial optimization problem. And we propose a variable strategy reinforced approach, denoted as VSR-LKH, which combines three reinforcement learning methods (Q-learning, Sarsa and Monte Carlo) with the well-known TSP algorithm, called Lin-Kernighan-Helsgaun (LKH). VSR-LKH replaces the inflexible traversal operation in LKH, and lets the program learn to make choice at each search step by reinforcement learning. Experimental results on 111 TSP benchmarks from the TSPLIB with up to 85, 900 cities demonstrate the excellent performance of the proposed method.
AAAI Conference 2020 Conference Paper
The problem of enumerating all maximal cliques in a graph is a key primitive in a variety of real-world applications such as community detection and so on. However, in practice, communities are rarely formed as cliques due to data noise. Hence, k-plex, a subgraph in which any vertex is adjacent to all but at most k vertices, is introduced as a relaxation of clique. In this paper, we investigate the problem of enumerating all maximal k-plexes and present FaPlexen, an enumeration algorithm which integrates the “pivot” heuristic and new branching schemes. To our best knowledge, for the first time, FaPlexen lists all maximal k-plexes with provably worst-case running time O(n2 γn ) in a graph with n vertices, where γ < 2. Then, we propose another algorithm CommuPlex which non-trivially extends FaPlexen to find all maximal kplexes of prescribed size for community detection in massive real-life networks. We finally carry out experiments on both real and synthetic graphs and demonstrate that our algorithms run much faster than the state-of-the-art algorithms.
AAAI Conference 2018 Short Paper
We present our preliminary work to determine if patient’s vocal acoustic, linguistic, and facial patterns could predict clinical ratings of depression severity, namely Patient Health Questionnaire depression scale (PHQ-8). We proposed a multi-modal fusion model that combines three different modalities: audio, video, and text features. By training over the AVEC2017 dataset, our proposed model outperforms each single-modality prediction model, and surpasses the dataset baseline with a nice margin.
EAAI Journal 2015 Journal Article
YNICL Journal 2015 Journal Article
YNIMG Journal 2014 Journal Article
KER Journal 1998 Journal Article
This document provides the specification of the Process Interchange Format (PIF) version 1.2. The goal of this work is to develop an interchange format to help automatically exchange process descriptions among a wide variety of business process modelling and support systems such as workflow software, flow charting tools, planners, process simulation systems and process repositories. Instead of having to write ad hoc translators for each pair of such systems each system will only need to have a single translator for converting process descriptions in that system into and out of the common PIF format. Then any system will be able to automatically exchange basic process descriptions with any other system. This document describes the PIF-CORE 1.2, i.e. the core set of object types (such as activities, agents and prerequisite relations) that can be used to describe the basic elements of any process. The document also describes a framework for extending the core set of object types to include additional information needed in specific applications. These extended descriptions are exchanged in such a way that the common elements are interpretable by any PIF translator, and the additional elements are interpretable by any translator that knows about the extensions. The PIF format was developed by a working group including representatives from several universities and companies, and has been used for experimental automatic translations among systems developed independently at three of these sites. This document is being distributed in the hopes that other groups will comment upon the interchange format proposed here, and that this format (or future versions of it) may be useful to other groups as well. The PIF Document 1.0 was released in December 1994, and the current document reports the revised PIF that incorporate the feedback received since then.