Arrow Research search

Author name cluster

Feng Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

65 papers
2 author rows

Possible papers

65

AAAI Conference 2026 Conference Paper

Neural Video Compression with Reference Hierarchy

  • Chuanbo Tang
  • Zhuoyuan Li
  • Li Li
  • Dong Liu
  • Feng Wu

Efficient reference structures are essential in video compression, enabling the exploitation of temporal dependencies across frames to reduce redundancy. In this paper, we delve into the inter-frame reference management mechanism in neural video codecs (NVCs). Previous schemes have inherited the reference propagation mechanism with the guidance of predefined reference structure, but the reference modeling across diverse reference sources remains underexplored. Moreover, the mismatch between the reference structure used for motion estimation and motion compensation limits the effectiveness of inter-frame prediction. To address the above limitations, we propose the unified reference hierarchy that integrates a learned hierarchical reference structure into the existing inherent reference propagation mechanism. Specifically, we first propose the hierarchical reference structure (HRS) to manage the multiple temporal contexts in the propagated reference feature, where a hierarchy-aware reference modulation module is integrated to select the most relevant reference features across different quality levels under the guidance of the reference balance loss. In addition, we propose the HRS-guided feature-wise inter-frame prediction that learns the low-rank approximation of the selected reference feature for ensuring the consistency and improving the inter-frame prediction performance. We conduct experiments on a state-of-the-art NVC, DCVC-DC. Experimental results show that our codec achieves an average 26% bitrate saving over H.266/VVC, and a 28.2% bitrate reduction compared to DCVC-DC without increasing the decoding complexity.

AAAI Conference 2026 Conference Paper

Scalable Mixed-Integer Optimization with Neural Constraints via Dual Decomposition

  • Shuli Zeng
  • Sijia Zhang
  • Feng Wu
  • Shaojie Tang
  • Xiangyang Li

Embedding deep neural networks (NNs) into mixed-integer programs (MIPs) is attractive for decision making with learned constraints, yet state-of-the-art monolithic linearisations blow up in size and quickly become intractable. In this paper, we introduce a novel dual-decomposition framework that relaxes the single coupling equality u=x with an augmented Lagrange multiplier and splits the problem into a vanilla MIP and a constrained NN block. Each part is tackled by the solver that suits it best-branch and cut for the MIP subproblem, first-order optimisation for the NN subproblem, so the model remains modular, the number of integer variables never grows with network depth, and the per-iteration cost scales only linearly with the NN size. On the public SurrogateLIB benchmark, our method proves scalable, modular, and adaptable: it runs 120x faster than an exact Big-M formulation on the largest test case; the NN sub-solver can be swapped from a log-barrier interior step to a projected-gradient routine with no code changes; and swapping the MLP for an LSTM backbone still completes the full optimisation in 47s without any bespoke adaptation.

NeurIPS Conference 2025 Conference Paper

Accurate KV Cache Eviction via Anchor Direction Projection for Efficient LLM Inference

  • Zijie Geng
  • Jie Wang
  • Ziqi Liu
  • Feng Ju
  • Yiming Li
  • Xing Li
  • Mingxuan Yuan
  • Jianye Hao

Key-Value (KV) cache eviction---which retains the KV pairs of the most important tokens while discarding less important ones---is a critical technique for optimizing both memory usage and inference latency in large language models (LLMs). However, existing approaches often rely on simple heuristics---such as attention weights---to measure token importance, overlooking the spatial relationships between token value states in the vector space. This often leads to suboptimal token selections and thus performance degradation. To tackle this problem, we propose a novel method, namely **AnDPro** (**An**chor **D**irection **Pro**jection), which introduces a projection-based scoring function to more accurately measure token importance. Specifically, AnDPro operates in the space of value vectors and leverages the projections of these vectors onto an *``Anchor Direction''*---the direction of the pre-eviction output---to measure token importance and guide more accurate token selection. Experiments on $16$ datasets from the LongBench benchmark demonstrate that AnDPro can maintain $96. 07\\%$ of the full cache accuracy using only $3. 44\\%$ KV cache budget, reducing KV cache budget size by $46. 0\\%$ without compromising quality compared to previous state-of-the-arts.

NeurIPS Conference 2025 Conference Paper

Amplifying Prominent Representations in Multimodal Learning via Variational Dirichlet Process

  • Tsai Hor Chan
  • Feng Wu
  • Yihang Chen
  • Guosheng Yin
  • Lequan Yu

Developing effective multimodal fusion approaches has become increasingly essential in many real-world scenarios, such as health care and finance. The key challenge is how to preserve the feature expressiveness in each modality while learning cross-modal interactions. Previous approaches primarily focus on the cross-modal alignment, while over-emphasis on the alignment of marginal distributions of modalities may impose excess regularization and obstruct meaningful representations within each modality. The Dirichlet process (DP) mixture model is a powerful Bayesian non-parametric method that can amplify the most prominent features by its richer-gets-richer property, which allocates increasing weights to them. Inspired by this unique characteristic of DP, we propose a new DP-driven multimodal learning framework that automatically achieves an optimal balance between prominent intra-modal representation learning and cross-modal alignment. Specifically, we assume that each modality follows a mixture of multivariate Gaussian distributions and further adopt DP to calculate the mixture weights for all the components. This paradigm allows DP to dynamically allocate the contributions of features and select the most prominent ones, leveraging its richer-gets-richer property, thus facilitating multimodal feature fusion. Extensive experiments on several multimodal datasets demonstrate the superior performance of our model over other competitors. Ablation analysis further validates the effectiveness of DP in aligning modality distributions and its robustness to changes in key hyperparameters. Code is anonymously available at https: //github. com/HKU-MedAI/DPMM. git

NeurIPS Conference 2025 Conference Paper

Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms

  • Zhihai Wang
  • Zijie Geng
  • Zhaojie Tu
  • Jie Wang
  • Yuxi Qian
  • Zhexuan Xu
  • Ziyan Liu
  • Siyuan Xu

Chip placement is a critical step in the Electronic Design Automation (EDA) workflow, which aims to arrange chip modules on the canvas to optimize the performance, power, and area (PPA) metrics of final designs. Recent advances show great potential of AI-based algorithms in chip placement. However, due to the lengthy EDA workflow, evaluations of these algorithms often focus on intermediate surrogate metrics, which are computationally efficient but often misalign with the final end-to-end performance (i. e. , the final design PPA). To address this challenge, we propose to build ChiPBench, a comprehensive benchmark specifically designed to evaluate the effectiveness of AI-based algorithms in final design PPA metrics. Specifically, we generate a diverse evaluation dataset from $20$ circuits across various domains, such as CPUs, GPUs, and NPUs. We then evaluate six state-of-the-art AI-based chip placement algorithms on the dataset and conduct a thorough analysis of their placement behavior. Extensive experiments show that AI-based chip placement algorithms produce unsatisfactory final PPA results, highlighting the significant influence of often-overlooked factors like regularity and dataflow. We believe ChiPBench will effectively bridge the gap between academia and industry.

ICML Conference 2025 Conference Paper

Cross-Modal Alignment via Variational Copula Modelling

  • Feng Wu
  • Tsai Hor Chan
  • Fuying Wang
  • Guosheng Yin
  • Lequan Yu

Various data modalities are common in real-world applications. (e. g. , EHR, medical images and clinical notes in healthcare). Thus, it is essential to develop multimodal learning methods to aggregate information from multiple modalities. The main challenge is appropriately aligning and fusing the representations of different modalities into a joint distribution. Existing methods mainly rely on concatenation or the Kronecker product, oversimplifying interactions structure between modalities and indicating a need to model more complex interactions. Additionally, the joint distribution of latent representations with higher-order interactions is underexplored. Copula is a powerful statistical structure in modelling the interactions between variables, as it bridges the joint distribution and marginal distributions of multiple variables. In this paper, we propose a novel copula modelling-driven multimodal learning framework, which focuses on learning the joint distribution of various modalities to capture the complex interaction among them. The key idea is interpreting the copula model as a tool to align the marginal distributions of the modalities efficiently. By assuming a Gaussian mixture distribution for each modality and a copula model on the joint distribution, our model can also generate accurate representations for missing modalities. Extensive experiments on public MIMIC datasets demonstrate the superior performance of our model over other competitors. The code is anonymously available at https: //github. com/HKU-MedAI/CMCM.

NeurIPS Conference 2025 Conference Paper

Dynamic Configuration for Cutting Plane Separators via Reinforcement Learning on Incremental Graph

  • Mingxuan Ye
  • Jie Wang
  • Fangzhou Fangzhou
  • Zhihai Wang
  • Yufei Kuang
  • Xijun Li
  • Weilin Luo
  • Jianye Hao

Cutting planes (cuts) are essential for solving mixed-integer linear programming (MILP) problems, as they tighten the feasible solution space and accelerate the solving process. Modern MILP solvers offer diverse cutting plane separators to generate cuts, enabling users to leverage their potential complementary strengths to tackle problems with different structures. Recent machine learning approaches learn to configure separators based on problem-specific features, selecting effective separators and deactivating ineffective ones to save unnecessary computing time. However, they ignore the dynamics of separator efficacy at different stages of cut generation and struggle to adapt the configurations for the evolving problems after multiple rounds of cut generation. To address this challenge, we propose a novel dyn amic sep arator configuration ( DynSep ) method that models separator configuration in different rounds as a reinforcement learning task, making decisions based on an incremental triplet graph updated by iteratively added cuts. Specifically, we tokenize the incremental subgraphs and utilize a decoder-only Transformer as our policy to autoregressively predict when to halt separation and which separators to activate at each round. Evaluated on synthetic and large-scale real-world MILP problems, DynSep speeds up average solving time by 64% on easy and medium datasets, and reduces primal-dual gap integral within the given time limit by 16% on hard datasets. Moreover, experiments demonstrate that DynSep well generalizes to MILP instances of significantly larger sizes than those seen during training.

AAAI Conference 2025 Conference Paper

FFCG: Effective and Fast Family Column Generation for Solving Large-Scale Linear Program

  • Yi-Xiang Hu
  • Feng Wu
  • Shaoang Li
  • Yifang Zhao
  • Xiang-Yang Li

Column Generation (CG) is an effective and iterative algorithm to solve large-scale linear programs (LP). During each CG iteration, new columns are added to improve the solution of the LP. Typically, CG greedily selects one column with the most negative reduced cost, which can be improved by adding more columns at once. However, selecting all columns with negative reduced costs would lead to the addition of redundant columns that do not improve the objective value. Therefore, selecting the appropriate columns to add is still an open problem and previous machine-learning-based approaches for CG only add a constant quantity of columns per iteration due to the state-space explosion problem. To address this, we propose Fast Family Column Generation (FFCG) — a novel reinforcement-learning-based CG that selects a variable number of columns as needed in an iteration. Specifically, we formulate the column selection problem in CG as an MDP and design a reward metric that balances both the convergence speed and the number of redundant columns. In our experiments, FFCG converges faster on the common benchmarks and reduces the number of CG iterations by 77.1% for Cutting Stock Problem (CSP) and 84.8% for Vehicle Routing Problem with Time Windows (VRPTW), and a 71.4% reduction in computing time for CSP and 84.0% for VRPTW on average compared to several state-of-the-art baselines.

NeurIPS Conference 2025 Conference Paper

GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images

  • Xiang Lan
  • Feng Wu
  • Kai He
  • Qinghao Zhao
  • Shenda Hong
  • Mengling Feng

While recent multimodal large language models (MLLMs) have advanced automated ECG interpretation, they still face two key limitations: (1) insufficient multimodal synergy between ECG time series and ECG images, and (2) limited explainability in linking diagnoses to granular waveform evidence. We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation. GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations: a dual-encoder framework extracting complementary time series and image features, cross-modal alignment for effective multimodal understanding, and knowledge-guided instruction data generation for generating high-granularity grounding data (ECG-Grounding) linking diagnoses to measurable parameters ($e. g. $, QRS/PR Intervals). Additionally, we propose the Grounded ECG Understanding task, a clinically motivated benchmark designed to comprehensively assess the MLLM's capability in grounded ECG understanding. Experimental results on both existing and our proposed benchmarks show GEM significantly improves predictive performance (CSN $7. 4\%$ $\uparrow$), explainability ($22. 7\%$ $\uparrow$), and grounding ($25. 3\%$ $\uparrow$), making it a promising approach for real-world clinical applications. Codes, model, and data are available at https: //github. com/lanxiang1017/GEM.

IJCAI Conference 2025 Conference Paper

Guiding Large Language Models in Modeling Optimization Problems via Question Partitioning

  • Xiaotian Pan
  • Junhao Fang
  • Feng Wu
  • Sijia Zhang
  • Yi-Xiang Hu
  • Shaoang Li
  • Xiang-Yang Li

Optimization problems are ubiquitous across various domains, such as resource scheduling, production planning, and sales management. Traditionally, they are modeled manually, leading to inefficiencies due to difficulties in communication and collaboration between modeling and domain experts. The emergence of Large Language Models (LLMs) has made automated modeling possible. However, real-world applications are often large-scale and have numerous variables and constraints, limiting the applicability of existing methods. To address this, we propose PaMOP, a novel modeling framework based on LLMs, to model optimization problems automatically, given only natural language descriptions. Specifically, we extract and partition the problems using a tree structure, guiding the LLMs to model each set of constraints with self-augmented prompts, thus reducing the demands on the LLM's capabilities of large contents. The mathematical model is then iteratively corrected and validated through our correction procedures. The experiments demonstrate that our method improves performance on the common benchmark dataset NLP4LP, achieving an accuracy of 62. 3% and a code executability rate of 86. 8% when tested on GPT-4. Additionally, we demonstrate the effectiveness of our PaMOP in handling large real-world problems.

NeurIPS Conference 2025 Conference Paper

High-Performance Arithmetic Circuit Optimization via Differentiable Architecture Search

  • Xilin Xia
  • Jie Wang
  • Wanbo Zhang
  • Zhihai Wang
  • Mingxuan Yuan
  • Jianye Hao
  • Feng Wu

Arithmetic circuit optimization remains a fundamental challenge in modern integrated circuit design. Recent advances have cast this problem within the Learning to Optimize (L2O) paradigm, where intelligent agents autonomously explore high-performance design spaces with encouraging results. However, existing approaches predominantly target coarse-grained architectural configurations, while the crucial interconnect optimization stage is often relegated to oversimplified proxy models or a heuristic approach. This disconnect undermines design quality, leading to suboptimal solutions in the circuit topology search space. To bridge this gap, we present **Arith-DAS**, a **D**ifferentiable **A**rchitecture **S**earch framework for **Arith**metic circuits. To the best of our knowledge, **Arith-DAS** is the first to formulate interconnect optimization within arithmetic circuits as a differentiable edge prediction problem over a multi-relational directed acyclic graph, enabling fine-grained, proxy-free optimization at the interconnection level. We evaluate **Arith-DAS** on a suite of representative arithmetic circuits, including multipliers and multiply-accumulate units. Experiments show substantial improvements over state-of-the-art L2O and conventional methods, achieving up to $\textbf{27. 05}$% gain in hypervolume of area-delay Pareto front, a standard metric for evaluating multi-objective optimization performance. Moreover, integrating our optimized arithmetic units into large-scale AI accelerators yields up to $\textbf{6. 59}$% delay reduction, demonstrating both scalability and real-world applicability.

NeurIPS Conference 2025 Conference Paper

MURKA: Multi-Reward Reinforcement Learning with Knowledge Alignment for Optimization Tasks

  • WANTONG XIE
  • Yi-Xiang Hu
  • Jieyang Xu
  • Feng Wu
  • Xiangyang Li

Optimization plays a central role in Operations Research (OR) and numerous industrial applications, yet automating the end-to-end process of translating natural language descriptions into executable optimization programs remains a formidable challenge. While recent efforts have applied Large Language Models (LLMs) to this task, existing approaches are hindered by high inference costs, limited robustness across domains, and weak verification mechanisms. In this work, we propose MURKA, a reinforcement learning and knowledge distillation-based framework that enhances LLM-driven optimization modeling via collaborative agent alignment. MURKA orchestrates three specialized agents---Extractor, Solver, and Checker---to achieve accurate problem understanding, robust formulation, and verifiable execution. The Extractor is trained using group relative policy optimization with a composite reward function that incorporates semantic correctness and execution fidelity. The Solver benefits from knowledge distillation from a powerful teacher model, yielding structurally valid and executable formulations in AMPL. The Checker iteratively verifies solution correctness via solver feedback. We validate MURKA's generalizability through extensive experiments across diverse OR benchmarks, demonstrating its robustness and scalability. Experimental results on eight diverse OR benchmarks, including NLP4LP, ComplexOR, and NL4Opt, demonstrate that MURKA, built on the LLaMa3-8B backbone, achieves a 5. 9\% absolute improvement in solution accuracy and a 5. 1\% increase in execution success rate compared to leading baselines. These results establish MURKA as an effective and scalable paradigm for LLM-driven optimization, with strong potential for deployment in real-world OR applications.

AAAI Conference 2025 Conference Paper

Relaxed Class-consensus Consistency for Semi-supervised Semantic Segmentation

  • Huayu Mai
  • Rui Sun
  • Feng Wu

The key to semi-supervised semantic segmentation lies in how to fully exploit a large amount of unlabeled data to improve the model’s generalization performance. Most methods are lured into the trap of taking each class independently (i.e., class-independent consistency) and neglecting the fact that there exist semantic dependencies among classes. In this paper, we analyze the bottlenecks of class-independent consistency inherent in previous methods and offer a fresh perspective of cooperative game theory to explicitly encourage class-consensus alignment (i.e., class-consensus consistency between the teacher (weak augmented view) and student network (strong augmented view). We formulate classes as players in an cooperative game to model their interpretable consensus and shed light on the possibility of closer collaboration between consensus themselves and consistency regularization, yielding more comprehensive and effective supervision signals. To this end, we carefully design the class-consensus consistency without introducing any external knowledge to model class structure information which renders better interpretability, and further, prepend relaxed class-consensus consistency (RCC) to unlock the potential of modeling class consensus by relaxing the strict alignment of direct class consensus values to ranking alignment. Extensive experimental results on multiple benchmarks demonstrate that RCC performs favorably against state-of-the-art methods. Particularly in the low-data regimes, RCC achieves significant improvements.

AAAI Conference 2024 Conference Paper

Electron Microscopy Images as Set of Fragments for Mitochondrial Segmentation

  • Naisong Luo
  • Rui Sun
  • Yuwen Pan
  • Tianzhu Zhang
  • Feng Wu

Automatic mitochondrial segmentation enjoys great popularity with the development of deep learning. However, the coarse prediction raised by the presence of regular 3D grids in previous methods regardless of 3D CNN or the vision transformers suggest a possibly sub-optimal feature arrangement. To mitigate this limitation, we attempt to interpret the 3D EM image stacks as a set of interrelated 3D fragments for a better solution. However, it is non-trivial to model the 3D fragments without introducing excessive computational overhead. In this paper, we design a coherent fragment vision transformer (FragViT) combined with affinity learning to manipulate features on 3D fragments yet explore mutual relationships to model fragment-wise context, enjoying locality prior without sacrificing global reception. The proposed FragViT includes a fragment encoder and a hierarchical fragment aggregation module. The fragment encoder is equipped with affinity heads to transform the tokens into fragments with homogeneous semantics, and the multi-layer self-attention is used to explicitly learn inter-fragment relations with long-range dependencies. The hierarchical fragment aggregation module is responsible for hierarchically aggregating fragment-wise prediction back to the final voxel-wise prediction in a progressive manner. Extensive experimental results on the challenging MitoEM, Lucchi, and AC3/AC4 benchmarks demonstrate the effectiveness of the proposed method.

AAAI Conference 2024 Conference Paper

Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing

  • Qihua Chen
  • Xuejin Chen
  • Chenxuan Wang
  • Yixiong Liu
  • Zhiwei Xiong
  • Feng Wu

The current neuron reconstruction pipeline for electron microscopy (EM) data usually includes automatic image segmentation followed by extensive human expert proofreading. In this work, we aim to reduce human workload by predicting connectivity between over-segmented neuron pieces, taking both microscopy image and 3D morphology features into account, similar to human proofreading workflow. To this end, we first construct a dataset, named FlyTracing, that contains millions of pairwise connections of segments expanding the whole fly brain, which is three orders of magnitude larger than existing datasets for neuron segment connection. To learn sophisticated biological imaging features from the connectivity annotations, we propose a novel connectivity-aware contrastive learning method to generate dense volumetric EM image embedding. The learned embeddings can be easily incorporated with any point or voxel-based morphological representations for automatic neuron tracing. Extensive comparisons of different combination schemes of image and morphological representation in identifying split errors across the whole fly brain demonstrate the superiority of the proposed approach, especially for the locations that contain severe imaging artifacts, such as section missing and misalignment. The dataset and code are available at https://github.com/Levishery/Flywire-Neuron-Tracing.

NeurIPS Conference 2024 Conference Paper

MILP-StuDio: MILP Instance Generation via Block Structure Decomposition

  • Haoyang Liu
  • Jie Wang
  • Wanbo Zhang
  • Zijie Geng
  • Yufei Kuang
  • Xijun Li
  • Yongdong Zhang
  • Bin Li

Mixed-integer linear programming (MILP) is one of the most popular mathematical formulations with numerous applications. In practice, improving the performance of MILP solvers often requires a large amount of high-quality data, which can be challenging to collect. Researchers thus turn to generation techniques to generate additional MILP instances. However, existing approaches do not take into account specific block structures—which are closely related to the problem formulations—in the constraint coefficient matrices (CCMs) of MILPs. Consequently, they are prone to generate computationally trivial or infeasible instances due to the disruptions of block structures and thus problem formulations. To address this challenge, we propose a novel MILP generation framework, called Block Structure Decomposition (MILP-StuDio), to generate high-quality instances by preserving the block structures. Specifically, MILP-StuDio begins by identifying the blocks in CCMs and decomposing the instances into block units, which serve as the building blocks of MILP instances. We then design three operators to construct new instances by removing, substituting, and appending block units in the original instances, enabling us to generate instances with flexible sizes. An appealing feature of MILP-StuDio is its strong ability to preserve the feasibility and computational hardness of the generated instances. Experiments on the commonly-used benchmarks demonstrate that using instances generated by MILP-StuDio is able to significantly reduce over 10% of the solving time for learning-based solvers.

AAAI Conference 2024 Conference Paper

Pay Attention to Target: Relation-Aware Temporal Consistency for Domain Adaptive Video Semantic Segmentation

  • Huayu Mai
  • Rui Sun
  • Yuan Wang
  • Tianzhu Zhang
  • Feng Wu

Video semantic segmentation has achieved conspicuous achievements attributed to the development of deep learning, but suffers from labor-intensive annotated training data gathering. To alleviate the data-hunger issue, domain adaptation approaches are developed in the hope of adapting the model trained on the labeled synthetic videos to the real videos in the absence of annotations. By analyzing the dominant paradigm consistency regularization in the domain adaptation task, we find that the bottlenecks exist in previous methods from the perspective of pseudo-labels. To take full advantage of the information contained in the pseudo-labels and empower more effective supervision signals, we propose a coherent PAT network including a target domain focalizer and relation-aware temporal consistency. The proposed PAT network enjoys several merits. First, the target domain focalizer is responsible for paying attention to the target domain, and increasing the accessibility of pseudo-labels in consistency training. Second, the relation-aware temporal consistency aims at modeling the inter-class consistent relationship across frames to equip the model with effective supervision signals. Extensive experimental results on two challenging benchmarks demonstrate that our method performs favorably against state-of-the-art domain adaptive video semantic segmentation methods.

IJCAI Conference 2024 Conference Paper

Safety Constrained Multi-Agent Reinforcement Learning for Active Voltage Control

  • Yang Qu
  • Jinming Ma
  • Feng Wu

Active voltage control presents a promising avenue for relieving power congestion and enhancing voltage quality, taking advantage of the distributed controllable generators in the power network, such as roof-top photovoltaics. While Multi-Agent Reinforcement Learning (MARL) has emerged as a compelling approach to address this challenge, existing MARL approaches tend to overlook the constrained optimization nature of this problem, failing in guaranteeing safety constraints. In this paper, we formalize the active voltage control problem as a constrained Markov game and propose a safety-constrained MARL algorithm. We expand the primal-dual optimization RL method to multi-agent settings, and augment it with a novel approach of double safety estimation to learn the policy and to update the Lagrange-multiplier. In addition, we proposed different cost functions and investigated their influences on the behavior of our constrained MARL method. We evaluate our approach in the power distribution network simulation environment with real-world scale scenarios. Experimental results demonstrate the effectiveness of the proposed method compared with the state-of-the-art MARL methods.

AAAI Conference 2024 Conference Paper

Test-Time Adaptation via Style and Structure Guidance for Histological Image Registration

  • Shenglong Zhou
  • Zhiwei Xiong
  • Feng Wu

Image registration plays a crucial role in histological image analysis, encompassing tasks like multi-modality fusion and disease grading. Traditional registration methods optimize objective functions for each image pair, yielding reliable accuracy but demanding heavy inference burdens. Recently, learning-based registration methods utilize networks to learn the optimization process during training and apply a one-step forward process during testing. While these methods offer promising registration performance with reduced inference time, they remain sensitive to appearance variances and local structure changes commonly encountered in histological image registration scenarios. In this paper, for the first time, we propose a novel test-time adaptation method for histological image registration, aiming to improve the generalization ability of learning-based methods. Specifically, we design two operations, style guidance and shape guidance, for the test-time adaptation process. The former leverages style representations encoded by feature statistics to address the issue of appearance variances, while the latter incorporates shape representations encoded by HOG features to improve registration accuracy in regions with structural changes. Furthermore, we consider the continuity of the model during the test-time adaptation process. Different from the previous methods initialized by a given trained model, we introduce a smoothing strategy to leverage historical models for better generalization. We conduct experiments with several representative learning-based backbones on the public histological dataset, demonstrating the superior registration performance of our test-time adaptation method.

NeurIPS Conference 2024 Conference Paper

Towards Next-Generation Logic Synthesis: A Scalable Neural Circuit Generation Framework

  • Zhihai Wang
  • Jie Wang
  • Qingyue Yang
  • Yinqi Bai
  • Xing Li
  • Lei Chen
  • Jianye Hao
  • Mingxuan Yuan

Logic Synthesis (LS) aims to generate an optimized logic circuit satisfying a given functionality, which generally consists of circuit translation and optimization. It is a challenging and fundamental combinatorial optimization problem in integrated circuit design. Traditional LS approaches rely on manually designed heuristics to tackle the LS task, while machine learning recently offers a promising approach towards next-generation logic synthesis by neural circuit generation and optimization. In this paper, we first revisit the application of differentiable neural architecture search (DNAS) methods to circuit generation and found from extensive experiments that existing DNAS methods struggle to exactly generate circuits, scale poorly to large circuits, and exhibit high sensitivity to hyper-parameters. Then we provide three major insights for these challenges from extensive empirical analysis: 1) DNAS tends to overfit to too many skip-connections, consequently wasting a significant portion of the network's expressive capabilities; 2) DNAS suffers from the structure bias between the network architecture and the circuit inherent structure, leading to inefficient search; 3) the learning difficulty of different input-output examples varies significantly, leading to severely imbalanced learning. To address these challenges in a systematic way, we propose a novel regularized triangle-shaped circuit network generation framework, which leverages our key insights for completely accurate and scalable circuit generation. Furthermore, we propose an evolutionary algorithm assisted by reinforcement learning agent restarting technique for efficient and effective neural circuit optimization. Extensive experiments on four different circuit benchmarks demonstrate that our method can precisely generate circuits with up to 1200 nodes. Moreover, our synthesized circuits significantly outperform the state-of-the-art results from several competitive winners in IWLS 2022 and 2023 competitions.

NeurIPS Conference 2023 Conference Paper

A Deep Instance Generative Framework for MILP Solvers Under Limited Data Availability

  • Zijie Geng
  • Xijun Li
  • Jie Wang
  • Xiao Li
  • Yongdong Zhang
  • Feng Wu

In the past few years, there has been an explosive surge in the use of machine learning (ML) techniques to address combinatorial optimization (CO) problems, especially mixed-integer linear programs (MILPs). Despite the achievements, the limited availability of real-world instances often leads to sub-optimal decisions and biased solver assessments, which motivates a suite of synthetic MILP instance generation techniques. However, existing methods either rely heavily on expert-designed formulations or struggle to capture the rich features of real-world instances. To tackle this problem, we propose G2MILP, the first deep generative framework for MILP instances. Specifically, G2MILP represents MILP instances as bipartite graphs, and applies a masked variational autoencoder to iteratively corrupt and replace parts of the original graphs to generate new ones. The appealing feature of G2MILP is that it can learn to generate novel and realistic MILP instances without prior expert-designed formulations, while preserving the structures and computational hardness of real-world datasets, simultaneously. Thus the generated instances can facilitate downstream tasks for enhancing MILP solvers under limited data availability. We design a suite of benchmarks to evaluate the quality of the generated MILP instances. Experiments demonstrate that our method can produce instances that closely resemble real-world datasets in terms of both structures and computational hardness. The deliverables are released at https: //miralab-ustc. github. io/L2O-G2MILP.

IJCAI Conference 2023 Conference Paper

A Diffusion Model with Contrastive Learning for ICU False Arrhythmia Alarm Reduction

  • Feng Wu
  • Guoshuai Zhao
  • Xueming Qian
  • Li-wei H. Lehman

The high rate of false arrhythmia alarms in intensive care units (ICUs) can negatively impact patient care and lead to slow staff response time due to alarm fatigue. To reduce false alarms in ICUs, previous works proposed conventional supervised learning methods which have inherent limitations in dealing with high-dimensional, sparse, unbalanced, and limited data. We propose a deep generative approach based on the conditional denoising diffusion model to detect false arrhythmia alarms in the ICUs. Conditioning on past waveform data of a patient, our approach generates waveform predictions of the patient during an actual arrhythmia event, and uses the distance between the generated and the observed samples to classify the alarm. We design a network with residual links and self-attention mechanism to capture long-term dependencies in signal sequences, and leverage the contrastive learning mechanism to maximize distances between true and false arrhythmia alarms. We demonstrate the effectiveness of our approach on the MIMIC II arrhythmia dataset for detecting false alarms in both retrospective and real-time settings.

YNICL Journal 2023 Journal Article

Altered gray matter volumes and plasma IL-6 level in major depressive disorder patients with suicidal ideation

  • Yingrui Guo
  • Xiaowei Jiang
  • Linna Jia
  • Yue Zhu
  • Xinyu Han
  • Yifan Wu
  • Wen Liu
  • Wenhui Zhao

BACKGROUNDS: Suicidal ideation (SI) is one of the most serious consequences of major depressive disorder (MDD). Understanding the unique mechanism of MDD with SI (MDD + S) is crucial for treatment development. While abundant research has studied MDD, past studies have not reached a consensus on the mechanism of MDD + S. The study aimed to investigate the abnormalities of the gray matter volumes (GMVs) and plasma IL-6 level in MDD + S to further reveal the mechanism of MDD + S. METHODS: We tested the plasma IL-6 level using Luminex multifactor assays and collected the Structural Magnetic Resonance Imaging (SMRI) data from 34 healthy controls (HCs), 36 MDD patients without SI (MDD - S) and 34 MDD + S patients. We performed a partial correlation between the GMVs of the brain regions with significant differences and plasma IL-6 level with age, sex, medication, scores of HAMD-17 and HAMA as the covariates. RESULTS: Compared with HCs and MDD - S, MDD + S had significantly decreased GMVs in the left cerebellum Crus I/II and significantly increased plasma IL-6 level; compared with HCs, both the MDD + S and MDD - S had significantly decreased GMVs in right precentral and postcentral gyri. No significant correlation was found between the GMVs and the plasma IL-6 level in the MDD + S and MDD - S, respectively. While the GMVs of the right precentral and postcentral gyri negatively correlated with the level of IL-6 in the whole MDD (r = -0.28, P = 0.03). The GMVs of the left cerebellum Crus I/II (r = -0.47, P = 0.02), and the right precentral and postcentral gyri (r = -0.42, P = 0.04) negatively correlated with the level of IL-6 in HCs. CONCLUSION: The altered GMVs and the plasma IL-6 level may provide a scientific basis to understand the pathophysiological mechanisms of MDD + S.

IJCAI Conference 2023 Conference Paper

Appearance Prompt Vision Transformer for Connectome Reconstruction

  • Rui Sun
  • Naisong Luo
  • Yuwen Pan
  • Huayu Mai
  • Tianzhu Zhang
  • Zhiwei Xiong
  • Feng Wu

Neural connectivity reconstruction aims to understand the function of biological reconstruction and promote basic scientific research. The intricate morphology and densely intertwined branches make it an extremely challenging task. Most previous best-performing methods adopt affinity learning or metric learning. Nevertheless, they either neglect to model explicit voxel semantics caused by implicit optimization or are hysteresis to spatial information. Furthermore, the inherent locality of 3D CNNs limits modeling long-range dependencies, leading to sub-optimal results. In this work, we propose a coherent and unified Appearance Prompt Vision Transformer (APViT) to integrate affinity and metric learning to exploit the complementarity by learning long-range spatial dependencies. The proposed APViT enjoys several merits. First, the extension continuity-aware attention module aims at constructing hierarchical attention customized for neuron extensibility and slice continuity to learn instance voxel semantic context from a global perspective and utilize continuity priors to enhance voxel spatial awareness. Second, the appearance prompt modulator is responsible for leveraging voxel-adaptive appearance knowledge conditioned on affinity rich in spatial information to instruct instance voxel semantics, exploiting the potential of affinity learning to complement metric learning. Extensive experimental results on multiple challenging benchmarks demonstrate that our APViT achieves consistent improvements with huge flexibility under the same post-processing strategy.

ICRA Conference 2023 Conference Paper

Automatic Generation of Robot Facial Expressions with Preferences

  • Bing Tang
  • Rongyun Cao
  • Rongya Chen
  • Xiaoping Chen
  • Bei Hua
  • Feng Wu

The capability of humanoid robots to generate facial expressions is crucial for enhancing interactivity and emotional resonance in human-robot interaction. However, humanoid robots vary in mechanics, manufacturing, and ap-pearance. The lack of consistent processing techniques and the complexity of generating facial expressions pose significant challenges in the field. To acquire solutions with high confidence, it is necessary to enable robots to explore the solution space automatically based on performance feedback. To this end, we designed a physical robot with a human-like appearance and developed a general framework for automatic expression generation using the MAP-Elites algorithm. The main advan-tage of our framework is that it does not only generate facial expressions automatically but can also be customized according to user preferences. The experimental results demonstrate that our framework can efficiently generate realistic facial expressions without hard coding or prior knowledge of the robot kinematics. Moreover, it can guide the solution-generation process in accordance with user preferences, which is desirable in many real-world applications.

AAAI Conference 2023 Conference Paper

Better and Faster: Adaptive Event Conversion for Event-Based Object Detection

  • Yansong Peng
  • Yueyi Zhang
  • Peilin Xiao
  • Xiaoyan Sun
  • Feng Wu

Event cameras are a kind of bio-inspired imaging sensor, which asynchronously collect sparse event streams with many advantages. In this paper, we focus on building better and faster event-based object detectors. To this end, we first propose a computationally efficient event representation Hyper Histogram, which adequately preserves both the polarity and temporal information of events. Then we devise an Adaptive Event Conversion module, which converts events into Hyper Histograms according to event density via an adaptive queue. Moreover, we introduce a novel event-based augmentation method Shadow Mosaic, which significantly improves the event sample diversity and enhances the generalization ability of detection models. We equip our proposed modules on three representative object detection models: YOLOv5, Deformable-DETR, and RetinaNet. Experimental results on three event-based detection datasets (1Mpx, Gen1, and MVSEC-NIGHTL21) demonstrate that our proposed approach outperforms other state-of-the-art methods by a large margin, while achieving a much faster running speed (< 14 ms and < 4 ms for 50 ms event data on the 1Mpx and Gen1 datasets).

NeurIPS Conference 2023 Conference Paper

DAW: Exploring the Better Weighting Function for Semi-supervised Semantic Segmentation

  • Rui Sun
  • Huayu Mai
  • Tianzhu Zhang
  • Feng Wu

The critical challenge of semi-supervised semantic segmentation lies in how to fully exploit a large volume of unlabeled data to improve the model’s generalization performance for robust segmentation. Existing methods tend to employ certain criteria (weighting function) to select pixel-level pseudo labels. However, the trade-off exists between inaccurate yet utilized pseudo-labels, and correct yet discarded pseudo-labels in these methods when handling pseudo-labels without thoughtful consideration of the weighting function, hindering the generalization ability of the model. In this paper, we systematically analyze the trade-off in previous methods when dealing with pseudo-labels. We formally define the trade-off between inaccurate yet utilized pseudo-labels, and correct yet discarded pseudo-labels by explicitly modeling the confidence distribution of correct and inaccurate pseudo-labels, equipped with a unified weighting function. To this end, we propose Distribution-Aware Weighting (DAW) to strive to minimize the negative equivalence impact raised by the trade-off. We find an interesting fact that the optimal solution for the weighting function is a hard step function, with the jump point located at the intersection of the two confidence distributions. Besides, we devise distribution alignment to mitigate the issue of the discrepancy between the prediction distributions of labeled and unlabeled data. Extensive experimental results on multiple benchmarks including mitochondria segmentation demonstrate that DAW performs favorably against state-of-the-art methods.

YNICL Journal 2023 Journal Article

Gray matter volume reduction in orbitofrontal cortex correlated with plasma glial cell line-derived neurotrophic factor (GDNF) levels within major depressive disorder

  • Yifan Wu
  • Lingtao Kong
  • Anqi Yang
  • Kaiqi Xin
  • Yihui Lu
  • Xintong Yan
  • Wen Liu
  • Yue Zhu

BACKGROUND: Major depressive disorder (MDD) is a severe mental disorder characterized by reduced gray matter volume (GMV). To date, the pathogenesis of MDD remains unclear, but neurotrophic factors play an essential role in the pathophysiological alterations of MDD during disease development. In particular, plasma glial cell line-derived neurotrophic factor (GDNF) has been suggested as a potential biomarker that may be associated with disease activity and neurological progression in MDD. Our study investigated whether plasma GDNF levels in MDD patients and healthy controls (HCs) are correlated with GMV alterations. METHODS: We studied 54 MDD patients and 48 HCs. The effect of different diagnoses on whole-brain GMV was investigated using ANOVA (Analysis of Variance). The threshold of significance was p < 0.05, and Gaussian random-field (GRF) correction for error was used. All analyses were controlled for covariates such as ethnicity, handedness, age, and gender that could affect GMV. RESULT: Compared with the HC group, the GMV in the MDD group was significantly reduced in the right inferior orbitofrontal cortex (OFC), and plasma GDNF levels were significantly higher in the MDD group than in the HC group. In the right inferior OFC, the GDNF levels were positively correlated with GMV reduction in the MDD group, whereas in the HC group, a negative correlation was observed between GDNF levels and GMV reduction. CONCLUSION: Although increased production of GDNF in MDD may help repair neural damage in brain regions associated with brain disease, its repairing effects may be interfered with and hindered by underlying neuroinflammatory processes.

AAMAS Conference 2023 Conference Paper

Learning to Coordinate from Offline Datasets with Uncoordinated Behavior Policies

  • Jinming Ma
  • Feng Wu

In offline multi-agent reinforcement learning (RL), multiple agents must learn to coordinate from previously collected datasets. Like the single-agent case, we must handle the distribution shift issue from the datasets. Most importantly, we also need to deal with possible miscoordination in the datasets, collected by some uncoordinated behavior policies. To address this, we propose a novel offline multi-agent RL method using counterfactual sample-average approximation with subteam masking. Specifically, we compute the best-response policy for each agent using sample-average approximation. For the miscoordination issue, we use counterfactual mechanism and subteam masking to reason about the agents’ contributions to the team. Based on this, each agent learns to coordinate from the uncoordinated datasets. Empirically, we evaluate our method in two benchmark domains: a continuous multi-agent MuJoCo control domain, and a challenging cooperation environment Starcraft II domain. Our experimental results confirm that our approach can achieve significantly better performance than several state-of-the-art methods. The source code is available at: https: //github. com/JinmingM/CAST-BCQ.

AAMAS Conference 2023 Conference Paper

Less Is More: Refining Datasets for Offline Reinforcement Learning with Reward Machines

  • Haoyuan Sun
  • Feng Wu

Offline reinforcement learning (RL) aims to learn a policy from a fixed dataset, without further interactions with the environment. However, offline datasets are often very noisy, which consist of large quantities of sub-optimal or task-agnostic trajectories. Therefore, it is very challenging for offline RL to learn an optimal policy from such datasets. To address this, we use reward machines (RM) to encode human knowledge about the task and refine datasets for offline RL. Specifically, we define the event-ordered RM to label offline datasets with RM states. Then, we further use the labeled datasets to generate refined datasets, which is smaller but better for offline RL. By using the RM, we can decompose a long-horizon task into easier sub-tasks, inform the agent about their current stage along task completion, and guide the offline learning process. In addition, we generate counterfactual experiences by RM to guide agent to complete each sub-task. Experimental results in the D4RL benchmark confirm that our method achieves better performance in long-horizon manipulation tasks with sub-optimal datasets.

NeurIPS Conference 2023 Conference Paper

State Sequences Prediction via Fourier Transform for Representation Learning

  • Mingxuan Ye
  • Yufei Kuang
  • Jie Wang
  • Yang Rui
  • Wengang Zhou
  • Houqiang Li
  • Feng Wu

While deep reinforcement learning (RL) has been demonstrated effective in solving complex control tasks, sample efficiency remains a key challenge due to the large amounts of data required for remarkable performance. Existing research explores the application of representation learning for data-efficient RL, e. g. , learning predictive representations by predicting long-term future states. However, many existing methods do not fully exploit the structural information inherent in sequential state signals, which can potentially improve the quality of long-term decision-making but is difficult to discern in the time domain. To tackle this problem, we propose State Sequences Prediction via Fourier Transform (SPF), a novel method that exploits the frequency domain of state sequences to extract the underlying patterns in time series data for learning expressive representations efficiently. Specifically, we theoretically analyze the existence of structural information in state sequences, which is closely related to policy performance and signal regularity, and then propose to predict the Fourier transform of infinite-step future state sequences to extract such information. One of the appealing features of SPF is that it is simple to implement while not requiring storage of infinite-step future states as prediction targets. Experiments demonstrate that the proposed method outperforms several state-of-the-art algorithms in terms of both sample efficiency and performance.

NeurIPS Conference 2023 Conference Paper

VTaC: A Benchmark Dataset of Ventricular Tachycardia Alarms from ICU Monitors

  • Li-wei Lehman
  • Benjamin Moody
  • Harsh Deep
  • Feng Wu
  • Hasan Saeed
  • Lucas McCullum
  • Diane Perry
  • Tristan Struja

False arrhythmia alarms in intensive care units (ICUs) are a continuing problem despite considerable effort from industrial and academic algorithm developers. Of all life-threatening arrhythmias, ventricular tachycardia (VT) stands out as the most challenging arrhythmia to detect reliably. We introduce a new annotated VT alarm database, VTaC (Ventricular Tachycardia annotated alarms from ICUs) consisting of over 5, 000 waveform recordings with VT alarms triggered by bedside monitors in the ICU. Each VT alarm waveform in the dataset has been labeled by at least two independent human expert annotators. The dataset encompasses data collected from ICUs in two major US hospitals and includes data from three leading bedside monitor manufacturers, providing a diverse and representative collection of alarm waveform data. Each waveform recording comprises at least two electrocardiogram (ECG) leads and one or more pulsatile waveforms, such as photoplethysmogram (PPG or PLETH) and arterial blood pressure (ABP) waveforms. We demonstrate the utility of this new benchmark dataset for the task of false arrhythmia alarm reduction, and present performance of multiple machine learning approaches, including conventional supervised machine learning, deep learning, semi-supervised learning, and generative approaches for the task of VT false alarm reduction.

JBHI Journal 2022 Journal Article

A Joint Constrained CCA Model for Network-Dependent Brain Subregion Parcellation

  • Qinrui Ling
  • Aiping Liu
  • Yu Li
  • Xueyang Fu
  • Xun Chen
  • Martin J. McKeown
  • Feng Wu

Connectivity-based brain region parcellation from functional magnetic resonance imaging (fMRI) data is complicated by heterogeneity among aged and diseased subjects, particularly when the data are spatially transformed to a common space. Here, we propose a group-guided functional brain region parcellation model capable of obtaining subregions from a target region with consistent connectivity profiles across multiple subjects, even when the fMRI signals are kept in their native spaces. The model is based on a joint constrained canonical correlation analysis (JC-CCA) method that achieves group-guided parcellation while allowing the data dimension of the parcellated regions for each subject to vary. We performed extensive experiments on synthetic and real data to demonstrate the superiority of the proposed model compared to other classical methods. When applied to fMRI data of subjects with and without Parkinson's disease (PD) to estimate the subregions in the Putamen, significant between-group differences were found in the derived subregions and the connectivity patterns. Superior classification and regression results were obtained, demonstrating its potential in clinical practice.

YNICL Journal 2022 Journal Article

Altered dynamic amplitude of low-frequency fluctuation between bipolar type I and type II in the depressive state

  • Wen Liu
  • Xiaowei Jiang
  • Zijing Deng
  • Linna Jia
  • Qikun Sun
  • Lingtao Kong
  • Feng Wu
  • Yanqing Tang

BACKGROUND: Bipolar disorder is a chronic and highly recurrent mental disorder that can be classified as bipolar type I (BD I) and bipolar type II (BD II). BD II is sometimes taken as a milder form of BD I or even doubted as an independent subtype. However, the fact that symptoms and severity differ in patients with BD I and BD II suggests different pathophysiologies and underlying neurobiological mechanisms. In this study, we aimed to explore the shared and unique functional abnormalities between subtypes. METHODS: The dynamic amplitude of low-frequency fluctuation (dALFF) was performed to compare 31 patients with BD I, 32 with BD II, and 79 healthy controls (HCs). Global dALFF was calculated using sliding-window analysis. Group differences in dALFF among the 3 groups were compared using analysis of covariance (ANCOVA), with covariates of age, sex, years of education, and mean FD, and Bonferroni correction was applied for post hoc analysis. Pearson and Spearman's correlations were conducted between clusters with significant differences and clinical features in the BD I and BD II groups, after which false error rate (FDR) was used for correction. RESULTS: We found a significant decrease in dALFF values in BD patients compared with HCs in the following brain regions: the bilateral-side inferior frontal gyrus (including the triangular, orbital, and opercular parts), inferior temporal gyrus, the medial part of the superior frontal gyrus, middle frontal gyrus, anterior cingulum, insula gyrus, lingual gyrus, calcarine gyrus, precuneus gyrus, cuneus gyrus, left-side precentral gyrus, postcentral gyrus, inferior parietal gyrus, superior temporal pole gyrus, middle temporal gyrus, middle occipital gyrus, superior occipital gyrus and right-side fusiform gyrus, parahippocampal gyrus, hippocampus, middle cingulum, orbital part of the medial frontal gyrus and superior frontal gyrus. Unique alterations in BD I were observed in the right-side supramarginal gyrus and postcentral gyrus. In addition, dALFF values in BD II were significantly higher than those in BD I in the right superior temporal gyrus and middle temporal gyrus. The variables of dALFF correlated with clinical characteristics differently according to the subtypes, but no correlations survived after FDR correction. LIMITATIONS: Our study was cross-sectional. Most of our patients were on medication, and the sample was limited. CONCLUSIONS: Our findings demonstrated neurobiological characteristics of BD subtypes, providing evidence for BD II as an independent existence, which could be the underlying explanation for the specific symptoms and/or severity and point to potential biomarkers for the differential diagnosis of bipolar subtypes.

AAMAS Conference 2022 Conference Paper

Multimodal Reinforcement Learning with Effective State Representation Learning

  • Jinming Ma
  • Yingfeng Chen
  • Feng Wu
  • Xianpeng Ji
  • Yu Ding

Many real-world applications require an agent to make robust and deliberate decisions with multimodal information (e. g. , robots with multi-sensory inputs). However, it is very challenging to train the agent via reinforcement learning (RL) due to the heterogeneity and dynamic importance of different modalities. Specifically, we observe that these issues make conventional RL methods difficult to learn a useful state representation in the end-to-end training with multimodal information. To address this, we propose a novel multimodal RL approach that can do multimodal alignment and importance enhancement according to their similarity and importance in terms of RL tasks respectively. By doing so, we are able to learn an effective state representation and consequentially improve the RL training process. We test our approach on several multimodal RL domains, showing that it outperforms state-of-the-art methods in terms of learning speed and policy quality.

AAAI Conference 2022 Conference Paper

ProgressiveMotionSeg: Mutually Reinforced Framework for Event-Based Motion Segmentation

  • Jinze Chen
  • Yang Wang
  • Yang Cao
  • Feng Wu
  • Zheng-Jun Zha

Dynamic Vision Sensor (DVS) can asynchronously output the events reflecting apparent motion of objects with microsecond resolution, and shows great application potential in monitoring and other fields. However, the output event stream of existing DVS inevitably contains background activity noise (BA noise) due to dark current and junction leakage current, which will affect the temporal correlation of objects, resulting in deteriorated motion estimation performance. Particularly, the existing filter-based denoising methods cannot be directly applied to suppress the noise in event stream, since there is no spatial correlation. To address this issue, this paper presents a novel progressive framework, in which a Motion Estimation (ME) module and an Event Denoising (ED) module are jointly optimized in a mutually reinforced manner. Specifically, based on the maximum sharpness criterion, ME module divides the input event into several segments by adaptive clustering in a motion compensating warp field, and captures the temporal correlation of event stream according to the clustered motion parameters. Taking temporal correlation as guidance, ED module calculates the confidence that each event belongs to real activity events, and transmits it to ME module to update energy function of motion segmentation for noise suppression. The two steps are iteratively updated until stable motion segmentation results are obtained. Extensive experimental results on both synthetic and real datasets demonstrate the superiority of our proposed approaches against the State-Of-The-Art (SOTA) methods.

JBHI Journal 2022 Journal Article

Recursive Decomposition Network for Deformable Image Registration

  • Bo Hu
  • Shenglong Zhou
  • Zhiwei Xiong
  • Feng Wu

Deformation decomposition serves as a good solution for deformable image registration when the deformation is large. Current deformation decomposition methods can be categorized into cascade-based methods and pyramid-based methods. However, cascade-based methods suffer from heavy computational burdens and long inference time due to their structures of repeated subnetworks, while the effectiveness of pyramid-based methods is constrained by their limited numbers of resolution levels. In this paper, to address both the insufficient and inefficient decomposition problems in current deformation decomposition methods, we propose a recursive decomposition network (RDN) to offer a novel solution for deformable image registration. Stage-wise recursion can efficiently decompose a large deformation into different pyramid estimation stages without using repeated subnetworks like in cascade-based methods. Level-wise recursion can sufficiently decompose the deformation inside each resolution level instead of only one-time estimation like in pyramid-based methods. Extensive experiments and ablation studies on two representative datasets validate the effectiveness and efficiency of our proposed RDN.

NeurIPS Conference 2022 Conference Paper

Stochastic Window Transformer for Image Restoration

  • Jie Xiao
  • Xueyang Fu
  • Feng Wu
  • Zheng-Jun Zha

Thanks to the powerful representation capabilities, transformers have made impressive progress in image restoration. However, existing transformers-based methods do not carefully consider the particularities of image restoration. In general, image restoration requires that an ideal approach should be translation-invariant to the degradation, i. e. , the undesirable degradation should be removed irrespective of its position within the image. Furthermore, the local relationships also play a vital role, which should be faithfully exploited for recovering clean images. Nevertheless, most transformers either adopt local attention with the fixed local window strategy or global attention, which unfortunately breaks the translation invariance and causes huge loss of local relationships. To address these issues, we propose an elegant stochastic window strategy for transformers. Specifically, we first introduce the window partition with stochastic shift to replace the original fixed window partition for training. Then, we design a new layer expectation propagation algorithm to efficiently approximate the expectation of the induced stochastic transformer for testing. Our stochastic window transformer not only enjoys powerful representation but also maintains the desired property of translation invariance and locality. Experiments validate the stochastic window strategy consistently improves performance on various image restoration tasks (deraining, denoising and deblurring) by significant margins. The code is available at https: //github. com/jiexiaou/Stoformer.

NeurIPS Conference 2021 Conference Paper

ConE: Cone Embeddings for Multi-Hop Reasoning over Knowledge Graphs

  • Zhanqiu Zhang
  • Jie Wang
  • Jiajun Chen
  • Shuiwang Ji
  • Feng Wu

Query embedding (QE)---which aims to embed entities and first-order logical (FOL) queries in low-dimensional spaces---has shown great power in multi-hop reasoning over knowledge graphs. Recently, embedding entities and queries with geometric shapes becomes a promising direction, as geometric shapes can naturally represent answer sets of queries and logical relationships among them. However, existing geometry-based models have difficulty in modeling queries with negation, which significantly limits their applicability. To address this challenge, we propose a novel query embedding model, namely \textbf{Con}e \textbf{E}mbeddings (ConE), which is the first geometry-based QE model that can handle all the FOL operations, including conjunction, disjunction, and negation. Specifically, ConE represents entities and queries as Cartesian products of two-dimensional cones, where the intersection and union of cones naturally model the conjunction and disjunction operations. By further noticing that the closure of complement of cones remains cones, we design geometric complement operators in the embedding space for the negation operations. Experiments demonstrate that ConE significantly outperforms existing state-of-the-art methods on benchmark datasets.

JBHI Journal 2021 Journal Article

Striatal Subdivisions Estimated via Deep Embedded Clustering With Application to Parkinson's Disease

  • Yu Li
  • Aiping Liu
  • Taomian Mi
  • Runyu Yang
  • Piu Chan
  • Martin J. McKeown
  • Xun Chen
  • Feng Wu

Recent fMRI connectivity-based parcellation (CBP) methods have been developed to obtain homogeneous and functionally coherent brain parcels. However, most of these studies utilize traditional clustering methods that neglect hidden nonlinear features. To enhance parcellation performance, here we propose a deep embedded connectivity-based parcellation (DECBP) framework and apply it to determine functional subdivisions of the striatum in public resting state fMRI data sets. This framework integrates fMRI connectivity features into deep embedded clustering (DEC), a deep neural network based on a stacked autoencoder. Compared to three prevalent clustering methods and their combinations with principal component analysis (PCA), the DECBP exhibited a significantly higher similarity between scans, individuals, and groups, indicating enhanced reproducibility. The generated reliable parcellations were also largely consistent with other public atlases. We further explored the functional subunits in the striatum in a data set from 23 Parkinson's disease (PD) subjects and 27 age-matched healthy controls (HC). All putaminal subregions of PD demonstrated lower interhemispheric connectivity than those of HC, which might reflect imbalance in the pathological progression of PD. Such hypo-connectivity was also observed between putaminal subregions and other brain regions, reflecting neuroimaging manifestations of the altered cortico-striato-thalamo-cortical circuit. These observed weaker couplings were associated with PD severity and duration. Our results support the utilization of the DECBP framework and suggest that abnormal connectivity in putaminal subregions may be a potential indicator of PD.

AAAI Conference 2021 Conference Paper

Topology-Aware Correlations Between Relations for Inductive Link Prediction in Knowledge Graphs

  • Jiajun Chen
  • Huarui He
  • Feng Wu
  • Jie Wang

Inductive link prediction—where entities during training and inference stages can be different—has been shown to be promising for completing continuously evolving knowledge graphs. Existing models of inductive reasoning mainly focus on predicting missing links by learning logical rules. However, many existing approaches do not take into account semantic correlations between relations, which are commonly seen in real-world knowledge graphs. To address this challenge, we propose a novel inductive reasoning approach, namely TACT, which can effectively exploit Topology-Aware CorrelaTions between relations in an entity-independent manner. TACT is inspired by the observation that the semantic correlation between two relations is highly correlated to their topological structure in knowledge graphs. Specifically, we categorize all relation pairs into several topological patterns, and then propose a Relational Correlation Network (RCN) to learn the importance of the different patterns for inductive link prediction. Experiments demonstrate that TACT can effectively model semantic correlations between relations, and significantly outperforms existing state-of-the-art methods on benchmark datasets for the inductive link prediction task.

AAAI Conference 2021 Conference Paper

Training Spiking Neural Networks with Accumulated Spiking Flow

  • Hao Wu
  • Yueyi Zhang
  • Wenming Weng
  • Yongting Zhang
  • Zhiwei Xiong
  • Zheng-Jun Zha
  • Xiaoyan Sun
  • Feng Wu

The fast development of neuromorphic hardwares promotes Spiking Neural Networks (SNNs) to a thrilling research avenue. Current SNNs, though much efficient, are less effective compared with leading Artificial Neural Networks (ANNs) especially in supervised learning tasks. Recent efforts further demonstrate the potential of SNNs in supervised learning by introducing approximated backpropagation (BP) methods. To deal with the non-differentiable spike function in SNNs, these BP methods utilize information from the spatio-temporal domain to adjust the model parameters. With the increasing of time window and network size, the computational complexity of spatio-temporal backpropagation augments dramatically. In this paper, we propose a new backpropagation method for SNNs based on the accumulated spiking flow (ASF), i. e. ASF- BP. In the proposed ASF-BP method, updating parameters does not rely on the spike train of spiking neurons but leverage accumulated inputs and outputs of spiking neurons over the time window, which reduces the BP complexity significantly. We further present an adaptive linear estimation model to approach the dynamic characteristics of spiking neurons statistically. Experimental results demonstrate that with our proposed ASF-BP method, light-weight convolutional SNNs achieve superior performances compared with other spike-based BP methods on both non-neuromorphic (MNIST, CIFAR10) and neuromorphic (CIFAR10-DVS) datasets. The code is available at https: //github. com/neural-lab/ASF-BP.

NeurIPS Conference 2020 Conference Paper

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

  • Yujing Hu
  • Weixun Wang
  • Hangtian Jia
  • Yixiang Wang
  • Yingfeng Chen
  • Jianye Hao
  • Feng Wu
  • Changjie Fan

Reward shaping is an effective technique for incorporating domain knowledge into reinforcement learning (RL). Existing approaches such as potential-based reward shaping normally make full use of a given shaping reward function. However, since the transformation of human knowledge into numeric reward values is often imperfect due to reasons such as human cognitive bias, completely utilizing the shaping reward function may fail to improve the performance of RL algorithms. In this paper, we consider the problem of adaptively utilizing a given shaping reward function. We formulate the utilization of shaping rewards as a bi-level optimization problem, where the lower level is to optimize policy using the shaping rewards and the upper level is to optimize a parameterized shaping weight function for true reward maximization. We formally derive the gradient of the expected true reward with respect to the shaping weight function parameters and accordingly propose three learning algorithms based on different assumptions. Experiments in sparse-reward cartpole and MuJoCo environments show that our algorithms can fully exploit beneficial shaping rewards, and meanwhile ignore unbeneficial shaping rewards or even transform them into beneficial ones.

IJCAI Conference 2020 Conference Paper

Monte-Carlo Tree Search for Scalable Coalition Formation

  • Feng Wu
  • Sarvapali D. Ramchurn

We propose a novel algorithm based on Monte-Carlo tree search for the problem of coalition structure generation (CSG). Specifically, we find the optimal solution by sampling the coalition structure graph and incrementally expanding a search tree, which represents the partial space that has been searched. We prove that our algorithm is complete and converges to the optimal given sufficient number of iterations. Moreover, it is anytime and can scale to large CSG problems with many agents. Experimental results on six common CSG benchmark problems and a disaster response domain confirm the advantages of our approach comparing to the state-of-the-art methods.

AAAI Conference 2019 Conference Paper

A Two-Stream Mutual Attention Network for Semi-Supervised Biomedical Segmentation with Noisy Labels

  • Shaobo Min
  • Xuejin Chen
  • Zheng-Jun Zha
  • Feng Wu
  • Yongdong Zhang

Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation. Although many semi-supervised methods have been proposed to provide extra training data, automatically generated labels are usually too noisy to retrain models effectively. In this paper, we propose a Two-Stream Mutual Attention Network (TS- MAN) that weakens the influence of back-propagated gradients caused by incorrect labels, thereby rendering the network robust to unclean data. The proposed TSMAN consists of two sub-networks that are connected by three types of attention models in different layers. The target of each attention model is to indicate potentially incorrect gradients in a certain layer for both sub-networks by analyzing their inferred features using the same input. In order to achieve this purpose, the attention models are designed based on the propagation analysis of noisy gradients at different layers. This allows the attention models to effectively discover incorrect labels and weaken their influence during parameter updating process. By exchanging multi-level features within two-stream architecture, the effects of noisy labels in each sub-network are reduced by decreasing the noisy gradients. Furthermore, a hierarchical distillation is developed to provide reliable pseudo labels for unlabelded data, which further boosts the performance of TSMAN. The experiments using both HVSMR 2016 and BRATS 2015 benchmarks demonstrate that our semi-supervised learning framework surpasses the state-of-the-art fully-supervised results.

IJCAI Conference 2019 Conference Paper

Densely Supervised Hierarchical Policy-Value Network for Image Paragraph Generation

  • Siying Wu
  • Zheng-Jun Zha
  • Zilei Wang
  • Houqiang Li
  • Feng Wu

Image paragraph generation aims to describe an image with a paragraph in natural language. Compared to image captioning with a single sentence, paragraph generation provides more expressive and fine-grained description for storytelling. Existing approaches mainly optimize paragraph generator towards minimizing word-wise cross entropy loss, which neglects linguistic hierarchy of paragraph and results in ``sparse" supervision for generator learning. In this paper, we propose a novel Densely Supervised Hierarchical Policy-Value (DHPV) network for effective paragraph generation. We design new hierarchical supervisions consisting of hierarchical rewards and values at both sentence and word levels. The joint exploration of hierarchical rewards and values provides dense supervision cues for learning effective paragraph generator. We propose a new hierarchical policy-value architecture which exploits compositionality at token-to-token and sentence-to-sentence levels simultaneously and can preserve the semantic and syntactic constituent integrity. Extensive experiments on the Stanford image-paragraph benchmark have demonstrated the effectiveness of the proposed DHPV approach with performance improvements over multiple state-of-the-art methods.

AAMAS Conference 2018 Conference Paper

Human-UAV Teaming in Dynamic and Uncertain Environments

  • Alper Turan Alan
  • Chang Liu
  • Elliot Salisbury
  • Stephen D. Prior
  • Sarvapali D. Ramchurn
  • Feng Wu
  • Kerry Tatlock
  • Gareth Rees

In this demonstrator we show how an algorithm developed for human-agent coordination can be used to coordinate human actors on the ground and unmanned aerial vehicles in a rescue mission. A video can be found here: http: //goo. gl/QLQD7q.

AAAI Conference 2018 Conference Paper

Privacy-Preserving Policy Iteration for Decentralized POMDPs

  • Feng Wu
  • Shlomo Zilberstein
  • Xiaoping Chen

We propose the first privacy-preserving approach to address the privacy issues that arise in multi-agent planning problems modeled as a Dec-POMDP. Our solution is a distributed message-passing algorithm based on trials, where the agents’ policies are optimized using the cross-entropy method. In our algorithm, the agents’ private information is protected using a public-key homomorphic cryptosystem. We prove the correctness of our algorithm and analyze its complexity in terms of message passing and encryption/decryption operations. Furthermore, we analyze several privacy aspects of our algorithm and show that it can preserve the agent privacy of non-neighbors, model privacy, and decision privacy. Our experimental results on several common Dec-POMDP benchmark problems confirm the effectiveness of our approach.

YNICL Journal 2017 Journal Article

Acupuncture modulates the abnormal brainstem activity in migraine without aura patients

  • Zhengjie Li
  • Fang Zeng
  • Tao Yin
  • Lei Lan
  • Nikos Makris
  • Kristen Jorgenson
  • Taipin Guo
  • Feng Wu

Migraine is a common neurological disease with a high prevalence and unsatisfactory treatment options. The specific pathophysiological mechanisms of migraine remain unclear, which restricts the development of effective treatments for this prevalent disorder. The aims of this study were to 1) compare the spontaneous brain activity differences between Migraine without Aura (MwoA) patients and healthy controls (HCs), using amplitude of low-frequency fluctuations (ALFF) calculation method, and 2) explore how an effective treatment (verum acupuncture) could modulate the ALFF of MwoA patients. One hundred MwoA patients and forty-six matched HCs were recruited. Patients were randomized to four weeks' verum acupuncture, sham acupuncture, and waiting list groups. Patients had resting state BOLD-fMRI scan before and after treatment, while HCs only had resting state BOLD-fMRI scan at baseline. Headache intensity, headache frequency, self-rating anxiety and self-rating depression were used for clinical efficacy evaluation. Compared with HCs, MwoA patients showed increased ALFF in posterior insula and putamen/caudate, and reduced ALFF in rostral ventromedial medulla (RVM)/trigeminocervical complex (TCC). After longitudinal verum acupuncture treatment, the decreased ALFF of the RVM/TCC was normalized in migraine patients. Verum acupuncture and sham acupuncture have different modulation effects on ALFF of RVM/TCC in migraine patients. Our results suggest that impairment of the homeostasis of the trigeminovascular nociceptive pathway is involved in the neural pathophysiology of migraines. Effective treatments, such as verum acupuncture, could help to restore this imbalance.

IJCAI Conference 2017 Conference Paper

Integrating Answer Set Programming with Semantic Dictionaries for Robot Task Planning

  • Dongcai Lu
  • Yi Zhou
  • Feng Wu
  • Zhao Zhang
  • Xiaoping Chen

In this paper, we propose a novel integrated task planning system for service robot in domestic domains. Given open-ended high-level user instructions in natural language, robots need to generate a plan, i. e. , a sequence of low-level executable actions, to complete the required tasks. To address this, we exploit the knowledge on semantic roles of common verbs defined in semantic dictionaries such as FrameNet and integrate it with Answer Set Programming --- a task planning framework with both representation language and solvers. In the experiments, we evaluated our approach using common benchmarks on service tasks and showed that it can successfully handle much more tasks than the state-of-the-art solution. Notably, we deployed the proposed planning system on our service robot for the annual RoboCup@Home competitions and achieved very encouraging results.

IJCAI Conference 2017 Conference Paper

Multi-Agent Planning with Baseline Regret Minimization

  • Feng Wu
  • Shlomo Zilberstein
  • Xiaoping Chen

We propose a novel baseline regret minimization algorithm for multi-agent planning problems modeled as finite-horizon decentralized POMDPs. It guarantees to produce a policy that is provably better than or at least equivalent to the baseline policy. We also propose an iterative belief generation algorithm to effectively and efficiently minimize the baseline regret, which only requires necessary iterations to converge to the policy with minimum baseline regret. Experimental results on common benchmark problems confirm its advantage comparing to the state-of-the-art approaches.

JAIR Journal 2016 Journal Article

A Disaster Response System based on Human-Agent Collectives

  • Sarvapali D. Ramchurn
  • Trung Dong Huynh
  • Feng Wu
  • Yukki Ikuno
  • Jack Flann
  • Luc Moreau
  • Joel E. Fischer
  • Wenchao Jiang

Major natural or man-made disasters such as Hurricane Katrina or the 9/11 terror attacks pose significant challenges for emergency responders. First, they have to develop an understanding of the unfolding event either using their own resources or through third-parties such as the local population and agencies. Second, based on the information gathered, they need to deploy their teams in a flexible manner, ensuring that each team performs tasks in The most effective way. Third, given the dynamic nature of a disaster space, and the uncertainties involved in performing rescue missions, information about the disaster space and the actors within it needs to be managed to ensure that responders are always acting on up-to-date and trusted information. Against this background, this paper proposes a novel disaster response system called HAC-ER. Thus HAC-ER interweaves humans and agents, both robotic and software, in social relationships that augment their individual and collective capabilities. To design HAC-ER, we involved end-users including both experts and volunteers in a several participatory design workshops, lab studies, and field trials of increasingly advanced prototypes of individual components of HAC-ER as well as the overall system. This process generated a number of new quantitative and qualitative results but also raised a number of new research questions. HAC-ER thus demonstrates how such Human-Agent Collectives (HACs) can address key challenges in disaster response. Specifically, we show how HAC-ER utilises crowdsourcing combined with machine learning to obtain most important situational awareness from large streams of reports posted by members of the public and trusted organisations. We then show how this information can inform human-agent teams in coordinating multi-UAV deployments, as well as task planning for responders on the ground. Finally, HAC-ER incorporates an infrastructure and the associated intelligence for tracking and utilising the provenance of information shared across the entire system to ensure its accountability. We individually validate each of these elements of HAC-ER and show how they perform against standard (non-HAC) baselines and also elaborate on the evaluation of the overall system.

IJCAI Conference 2016 Conference Paper

Coordinating Human-UAV Teams in Disaster Response

  • Feng Wu
  • Sarvapali D. Ramchurn
  • Xiaoping Chen

We consider a disaster response scenario where emergency responders have to complete rescue tasks in dynamic and uncertain environment with the assistance of multiple UAVs to collect information about the disaster space. To capture the uncertainty and partial observability of the domain, we model this problem as a POMDP. However, the resulting model is computationally intractable and cannot be solved by most existing POMDP solvers due to the large state and action spaces. By exploiting the problem structure we propose a novel online planning algorithm to solve this model. Specifically, we generate plans for the responders based on Monte-Carlo simulations and compute actions for the UAVs according to the value of information. Our empirical results confirm that our algorithm significantly outperforms the state-of-the-art both in time and solution quality.

IJCAI Conference 2015 Conference Paper

A Study of Human-Agent Collaboration for Multi-UAV Task Allocation in Dynamic Environments

  • Sarvapali D. Ramchurn
  • Joel E Fischer
  • Yuki Ikuno
  • Feng Wu
  • Jack Flann
  • Antony Waldock

We consider a setting where a team of humans oversee the coordination of multiple Unmanned Aerial Vehicles (UAVs) to perform a number of search tasks in dynamic environments that may cause the UAVs to drop out. Hence, we develop a set of multi- UAV supervisory control interfaces and a multiagent coordination algorithm to support human decision making in this setting. To elucidate the resulting interactional issues, we compare manual and mixed-initiative task allocation in both static and dynamic environments in lab studies with 40 participants and observe that our mixed-initiative system results in lower workloads and better performance in re-planning tasks than one which only involves manual task allocation. Our analysis points to new insights into the way humans appropriate flexible autonomy.

IJCAI Conference 2015 Conference Paper

Agile Planning for Real-World Disaster Response

  • Feng Wu
  • Sarvapali D. Ramchurn
  • Wenchao Jiang
  • Jeol E. Fischer
  • Tom Rodden
  • Nicholas R. Jennings

We consider a setting where an agent-based planner instructs teams of human emergency responders to perform tasks in the real world. Due to uncertainty in the environment and the inability of the planner to consider all human preferences and all attributes of the real-world, humans may reject plans computed by the agent. A naı̈ve solution that replans given a rejection is inefficient and does not guarantee the new plan will be acceptable. Hence, we propose a new model re-planning problem using a Multi-agent Markov Decision Process that integrates potential rejections as part of the planning process and propose a novel algorithm to efficiently solve this new model. We empirically evaluate our algorithm and show that it outperforms current benchmarks. Our algorithm is also shown to perform better in pilot studies with real humans.

JAAMAS Journal 2015 Journal Article

Human–agent collaboration for disaster response

  • Sarvapali D. Ramchurn
  • Feng Wu
  • Nicholas R. Jennings

Abstract In the aftermath of major disasters, first responders are typically overwhelmed with large numbers of, spatially distributed, search and rescue tasks, each with their own requirements. Moreover, responders have to operate in highly uncertain and dynamic environments where new tasks may appear and hazards may be spreading across the disaster space. Hence, rescue missions may need to be re-planned as new information comes in, tasks are completed, or new hazards are discovered. Finding an optimal allocation of resources to complete all the tasks is a major computational challenge. In this paper, we use decision theoretic techniques to solve the task allocation problem posed by emergency response planning and then deploy our solution as part of an agent-based planning tool in real-world field trials. By so doing, we are able to study the interactional issues that arise when humans are guided by an agent. Specifically, we develop an algorithm, based on a multi-agent Markov decision process representation of the task allocation problem and show that it outperforms standard baseline solutions. We then integrate the algorithm into a planning agent that responds to requests for tasks from participants in a mixed-reality location-based game, called AtomicOrchid, that simulates disaster response settings in the real-world. We then run a number of trials of our planning agent and compare it against a purely human driven system. Our analysis of these trials show that human commanders adapt to the planning agent by taking on a more supervisory role and that, by providing humans with the flexibility of requesting plans from the agent, allows them to perform more tasks more efficiently than using purely human interactions to allocate tasks. We also discuss how such flexibility could lead to poor performance if left unchecked.

TIST Journal 2015 Journal Article

Online Planning for Large Markov Decision Processes with Hierarchical Decomposition

  • Aijun Bai
  • Feng Wu
  • Xiaoping Chen

Markov decision processes (MDPs) provide a rich framework for planning under uncertainty. However, exactly solving a large MDP is usually intractable due to the “curse of dimensionality”— the state space grows exponentially with the number of state variables. Online algorithms tackle this problem by avoiding computing a policy for the entire state space. On the other hand, since online algorithm has to find a near-optimal action online in almost real time, the computation time is often very limited. In the context of reinforcement learning, MAXQ is a value function decomposition method that exploits the underlying structure of the original MDP and decomposes it into a combination of smaller subproblems arranged over a task hierarchy. In this article, we present MAXQ-OP—a novel online planning algorithm for large MDPs that utilizes MAXQ hierarchical decomposition in online settings. Compared to traditional online planning algorithms, MAXQ-OP is able to reach much more deeper states in the search tree with relatively less computation time by exploiting MAXQ hierarchical decomposition online. We empirically evaluate our algorithm in the standard Taxi domain—a common benchmark for MDPs—to show the effectiveness of our approach. We have also conducted a long-term case study in a highly complex simulated soccer domain and developed a team named WrightEagle that has won five world champions and five runners-up in the recent 10 years of RoboCup Soccer Simulation 2D annual competitions. The results in the RoboCup domain confirm the scalability of MAXQ-OP to very large domains.

AAAI Conference 2014 Conference Paper

Regret-Based Multi-Agent Coordination with Uncertain Task Rewards

  • Feng Wu
  • Nicholas Jennings

Many multi-agent coordination problems can be represented as DCOPs. Motivated by task allocation in disaster response, we extend standard DCOP models to consider uncertain task rewards where the outcome of completing a task depends on its current state, which is randomly drawn from unknown distributions. The goal of solving this problem is to find a solution for all agents that minimizes the overall worst-case loss. This is a challenging problem for centralized algorithms because the search space grows exponentially with the number of agents and is nontrivial for existing algorithms for standard DCOPs. To address this, we propose a novel decentralized algorithm that incorporates Max-Sum with iterative constraint generation to solve the problem by passing messages among agents. By so doing, our approach scales well and can solve instances of the task allocation problem with hundreds of agents and tasks.

NeurIPS Conference 2013 Conference Paper

Bayesian Mixture Modelling and Inference based Thompson Sampling in Monte-Carlo Tree Search

  • Aijun Bai
  • Feng Wu
  • Xiaoping Chen

Monte-Carlo tree search is drawing great interest in the domain of planning under uncertainty, particularly when little or no domain knowledge is available. One of the central problems is the trade-off between exploration and exploitation. In this paper we present a novel Bayesian mixture modelling and inference based Thompson sampling approach to addressing this dilemma. The proposed Dirichlet-NormalGamma MCTS (DNG-MCTS) algorithm represents the uncertainty of the accumulated reward for actions in the MCTS search tree as a mixture of Normal distributions and inferences on it in Bayesian settings by choosing conjugate priors in the form of combinations of Dirichlet and NormalGamma distributions. Thompson sampling is used to select the best action at each decision node. Experimental results show that our proposed algorithm has achieved the state-of-the-art comparing with popular UCT algorithm in the context of online planning for general Markov decision processes.

IJCAI Conference 2013 Conference Paper

Monte-Carlo Expectation Maximization for Dec-POMDPs

  • Feng Wu
  • Shlomo Zilberstein
  • Nicholas R. Jennings

We address two significant drawbacks of state-ofthe-art solvers of decentralized POMDPs (DEC- POMDPs): the reliance on complete knowledge of the model and limited scalability as the complexity of the domain grows. We extend a recently proposed approach for solving DEC-POMDPs via a reduction to the maximum likelihood problem, which in turn can be solved using EM. We introduce a model-free version of this approach that employs Monte-Carlo EM (MCEM). While a naı̈ve implementation of MCEM is inadequate in multiagent settings, we introduce several improvements in sampling that produce high-quality results on a variety of DEC-POMDP benchmarks, including large problems with thousands of agents.

AAMAS Conference 2012 Conference Paper

Online Planning for Large MDPs with MAXQ Decomposition

  • Aijun Bai
  • Feng Wu
  • Xiaoping Chen

Markov decision processes (MDPs) provide an expressive framework for planning in stochastic domains. However, exactly solving a large MDP is often intractable due to the curse of dimensionality. Online algorithms help overcome the high computational complexity by avoiding computing a policy for each possible state. Hierarchical decomposition is another promising way to help scale MDP algorithms up to large domains by exploiting their underlying structure. In this paper, we present an effort on combining the benefits of a general hierarchical structure based on MAXQ value function decomposition with the power of heuristic and approximate techniques for developing an online planning framework, called MAXQ-OP. The proposed framework provides a principled approach for programming autonomous agents in a large stochastic domain. We have been conducting a long-term case-study with the RoboCup soccer simulation 2D domain, which is extremely larger than domains usually studied in literature, as the major benchmark to this research. The case-study showed that the agents developed with this framework and the related techniques reached outstanding performances, showing its high scalability to very large domains.

IJCAI Conference 2011 Conference Paper

Online Planning for Ad Hoc Autonomous Agent Teams

  • Feng Wu
  • Shlomo Zilberstein
  • Xiaoping Chen

We propose a novel online planning algorithm for ad hoc team settings - challenging situations in which an agent must collaborate with unknown teammates without prior coordination. Our approach is based on constructing and solving a series of stage games, and then using biased adaptive play to choose actions. The utility function in each stage game is estimated via Monte-Carlo tree search using the UCT algorithm. We establish analytically the convergence of the algorithm and show that it performs well in a variety of ad hoc team domains.

AIJ Journal 2011 Journal Article

Online planning for multi-agent systems with bounded communication

  • Feng Wu
  • Shlomo Zilberstein
  • Xiaoping Chen

We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenges in decentralized operation are to maintain coordinated behavior with little or no communication and, when communication is allowed, to optimize value with minimal communication. The algorithm addresses these challenges by generating identical conditional plans based on common knowledge and communicating only when history inconsistency is detected, allowing communication to be postponed when necessary. To be suitable for online operation, the algorithm computes good local policies using a new and fast local search method implemented using linear programming. Moreover, it bounds the amount of memory used at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing much higher value with much less communication in most cases. The algorithm also proves to be effective when the communication channel is imperfect (periodically unavailable). These results contribute to the scalability of decision-theoretic planning in multi-agent settings.

AAMAS Conference 2010 Conference Paper

Point-Based Policy Generation for Decentralized POMDPs

  • Feng Wu
  • Shlomo Zilberstein
  • Xiaoping Chen

Memory-bounded techniques have shown great promise in solving complex multi-agent planning problems modeled as DEC-POMDPs. Much of the performance gains can be attributed to pruning techniques that alleviate the complexity of the exhaustive backup step of the original MBDP algorithm. Despite these improvements, state-of-the-art algorithms can still handle a relative small pool of candidate policies, which limits the quality of the solution in some benchmark problems. We present a new algorithm, Point-Based Policy Generation, which avoids altogether searching the entire joint policy space. The key observation is that the best policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. We also provide an efficient approximate implementation of this operation. The experimental results show that our solution technique improvesthe performance significantly in terms of both runtime and solution quality.

AAAI Conference 2010 Conference Paper

Trial-Based Dynamic Programming for Multi-Agent Planning

  • Feng Wu
  • Shlomo Zilberstein
  • Xiaoping Chen

Trial-based approaches offer an efficient way to solve singleagent MDPs and POMDPs. These approaches allow agents to focus their computations on regions of the environment they encounter during the trials, leading to significant computational savings. We present a novel trial-based dynamic programming (TBDP) algorithm for DEC-POMDPs that extends these benefits to multi-agent settings. The algorithm uses trial-based methods for both belief generation and policy evaluation. Policy improvement is implemented efficiently using linear programming and a sub-policy reuse technique that helps bound the amount of memory. The results show that TBDP can produce significant value improvements and is much faster than the best existing planning algorithms.