Author name cluster

Qiang Fu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

39 papers

2 author rows

AAAI Conference 2026 Conference Paper

Deep (Predictive) Discounted Counterfactual Regret Minimization

Hang Xu
Kai Li
Haobo Fu
Qiang Fu
Junliang Xing
Jian Cheng

Counterfactual regret minimization (CFR) is a family of algorithms for effectively solving imperfect-information games. To enhance CFR's applicability in large games, researchers use neural networks to approximate its behavior. However, existing methods are mainly based on vanilla CFR and struggle to effectively integrate more advanced CFR variants. In this work, we propose an efficient model-free neural CFR algorithm, overcoming the limitations of existing methods in approximating advanced CFR variants. At each iteration, it collects variance-reduced sampled advantages based on a value network, fits cumulative advantages by bootstrapping, and applies discounting and clipping operations to simulate the update mechanisms of advanced CFR variants. Experimental results show that, compared with model-free neural algorithms, it exhibits faster convergence in typical imperfect-information games and demonstrates stronger adversarial performance in a large poker game.

PDF Details DOI

AAAI Conference 2026 Conference Paper

FGD-Align: Pluralistic Alignment for Large Language Models via Fuzzy Group Decision-Making

Weihang Pan
Zhengxu Yu
Yong Wu
Xun Liang
Zhongming Jin
Qiang Fu
Penghui Shang
Binbin Lin

Ensuring alignment with human values is essential for modern large language models (LLMs), especially amid growing concerns around AI safety and social impact. Yet achieving such alignment remains challenging due to the limited, noisy, and often conflicting nature of human feedback from diverse annotators. Most existing approaches, such as Direct Preference Optimization (DPO), assume consistent and conflict-free supervision, overlooking the ambiguity, inconsistency, and value trade-offs inherent in real-world preferences—often leading to reduced robustness and exclusion of minority views. To address this, we propose FGD-Align, a novel pluralistic alignment framework grounded in Fuzzy Group Decision-Making theory. Our approach rigorously models and aggregates human preferences while retaining the complexity of real-world value trade-offs. Unlike traditional methods that rely on coarse-grained preference pairs, FGD-Align introduces fuzzy preference modeling via triangular fuzzy numbers to capture nuanced, multi-criteria human judgments. We further develop a new training objective, Probabilistic Fuzzy DPO, which incorporates fuzzy preference strength as adaptive loss weights and gradient filters, enhancing robustness to ambiguity and inconsistency in feedback. Comprehensive experiments demonstrate that FGD-Align consistently outperforms both DPO variants and advanced preference aggregation methods in terms of preference accuracy and robustness to ambiguity. It achieves superior alignment stability and better preserves minority preferences, all with minimal computational overhead. Our work bridges the gap between algorithmic tractability and the nuanced landscape of human values, enabling more scalable, inclusive, and socially-aware AI alignment.

PDF Details DOI

EAAI Journal 2025 Journal Article

A global linear attention incorporated video transformer for robust sintering condition recognition

Leyuan Wu
Junlin Wu
Dingxiang Wang
Qiang Fu

Robust and accurate sintering condition recognition is a fundamental yet critical issue in the design of image-based intelligent combustion control systems. However, owing to the weak texture and fast changing characteristics of flame videos, capturing the condition indicator using existing gradient-based methods is challenging. To address this issue, we propose a global linear attention incorporated video transformer model for sintering condition recognition. First, to reduce the prediction error and uncertainty, the spatial-temporal features are extracted to describe the dynamic characteristics of the flame video streams based on the video shifted window (Swin) Transformer architecture. Next, to address the problem that the local attention strategy used in the Video Swin Transformer is insufficient for global flame feature extraction, we propose a Video Linear Attention block that obtains the global attention as a supplement. Extensive experiments conducted on a real-world rotary kiln sintering dataset demonstrate the effectiveness of our approach, achieving an overall accuracy of 97. 76% and an F1-score of 95. 30%. Compared to the Video Swin Transformer model, these results represent improvements of 2. 00% in accuracy and 4. 96% in F1-score, respectively. This research is particularly significant in the context of real-time identification of combustion process conditions, optimization of control parameters, and realization of more stable and efficient combustion process control.

Details DOI

EAAI Journal 2025 Journal Article

A novel condition monitoring approach using hybrid lightweighted adaptive models for complex machinery

Yingqian Liu
Rongyong Zhang
Luigi Grossi
Zhipin Ye
Huairui Li
Rongsheng Zhu
Qiang Fu

Condition monitoring of complex industrial systems is critical for ensuring operational reliability. Data-driven methods using artificial intelligence have advanced anomaly detection (AD) and fault diagnosis (FD), but existing approaches often treat them separately, focus on known faults, and struggle with previously unseen or rare conditions in multi-modal scenarios. This study proposes a novel condition monitoring framework that integrates AD and FD within a distributed architecture. Lightweight models—including kernel principal component analysis, support vector machines, and one-dimensional convolutional neural networks—enable efficient and scalable processing. A multilevel information fusion strategy ensures consistent detection and diagnosis while facilitating the isolation of previously unknown faults. Module test results demonstrate the effectiveness and robustness of the proposed feature extraction and adaptive modeling approaches. The overall test results for previously unknown faults vary across channels and modules. For samples with misalignment and inner blade wear, channel-level detection accuracy ranges from 0. 007 to 0. 989, with unknown recognition rates up to 0. 933 and diagnosis probabilities from 0. 508 to 0. 933. For strong misalignment and fan-end inner race faults, nearly all channels achieve 100 % detection accuracy, with some diagnosis probabilities above 0. 9, while unknown recognition remains minimal (mostly below 0. 05). Importantly, the proposed framework integrates detection and diagnostic outputs across channels, effectively mapping previously unseen faults to similar known categories or to an unknown category. Overall, the proposed framework offers a referenced solution for condition monitoring of industrial systems like pumps, turbines, and compressors, and lays the foundation for future improvements incorporating domain knowledge and model-driven interpretability.

Details DOI

AAAI Conference 2025 Conference Paper

An Open-Ended Learning Framework for Opponent Modeling

Yuheng Jing
Kai Li
Bingyun Liu
Haobo Fu
Qiang Fu
Junliang Xing
Jian Cheng

Opponent Modeling (OM) aims to enhance decision-making by modeling other agents in multi-agent environments. Existing works typically learn opponent models against a pre-designated fixed set of opponents during training. However, this will cause poor generalization when facing unknown opponents during testing, as previously unseen opponents can exhibit out-of-distribution (OOD) behaviors that the learned opponent models cannot handle. To tackle this problem, we introduce a novel Open-Ended Opponent Modeling (OEOM) framework, which continuously generates opponents with diverse strengths and styles to reduce the possibility of OOD situations occurring during testing. Founded on population-based training and information-theoretic trajectory space diversity regularization, OEOM generates a dynamic set of opponents. This set is then fed to any OM approaches to train a potentially generalizable opponent model. Upon this, we further propose a simple yet effective OM approach that naturally fits within the OEOM framework. This approach is based on in-context reinforcement learning and learns a Transformer that dynamically recognizes and responds to opponents based on their trajectories. Extensive experiments in cooperative, competitive, and mixed environments demonstrate that OEOM is an approach-agnostic framework that improves generalizability compared to training against a fixed set of opponents, regardless of OM approaches or testing opponent settings. The results also indicate that our proposed approach generally outperforms existing OM baselines.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Hamiltonian Descent Algorithms for Optimization: Accelerated Rates via Randomized Integration Time

Qiang Fu
Andre Wibisono

We study the Hamiltonian flow for optimization (HF-opt), which simulates the Hamiltonian dynamics for some integration time and resets the velocity to $0$ to decrease the objective function; this is the optimization analogue of the Hamiltonian Monte Carlo algorithm for sampling. For short integration time, HF-opt has the same convergence rates as gradient descent for minimizing strongly and weakly convex functions. We show that by randomizing the integration time in HF-opt, the resulting randomized Hamiltonian flow (RHF) achieves accelerated convergence rates in continuous time, similar to the rates for accelerated gradient flow. We study a discrete-time implementation of RHF as the randomized Hamiltonian gradient descent (RHGD) algorithm. We prove that RHGD achieves the same accelerated convergence rates as Nesterov's accelerated gradient descent (AGD) for minimizing smooth strongly and weakly convex functions. We provide numerical experiments to demonstrate that RHGD is competitive with classical accelerated methods such as AGD across all settings and outperforms them in certain regimes.