Author name cluster

Kun Kuang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

30 papers

1 author row

AAAI Conference 2026 Conference Paper

Detecting Unobserved Confounders: A Kernelized Regression Approach

Yikai Chen
Yunxin Mao
Chunyuan Zheng
Hao Zou
Shanzhi Gu
Shixuan Liu
Yang Shi
Wenjing Yang

Detecting unobserved confounders is crucial for reliable causal inference in observational studies. Existing methods require either linearity assumptions or multiple heterogeneous environments, limiting applicability to nonlinear single-environment settings. To bridge this gap, we propose Kernel Regression Confounder Detection (KRCD), a novel method for detecting unobserved confounding in nonlinear observational data under single-environment conditions. KRCD leverages reproducing kernel Hilbert spaces to model complex dependencies. By comparing standard and higher-order kernel regressions, we derive a test statistic whose significant deviation from zero indicates unobserved confounding. Theoretically, we prove two key results: First, in infinite samples, regression coefficients coincide if and only if no unobserved confounders exist. Second, finite-sample differences converge to zero-mean Gaussian distributions with tractable variance. Extensive experiments on synthetic benchmarks and the Twins dataset demonstrate that KRCD not only outperforms existing baselines but also achieves superior computational efficiency.

PDF Details DOI

AAAI Conference 2026 Conference Paper

P2S: Probabilistic Process Supervision for General-Domain Reasoning Question Answering

Wenlin Zhong
Chengyuan Liu
Yiquan Wu
Bovin Tan
Changlong Sun
Yi Wang
Xiaozhong Liu
Kun Kuang

While reinforcement learning with verifiable rewards (RLVR) has advanced LLM reasoning in structured domains like mathematics and programming, its application to general-domain reasoning tasks remains challenging due to the absence of verifiable reward signals. To this end, methods like Reinforcement Learning with Reference Probability Reward (RLPR) have emerged, leveraging the probability of generating the final answer as a reward signal. However, these outcome-focused approaches neglect crucial step-by-step supervision of the reasoning process itself. To address this gap, we introduce Probabilistic Process Supervision (P2S), a novel self-supervision framework that provides fine-grained process rewards without requiring a separate reward model or human-annotated reasoning steps. During reinforcement learning, P2S synthesizes and filters a high-quality reference reasoning chain (gold-CoT). The core of our method is to calculate a Path Faithfulness Reward (PFR) for each reasoning step, which is derived from the conditional probability of generating the gold-CoT's suffix, given the model's current reasoning prefix. Crucially, this PFR can be flexibly integrated with any outcome-based reward, directly tackling the reward sparsity problem by providing dense guidance. Extensive experiments on reading comprehension and medical Question Answering benchmarks show that P2S significantly outperforms strong baselines.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Think Then Rewrite: Reasoning Enhanced Query Rewriting for Domain Specific Retrieval

Ang Li
Yufei Shi
Yuxuan Si
Yiquan Wu
Ming Cai
Xu Tan
Yi Wang
Changlong Sun

Query rewriting is a crucial task for improving retrieval, especially in professional domains such as law and medicine, where user queries are often underspecified and ambiguous. While large language models (LLMs) offer strong understanding and generation capabilities, existing LLM-based approaches reduce the task to text transformation or expansion, neglecting reasoning to disambiguate queries, which fails to bridge the cognitive gap between user queries and specialized documents. In this paper, we propose Think-Then-Rewrite (TTR), a reinforcement learning based framework that unleashes LLMs' reasoning ability for domain-specific query rewriting. TTR introduces a contrastive mutual information reward to encourage the LLM to generate reasoning processes that effectively distinguish confusing distractors. To boost early-stage training, TTR also constructs golden query rewrites as off‑policy data, providing strong guidance for RL learning. A mixed-policy optimization then combines on-policy and off-policy signals, ensuring both effectiveness and stability. Extensive experiments on legal and medical retrieval benchmarks demonstrate that TTR achieves state-of-the-art performance.

PDF Details DOI

AAAI Conference 2025 Conference Paper

FedCFA: Alleviating Simpson’s Paradox in Model Aggregation with Counterfactual Federated Learning

Zhonghua Jiang
Jimin Xu
Shengyu Zhang
Tao Shen
Jiwei Li
Kun Kuang
Haibin Cai
Fei Wu

Federated learning (FL) is a promising technology for data privacy and distributed optimization, but it suffers from data imbalance and heterogeneity among clients. Existing FL methods try to solve the problems by aligning client with server model or by correcting client model with control variables. These methods excel on IID and general Non-IID data but perform mediocrely in Simpson's Paradox scenarios. Simpson's Paradox refers to the phenomenon that the trend observed on the global dataset disappears or reverses on a subset, which may lead to the fact that global model obtained through aggregation in FL does not accurately reflect the distribution of global data. Thus, we propose FedCFA, an novel FL framework employing counterfactual learning to generate counterfactual samples by replacing local data critical factors with global average data, aligning local data distributions with the global and mitigating Simpson's Paradox effects. In addition, to improve the counterfactual samples quality, we introduce factor decorrelation (FDC) loss to reduce the correlation among features and thus improve the independence of extracted factors. We conduct extensive experiments on six datasets and verify that our method outperforms other FL methods in terms of efficiency and global model accuracy under limited communication rounds.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Learning Causal Transition Matrix for Instance-dependent Label Noise

Jiahui Li
Tai-Wei Chang
Kun Kuang
Ximing Li
Long Chen
Jun Zhou

Noisy labels are both inevitable and problematic in machine learning methods, as they negatively impact models' generalization ability by causing overfitting. In the context of learning with noise, the transition matrix plays a crucial role in the design of statistically consistent algorithms. However, the transition matrix is often considered unidentifiable. One strand of methods typically addresses this problem by assuming that the transition matrix is instance-independent; that is, the probability of mislabeling a particular instance is not influenced by its characteristics or attributes. This assumption is clearly invalid in complex real-world scenarios. To better understand the transition relationship and relax this assumption, we propose to study the data generation process of noisy labels from a causal perspective. We discover that an unobservable latent variable can affect either the instance itself, the label annotation procedure, or both, which complicates the identification of the transition matrix. To address various scenarios, we have unified these observations within a new causal graph. In this graph, the input instance is divided into a noise-resistant component and a noise-sensitive component based on whether they are affected by the latent variable. These two components contribute to identifying the “causal transition matrix”, which approximates the true transition matrix with theoretical guarantee. In line with this, we have designed a novel training framework that explicitly models this causal relationship and, as a result, achieves a more accurate model for inferring the clean label.

PDF Details DOI

AAAI Conference 2025 Conference Paper

MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities

Kunxi Li
Tianyu Zhan
Kairui Fu
Shengyu Zhang
Kun Kuang
Jiwei Li
Zhou Zhao
Fan Wu

In this study, we focus on heterogeneous knowledge transfer across entirely different model architectures, tasks, and modalities. Existing knowledge transfer methods (e.g., backbone sharing, knowledge distillation) often hinge on shared elements within model structures or task-specific features/labels, limiting transfers to complex model types or tasks. To overcome these challenges, we present MergeNet, which learns to bridge the gap of parameter spaces of heterogeneous models, facilitating the direct interaction, extraction, and application of knowledge within these parameter spaces. The core mechanism of MergeNet lies in the parameter adapter, which operates by querying the source model's low-rank parameters and adeptly learning to identify and map parameters into the target model. MergeNet is learned alongside both models, allowing our framework to dynamically transfer and adapt knowledge relevant to the current stage, including the training trajectory knowledge of the source model. Extensive experiments on heterogeneous knowledge transfer demonstrate significant improvements in challenging settings, where representative approaches may falter or prove less applicable.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

Jinluan Yang
Dingnan Jin
Anke Tang
Li Shen
Didi Zhu
Zhengyu Chen
Ziyu Zhao
Daixin Wang

Achieving balanced alignment of large language models (LLMs) in terms of Helpfulness, Honesty, and Harmlessness (3H optimization) constitutes a cornerstone of responsible AI. Existing methods like data mixture strategies face limitations, including heavy reliance on expert knowledge and conflicting optimization signals. While model merging offers parameter-level conflict-resolution strategies through integrating specialized models' parameters, its potential for 3H optimization remains underexplored. This paper systematically compares the effectiveness of model merging and data mixture methods in constructing 3H-aligned LLMs for the first time, revealing previously overlooked collaborative and conflict relationships among the 3H dimensions and discussing the advantages and drawbacks of data mixture (\textit{data-level}) and model merging (\textit{parameter-level}) methods in mitigating the conflict for balanced 3H optimization. Specially, we propose a novel \textbf{R}eweighting \textbf{E}nhanced task \textbf{S}ingular \textbf{M}erging method, \textbf{RESM}, through outlier weighting and sparsity-aware rank selection strategies to address the challenges of preference noise accumulation and layer sparsity adaptation inherent in 3H-aligned LLM merging. Extensive evaluations can verify the effectiveness and robustness of RESM compared to previous data mixture (2\%-5\% gain) and model merging (1\%-3\% gain) methods in achieving balanced LLM alignment.

PDF Details

NeurIPS Conference 2025 Conference Paper

MS-Bench: Evaluating LMMs in Ancient Manuscript Study through a Dunhuang Case Study

Yuqing Zhang
Yue Han
Shuanghe Zhu
Haoxiang Wu
Hangqi Li
Shengyu Zhang
Junchi Yan
Zemin Liu

Analyzing ancient manuscripts has traditionally been a labor-intensive and time-consuming task for philologists. While recent advancements in LMMs have demonstrated their potential across diverse domains, their effectiveness in manuscript study remains underexplored. In this paper, we introduce MS-Bench, the first comprehensive benchmark co-developed with archaeologists, comprising 5, 076 high-resolution images from 4th to 14th century and 9, 982 expert-curated questions across nine sub-tasks aligned with archaeological workflows. Through four prompting strategies, we systematically evaluate 32 LMMs on their effectiveness, robustness, and cultural contextualization. Our analysis reveals scale-driven performance and reliability improvements, prompting strategies' impact on performance (CoT has two-sides effect, while visual retrieval-augmented prompts provide consistent boost), and task-specific preferences depending on LMM’s visual capabilities. Although current LMMs are not yet capable of replacing domain expertise, they demonstrate promising potential to accelerate manuscript research through future human–AI collaboration.

PDF Details

AAAI Conference 2025 Conference Paper

Optimize Incompatible Parameters Through Compatibility-aware Knowledge Integration

Zheqi Lv
Keming Ye
Zishu Wei
Qi Tian
Shengyu Zhang
Wenqiao Zhang
Wenjie Wang
Kun Kuang

Deep neural networks have become foundational to advancements in multiple domains, including recommendation systems, natural language processing, and so on. Despite their successes, these models often contain incompatible parameters that can be underutilized or detrimental to model performance, particularly when faced with specific, varying data distributions. Existing research excels in removing such parameters or merging the outputs of multiple different pretrained models. However, the former focuses on efficiency rather than performance, while the latter requires several times more computing and storage resources to support inference. In this paper, we set the goal to explicitly improve these incompatible parameters by leveraging the complementary strengths of different models, thereby directly enhancing the models without any additional parameters. Specifically, we propose Compatibility-aware Knowledge Integration (CKI), which consists of Parameter Compatibility Assessment and Parameter Splicing, which are used to evaluate the knowledge content of multiple models and integrate the knowledge into one model, respectively. The integrated model can be used directly for inference or for further fine-tuning. Extensive experiments on various recommendation and language datasets show that CKI can effectively optimize incompatible parameters under multiple tasks and settings to break through the training limit of the original model without increasing the inference cost.

PDF Details DOI

AILAW Journal 2025 Journal Article

Specialized or general AI? a comparative evaluation of LLMs’ performance in legal tasks

Xue Guo
Yuting Huang
Bin Wei
Kun Kuang
Yiquan Wu
Leilei Gan
Xianshan Huang
Xianglin Dong

Abstract The rise of large language models (LLMs) such as ChatGPT and GPT-4 developed by OpenAI have generated significant interest in the legal domain due to their sophisticated language processing capabilities. In particular, regions like China are vigorously developing legal-specific LLMs for legal purposes. Fine-tuned with fewer parameters and based on judicial documents and Chinese case data sets, these specialized LLMs are widely expected to meet practical needs in the judicial field more effectively. However, the ability of these law-specific LLMs to perform legal tasks and their potential to outperform general LLMs has not yet been established. To fill in this research gap, we systematically evaluate a range of general and legal-specific LLMs on various legal tasks. The results show that GPT-4 maintains superior performance on most legal tasks, although legal-specific LLMs show superior performance in specific cases. This study provides insight into the factors leading to these results, hoping to enrich the discourse on the use of LLMs in the legal field.

Details DOI