Author name cluster

Chen Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

58 papers

2 author rows

EAAI Journal 2026 Journal Article

A novel physics-constrained deep learning framework for the inverse design of assembly contact interfaces

Lifei Chen
Qiyin Lin
Mingjun Qiu
Chen Wang
Tao Wang
Hao Guan
Qiyuan Xie
Yuge Jiao

Assembly contact interface characteristics critically influence the performance of precision mechanical systems. Traditional design methods relying on iterative finite element analysis are computationally expensive, while existing deep learning approaches often neglect physical constraints and the complex effects of assembly processes. To address these limitations, this paper proposes a physics-constrained deep learning framework for the inverse design of assembly interfaces. Specifically, we introduce a novel network architecture which integrates multi-source inputs including target contact pressure, assembly parameters, and service conditions. To enforce physical consistency, a differentiable loss function incorporating the impenetrability condition is developed. Furthermore, an optimized learning rate scheduling strategy is implemented to enhance model convergence. Comprehensive ablation and comparative experiments demonstrate that our method outperforms conventional approaches in both accuracy and physical plausibility. When applied to an aero-engine flange structure, the framework enables rapid inverse design of interface morphology, reducing maximum contact pressure by 15. 67% and increasing the effective contact area by 45. 23% compared to traditional designs. This work provides a robust solution for assembly interface design and advances the application of physics-constrained deep learning in complex engineering systems.

Details DOI

EAAI Journal 2026 Journal Article

Balance divergence for knowledge distillation

Yafei Qi
Chen Wang
Zhaoning Zhang
Yaping Liu
Yongmin Zhang

Knowledge distillation (KD) represents a fundamental artificial intelligence (AI) technique for model compression and optimization. In computer vision AI applications, most KD methods use Kullback–Leibler (KL) divergence to align teacher–student output probabilities, but often neglect crucial negative aspects of teacher “dark knowledge” by underweighting low-probability signals. This limitation leads to suboptimal logit mimicry and unbalanced knowledge transfer to the student network. In this paper, we investigate the impact of this imbalance and propose a novel method, named Balance Divergence Distillation (BDD). By introducing a compensatory operation using reverse KL divergence, our method can improve the modeling of the extremely small values in the negative from the teacher and preserve the learning capacity for the positive. Furthermore, we test the impact of different temperature coefficients adjustments, which can lead to further balance in knowledge transfer. The evaluation results demonstrate that our method achieves accuracy improvements of 1 % ∼ 3 % for lightweight student networks over standard KD methods on both Canadian Institute for Advanced Research 100 classes(CIFAR-100) and ImageNet datasets. Additionally, when applied to semantic segmentation, our approach enhances the student by 4. 55% in mean Intersection over Union (mIoU) compared to the baseline on the Cityscapes dataset. These experiments confirm that our method provides a simple yet highly effective solution that can be seamlessly integrated with various KD frameworks across different vision tasks.

Details DOI

AAAI Conference 2026 Conference Paper

Benchmarking LLMs for Political Science: A United Nations Perspective

Yueqing Liang
Liangwei Yang
Chen Wang
Congying Xia
Rui Meng
Xiongxiao Xu
Haoran Wang
Ali Payani

Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored. This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process, where the stakes are particularly high and political decisions can have far-reaching consequences. We introduce a novel dataset comprising publicly available UN Security Council (UNSC) records from 1994 to 2024, including draft resolutions, voting records, and diplomatic speeches. Using this dataset, we propose the United Nations Benchmark (UNBench), the first comprehensive benchmark designed to evaluate LLMs across four interconnected political science tasks: co-penholder judgment, representative voting simulation, draft adoption prediction, and representative statement generation. These tasks span the three stages of the UN decision-making process—drafting, voting, and discussing—and aim to assess LLMs' ability to understand and simulate political dynamics. Our experimental analysis demonstrates the potential and challenges of applying LLMs in this domain, providing insights into their strengths and limitations in political science. To the best of our knowledge, this is the first benchmark to systematically evaluate LLMs in UN decision-making, contributing to the growing intersection of AI and political science.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Instance Generation for Meta-Black-Box Optimization Through Latent Space Reverse Engineering

Chen Wang
Yue-Jiao Gong
Zhiguang Cao
Zeyuan Ma

To relieve intensive human-expertise required to design optimization algorithms, recent Meta-Black-Box Optimization (MetaBBO) researches leverage generalization strength of meta-learning to train neural network-based algorithm design policies over a predefined training problem set, which automates the adaptability of the low-level optimizers on unseen problem instances. Currently, a common training problem set choice in existing MetaBBOs is well-known benchmark suites CoCo-BBOB. Although such choice facilitates the MetaBBO's development, problem instances in CoCo-BBOB are more or less limited in diversity, raising the risk of overfitting of MetaBBOs, which might further results in poor generalization. In this paper, we propose an instance generation approach, termed as LSRE, which could generate diverse training problem instances for MetaBBOs to learn more generalizable policies. LSRE first trains an autoencoder which maps high-dimensional problem features into a 2-dimensional latent space. Uniform-grid sampling in this latent space leads to hidden representations of problem instances with sufficient diversity. By leveraging a genetic-programming approach to search function formulas with minimal L2-distance to these hidden representations, LSRE reverse engineers a diversified problem set, termed as Diverse-BBO. We validate the effectiveness of LSRE by training various MetaBBOs on Diverse-BBO and observe their generalization performances on either synthetic or realistic scenarios. Extensive experimental results underscore the superiority of Diverse-BBO to existing training set choices in MetaBBOs. Further ablation studies not only demonstrate the effectiveness of design choices in LSRE, but also reveal interesting insights on instance diversity and MetaBBO's generalization.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

Meng Cao
Pengfei Hu
Yingyao Wang
Jihao Gu
Haoran Tang
Haoze Zhao
Chen Wang
Jiahua Dong

Recent advancements in Large Video Language Models (LVLMs) have highlighted their potential for multi-modal understanding, yet evaluating their factual grounding in videos remains a critical unsolved challenge. To address this gap, we introduce Video SimpleQA, the first comprehensive benchmark tailored for factuality evaluation in video contexts. Our work differs from existing video benchmarks through the following key features: 1) Knowledge required: demanding integration of external knowledge beyond the video’s explicit narrative; 2) Multi-hop fact-seeking question: Each question involves multiple explicit facts and requires strict factual grounding without hypothetical or subjective inferences. We include per-hop single-fact-based sub-QAs alongside final QAs to enable fine-grained, step-by-step evaluation; 3) Short-form definitive answer: Answers are crafted as unambiguous and definitively correct in a short format with minimal scoring variance; 4) Temporal grounded required: Requiring answers to rely on one or more temporal segments in videos, rather than single frames. We extensively evaluate 33 state-of-the-art LVLMs and summarize key findings as follows: 1) Current LVLMs exhibit notable deficiencies in factual adherence, with the best-performing model o3 merely achieving an F-score of 66.3%; 2) Most LVLMs are overconfident in what they generate, with self-stated confidence exceeding actual accuracy; 3) Retrieval-Augmented Generation demonstrates consistent improvements at the cost of additional inference time overhead; 4) Multi-hop QA demonstrates substantially degraded performance compared to single-hop sub-QAs, with first-hop object/event recognition emerging as the primary bottleneck. We position Video SimpleQA as the cornerstone benchmark for video factuality assessment, aiming to steer LVLM development toward verifiable grounding in real-world contexts.

PDF Details DOI

YNIMG Journal 2025 Journal Article

Dynamic changes in brain function during sleep deprivation: Increased occurrence of non-stationary states indicates the extent of cognitive impairment

Ziliang Xu
Chaozong Ma
Chen Wang
Fan Guo
Minwen Zheng
Peng Fang
Yuanqiang Zhu

OBJECTIVE: The brain networks are inherently dynamic, constantly adjusting and reorganizing over time; therefore, the cognitive impairment caused by sleep deprivation (SD) should also exhibit dynamism. However, previous studies on SD that have provided valuable insights predominantly rely on static functional connectivity (FC) analysis. Hence, this study aims to employ dynamical FC (DFC) analysis to capture the dynamic changes in cognitive impairment during SD. METHODS: The data from 32 subjects, encompassing resting state and psychomotor vigilance task (PVT) functional magnetic resonance imaging data collected at five different timepoints (22:00, 00:00, 02:00, 04:00 and 06:00) during a whole night were acquired. Dynamic functional connectivity (DFC) analysis was employed to assess alterations in brain states across the five timepoints, resulting in the identification of three distinct DFC states. RESULTS: After conducting ANOVA analysis, significant changes were observed in the fraction rate of state 1 (non-stationary state) across five timepoints in both resting and task conditions. The transition time corresponding to state 1 consistently showed an increase over time. Furthermore, task condition-related DFC metrics, particularly those associated with state 1, exhibited significant correlations with PVT metrics across five timepoints as well as their changes. CONCLUSIONS: The collective findings suggest that cognitive impairment resulting from sleep deprivation is a dynamic process, with state 1-related indicators exerting the most significant influence on cognition.

Details DOI

NeurIPS Conference 2025 Conference Paper

First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training

Lai Wei
Yuting Li
Chen Wang
Yue Wang
Linghe Kong
Weiran Huang
Lichao Sun

Improving Multi-modal Large Language Models (MLLMs) in the post-training stage typically relies on supervised fine-tuning (SFT) or reinforcement learning (RL), which require expensive and manually annotated multi-modal data--an ultimately unsustainable resource. This limitation has motivated a growing interest in unsupervised paradigms as a third stage of post-training after SFT and RL. While recent efforts have explored this direction, their methods are complex and difficult to iterate. To address this, we propose MM-UPT, a simple yet effective framework for unsupervised post-training of MLLMs, enabling continual self-improvement without any external supervision. The training method of MM-UPT builds upon GRPO, replacing traditional reward signals with a self-rewarding mechanism based on majority voting over multiple sampled responses. Our experiments demonstrate that such training method effectively improves the reasoning ability of Qwen2. 5-VL-7B (e. g. , 66. 3\%$\rightarrow$72. 9\% on MathVista, 62. 9\%$\rightarrow$68. 7\% on We-Math), using standard dataset without ground truth labels. To further explore scalability, we extend our framework to a data self-generation setting, designing two strategies that prompt the MLLM to synthesize new training samples on its own. Additional experiments show that combining these synthetic data with the unsupervised training method can also boost performance, highlighting a promising approach for scalable self-improvement. Overall, MM-UPT offers a new paradigm for autonomous enhancement of MLLMs, serving as a critical third step after initial SFT and RL in the absence of external supervision. Our code is available at \url{https: //github. com/waltonfuture/MM-UPT}.