Author name cluster

Chang Xu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

91 papers

2 author rows

EAAI Journal 2026 Journal Article

A real-time traffic-load-driven framework for asphalt pavement maintenance timing

Chang Xu
Qingwei Zeng
Wenxuan Zhang
Haoyang Li
Shunxin Yang

Details DOI

AAAI Conference 2026 Conference Paper

Controllable Financial Market Generation with Diffusion Guided Meta Agent

Yu-Hao Huang
Chang Xu
Yang Liu
Weiqing Liu
Wu-Jun Li
Jiang Bian

Generative modeling has transformed many fields, such as language and visual modeling, while its application in financial markets remains under-explored. As the minimal unit within a financial market is an order, order-flow modeling represents a fundamental generative financial task. However, current approaches often yield unsatisfactory fidelity in generating order flow, and their generation lacks controllability, thereby limiting their practical applications. In this paper, we formulate the challenge of controllable financial market generation, and propose a Diffusion Guided Meta Agent (DigMA) model to address it. Specifically, we employ a conditional diffusion model to capture the dynamics of the market state represented by time-evolving distribution parameters of the mid-price return rate and the order arrival rate, and we define a meta agent with financial economic priors to generate orders from the corresponding distributions. Extensive experimental results show that DigMA achieves superior controllability and generation fidelity. Moreover, we validate its effectiveness as a generative environment for downstream high-frequency trading tasks and its computational efficiency.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Eliciting Chain-of-Thought in Base LLMs via Gradient-Based Representation Optimization

Zijian Wang
Yanxiang Ma
Chang Xu

Chain-of-Thought (CoT) reasoning is a critical capability for large language models (LLMs), enabling them to tackle complex multi-step tasks. While base LLMs, pre-trained on general text corpora, often struggle with reasoning due to a lack of specialized training, recent studies reveal their latent reasoning potential tied to hidden states. However, existing hidden state manipulation methods, such as linear activation steering, suffer from limitations due to their rigid and unconstrained nature, often leading to distribution shifts and degraded text quality. In this work, we propose a novel approach for eliciting CoT reasoning from base LLMs through hidden state manipulation grounded in probabilistic conditional generation. By reformulating the challenge as an optimization problem with a balanced likelihood and prior regularization framework, our method guides hidden states toward reasoning-oriented trajectories while preserving linguistic coherence. Extensive evaluations across mathematical, commonsense, and logical reasoning benchmarks demonstrate that our approach consistently outperforms existing steering methods, offering a theoretically principled and effective solution for enhancing reasoning capabilities in base LLMs.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Explainable Oracle Bone Script Recognition via Multimodal Pictographic Reasoning

Yin Wu
Zhengxuan Zhang
Jiayu Chen
Chang Xu
Yuyu Luo
Nan Tang
Hui Xiong

Oracle Bone Script, East Asia's earliest mature writing system from over 3,500 years ago, encodes ancient cognition through visual metaphors, yet remains largely undeciphered and inaccessible, severing modern society from its cultural roots. Traditional AI methods, while accurate in classification, treat glyphs as opaque data, neglecting their pictographic essence and failing to foster public understanding—exacerbating a heritage crisis amid linguistic evolution. We pioneer a paradigm shift toward AI-driven cultural democratization, introducing OracleVis, the first human-validated multimodal dataset of glyph-image-explanation triplets, curated through expert collaborations to overcome data scarcity, bias, and incompleteness in archaeological sources. Building on this, OBS-VM, an explainability-centric multimodal large language model fine-tuned on Qwen2-VL-7B, models pictographic reasoning by balancing semantic fidelity with interpretive transparency, transforming black-box predictions into cognition-aligned narratives. Rigorous evaluations, including benchmarks and a user study with 24 non-experts, reveal our system's superiority: it outperforms GPT-4o in pictographic rationality (3.79 vs. 3.58 in human evaluation) and achieves a 35.3% relative improvement in recognition accuracy, while interactive learning boosts knowledge gains (+5.5 vs. +1.7), interest (+1.9 vs. +0.4), and confidence (+2.0 vs. +0.3) over static methods. This work illuminates AI's potential to bridge ancient wisdom and contemporary audiences, redefining heritage preservation as an inclusive, socially impactful endeavor that turns cultural alienation into enlightened engagement.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Efficient Image-to-Image Diffusion Classifier for Adversarial Robustness

Hefei Mei
Minjing Dong
Chang Xu

Diffusion models (DMs) have demonstrated great potential in the field of adversarial robustness, where DM-based defense methods can achieve superior defense capability without adversarial training. However, they all require huge computational costs due to the usage of large-scale pre-trained DMs, making it difficult to conduct full evaluation under strong attacks and compare with traditional CNN-based methods. Simply reducing the network size and timesteps in DMs could significantly harm the image generation quality, which invalidates previous frameworks. To alleviate this issue, we redesign the diffusion framework from generating high-quality images to predicting distinguishable image labels. Specifically, we employ an image translation framework to learn many-to-one mapping from input samples to designed orthogonal image labels. Based on this framework, we introduce an efficient Image-to-Image diffusion classifier with a pruned U-Net structure and reduced diffusion timesteps. Besides the framework, we redesign the optimization objective of DMs to fit the target of image classification, where a new classification loss is incorporated in the DM-based image translation framework to distinguish the generated label from those of other classes. We conduct sufficient evaluations of the proposed classifier under various attacks on popular benchmarks. Extensive experiments show that our method achieves better adversarial robustness with fewer computational costs than DM-based and CNN-based methods.

PDF Details DOI

AAAI Conference 2025 Conference Paper

EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba

Xiaohuan Pei
Tao Huang
Chang Xu

Prior efforts in light-weight model development mainly centered on CNN and Transformer-based designs yet faced persistent challenges. CNNs adept at local feature extraction compromise resolution while Transformers offer global reach but escalate computational demands O(N^2). This ongoing trade-off between accuracy and efficiency remains a significant hurdle. Recently, state space models (SSMs), such as Mamba, have shown outstanding performance and competitiveness in various tasks such as language modeling and computer vision, while reducing the time complexity of global information extraction to O(N). Inspired by this, this work proposes to explore the potential of visual state space models in light-weight model design and introduce a novel efficient model variant dubbed EfficientVMamba. Concretely, our EfficientVMamba integrates a atrous-based selective scan approach by efficient skip sampling, constituting building blocks designed to harness both global and local representational features. Additionally, we investigate the integration between SSM blocks and convolutions, and introduce an efficient visual state space block combined with an additional convolution branch, which further elevate the model performance. Experimental results show that, EfficientVMamba scales down the computational complexity while yields competitive results across a variety of vision tasks. For example, our EfficientVMamba-S with 1.3G FLOPs improves Vim-Ti with 1.5G FLOPs by a large margin of 5.6% accuracy on ImageNet.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Feature Clipping for Uncertainty Calibration

Linwei Tao
Minjing Dong
Chang Xu

Deep neural networks (DNNs) have achieved significant success across various tasks, but ensuring reliable uncertainty estimates, known as model calibration, is crucial for their safe and effective deployment. Modern DNNs often suffer from overconfidence, leading to miscalibration. We propose a novel post-hoc calibration method called feature clipping (FC) to address this issue. FC involves clipping feature values to a specified threshold, effectively increasing entropy in high calibration error samples while maintaining the information in low calibration error samples. This process reduces the overconfidence in predictions, improving the overall calibration of the model. Our extensive experiments on datasets such as CIFAR-10, CIFAR-100, and ImageNet, and models including CNNs and transformers, demonstrate that FC consistently enhances calibration performance. Additionally, we provide a theoretical analysis that validates the effectiveness of our method. As the first calibration technique based on feature modification, feature clipping offers a novel approach to improving model calibration, showing significant improvements over both post-hoc and train-time calibration methods and pioneering a new avenue for feature-based model calibration.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

MIRA: Medical Time Series Foundation Model for Real-World Health Data

Hao Li
Bowen Deng
Chang Xu
ZhiYuan Feng
Viktor Schlegel
Yu-Hao Huang
Yizheng Sun
Jingyuan Sun

A unified foundation model for medical time series—pretrained on open access and ethically reviewed medical corpora—offers the potential to reduce annotation burdens, minimize model customization, and enable robust transfer across clinical institutions, modalities, and tasks, particularly in data-scarce or privacy-constrained environments. However, existing time series foundation models struggle to handle medical time series data due to its inherent challenges, including irregular intervals, heterogeneous sampling rates, and frequent missingness. To address these challenges, we introduce MIRA, a unified foundation model specifically designed for medical time series forecasting. MIRA incorporates a Continuous-Time Rotary Positional Encoding that enables fine-grained modeling of variable time intervals, a frequency-specific mixture-of-experts layer that routes computation across latent frequency regimes to further promote temporal specialization, and a Continuous Dynamics Extrapolation Block based on Neural ODE that models the continuous trajectory of latent states, enabling accurate forecasting at arbitrary target timestamps. Pretrained on a large-scale and diverse medical corpus comprising over 454 billion time points collect from publicly available datasets, MIRA achieving reductions in forecasting errors by an average of 8% and 6% in out-of-distribution and in-distribution scenarios, respectively. We also introduce a comprehensive benchmark spanning multiple downstream clinical tasks, establishing a foundation for future research in medical time series modeling.