Author name cluster

Sirui Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

1 author row

AAAI Conference 2026 Conference Paper

CaTFormer: Causal Temporal Transformer with Dynamic Contextual Fusion for Driving Intention Prediction

Sirui Wang
Zhou Guan
Bingxi Zhao
Tongjia Gu
Jie Liu

Accurate prediction of driving intention is key to enhancing the safety and interactive efficiency of human-machine co-driving systems. It serves as a cornerstone for achieving high-level autonomous driving. However, current approaches remain inadequate for accurately modeling the complex spatiotemporal interdependencies and the unpredictable variability of human driving behavior. To address these challenges, we propose CaTFormer, a causal Temporal Transformer that explicitly models causal interactions between driver behavior and environmental context for robust intention prediction. Specifically, CaTFormer introduces a novel Reciprocal Delayed Fusion (RDF) mechanism for precise temporal alignment of interior and exterior feature streams, a Counterfactual Residual Encoding (CRE) module that systematically eliminates spurious correlations to reveal authentic causal dependencies, and an innovative Feature Synthesis Network (FSN) that adaptively synthesizes these purified representations into coherent temporal representations. Experimental results demonstrate that CaTFormer attains state-of-the-art performance on the Brain4Cars dataset. It effectively captures complex causal temporal dependencies and enhances both the accuracy and transparency of driving intention prediction.

PDF Details DOI

AAAI Conference 2026 Conference Paper

GEWDiff: Geometric Enhanced Wavelet-based Diffusion Model for Hyperspectral Image Super-resolution

Sirui Wang
Jiang He
Natàlia Blasco Andreo
Xiao Xiang Zhu

Improving the quality of hyperspectral images (HSIs), such as through super-resolution, is a crucial research area. However, generative modeling for HSIs presents several challenges. Due to their high spectral dimensionality, HSIs are too memory-intensive for direct input into conventional diffusion models. Furthermore, general generative models lack an understanding of the topological and geometric structures of ground objects in remote sensing imagery. In addition, most diffusion models optimize loss functions at the noise level, leading to a non-intuitive convergence behavior and suboptimal generation quality for complex data. To address these challenges, we propose a Geometric Enhanced Wavelet-based Diffusion Model (GEWDiff), a novel framework for reconstructing hyperspectral images at 4-times super-resolution. A wavelet-based encoder-decoder is introduced that efficiently compresses HSIs into a latent space while preserving spectral-spatial information. To avoid distortion during generation, we incorporate a geometry-enhanced diffusion process that preserves the geometric features. Furthermore, a multi-level loss function was designed to guide the diffusion process, promoting stable convergence and improved reconstruction fidelity. Our model demonstrated state-of-the-art results across multiple dimensions, including fidelity, spectral accuracy, visual realism, and clarity.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Rectify Evaluation Preference: Improving LLMs’ Critique on Math Reasoning via Perplexity-aware Reinforcement Learning

Changyuan Tian
Zhicong Lu
Shuang Qian
Nayu Liu
Peiguang Li
Li Jin
Leiyi Hu
Zhizhao Zeng

To improve Multi-step Mathematical Reasoning (MsMR) of Large Language Models (LLMs), it is crucial to obtain scalable supervision from the corpus by automatically critiquing mistakes in the reasoning process of MsMR and rendering a final verdict of the problem-solution. Most existing methods rely on crafting high-quality supervised fine-tuning demonstrations for critiquing capability enhancement and pay little attention to delving into the underlying reason for the poor critiquing performance of LLMs. In this paper, we orthogonally quantify and investigate the potential reason — imbalanced evaluation preference, and conduct a statistical preference analysis. Motivated by the analysis of the reason, a novel perplexity-aware reinforcement learning algorithm is proposed to rectify the evaluation preference, elevating the critiquing capability. Specifically, to probe into LLMs' critiquing characteristics, a One-to-many Problem-Solution (OPS) benchmark is meticulously constructed to quantify the behavior difference of LLMs when evaluating the problem solutions generated by itself and others. Then, to investigate the behavior difference in depth, we conduct a statistical preference analysis oriented on perplexity and find an intriguing phenomenon — "LLMs incline to judge solutions with lower perplexity as correct", which is dubbed as imbalanced evaluation preference. To rectify this preference, we regard perplexity as the baton in the algorithm of Group Relative Policy Optimization, supporting the LLMs to explore trajectories that judge lower perplexity as wrong and higher perplexity as correct. Extensive experimental results on our built OPS and existing available critic benchmarks demonstrate the validity of our method.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Enhancing LLMs via High-Knowledge Data Selection

Feiyu Duan
Xuemiao Zhang
Sirui Wang
Haoran Que
Yuqi Liu
Wenge Rong
Xunliang Cai

The performance of Large Language Models (LLMs) is intrinsically linked to the quality of its training data. Although several studies have proposed methods for high-quality data selection, they do not consider the importance of knowledge richness in text corpora. In this paper, we propose a novel and gradient-free High-Knowledge Scorer (HKS) to select high-quality data from the dimension of knowledge, to alleviate the problem of knowledge scarcity in the pre-trained corpus. We propose a comprehensive multi-domain knowledge element pool and introduce knowledge density and coverage as metrics to assess the knowledge content of the text. Based on this, we propose a comprehensive knowledge scorer to select data with intensive knowledge, which can also be utilized for domain-specific high-knowledge data selection by restricting knowledge elements to the specific domain. We train models on a high-knowledge bilingual dataset, and experimental results demonstrate that our scorer improves the model's performance in knowledge-intensive and general comprehension tasks, and is effective in enhancing both the generic and domain-specific capabilities of the model.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Local and Global: Temporal Question Answering via Information Fusion

Yonghao Liu
Di Liang
Mengyu Li
Fausto Giunchiglia
Ximing Li
Sirui Wang
Wei Wu
Lan Huang

Many models that leverage knowledge graphs (KGs) have recently demonstrated remarkable success in question answering (QA) tasks. In the real world, many facts contained in KGs are time-constrained thus temporal KGQA has received increasing attention. Despite the fruitful efforts of previous models in temporal KGQA, they still have several limitations. (I) They neither emphasize the graph structural information between entities in KGs nor explicitly utilize a multi-hop relation path through graph neural networks to enhance answer prediction. (II) They adopt pre-trained language models (LMs) to obtain question representations, focusing merely on the global information related to the question while not highlighting the local information of the entities in KGs. To address these limitations, we introduce a novel model that simultaneously explores both Local information and Global information for the task of temporal KGQA (LGQA). Specifically, we first introduce an auxiliary task in the temporal KG embedding procedure to make timestamp embeddings time-order aware. Then, we design information fusion layers that effectively incorporate local and global information to deepen question understanding. We conduct extensive experiments on two benchmarks, and LGQA significantly outperforms previous state-of-the-art models, especially in difficult questions. Moreover, LGQA can generate interpretable and trustworthy predictions.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

Searching for Optimal Subword Tokenization in Cross-domain NER

Ruotian Ma
Yiding Tan
Xin Zhou
Xuanting Chen
Di Liang
Sirui Wang
Wei Wu
Tao Gui

Input distribution shift is one of the vital problems in unsupervised domain adaptation (UDA). The most popular UDA approaches focus on domain-invariant representation learning, trying to align the features from different domains into a similar feature distribution. However, these approaches ignore the direct alignment of input word distributions between domains, which is a vital factor in word-level classification tasks such as cross-domain NER. In this work, we shed new light on cross-domain NER by introducing a subword-level solution, X-Piece, for input word-level distribution shift in NER. Specifically, we re-tokenize the input words of the source domain to approach the target subword distribution, which is formulated and solved as an optimal transport problem. As this approach focuses on the input level, it can also be combined with previous DIRL methods for further improvement. Experimental results show the effectiveness of the proposed method based on BERT-tagger on four benchmark NER datasets. Also, the proposed method is proved to benefit DIRL methods such as DANN.

PDF Details DOI

TIST Journal 2021 Journal Article

BATS: A Spectral Biclustering Approach to Single Document Topic Modeling and Segmentation

Qiong Wu
Adam Hare
Sirui Wang
Yuwei Tu
Zhenming Liu
Christopher G. Brinton
Yanhua Li

Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available. In this work, we reexamine the inter-related problems of “topic identification” and “text segmentation” for sparse document learning, when there is a single new text of interest. In developing a methodology to handle single documents, we face two major challenges. First is sparse information: with access to only one document, we cannot train traditional topic models or deep learning algorithms. Second is significant noise: a considerable portion of words in any single document will produce only noise and not help discern topics or segments. To tackle these issues, we design an unsupervised, computationally efficient methodology called Biclustering Approach to Topic modeling and Segmentation (BATS). BATS leverages three key ideas to simultaneously identify topics and segment text: (i) a new mechanism that uses word order information to reduce sample complexity, (ii) a statistically sound graph-based biclustering technique that identifies latent structures of words and sentences, and (iii) a collection of effective heuristics that remove noise words and award important words to further improve performance. Experiments on six datasets show that our approach outperforms several state-of-the-art baselines when considering topic coherence, topic diversity, segmentation, and runtime comparison metrics.

Details DOI

EAAI Journal 2019 Journal Article

Optimizing echo state network with backtracking search optimization algorithm for time series forecasting

Zhigang Wang
Yu-Rong Zeng
Sirui Wang
Lin Wang

The echo state network (ESN) is a state-of-the art reservoir computing approach, which is particularly effective for time series forecasting problems because it is coupled with a time parameter. However, the linear regression algorithm commonly used to compute the output weights of ESN could usually cause the trained network over-fitted and thus obtain unsatisfactory results. To overcome the problem, we present four optimized ESNs that are based on the backtracking search optimization algorithm (BSA) or its variants to improve generalizability. Concretely, we utilize BSA and its variants to determine the most appropriate output weights of ESN given that the optimization problem is complex while BSA is a novel evolutionary algorithm that effectively unscrambles optimal solutions in complex spaces. The three BSA variants, namely, adaptive population selection scheme (APSS)–BSA, adaptive mutation factor strategy (AMFS)–BSA, and APSS&AMFS–BSA, were designed to further improve the performance of BSA. Time series forecasting experiments were performed using two real-life time series. The experimental results of the optimized ESNs were compared with those of the basic ESN without optimization, and the two other comparison approaches, as well as the other existing approaches. Experimental results showed that (a) the results of the optimized ESNs are more accurate than that of basic ESN and (b) APSS&AMFS–BSA–ESN nearly outperforms basic ESN, the three other optimized ESNs, the two comparison approaches, and other existing optimization approaches.

Details DOI