Arrow Research search

Author name cluster

Chenyang Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers
2 author rows

Possible papers

16

AAAI Conference 2026 Conference Paper

LexChain: Modeling Legal Reasoning Chains for Chinese Tort Case Analysis

  • Huiyuan Xie
  • Chenyang Li
  • Huining Zhu
  • Chubin Zhang
  • Yuxiao Ye
  • Zhenghao Liu
  • Zhiyuan Liu

Legal reasoning is a fundamental component of legal analysis and decision-making. Existing computational approaches to legal reasoning predominantly rely on generic reasoning frameworks such as syllogism, which do not comprehensively examine the nuanced process of legal reasoning. Moreover, current research has largely focused on criminal cases, with insufficient modeling for civil cases. In this work, we present a novel framework to explicitly model legal reasoning in the analysis of Chinese tort-related civil cases. We first operationalize the legal reasoning process in tort analysis into the three-module LexChain framework, with each module consisting of multiple finer-grained sub-steps. Informed by the LexChain framework, we introduce the task of tort legal reasoning and construct an evaluation benchmark to systematically assess the critical steps within analytical reasoning chains for tort analysis. Leveraging this benchmark, we evaluate existing large language models for their legal reasoning ability in civil tort contexts. Our results indicate that current models still fall short in accurately handling crucial elements of tort legal reasoning. Furthermore, we introduce several baseline approaches that explicitly incorporate LexChain-style reasoning through prompting or post-training. The proposed baselines achieve significant improvements in tort-related legal reasoning and generalize well to related legal analysis tasks, demonstrating the value of explicitly modeling legal reasoning chains to enhance the reasoning capabilities of language models.

NeurIPS Conference 2025 Conference Paper

CoC-VLA: Delving into Adversarial Domain Transfer for Explainable Autonomous Driving via Chain-of-Causality Visual-Language-Action Model

  • Dapeng Zhang
  • Fei Shen
  • Rui Zhao
  • Yinda Chen
  • Peng Zhi
  • Chenyang Li
  • Rui Zhou
  • Qingguo Zhou

Autonomous driving represents a prominent application of artificial intelligence. Recent approaches have shifted from focusing solely on common scenarios to addressing complex, long-tail situations such as subtle human behaviors, traffic accidents, and non-compliant driving patterns. Given the demonstrated capabilities of large language models (LLMs) in understanding visual and natural language inputs and following instructions, recent methods have integrated LLMs into autonomous driving systems to enhance reasoning, interpretability, and performance across diverse scenarios. However, existing methods typically rely either on real-world data, which is suitable for industrial deployment, or on simulation data tailored to rare or hard case scenarios. Few approaches effectively integrate the complementary advantages of both data sources. To address this limitation, we propose a novel VLM-guided, end-to-end adversarial transfer framework for autonomous driving that transfers long-tail handling capabilities from simulation to real-world deployment, named CoC-VLA. The framework comprises a teacher VLM model, a student VLM model, and a discriminator. Both the teacher and student VLM models utilize a shared base architecture, termed the Chain-of-Causality Visual–Language Model (CoC VLM), which integrates temporal information via an end-to-end text adapter. This architecture supports chain-of-thought reasoning to infer complex driving logic. The teacher and student VLM models are pre-trained separately on simulated and real-world datasets. The discriminator is trained adversarially to facilitate the transfer of long-tail handling capabilities from simulated to real-world environments by the student VLM model, using a novel backpropagation strategy. Experimental results show that our method effectively bridges the gap between simulation and real-world autonomous driving, indicating a promising direction for future research.

IJCAI Conference 2025 Conference Paper

DiffusionIMU: Diffusion-Based Inertial Navigation with Iterative Motion Refinement

  • Xiaoqiang Teng
  • Chenyang Li
  • Shibiao Xu
  • Zhihao Hao
  • Deke Guo
  • Jingyuan Li
  • Haisheng Li
  • Weiliang Meng

Inertial navigation enables self-contained localization using only Inertial Measurement Units (IMUs), making it widely applicable in various domains such as navigation, augmented reality, and robotics. However, existing methods suffer from drift accumulation due to the sensor noise and difficulty capturing long-range temporal dependencies, limiting their robustness and accuracy. To address these challenges, we propose DiffusionIMU, a novel diffusion-based framework for inertial navigation. DiffusionIMU enhances direct velocity regression from IMU data through an iterative generative denoising process, progressively refining motion state estimation. It integrates the noise-adaptive feature modulation for sensor variability handling, the feature alignment mechanism for representation consistency, and the diffusion-based temporal modeling to decrease accumulated drift. Experiments show that DiffusionIMU consistently outperforms existing methods, demonstrating superior generalization to unseen users while alleviating the impact of the sensor noise.

JBHI Journal 2025 Journal Article

FROG: A Fine-Grained Spatiotemporal Graph Neural Network With Self-Supervised Guidance for Early Diagnosis of Alzheimer's Disease

  • Shuoyan Zhang
  • Qingmin Wang
  • Min Wei
  • Jiayi Zhong
  • Ying Zhang
  • Ziyan Song
  • Chenyang Li
  • Xiaochen Zhang

Functional magnetic resonance imaging (fMRI) has demonstrated significant potential in the early diagnosis and study of pathological mechanisms of Alzheimer's disease (AD). To fit subtle cross-spatiotemporal interactions and learn pathological features from fMRI, we propose a fine-grained spatiotemporal graph neural network with self-supervised learning (SSL) for diagnosis and biomarker extraction of early AD. First, considering the spatiotemporal interaction of the brain, we design two masks that leverage the spatial correlation and temporal repeatability of fMRI. Afterwards, temporal gated inception convolution and graph scalable inception convolution are proposed for the spatiotemporal autoencoder to enhance subtle cross-spatiotemporal variation and learn noise-suppressed signals. Furthermore, a spatiotemporal scalable cosine error with high selectivity for signal reconstruction is designed in SSL to guide the autoencoder to fit the fine-grained pathological features in an unsupervised manner. A total of 5, 687 samples from four cross-population cohorts are involved. The accuracy of our model was 5. 1% higher than the state-of-the-art models, which included four AD diagnostic models, four SSL strategies, and three multivariate time series models. The neuroimaging biomarkers were precisely localized to the abnormal brain regions, and correlated significantly with the cognitive scale and biomarkers (P $< $ 0. 001). Moreover, the AD progression was reflected through the mask reconstruction error of our SSL strategy. The results demonstrate that our model can effectively capture spatiotemporal and pathological features, and providing a novel and relevant framework for the early diagnosis of AD based on fMRI.

ICLR Conference 2025 Conference Paper

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

  • Jerry Yao-Chieh Hu
  • Wei-Po Wang
  • Ammar Gilani
  • Chenyang Li
  • Zhao Song 0002
  • Han Liu 0001

We investigate the statistical and computational limits of prompt tuning for transformer-based foundation models. Our key contributions are that prompt tuning on *single-head* transformers with only a *single* self-attention layer: (i) is universal, and (ii) supports efficient (even almost-linear time) algorithms under the Strong Exponential Time Hypothesis (SETH). Statistically, we prove that prompt tuning on such the simplest possible transformers are universal approximators for sequence-to-sequence Lipschitz functions. In addition, we provide an exponential-in-$dL$ and -in-$(1/\epsilon)$ lower bound on the required soft-prompt tokens for prompt tuning to memorize any dataset with 1-layer, 1-head transformers. Computationally, we identify a phase transition in the efficiency of prompt tuning, determined by the norm of the *soft-prompt-induced* keys and queries, and provide an upper bound criterion. Beyond this criterion, no sub-quadratic (efficient) algorithm for prompt tuning exists under SETH. Within this criterion, we showcase our theory by proving the existence of almost-linear time prompt tuning inference algorithms. These fundamental limits provide important necessary conditions for designing expressive and efficient prompt tuning methods for practitioners.

YNIMG Journal 2025 Journal Article

Hippocampal subfields in aging: Sex-specific trajectories in structure and hemodynamics

  • Jiaqi Wen
  • Chenyang Li
  • Zhe Sun
  • Chao Wang
  • Jiangyang Zhang
  • Xiaojun Guan
  • Xiaojun Xu
  • Thomas Wisniewski

Sex differences in hippocampal aging have been increasingly recognized, with females showing greater vulnerability to neurodegeneration, particularly after menopause. However, the underlying neurobiological mechanisms remain unclear, especially at the level of hippocampal subfields. Leveraging high-resolution T1-, T2-weighted, and multi-delay arterial spin labeling MRI from 650 adults in the Human Connectome Project-Aging dataset, we examined sex-specific alterations in hippocampal subfield volume, arterial transit time (ATT), and cerebral blood flow (CBF) across the adult lifespan. All hippocampal subfields showed age-related atrophy and ATT prolongation. An age × sex interaction effect on ATT was observed in CA1 and CA2, indicating that age-related increases in ATT were more pronounced in females than in males in these subfields. Moreover, females exhibited more pronounced hippocampal subfields CBF reductions with aging and atrophy, while males showed relatively preserved CBF, with an increase in subiculum perfusion. Furthermore, CA1 showed the lowest perfusion and the strongest association with atrophy among hippocampal subfields. To investigate the potential impact of menopausal hormonal changes on sex-specific patterns, we explored the hypothalamic structure and hemodynamic alterations during aging and their effects on the hippocampus, given that hypothalamus regulates gonadal hormone secretion through the hypothalamic-pituitary-gonadal axis. We found significant hypothalamic atrophy during aging in both sexes, accompanied by ATT prolongation exclusively in females, which was associated with hippocampal atrophy and impaired hemodynamics. Our study highlights the intricate interplay between hippocampal structure and vascular function, revealing sex- and subfield-specific aging trajectories. These findings provide a normative quantitative imaging reference to age-related neurodegenerative diseases such as Alzheimer's Disease.

AAAI Conference 2025 Conference Paper

MapExpert: Online HD Map Construction with Simple and Efficient Sparse Map Element Expert

  • Dapeng Zhang
  • Dayu Chen
  • Peng Zhi
  • Yinda Chen
  • Zhenlong Yuan
  • Chenyang Li
  • Sunjing
  • Rui Zhou

Constructing online High-Definition (HD) maps is crucial for the static environment perception of autonomous driving systems (ADS). Existing solutions typically attempt to detect vectorized HD map elements with unified models; however, these methods often overlook the distinct characteristics of different non-cubic map elements, making accurate distinction challenging. To address these issues, we introduce an expert-based online HD map method, termed MapExpert. MapExpert utilizes sparse experts, distributed by our routers, to describe various non-cubic map elements accurately. Additionally, we propose an auxiliary balance loss function to distribute the load evenly across experts. Furthermore, we theoretically analyze the limitations of prevalent bird's-eye view (BEV) feature temporal fusion methods and introduce an efficient temporal fusion module called Learnable Weighted Moving Descentage. This module effectively integrates relevant historical information into the final BEV features. Combined with an enhanced slice head branch, the proposed MapExpert achieves state-of-the-art performance and maintains good efficiency on both nuScenes and Argoverse2 datasets.

YNIMG Journal 2025 Journal Article

Motor-cognitive aging: The role of motor cortex and its pathways

  • Jiaqi Wen
  • Zifei Liang
  • Chenyang Li
  • Huize Pang
  • Li Jiang
  • Jiayi Li
  • Xiaojun Guan
  • Jiangyang Zhang

BACKGROUND: Motor and cognitive decline are hallmark features of aging. In the primary motor cortex (M1), pyramidal neurons project to the corticospinal tract (CST), a well-established motor pathway, and send collaterals to the ipsilateral striatum, forming the corticostriatal tract (CStrT). While the CST has been extensively studied, the role of the CStrT in motor and cognitive aging remains poorly understood. METHODS: We analyzed T1- and T2-weighted MRI, multi-delay arterial spin labeling, and multi-shell diffusion MRI data from 339 right-handed healthy adults (aged 36-90 years) in the Human Connectome Project-Aging dataset. Age-related trajectories of M1 structure and hemodynamics, as well as CST and CStrT microstructure, were assessed. Segment-wise along-tract analyses were conducted to identify localized tract degeneration. Mediation analyses were performed to examine whether tract integrity linked M1 atrophy to motor and cognitive performance. RESULTS: With age, M1 exhibited reduced volume and hemodynamics, altered T1/T2 ratio, and increased cortical curvature, reflecting structural and hemodynamic alterations. Along-tract analyses revealed localized microstructural degeneration in the CST adjacent to M1, whereas the CStrT showed more extensive degeneration along its trajectory. These tract changes were associated with structural and hemodynamic alterations in M1. Furthermore, integrity of the dominant (left) CST and CStrT mediated the relationship between ipsilateral M1 atrophy and motor decline. Notably, CStrT integrity also mediated the association between M1 atrophy and motor cognition decline. CONCLUSION: These findings establish age-related structural and functional degeneration of M1 and its pathways, highlighting the CStrT as a critical mediator between motor cortical atrophy and both motor and cognitive decline. These normative imaging markers of healthy aging may help inform the early detection of neurodegenerative diseases.

NeurIPS Conference 2025 Conference Paper

Rethinking Tokenized Graph Transformers for Node Classification

  • Jinsong Chen
  • Chenyang Li
  • Gaichao Li
  • John Hopcroft
  • Kun He

Node tokenized graph Transformers (GTs) have shown promising performance in node classification. The generation of token sequences is the key module in existing tokenized GTs which transforms the input graph into token sequences, facilitating the node representation learning via Transformer. In this paper, we observe that the generations of token sequences in existing GTs only focus on the first-order neighbors on the constructed similarity graphs, which leads to the limited usage of nodes to generate diverse token sequences, further restricting the potential of tokenized GTs for node classification. To this end, we propose a new method termed SwapGT. SwapGT first introduces a novel token swapping operation based on the characteristics of token sequences that fully leverages the semantic relevance of nodes to generate more informative token sequences. Then, SwapGT leverages a Transformer-based backbone to learn node representations from the generated token sequences. Moreover, SwapGT develops a center alignment loss to constrain the representation learning from multiple token sequences, further enhancing the model performance. Extensive empirical results on various datasets showcase the superiority of SwapGT for node classification. Code is available at https: //github. com/JHL-HUST/SwapGT.

EAAI Journal 2024 Journal Article

An improved flocking control algorithm to solve the effect of individual communication barriers on flocking cohesion in multi-agent systems

  • Chenyang Li
  • Yonghui Yang
  • Tian-Yun Huang
  • Xue-Bo Chen

Flocking cohesion is a crucial factor for groups to maintain aggregation. Communication barriers can significantly challenge the maintenance of flocking aggregation and integrity. Although several studies have focused on communication barriers between agents and the Leader, less attention has been given to agents' self-communication barriers. This paper investigates the effect of two kinds of communication barriers that coexist on flocking cohesion. One is the communication barrier between agents and the Leader. There are informed and uninformed agents, with informed agents receiving different degrees of complete information about the Leader. The other is the agent's self-communication barrier, where each agent has a failure rate that prevents it from obtaining and transmitting information. Then, a novel potential function is designed by analyzing the correlation between the potential function and flocking cohesion. It enhances flocking cohesion and reduces flocking time. Moreover, the flocking control algorithm is improved by incorporating a local feedback mechanism. It allows informed agents to transfer information more broadly and promotes interaction among agents, further enhancing flocking cohesion and integrity. Stability analysis is performed using the Lyapunov theorem. Finally, we propose three criteria for evaluating flocking performance. Based on these criteria, simulation results demonstrate the proposed potential function and control algorithm's significant advantages. This control algorithm is applicable for addressing individual communication issues in swarming drones or flocking robots during missions in engineering applications.

YNIMG Journal 2024 Journal Article

In vivo mapping of hippocampal venous vasculature and oxygenation using susceptibility imaging at 7T

  • Chenyang Li
  • Sagar Buch
  • Zhe Sun
  • Marco Muccio
  • Li Jiang
  • Yongsheng Chen
  • E. Mark Haacke
  • Jiangyang Zhang

Mapping the small venous vasculature of the hippocampus in vivo is crucial for understanding how functional changes of hippocampus evolve with age. Oxygen utilization in the hippocampus could serve as a sensitive biomarker for early degenerative changes, surpassing hippocampal tissue atrophy as the main source of information regarding tissue degeneration. Using an ultrahigh field (7T) susceptibility-weighted imaging (SWI) sequence, it is possible to capture oxygen-level dependent contrast of submillimeter-sized vessels. Moreover, the quantitative susceptibility mapping (QSM) results derived from SWI data allow for the simultaneous estimation of venous oxygenation levels, thereby enhancing the understanding of hippocampal function. In this study, we proposed two potential imaging markers in a cohort of 19 healthy volunteers aged between 20 and 74 years. These markers were: 1) hippocampal venous density on SWI images and 2) venous susceptibility ( Δ χ vein ) in the hippocampus-associated draining veins (the inferior ventricular veins (IVV) and the basal veins of Rosenthal (BVR) using QSM images). They were chosen specifically to help characterize the oxygen utilization of the human hippocampus and medial temporal lobe (MTL). As part of the analysis, we demonstrated the feasibility of measuring hippocampal venous density and Δ χ vein in the IVV and BVR at 7T with high spatial resolution (0. 25 × 0. 25 × 1 mm3). Our results demonstrated the in vivo reconstruction of the hippocampal venous system, providing initial evidence regarding the presence of the venous arch structure within the hippocampus. Furthermore, we evaluated the age effect of the two quantitative estimates and observed a significant increase in Δ χ vein for the IVV with age (p = 0. 006, r2 = 0. 369). This may suggest the potential application of Δ χ vein in IVV as a marker for assessing changes in atrophy-related hippocampal oxygen utilization in normal aging and neurodegenerative diseases such as AD and dementia.

ECAI Conference 2024 Conference Paper

Semantic Similarity Driven Multi-Modal Model for Rumor Detection

  • Chenyang Li
  • Bo Xu
  • Meng Wang 0039
  • Kun He 0001

The wide spread of rumors with images and texts on social media has attracted broad attention in the academy and industry. Existing models focus on utilizing powerful feature extractors to obtain multi-modal features and introducing various external knowledge. However, the intrinsic semantic similarity of different modalities is either simply ignored in most models or far from adequate in others. The insufficiency of semantic similarity information suppresses the potential of rumor detection models severely. To address this issue, we propose a novel model termed the Semantic Similarity driven Multi-modal model (SemSim) for rumor detection, which deeply captures the semantic similarity through more comprehensive fusion between different modalities and designs a new classification method consequently. Specifically, the proposed SemSim first integrates the raw image and raw text into a virtual image, which fuses information at a new view, i. e. , via the diffusion process inside stable diffusion models. Then SemSim captures the semantic similarity score between virtual image and raw image as the intrinsic information to drive SemSim. Besides, co-attention mechanism is employed to further perceive consistency and enhance interaction between the raw text-image pair. The fused representations via co-attention are utilized to evaluate the multi-modal feature score. In the end, SemSim balances the above two scores for final classification. Experiments on two typical real-world datasets show that SemSim can effectively detect rumors and outperform state-of-the-art methods.

ICRA Conference 2024 Conference Paper

SGCalib: A Two-stage Camera-LiDAR Calibration Method Using Semantic Information and Geometric Features

  • Zhipeng Lin
  • Zhi Gao 0005
  • Xinyi Liu 0002
  • Jialiang Wang
  • Weiwei Song
  • Ben M. Chen
  • Chenyang Li
  • Yue Huang

Extrinsic calibration is an essential prerequisite for the applications of camera-LiDAR fusion. Existing methods either suffer from the complex offline setting of man-made targets or tend to produce suboptimal and unrobust results. In this paper, we propose an online two-stage calibration method that estimates robust and accurate extrinsic parameters between camera and LiDAR. This is a novel work to use semantic information and geometric features jointly in calibration to promote accuracy and robustness. In the first stage, we detect objects in the image and point cloud and build graphs on the objects using Delaunay triangulation. Then, we design a novel graph matching algorithm to associate the objects in the two data domains and extract pairs of 2D-3D points. Using the PnP solver, we get robust initial extrinsic parameters. Then, in the second stage, we design a new optimization formulation with semantic information and geometric features to generate accurate extrinsic parameters with the initial value from the first stage. Extensive experiments on solid-state LiDAR, conventional spinning LiDAR and KITTI datasets have verified the robustness and accuracy of our method which outperforms existing works. We will share the code publicly to benefit the community (after review stages).

IJCAI Conference 2023 Conference Paper

DAMO-StreamNet: Optimizing Streaming Perception in Autonomous Driving

  • Jun-Yan He
  • Zhi-Qi Cheng
  • Chenyang Li
  • Wangmeng Xiang
  • Binghui Chen
  • Bin Luo
  • Yifeng Geng
  • Xuansong Xie

In the realm of autonomous driving, real-time perception or streaming perception remains under-explored. This research introduces DAMO-StreamNet, a novel framework that merges the cutting-edge elements of the YOLO series with a detailed examination of spatial and temporal perception techniques. DAMO-StreamNet's main inventions include: (1) a robust neck structure employing deformable convolution, bolstering receptive field and feature alignment capabilities; (2) a dual-branch structure synthesizing short-path semantic features and long-path temporal features, enhancing the accuracy of motion state prediction; (3) logits-level distillation facilitating efficient optimization, which aligns the logits of teacher and student networks in semantic space; and (4) a real-time prediction mechanism that updates the features of support frames with the current frame, providing smooth streaming perception during inference. Our testing shows that DAMO-StreamNet surpasses current state-of-the-art methodologies, achieving 37. 8% (normal size (600, 960)) and 43. 3% (large size (1200, 1920)) sAP without requiring additional data. This study not only establishes a new standard for real-time perception but also offers valuable insights for future research. The source code is at https: //github. com/zhiqic/DAMO-StreamNet.

YNIMG Journal 2023 Journal Article

Improving measurement of blood-brain barrier permeability with reduced scan time using deep-learning-derived capillary input function

  • Jonghyun Bae
  • Chenyang Li
  • Arjun Masurkar
  • Yulin Ge
  • Sungheon Gene Kim

PURPOSE: In Dynamic contrast-enhanced MRI (DCE-MRI), Arterial Input Function (AIF) has been shown to be a significant contributor to uncertainty in the estimation of kinetic parameters. This study is to assess the feasibility of using a deep learning network to estimate local Capillary Input Function (CIF) to estimate blood-brain barrier (BBB) permeability, while reducing the required scan time. MATERIALS AND METHOD: -10min methods in estimating the PS values. RESULTS: -10min. We found a 75% increase of BBB permeability in the gray matter and a 35% increase in the white matter, when comparing the older group to the younger group. CONCLUSIONS: We demonstrated the feasibility of estimating the capillary-level input functions using a deep learning network. We also showed that this method can be used to estimate subtle age-related changes in BBB permeability with reduced scan time, without compromising accuracy. Moreover, the trained deep learning network can automatically select CIF, reducing the potential uncertainty resulting from manual user-intervention.

AAAI Conference 2019 Conference Paper

Skeleton-Based Gesture Recognition Using Several Fully Connected Layers with Path Signature Features and Temporal Transformer Module

  • Chenyang Li
  • Xin Zhang
  • Lufan Liao
  • Lianwen Jin
  • Weixin Yang

The skeleton based gesture recognition is gaining more popularity due to its wide possible applications. The key issues are how to extract discriminative features and how to design the classification model. In this paper, we first leverage a robust feature descriptor, path signature (PS), and propose three PS features to explicitly represent the spatial and temporal motion characteristics, i. e. , spatial PS (S PS), temporal PS (T PS) and temporal spatial PS (T S PS). Considering the significance of fine hand movements in the gesture, we propose an ”attention on hand” (AOH) principle to define joint pairs for the S PS and select single joint for the T PS. In addition, the dyadic method is employed to extract the T PS and T S PS features that encode global and local temporal dynamics in the motion. Secondly, without the recurrent strategy, the classification model still faces challenges on temporal variation among different sequences. We propose a new temporal transformer module (TTM) that can match the sequence key frames by learning the temporal shifting parameter for each input. This is a learning-based module that can be included into standard neural network architecture. Finally, we design a multi-stream fully connected layer based network to treat spatial and temporal features separately and fused them together for the final result. We have tested our method on three benchmark gesture datasets, i. e. , ChaLearn 2016, ChaLearn 2013 and MSRC-12. Experimental results demonstrate that we achieve the state-of-the-art performance on skeleton-based gesture recognition with high computational efficiency.