Author name cluster

Yun Zhang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

2 author rows

EAAI Journal 2026 Journal Article

Cross-query contextual clues dynamic enhancement for partially relevant video retrieval

Ou Ye
Rongkang Wang
Zhenhua Yu
Yun Zhang
Wenchao Zhang
Liangguo Xiao

Details DOI

EAAI Journal 2026 Journal Article

Research on key technologies of multi-modal image intelligent inspection of transmission lines driven by knowledge distillation task

Zhiwei Song
Yun Zhang
Ye Zhang

Details DOI

ICLR Conference 2025 Conference Paper

AugKD: Ingenious Augmentations Empower Knowledge Distillation for Image Super-Resolution

Yun Zhang
Wei Li 0002
Simiao Li
Hanting Chen
Zhijun Tu
Bingyi Jing
Shaohui Lin
Jie Hu 0021

Knowledge distillation (KD) compresses deep neural networks by transferring task-related knowledge from cumbersome pre-trained teacher models to more compact student models. However, vanilla KD for image super-resolution (SR) networks yields only limited improvements due to the inherent nature of SR tasks, where the outputs of teacher models are noisy approximations of high-quality label images. In this work, we show that the potential of vanilla KD has been underestimated and demonstrate that the ingenious application of data augmentation methods can close the gap between it and more complex, well-designed methods. Unlike conventional training processes typically applying image augmentations simultaneously to both low-quality inputs and high-quality labels, we propose AugKD utilizing unpaired data augmentations to 1) generate auxiliary distillation samples and 2) impose label consistency regularization. Comprehensive experiments show that the AugKD significantly outperforms existing state-of-the-art KD methods across a range of SR tasks.

Details

NeurIPS Conference 2025 Conference Paper

AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning

Zewei Zhou
Tianhui Cai
Seth Zhao
Yun Zhang
Zhiyu Huang
Bolei Zhou
Jiaqi Ma

Recent advancements in Vision-Language-Action (VLA) models have shown promise for end-to-end autonomous driving by leveraging world knowledge and reasoning capabilities. However, current VLA models often struggle with physically infeasible action outputs, complex model structures, or unnecessarily long reasoning. In this paper, we propose AutoVLA, a novel VLA model that unifies reasoning and action generation within a single autoregressive generation model for end-to-end autonomous driving. AutoVLA performs semantic reasoning and trajectory planning directly from raw visual inputs and language instructions. We tokenize continuous trajectories into discrete, feasible actions, enabling direct integration into the language model. For training, we employ supervised fine-tuning to equip the model with dual thinking modes: fast thinking (trajectory-only) and slow thinking (enhanced with chain-of-thought reasoning). To further enhance planning performance and efficiency, we introduce a reinforcement fine-tuning method based on Group Relative Policy Optimization (GRPO), reducing unnecessary reasoning in straightforward scenarios. Extensive experiments across real-world and simulated datasets and benchmarks, including nuPlan, nuScenes, Waymo, and CARLA, demonstrate the competitive performance of AutoVLA in both open-loop and closed-loop settings. Qualitative results showcase the adaptive reasoning and accurate planning capabilities of AutoVLA in diverse scenarios.

PDF Details

ICLR Conference 2025 Conference Paper

CBQ: Cross-Block Quantization for Large Language Models

Xin Ding
Xiaoyu Liu 0006
Zhijun Tu
Yun Zhang
Wei Li 0002
Jie Hu 0021
Hanting Chen
Yehui Tang 0001

Post-training quantization (PTQ) has played a pivotal role in compressing large language models (LLMs) at ultra-low costs. Although current PTQ methods have achieved promising results by addressing outliers and employing layer- or block-wise loss optimization techniques, they still suffer from significant performance degradation at ultra-low bits precision. To dissect this issue, we conducted an in-depth analysis of quantization errors specific to LLMs and surprisingly discovered that, unlike traditional sources of quantization errors, the growing number of model parameters, combined with the reduction in quantization bits, intensifies inter-layer and intra-layer dependencies, which severely impact quantization accuracy. This finding highlights a critical challenge in quantizing LLMs. To address this, we propose CBQ, a cross-block reconstruction-based PTQ method for LLMs. CBQ leverages a cross-block dependency to establish long-range dependencies across multiple blocks and integrates an adaptive LoRA-Rounding technique to manage intra-layer dependencies. To further enhance performance, CBQ incorporates a coarse-to-fine pre-processing mechanism for processing weights and activations. Extensive experiments show that CBQ achieves superior low-bit quantization (W4A4, W4A8, W2A16) and outperforms existing state-of-the-art methods across various LLMs and datasets. Notably, CBQ only takes 4.3 hours to quantize a weight-only quantization of a 4-bit LLAMA1-65B model, achieving a commendable trade off between performance and efficiency.

Details

EAAI Journal 2025 Journal Article

Fast Fusion Net: Defect detection and fault identification methods for high-voltage overhead power lines

Zhiwei Song
Yun Zhang
Xinbo Huang
Ye Zhang

Details DOI

YNIMG Journal 2025 Journal Article

Frequency- and state-dependent dynamics of EEG microstates during propofol anesthesia

Yun Zhang
Haidong Wang
Fei Yan
Dawei Song
Qiang Wang
Yubo Wang
Liyu Huang

Details DOI

YNIMG Journal 2025 Journal Article

From simulation to clinic: Assessing the required channel count for effective clinical use of OPM-MEG systems

Bing Yan
Yuming Peng
Yixiang Zhang
Yun Zhang
Haonan Zhang
Yifu Cao
Chang Sun
Ming Ding

Details DOI

ICLR Conference 2025 Conference Paper

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Simiao Li
Yun Zhang
Wei Li 0002
Hanting Chen
Wenjia Wang
Bingyi Jing
Shaohui Lin
Jie Hu 0021

Knowledge distillation (KD) is a promising yet challenging model compression approach that transmits rich learning representations from robust but resource-demanding teacher models to efficient student models. Previous methods for image super-resolution (SR) are often tailored to specific teacher-student architectures, limiting their potential for improvement and hindering broader applications. This work presents a novel KD framework for SR models, the multi-granularity Mixture of Priors Knowledge Distillation (MiPKD), which can be universally applied to a wide range of architectures at both feature and block levels. The teacher’s knowledge is effectively integrated with the student's feature via the Feature Prior Mixer, and the reconstructed feature propagates dynamically in the training phase with the Block Prior Mixer. Extensive experiments illustrate the significance of the proposed MiPKD technique.

Details

NeurIPS Conference 2025 Conference Paper

PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling

Guilin Li
Yun Zhang
Xiuyuan Chen
CHENGQI LI
Bo Wang
Linghe Kong
Wenjia Wang
Weiran Huang

Large language models (LLMs) have shown that generative pretraining can distill vast world knowledge into compact token representations. While LLMs encapsulate extensive world knowledge, they remain limited in modeling the behavioral knowledge contained within user interaction histories. User behavior forms a distinct modality, where each action—defined by multi-dimensional attributes such as time, context, and transaction type—constitutes a behavioral token. Modeling these high-cardinality, sparse, and irregular sequences is challenging, and discriminative models often falter under limited supervision. To bridge this gap, we extend generative pretraining to user behavior, learning transferable representations from unlabeled behavioral data analogous to how LLMs learn from text. We present PANTHER, a hybrid generative–discriminative framework that unifies user behavior pretraining and downstream adaptation, enabling large-scale sequential user representation learning and real-time inference. PANTHER introduces: (1) Structured Tokenization to compress multi-dimensional transaction attributes into an interpretable vocabulary; (2) Sequence Pattern Recognition Module (SPRM) for modeling periodic transaction motifs; (3) a Unified User-Profile Embedding that fuses static demographics with dynamic transaction histories, enabling both personalized predictions and population-level knowledge transfer; and (4) Real-time scalability enabled by offline caching of pre-trained embeddings for millisecond-level inference. Fully deployed and operational online at WeChat Pay, PANTHER delivers a 25. 6\% boost in next-transaction prediction HitRate@1 and a 38. 6\% relative improvement in fraud detection recall over baselines. Cross-domain evaluations on public benchmarks (CCT, MBD, MovieLens-1M, Yelp) show strong generalization, achieving up to 21\% HitRate@1 gains over transformer baselines, establishing PANTHER as a scalable, high-performance framework for industrial user sequential behavior modeling.

PDF Details

IROS Conference 2025 Conference Paper

PC-SRIF: Preconditioned Cholesky-based Square Root Information Filter for Vision-aided Inertial Navigation

Tong Ke
Parth Agrawal
Yun Zhang
Weikun Zhen
Chao X. Guo
Toby Sharp
Ryan C. DuToit

In this paper, we introduce a novel estimator for vision-aided inertial navigation systems (VINS), the Preconditioned Cholesky-based Square Root Information Filter (PC-SRIF). When solving linear systems, employing Cholesky decomposition offers superior efficiency but can compromise numerical stability. Due to this, existing VINS utilizing (Square Root) Information Filters often opt for QR decomposition on platforms where single precision is preferred, avoiding the numerical challenges associated with Cholesky decomposition. While these issues are often attributed to the ill-conditioned information matrix in VINS, our analysis reveals that this is not an inherent property of VINS but rather a consequence of specific parameterizations. We identify several factors that contribute to an ill-conditioned information matrix and propose a preconditioning technique to mitigate these conditioning issues. Building on this analysis, we present PC-SRIF, which exhibits remarkable stability in performing Cholesky decomposition in single precision when solving linear systems in VINS. Consequently, PC-SRIF achieves superior theoretical efficiency compared to alternative estimators. To validate the efficiency advantages and numerical stability of PC-SRIF based VINS, we have conducted well controlled experiments, which provide empirical evidence in support of our theoretical findings. Remarkably, in our VINS implementation, PC-SRIF’s runtime is 41% faster than QR-based SRIF.

Details

EAAI Journal 2024 Journal Article

A transfer learning model for cognitive electronic reconnaissance of unmanned aerial vehicle: Experiments

Yun Zhang
Shixun You
Yunbin Yan
Qiaofeng Ou
Xijun Gao
Fangqing Jiang

Details DOI

EAAI Journal 2024 Journal Article

An enhanced algorithm for object detection based on generative adversarial structure

Yun Zhang
Cheng Huang
Yuyao Zhang
Shujuan Yu
Liya Huang
Na Xie

Details DOI

YNIMG Journal 2024 Journal Article

Neural correlates of semantic-driven syntactic parsing in sentence comprehension

Yun Zhang
Marcus Taft
Jiaman Tang
Le Li

Details DOI

YNIMG Journal 2023 Journal Article

EEG spectral slope: A reliable indicator for continuous evaluation of consciousness levels during propofol anesthesia

Yun Zhang
Yubo Wang
Huanhuan Cheng
Fei Yan
Dingning Li
Dawei Song
Qiang Wang
Liyu Huang

Details DOI

AILAW Journal 2023 Journal Article

Methods of incorporating common element characteristics for law article prediction

Yifan Hou
Ge Cheng
Yun Zhang
Dongliang Zhang

Abstract Law article prediction is a task of predicting the relevant laws and regulations involved in a case according to the description text of the case, and it has broad application prospects in improving judicial efficiency. In the existing research work, researchers often only consider a single case, employing the neural network method to extract features for prediction, which lack the mining of related and common element information between different data. In order to solve this problem, we propose a law article prediction method that integrates the characteristics of common elements. It can effectively utilize the co-occurrence information of the training data, fully mine the relevant common elements between cases, and fuse local features. Experiments show that our method performs well.

Details DOI

EAAI Journal 2023 Journal Article

Price forecasts of ten steel products using Gaussian process regressions

Xiaojie Xu
Yun Zhang

Details DOI

JBHI Journal 2021 Journal Article

Attribute Supervised Probabilistic Dependent Matrix Tri-Factorization Model for the Prediction of Adverse Drug-Drug Interaction

Jiajing Zhu
Yongguo Liu
Yun Zhang
Dongxiao Li

Adverse drug-drug interaction (ADDI) becomes a significant threat to public health. Despite the detection of ADDIs is experimentally implemented in the early development phase of drug design, many potential ADDIs are still clinically explored by accidents, leading to a large number of morbidity and mortality. Several computational models are designed for ADDI prediction. However, they take no consideration of drug dependency, although many drugs usually produce synergistic effects and own highly mutual dependency in treatments, which contains underlying information about ADDIs and benefits ADDI prediction. In this paper, we design a dependent network to model the drug dependency and propose an attribute supervised learning model Probabilistic Dependent Matrix Tri-Factorization (PDMTF) for ADDI prediction. In particular, PDMTF incorporates two drug attributes, molecular structure and side effect, and their correlation to model the adverse interactions among drugs. The dependent network is represented by a dependent matrix, which is first formulated by the row precision matrix of the predicted attribute matrices and then regularized by the molecular structure similarities among drugs. Meanwhile, an efficient alternating algorithm is designed for solving the optimization problem of PDMTF. Experiments demonstrate the superior performance of the proposed model when compared with eight baselines and its two variants.

Details DOI

EAAI Journal 2021 Journal Article

Five-step discrete-time noise-tolerant zeroing neural network model for time-varying matrix inversion with application to manipulator motion generation

Keping Liu
Yongbai Liu
Yun Zhang
Lin Wei
Zhongbo Sun
Long Jin

Details DOI

EAAI Journal 2017 Journal Article

Machining vibration states monitoring based on image representation using convolutional neural networks

Yang Fu
Yun Zhang
Yuan Gao
Huang Gao
Ting Mao
Huamin Zhou
Dequn Li

Details DOI

JBHI Journal 2017 Journal Article

Order Statistics Concordance Coefficient With Applications to Multichannel Biosignal Analysis

Weichao Xu
Zhaoguo Chen
Yun Zhang
Lianglun Cheng

In this paper, we propose a novel concordance coefficient, called order statistics concordance coefficient (OSCOC), to quantify the association among multichannel biosignals. To uncover its properties, we compare OSCOC with three other similar indexes, i. e. , average Pearson's product moment correlation coefficient (APPMCC), Kendall's concordance coefficients (KCC), and average Kendall's tau (AKT), under a multivariate normal model (MNM), linear model (LM), and nonlinear model. To further demonstrate its usefulness, we present an example on atrial arrhythmia analysis based on real-world multichannel cardiac signals. Theoretical derivations as well as numerical results suggest that 1) under MNM and LM, OSCOC performs equally well with APPMCC, and outperforms the other two methods, 2) in nonlinear case, OSCOC even has better performance than KCC and AKT, which are well known to be robust under increasing nonlinear transformations, and 3) OSCOC performs the best in the case study of arrhythmia analysis in terms of the volume under the surface.

Details DOI

EAAI Journal 2015 Journal Article