Arrow Research search

Author name cluster

Hao Lin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
2 author rows

Possible papers

9

AAAI Conference 2026 Conference Paper

Driving with Advice: Large Model as Motion Advisor for Joint Planning

  • Junyin Wang
  • Jinlei Yu
  • Hao Lin
  • Huikai Liu
  • Wenqian Zhu
  • Shengwu Xiong

We address the challenge of integrating high-level semantic reasoning with low-level trajectory planning in end-to-end autonomous driving, where most existing frameworks decouple perception, decision-making, and control, leading to limited interpretability and poor instruction compliance. To bridge this gap, we propose Driving with Advice, a novel closed-loop framework that treats a vision-language model (VLM) as a motion advisor to provide interpretable, language-mediated guidance for trajectory generation. Our approach introduces three key innovations: (1) Semantic-Intentional Pretraining (SIP), which injects driving rationale into a compact VLM via machine-generated question-answering pairs; (2) a discrete action space grounded in directional and speed primitives, enabling structured and interpretable policy learning; and (3) an advice-following diffusion policy refined via Group Relative Policy Optimization under a multi-objective reward that ensures safety, comfort, and alignment with semantic intent. We evaluate our method on the NAVSIM benchmark in a closed-loop setting, achieving a state-of-the-art Predictive Driver Model Score (PDMS) of 91.5, outperforming strong baselines in safety (NC: 99.2). The results demonstrate that leveraging language as a cognitive interface between perception and control enhances both generalization and behavioral transparency, advancing the paradigm of language-conditioned driving.

AAAI Conference 2026 Conference Paper

Not Just What’s There: Enabling CLIP to Comprehend Negated Visual Descriptions Without Fine-Tuning

  • Junhao Xiao
  • Zhiyu Wu
  • Hao Lin
  • Yi Chen
  • Yahui Liu
  • Xiaoran Zhao
  • Zixu Wang
  • Zejiang He

Vision-Language Models (VLMs) like CLIP struggle to understand negation, often embedding affirmatives and negatives similarly (e.g., matching "no dog" with dog images). Existing methods refine negation understanding via fine-tuning CLIP’s text encoder, risking overfitting. In this work, we propose CLIPGlasses, a plug-and-play framework that enhances CLIP’s ability to comprehend negated visual descriptions. CLIPGlasses adapts a dual-stage design: a Lens module disentangles negated semantics from text embeddings, and a Frame module predicts context-aware repulsion strength, which is integrated into the modified similarity computation to penalize alignment with negated semantics, thereby reducing false positive matches. Experiments show that CLIP equipped with CLIPGlasses achieves competitive in-domain performance and outperforms state-of-the-art methods in cross-domain generalization. Its superiority is especially evident under low-resource conditions, indicating stronger robustness across domains.

AAAI Conference 2025 Conference Paper

Bridging Traffic State and Trajectory for Dynamic Road Network and Trajectory Representation Learning

  • Chengkai Han
  • Jingyuan Wang
  • Yongyao Wang
  • Xie Yu
  • Hao Lin
  • Chao Li
  • Junjie Wu

Effective urban traffic management is vital for sustainable city development, relying on intelligent systems with machine learning tasks such as traffic flow prediction and travel time estimation. Traditional approaches usually focus on static road network and trajectory representation learning, and overlook the dynamic nature of traffic states and trajectories, which is crucial for downstream tasks. To address this gap, we propose TRACK, a novel framework to bridge traffic state and trajectory data for dynamic road network and trajectory representation learning. TRACK leverages graph attention networks (GAT) to encode static and spatial road segment features, and introduces a transformer-based model for trajectory representation learning. By incorporating transition probabilities from trajectory data into GAT attention weights, TRACK captures dynamic spatial features of road segments. Meanwhile, TRACK designs a traffic transformer encoder to capture the spatial-temporal dynamics of road segments from traffic state data. To further enhance dynamic representations, TRACK proposes a co-attentional transformer encoder and a trajectory-traffic state matching task. Extensive experiments on real-life urban traffic datasets demonstrate the superiority of TRACK over state-of-the-art baselines. Case studies confirm TRACK’s ability to capture spatial-temporal dynamics effectively.

ICML Conference 2025 Conference Paper

RollingQ: Reviving the Cooperation Dynamics in Multimodal Transformer

  • Haotian Ni
  • Yake Wei
  • Hang Liu
  • Gong Chen
  • Chong Peng
  • Hao Lin
  • Di Hu 0001

Multimodal learning faces challenges in effectively fusing information from diverse modalities, especially when modality quality varies across samples. Dynamic fusion strategies, such as attention mechanism in Transformers, aim to address such challenge by adaptively emphasizing modalities based on the characteristics of input data. However, through amounts of carefully designed experiments, we surprisingly observed that the dynamic adaptability of widely-used self-attention models diminishes. Model tends to prefer one modality regardless of data characteristics. This bias triggers a self-reinforcing cycle that progressively overemphasizes the favored modality, widening the distribution gap in attention keys across modalities and deactivating attention mechanism’s dynamic properties. To revive adaptability, we propose a simple yet effective method Rolling Query (RollingQ), which balances attention allocation by rotating the query to break the self-reinforcing cycle and mitigate the key distribution gap. Extensive experiments on various multimodal scenarios validate the effectiveness of RollingQ and the restoration of cooperation dynamics is pivotal for enhancing the broader capabilities of widely deployed multimodal Transformers. The source code is available at https: //github. com/GeWu-Lab/RollingQ_ICML2025.

TCS Journal 2024 Journal Article

Hardness of Entropic Module-LWE

  • Hao Lin
  • Mingqiang Wang
  • Jincheng Zhuang
  • Yang Wang

The Learning with Errors (LWE) problem is a versatile basis for building various purpose post-quantum schemes. Goldwasser et al. [ISC 2010] initialized the study of a variant of this problem called the Entropic LWE problem, where the LWE secret is generated from a distribution with a certain min-entropy. Brakerski and Döttling recently further extended the study in this field, and first proved the hardness of the Entropic LWE problem with unbounded secret [Eurocrypt 2020], then gave a similar result for the Entropic Ring-LWE problem [TCC 2020]. In this work, we systematically study the hardness of the Entropic Module-LWE problem. Adapting the “lossiness approach” to the module setting, we give lower entropy bounds for the secret distributions that guarantee the hardness of the Entropic Module-LWE problem in both search and decision cases, where results are divided into two settings: bounded and unbounded norm. We also present that our search entropy lower bound in the unbounded case is essentially tight. An application of our bounded result is to deduce the hardness for the Binary Module-LWE problem. One of our central techniques is a new generalized leftover hash lemma over rings, which might be of independent interest.

ICML Conference 2023 Conference Paper

Surrogate Module Learning: Reduce the Gradient Error Accumulation in Training Spiking Neural Networks

  • Shikuang Deng
  • Hao Lin
  • Yuhang Li 0001
  • Shi Gu

Spiking neural networks provide an alternative solution to conventional artificial neural networks with energy-saving and high-efficiency characteristics after hardware implantation. However, due to its non-differentiable activation function and the temporally delayed accumulation in outputs, the direct training of SNNs is extraordinarily tough even adopting a surrogate gradient to mimic the backpropagation. For SNN training, this non-differentiability causes the intrinsic gradient error that would be magnified through layerwise backpropagation, especially through multiple layers. In this paper, we propose a novel approach to reducing gradient error from a new perspective called surrogate module learning (SML). Surrogate module learning tries to construct a shortcut path to back-propagate more accurate gradient to a certain SNN part utilizing the surrogate modules. Then, we develop a new loss function for concurrently training the network and enhancing the surrogate modules’ surrogate capacity. We demonstrate that when the outputs of surrogate modules are close to the SNN output, the fraction of the gradient error drops significantly. Our method consistently and significantly enhances the performance of SNNs on all experiment datasets, including CIFAR-10/100, ImageNet, and ES-ImageNet. For example, for spiking ResNet-34 architecture on ImageNet, we increased the SNN accuracy by 3. 46%.

AAAI Conference 2017 Conference Paper

Collaborative Company Profiling: Insights from an Employee’s Perspective

  • Hao Lin
  • Hengshu Zhu
  • Yuan Zuo
  • Chen Zhu
  • Junjie Wu
  • Hui Xiong

Company profiling is an analytical process to build an indepth understanding of company’s fundamental characteristics. It serves as an effective way to gain vital information of the target company and acquire business intelligence. Traditional approaches for company profiling rely heavily on the availability of rich finance information about the company, such as finance reports and SEC filings, which may not be readily available for many private companies. However, the rapid prevalence of online employment services enables a new paradigm — to obtain the variety of company’s information from their employees’ online ratings and comments. This, in turn, raises the challenge to develop company pro- files from an employee’s perspective. To this end, in this paper, we propose a method named Company Profiling based Collaborative Topic Regression (CPCTR), for learning the latent structural patterns of companies. By formulating a joint optimization framework, CPCTR has the ability in collaboratively modeling both textual (e. g. , reviews) and numerical information (e. g. , salaries and ratings). Indeed, with the identi- fied patterns, including the positive/negative opinions and the latent variable that influences salary, we can effectively carry out opinion analysis and salary prediction. Extensive experiments were conducted on a real-world data set to validate the effectiveness of CPCTR. The results show that our method provides a comprehensive understanding of company characteristics and delivers a more effective prediction of salaries than other baselines.

AIIM Journal 2017 Journal Article

Identify and analysis crotonylation sites in histone by using support vector machines

  • Wang-Ren Qiu
  • Bi-Qian Sun
  • Hua Tang
  • Jian Huang
  • Hao Lin

Objective Lysine crotonylation (Kcr) is a newly discovered histone posttranslational modification, which is specifically enriched at active gene promoters and potential enhancers in mammalian cell genomes. Although lysine crotonylation sites can be correctly identified with high-resolution mass spectrometry, the experimental methods are time-consuming and expensive. Therefore, it is necessary to develop computational methods to deal with this problem. Methods We proposed a new encoding scheme named position weight amino acid composition to extract sequence information of histone around crotonylation sites. We chose protein data from Uniprot database. A series of steps were used to construct a strict and objective benchmark dataset for training and testing the proposed method. All samples were characterized by a significant number of features derived from position weight amino acid composition. The support vector machine was used to perform classification. Results Based on a series of experiments, we found that the sensitivity (Sn), specificity (Sp), accuracy (Acc), and Matthew’s correlation coefficient (MCC) were respectively 71. 69%, 98. 7%, 94. 43%, and 0. 778 in jackknife cross-validation. Comparison results demonstrated that our proposed model was better than random forest algorithm. We also performed the feature analysis on samples. Conclusion Identification of the Kcr sites in histone is an indispensable step for decoding protein function. Therefore, the method can promote the deep understanding of the physiological roles of crotonylation and provide useful information for developing drugs to treat various diseases associated with crotonylation.

IS Journal 2015 Journal Article

Noninvasive and Continuous Blood Pressure Monitoring Using Wearable Body Sensor Networks

  • Hao Lin
  • Wenyao Xu
  • Nan Guan
  • Dong Ji
  • Yangjie Wei
  • Wang Yi

Hypertension is a major health risk that influences the quality of life for many people. The importance of monitoring hypertension in a continuous and noninvasive manner increases as more people experience raised blood pressure (BP). The authors present a smartphone-centric body sensor network to measure the pulse transit time (PTT) in real time. Their robust method for calculating BP uses PPT information that considers the baroreflex mechanism, which reflects the relationship between BP and the heart rate. To evaluate the performance of their proposed method, they collected 300 groups of data from six subjects before and after exercise. Experimental results show that their proposed method can estimate BP values in real time with good precision.