Author name cluster

Xiuqiang He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

1 author row

NeurIPS Conference 2025 Conference Paper

Beyond Higher Rank: Token-wise Input-Output Projections for Efficient Low-Rank Adaptation

Shiwei Li
Xiandi Luo
Haozhao Wang
Xing Tang
Ziqiang Cui
Dugang Liu
Yuhua Li
Xiuqiang He

Low-rank adaptation (LoRA) is a parameter-efficient fine-tuning (PEFT) method widely used in large language models (LLMs). LoRA essentially describes the projection of an input space into a low-dimensional output space, with the dimensionality determined by the LoRA rank. In standard LoRA, all input tokens share the same weights and undergo an identical input-output projection. This limits LoRA's ability to capture token-specific information due to the inherent semantic differences among tokens. To address this limitation, we propose **Token-wise Projected Low-Rank Adaptation (TopLoRA)**, which dynamically adjusts LoRA weights according to the input token, thereby learning token-wise input-output projections in an end-to-end manner. Formally, the weights of TopLoRA can be expressed as $B\Sigma_X A$, where $A$ and $B$ are low-rank matrices (as in standard LoRA), and $\Sigma_X$ is a diagonal matrix generated from each input token $X$. Notably, TopLoRA does not increase the rank of LoRA weights but achieves more granular adaptation by learning token-wise LoRA weights (i. e. , token-wise input-output projections). Extensive experiments across multiple models and datasets demonstrate that TopLoRA consistently outperforms LoRA and its variants. The code is available at https: //github. com/Leopold1423/toplora-neurips25.

PDF Details

NeurIPS Conference 2025 Conference Paper

Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation

Ziqiang Cui
Yunpeng Weng
Xing Tang
Xiaokun Zhang
Shiwei Li
Peiyang Liu
Bowei He
Dugang Liu

Contrastive learning has shown effectiveness in improving sequential recommendation models. However, existing methods still face challenges in generating high-quality contrastive pairs: they either rely on random perturbations that corrupt user preference patterns or depend on sparse collaborative data that generates unreliable contrastive pairs. Furthermore, existing approaches typically require predefined selection rules that impose strong assumptions, limiting the model's ability to autonomously learn optimal contrastive pairs. To address these limitations, we propose a novel approach named Semantic Retrieval Augmented Contrastive Learning (SRA-CL). SRA-CL leverages the semantic understanding and reasoning capabilities of LLMs to generate expressive embeddings that capture both user preferences and item characteristics. These semantic embeddings enable the construction of candidate pools for inter-user and intra-user contrastive learning through semantic-based retrieval. To further enhance the quality of the contrastive samples, we introduce a learnable sample synthesizer that optimizes the contrastive sample generation process during model training. SRA-CL adopts a plug-and-play design, enabling seamless integration with existing sequential recommendation architectures. Extensive experiments on four public datasets demonstrate the effectiveness and model-agnostic nature of our approach. Our code is available at https: //github. com/ziqiangcui/SRA-CL

PDF Details

NeurIPS Conference 2023 Conference Paper

Towards Hybrid-grained Feature Interaction Selection for Deep Sparse Network

Fuyuan Lyu
Xing Tang
Dugang Liu
Chen Ma
Weihong Luo
Liang Chen
Xiuqiang He
Xue (Steve) Liu

Deep sparse networks are widely investigated as a neural network architecture for prediction tasks with high-dimensional sparse features, with which feature interaction selection is a critical component. While previous methods primarily focus on how to search feature interaction in a coarse-grained space, less attention has been given to a finer granularity. In this work, we introduce a hybrid-grained feature interaction selection approach that targets both feature field and feature value for deep sparse networks. To explore such expansive space, we propose a decomposed space which is calculated on the fly. We then develop a selection algorithm called OptFeature, which efficiently selects the feature interaction from both the feature field and the feature value simultaneously. Results from experiments on three large real-world benchmark datasets demonstrate that OptFeature performs well in terms of accuracy and efficiency. Additional studies support the feasibility of our method. All source code are publicly available\footnote{https: //anonymous. 4open. science/r/OptFeature-Anonymous}.

PDF Details

AAAI Conference 2022 Conference Paper

Regularization Penalty Optimization for Addressing Data Quality Variance in OoD Algorithms

Runpeng Yu
Hong Zhu
Kaican Li
Lanqing Hong
Rui Zhang
Nanyang Ye
Shao-Lun Huang
Xiuqiang He

Due to the poor generalization performance of traditional empirical risk minimization (ERM) in the case of distributional shift, Out-of-Distribution (OoD) generalization algorithms receive increasing attention. However, OoD generalization algorithms overlook the great variance in the quality of training data, which significantly compromises the accuracy of these methods. In this paper, we theoretically reveal the relationship between training data quality and algorithm performance and analyze the optimal regularization scheme for Lipschitz regularized invariant risk minimization. A novel algorithm is proposed based on the theoretical results to alleviate the influence of low-quality data at both the sample level and the domain level. The experiments on both the regression and classification benchmarks validate the effectiveness of our method with statistical significance.

PDF Details

IJCAI Conference 2021 Conference Paper

Deep Learning for Click-Through Rate Estimation

Weinan Zhang
Jiarui Qin
Wei Guo
Ruiming Tang
Xiuqiang He

Click-through rate (CTR) estimation plays as a core function module in various personalized online services, including online advertising, recommender systems, and web search etc. From 2015, the success of deep learning started to benefit CTR estimation performance and now deep CTR models have been widely applied in many industrial platforms. In this survey, we provide a comprehensive review of deep learning models for CTR estimation tasks. First, we take a review of the transfer from shallow to deep CTR models and explain why going deep is a necessary trend of development. Second, we concentrate on explicit feature interaction learning modules of deep CTR models. Then, as an important perspective on large platforms with abundant user histories, deep behavior models are discussed. Moreover, the recently emerged automated methods for deep CTR architecture design are presented. Finally, we summarize the survey and discuss the future prospects of this field.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Graph Heterogeneous Multi-Relational Recommendation

Chong Chen
Weizhi Ma
Min Zhang
Zhaowei Wang
Xiuqiang He
Chenyang Wang
Yiqun Liu
Shaoping Ma

Traditional studies on recommender systems usually leverage only one type of user behaviors (the optimization target, such as purchase), despite the fact that users also generate a large number of various types of interaction data (e. g. , view, click, add-to-cart, etc). Generally, these heterogeneous multirelational data provide well-structured information and can be used for high-quality recommendation. Early efforts towards leveraging these heterogeneous data fail to capture the high-hop structure of user-item interactions, which are unable to make full use of them and may only achieve constrained recommendation performance. In this work, we propose a new multi-relational recommendation model named Graph Heterogeneous Collaborative Filtering (GHCF). To explore the high-hop heterogeneous user-item interactions, we take the advantages of Graph Convolutional Network (GCN) and further improve it to jointly embed both representations of nodes (users and items) and relations for multi-relational prediction. Moreover, to fully utilize the whole heterogeneous data, we perform the advanced efficient non-sampling optimization under a multi-task learning framework. Experimental results on two public benchmarks show that GHCF significantly outperforms the state-of-the-art recommendation methods, especially for cold-start users who have few primary item interactions. Further analysis verifies the importance of the proposed embedding propagation for modelling high-hop heterogeneous user-item interactions, showing the rationality and effectiveness of GHCF. Our implementation has been released (https: //github. com/chenchongthu/GHCF).

PDF Details

AAAI Conference 2021 Short Paper

Modeling High-order Interactions across Multi-interests for Micro-video Reommendation (Student Abstract)

Dong Yao
Shengyu Zhang
Zhou Zhao
Wenyan Fan
Jieming Zhu
Xiuqiang He
Fei Wu

Personalized recommendation system has become pervasive in various video platform. Many effective methods have been proposed, but most of them didn’t capture the user’s multilevel interest trait and dependencies between their viewed micro-videos well. To solve these problems, we propose a Self-over-Co Attention module to enhance user’s interest representation. In particular, we first use co-attention to model correlation patterns across different levels and then use selfattention to model correlation patterns within a specific level. Experimental results on filtered public datasets verify that our presented module is useful.

PDF Details

IJCAI Conference 2021 Conference Paper

UNBERT: User-News Matching BERT for News Recommendation

Qi Zhang
Jingjie Li
Qinglin Jia
Chuyuan Wang
Jieming Zhu
Zhaowei Wang
Xiuqiang He

Nowadays, news recommendation has become a popular channel for users to access news of their interests. How to represent rich textual contents of news and precisely match users' interests and candidate news lies in the core of news recommendation. However, existing recommendation methods merely learn textual representations from in-domain news data, which limits their generalization ability to new news that are common in cold-start scenarios. Meanwhile, many of these methods represent each user by aggregating the historically browsed news into a single vector and then compute the matching score with the candidate news vector, which may lose the low-level matching signals. In this paper, we explore the use of the successful BERT pre-training technique in NLP for news recommendation and propose a BERT-based user-news matching model, called UNBERT. In contrast to existing research, our UNBERT model not only leverages the pre-trained model with rich language knowledge to enhance textual representation, but also captures multi-grained user-news matching signals at both word-level and news-level. Extensive experiments on the Microsoft News Dataset (MIND) demonstrate that our approach constantly outperforms the state-of-the-art methods.

PDF Details DOI

NeurIPS Conference 2020 Conference Paper

Counterfactual Contrastive Learning for Weakly-Supervised Vision-Language Grounding

Zhu Zhang
Zhou Zhao
Zhijie Lin
Jieming Zhu
Xiuqiang He

Weakly-supervised vision-language grounding aims to localize a target moment in a video or a specific region in an image according to the given sentence query, where only video-level or image-level sentence annotations are provided during training. Most existing approaches employ the MIL-based or reconstruction-based paradigms for the WSVLG task, but the former heavily depends on the quality of randomly-selected negative samples and the latter cannot directly optimize the visual-textual alignment score. In this paper, we propose a novel Counterfactual Contrastive Learning (CCL) to develop sufficient contrastive training between counterfactual positive and negative results, which are based on robust and destructive counterfactual transformations. Concretely, we design three counterfactual transformation strategies from the feature-, interaction- and relation-level, where the feature-level method damages the visual features of selected proposals, interaction-level approach confuses the vision-language interaction and relation-level strategy destroys the context clues in proposal relationships. Extensive experiments on five vision-language grounding datasets verify the effectiveness of our CCL paradigm.

PDF Details

AAAI Conference 2020 Conference Paper

Less Is Better: Unweighted Data Subsampling via Influence Function

Zifeng Wang
Hong Zhu
Zhenhua Dong
Xiuqiang He
Shao-Lun Huang

In the time of Big Data, training complex models on largescale data sets is challenging, making it appealing to reduce data volume for saving computation resources by subsampling. Most previous works in subsampling are weighted methods designed to help the performance of subset-model approach the full-set-model, hence the weighted methods have no chance to acquire a subset-model that is better than the full-set-model. However, we question that how can we achieve better model with less data? In this work, we propose a novel Unweighted Inﬂuence Data Subsampling (UIDS) method, and prove that the subset-model acquired through our method can outperform the full-set-model. Besides, we show that overly conﬁdent on a given test set for sampling is common in Inﬂuence-based subsampling methods, which can eventually cause our subset-model’s failure in out-ofsample test. To mitigate it, we develop a probabilistic sampling scheme to control the worst-case risk over all distributions close to the empirical distribution. The experiment results demonstrate our methods superiority over existed subsampling methods in diverse tasks, such as text classiﬁcation, image classiﬁcation, click-through prediction, etc.

PDF Details

AAAI Conference 2020 Conference Paper

Leveraging Title-Abstract Attentive Semantics for Paper Recommendation

Guibing Guo
Bowei Chen
Xiaoyan Zhang
Zhirong Liu
Zhenhua Dong
Xiuqiang He

Paper recommendation is a research topic to provide users with personalized papers of interest. However, most existing approaches equally treat title and abstract as the input to learn the representation of a paper, ignoring their semantic relationship. In this paper, we regard the abstract as a sequence of sentences, and propose a two-level attentive neural network to capture: (1) the ability of each word within a sentence to reﬂect if it is semantically close to the words within the title. (2) the extent of each sentence in the abstract relative to the title, which is often a good summarization of the abstract document. Speciﬁcally, we propose a Long-Short Term Memory (LSTM) network with attention to learn the representation of sentences, and integrate a Gated Recurrent Unit (GRU) network with a memory network to learn the long-term sequential sentence patterns of interacted papers for both user and item (paper) modeling. We conduct extensive experiments on two real datasets, and show that our approach outperforms other state-of-the-art approaches in terms of accuracy.

PDF Details

IJCAI Conference 2017 Conference Paper

DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

Huifeng Guo
Ruiming Tang
Yunming Ye
Zhenguo Li
Xiuqiang He

Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems. Despite great progress, existing methods seem to have a strong bias towards low- or high-order interactions, or require expertise feature engineering. In this paper, we show that it is possible to derive an end-to-end learning model that emphasizes both low- and high-order feature interactions. The proposed model, DeepFM, combines the power of factorization machines for recommendation and deep learning for feature learning in a new neural network architecture. Compared to the latest Wide & Deep model from Google, DeepFM has a shared input to its "wide" and "deep" parts, with no need of feature engineering besides raw features. Comprehensive experiments are conducted to demonstrate the effectiveness and efficiency of DeepFM over the existing models for CTR prediction, on both benchmark data and commercial data.

PDF Details

AAAI Conference 2015 Conference Paper

Personalized Tag Recommendation through Nonlinear Tensor Factorization Using Gaussian Kernel

Xiaomin Fang
Rong Pan
Guoxiang Cao
Xiuqiang He
Wenyuan Dai

Personalized tag recommendation systems recommend a list of tags to a user when he is about to annotate an item. It exploits the individual preference and the characteristic of the items. Tensor factorization techniques have been applied to many applications, such as tag recommendation. Models based on Tucker Decomposition can achieve good performance but require a lot of computation power. On the other hand, models based on Canonical Decomposition can run in linear time and are more feasible for online recommendation. In this paper, we propose a novel method for personalized tag recommendation, which can be considered as a nonlinear extension of Canonical Decomposition. Different from linear tensor factorization, we exploit Gaussian radial basis function to increase the model’s capacity. The experimental results show that our proposed method outperforms the state-of-the-art methods for tag recommendation on real datasets and perform well even with a small number of features, which verifies that our models can make better use of features.

PDF Details