Arrow Research search

Author name cluster

Chuhan Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
2 author rows

Possible papers

11

NeurIPS Conference 2025 Conference Paper

P-Law: Predicting Quantitative Scaling Law with Entropy Guidance in Large Recommendation Models

  • Tingjia Shen
  • Hao Wang
  • Chuhan Wu
  • Jin Yao Chin
  • Wei Guo
  • Yong Liu
  • Huifeng Guo
  • Defu Lian

With the growing size of data and models in Large Recommendation Models, the time required for debugging has become increasingly prohibitive, underscoring the urgent need for effective guidance in parameter configuration. The Scaling Law (SL) offers analogous guidance in the Sequential Language domain, having achieved significant success by predicting model loss when scaling model size. However, the existing guidance from SL for Sequential Recommendation (SR) remains qualitative, which is because quantitative analysis of SL on SR encounters challenges with quality measurement on redundant sequences along with loss-performance discrepancy. In response, we introduce the Performance Law (P-Law) for SR models, which predicts model performance across various settings, intending to provide a quantitative framework for guiding the parameter optimization of future models. Initially, Performance Law utilizes Real Entropy to measure data quality, aiming to remove the low-quality influence of low-entropy redundant sequences. Subsequently, Performance Law investigates a fitting decay term, which facilitated the prediction of the major loss-performance discrepancy phenomena of overfitting, ultimately achieving quantitative performance prediction. Extensive experiment on various datasets demonstrates the effectiveness of Performance Law by displaying exceptional quantitative prediction ability against the original and modified qualitative SL. Additional application experiments on optimal parameter prediction and model expansion potential prediction also demonstrated the broad applicability of the Performance Law.

ICLR Conference 2025 Conference Paper

RevisEval: Improving LLM-as-a-Judge via Response-Adapted References

  • Qiyuan Zhang 0001
  • Yufei Wang 0005
  • Tiezheng Yu
  • Yuxin Jiang
  • Chuhan Wu
  • Liangyou Li
  • Yasheng Wang
  • Xin Jiang 0002

With significant efforts in recent studies, LLM-as-a-Judge has become a cost-effective alternative to human evaluation for assessing text generation quality in a wide range of tasks. However, there still remains a reliability gap between LLM-as-a-Judge and human evaluation. One important reason is the lack of guided oracles in the evaluation process. Motivated by the role of reference pervasively used in classic text evaluation, we introduce RevisEval, a novel text generation evaluation paradigm via the response-adapted references. RevisEval is driven by the key observation that an ideal reference should maintain the necessary relevance to the response to be evaluated. Specifically, RevisEval leverages the text revision capabilities of large language models (LLMs) to adaptively revise the response, then treat the revised text as the reference (response-adapted reference) for the subsequent evaluation. Extensive experiments demonstrate that RevisEval outperforms traditional reference-free and reference-based evaluation paradigms that use LLM-as-a-Judge across NLG tasks and open-ended instruction-following tasks. More importantly, our response-adapted references can further boost the classical text metrics, e.g., BLEU and BERTScore, compared to traditional references and even rival the LLM-as-a-Judge. A detailed analysis is also conducted to confirm RevisEval's effectiveness in bias reduction, the impact of inference cost, and reference relevance.

ICLR Conference 2025 Conference Paper

ToolACE: Winning the Points of LLM Function Calling

  • Weiwen Liu
  • Xu Huang 0008
  • Xingshan Zeng
  • Xinlong Hao
  • Shuai Yu
  • Dexun Li
  • Shuai Wang 0020
  • Weinan Gan

Function calling significantly extends the application boundary of large language models (LLMs), where high-quality and diverse training data is critical for unlocking this capability. However, collecting and annotating real function-calling data is challenging, while synthetic data from existing pipelines often lack coverage and accuracy. In this paper, we present ToolACE, an automatic agentic pipeline designed to generate accurate, complex, and diverse tool-learning data, specifically tailored to the capabilities of LLMs. ToolACE leverages a novel self-evolution synthesis process to curate a comprehensive API pool of 26,507 diverse APIs. Dialogs are further generated through the interplay among multiple agents, under the guidance of a complexity evaluator. To ensure data accuracy, we implement a dual-layer verification system combining rule-based and model-based checks. We demonstrate that models trained on our synthesized data---even with only 8B parameters---achieve state-of-the-art performance, comparable to the latest GPT-4 models. Our model and a subset of the data are publicly available at https://huggingface.co/Team-ACE.

NeurIPS Conference 2022 Conference Paper

FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning

  • Tao Qi
  • Fangzhao Wu
  • Chuhan Wu
  • Lingjuan Lyu
  • Tong Xu
  • Hao Liao
  • Zhongliang Yang
  • Yongfeng Huang

Vertical federated learning (VFL) is a privacy-preserving machine learning paradigm that can learn models from features distributed on different platforms in a privacy-preserving way. Since in real-world applications the data may contain bias on fairness-sensitive features (e. g. , gender), VFL models may inherit bias from training data and become unfair for some user groups. However, existing fair machine learning methods usually rely on the centralized storage of fairness-sensitive features to achieve model fairness, which are usually inapplicable in federated scenarios. In this paper, we propose a fair vertical federated learning framework (FairVFL), which can improve the fairness of VFL models. The core idea of FairVFL is to learn unified and fair representations of samples based on the decentralized feature fields in a privacy-preserving way. Specifically, each platform with fairness-insensitive features first learns local data representations from local features. Then, these local representations are uploaded to a server and aggregated into a unified representation for the target task. In order to learn a fair unified representation, we send it to each platform storing fairness-sensitive features and apply adversarial learning to remove bias from the unified representation inherited from the biased data. Moreover, for protecting user privacy, we further propose a contrastive adversarial learning method to remove private information from the unified representation in server before sending it to the platforms keeping fairness-sensitive features. Experiments on three real-world datasets validate that our method can effectively improve model fairness with user privacy well-protected.

TIST Journal 2022 Journal Article

FedCTR: Federated Native Ad CTR Prediction with Cross-platform User Behavior Data

  • Chuhan Wu
  • Fangzhao Wu
  • Lingjuan Lyu
  • Yongfeng Huang
  • Xing Xie

Native ad is a popular type of online advertisement that has similar forms with the native content displayed on websites. Native ad click-through rate (CTR) prediction is useful for improving user experience and platform revenue. However, it is challenging due to the lack of explicit user intent, and user behaviors on the platform with native ads may be insufficient to infer users’ interest in ads. Fortunately, user behaviors exist on many online platforms that can provide complementary information for user-interest mining. Thus, leveraging multi-platform user behaviors is useful for native ad CTR prediction. However, user behaviors are highly privacy-sensitive, and the behavior data on different platforms cannot be directly aggregated due to user privacy concerns and data protection regulations. Existing CTR prediction methods usually require centralized storage of user behavior data for user modeling, which cannot be directly applied to the CTR prediction task with multi-platform user behaviors. In this article, we propose a federated native ad CTR prediction method named FedCTR, which can learn user-interest representations from cross-platform user behaviors in a privacy-preserving way. On each platform a local user model learns user embeddings from the local user behaviors on that platform. The local user embeddings from different platforms are uploaded to a server for aggregation, and the aggregated ones are sent to the ad platform for CTR prediction. Besides, we apply local differential privacy and differential privacy to the local and aggregated user embeddings, respectively, for better privacy protection. Moreover, we propose a federated framework for collaborative model training with distributed models and user behaviors. Extensive experiments on real-world dataset show that FedCTR can effectively leverage multi-platform user behaviors for native ad CTR prediction in a privacy-preserving manner.

IJCAI Conference 2022 Conference Paper

Rethinking InfoNCE: How Many Negative Samples Do You Need?

  • Chuhan Wu
  • Fangzhao Wu
  • Yongfeng Huang

InfoNCE is a widely used contrastive training loss. It aims to estimate the mutual information between a pair of variables by discriminating between each positive pair and its associated K negative pairs. It is proved that when the sample labels are clean, the lower bound of mutual information estimation is tighter when more negative samples are incorporated, which usually yields better model performance. However, in practice the labels often contain noise, and incorporating too many noisy negative samples into model training may be suboptimal. In this paper, we study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework. More specifically, we first propose a probabilistic model to analyze the influence of the negative sampling ratio K on training sample informativeness. Then, we design a training effectiveness function to measure the overall influence of training samples based on their informativeness. We estimate the optimal negative sampling ratio using the K value that maximizes the training effectiveness function. Based on our framework, we further propose an adaptive negative sampling method that can dynamically adjust the negative sampling ratio to improve InfoNCE-based model training. Extensive experiments in three different tasks show our framework can accurately predict the optimal negative sampling ratio, and various models can benefit from our adaptive negative sampling method.

AAAI Conference 2021 Conference Paper

Fairness-aware News Recommendation with Decomposed Adversarial Learning

  • Chuhan Wu
  • Fangzhao Wu
  • Xiting Wang
  • Yongfeng Huang
  • Xing Xie

News recommendation is important for online news services. Existing news recommendation models are usually learned from users’ news click behaviors. Usually the behaviors of users with the same sensitive attributes (e. g. , genders) have similar patterns and news recommendation models can easily capture these patterns. It may lead to some biases related to sensitive user attributes in the recommendation results, e. g. , always recommending sports news to male users, which is unfair since users may not receive diverse news information. In this paper, we propose a fairness-aware news recommendation approach with decomposed adversarial learning and orthogonality regularization, which can alleviate unfairness in news recommendation brought by the biases of sensitive user attributes. In our approach, we propose to decompose the user interest model into two components. One component aims to learn a bias-aware user embedding that captures the bias information on sensitive user attributes, and the other aims to learn a bias-free user embedding that only encodes attribute-independent user interest information for fairnessaware news recommendation. In addition, we propose to apply an attribute prediction task to the bias-aware user embedding to enhance its ability on bias modeling, and we apply adversarial learning to the bias-free user embedding to remove the bias information from it. Moreover, we propose an orthogonality regularization method to encourage the bias-free user embeddings to be orthogonal to the bias-aware one to better distinguish the bias-free user embedding from the bias-aware one. For fairness-aware news ranking, we only use the biasfree user embedding. Extensive experiments on benchmark dataset show that our approach can effectively improve fairness in news recommendation with minor performance loss.

IJCAI Conference 2021 Conference Paper

User-as-Graph: User Modeling with Heterogeneous Graph Pooling for News Recommendation

  • Chuhan Wu
  • Fangzhao Wu
  • Yongfeng Huang
  • Xing Xie

Accurate user modeling is critical for news recommendation. Existing news recommendation methods usually model users' interest from their behaviors via sequential or attentive models. However, they cannot model the rich relatedness between user behaviors, which can provide useful contexts of these behaviors for user interest modeling. In this paper, we propose a novel user modeling approach for news recommendation, which models each user as a personalized heterogeneous graph built from user behaviors to better capture the fine-grained behavior relatedness. In addition, in order to learn user interest embedding from the personalized heterogeneous graph, we propose a novel heterogeneous graph pooling method, which can summarize both node features and graph topology, and be aware of the varied characteristics of different types of nodes. Experiments on large-scale benchmark dataset show the proposed methods can effectively improve the performance of user modeling for news recommendation.

IJCAI Conference 2020 Conference Paper

User Modeling with Click Preference and Reading Satisfaction for News Recommendation

  • Chuhan Wu
  • Fangzhao Wu
  • Tao Qi
  • Yongfeng Huang

Modeling user interest is critical for accurate news recommendation. Existing news recommendation methods usually infer user interest from click behaviors on news. However, users may click a news article because attracted by its title shown on the news website homepage, but may not be satisfied with its content after reading. In many cases users close the news page quickly after click. In this paper we propose to model user interest from both click behaviors on news titles and reading behaviors on news content for news recommendation. More specifically, we propose a personalized reading speed metric to measure users’ satisfaction with news content. We learn embeddings of users from the news content they have read and their satisfaction with these news to model their interest in news content. In addition, we also learn another user embedding from the news titles they have clicked to model their preference in news titles. We combine both kinds of user embeddings into a unified user representation for news recommendation. We train the user representation model using two supervised learning tasks built from user behaviors, i. e. , news title based click prediction and news content based satisfaction prediction, to encourage our model to recommend the news articles which not only are likely to be clicked but also have the content satisfied by the user. Experiments on real-world dataset show our method can effectively boost the performance of user modeling for news recommendation.

AAAI Conference 2019 Conference Paper

Incorporating Semantic Similarity with Geographic Correlation for Query-POI Relevance Learning

  • Ji Zhao
  • Dan Peng
  • Chuhan Wu
  • Huan Chen
  • Meiyu Yu
  • Wanji Zheng
  • Li Ma
  • Hua Chai

Point-of-interest (POI) retrieval that searches for relevant destination locations plays a significant role in on-demand ridehailing services. Existing solutions to POI retrieval mainly retrieve and rank POIs based on their semantic similarity scores. Although intuitive, quantifying the relevance of a Query-POI pair by single-field semantic similarity is subject to inherent limitations. In this paper, we propose a novel Query-POI relevance model for effective POI retrieval for ondemand ride-hailing services. Different from existing relevance models, we capture and represent multi-field and local&global semantic features of a Query-POI pair to measure the semantic similarity. Besides, we observe a hidden correlation between origin-destination locations in ride-hailing scenarios, and propose two location embeddings to characterize the specific correlation. By incorporating the geographic correlation with the semantic similarity, our model achieves better performance in POI ranking. Experimental results on two real-world click-through datasets demonstrate the improvements of our model over state-of-the-art methods.

IJCAI Conference 2019 Conference Paper

Neural News Recommendation with Attentive Multi-View Learning

  • Chuhan Wu
  • Fangzhao Wu
  • Mingxiao An
  • Jianqiang Huang
  • Yongfeng Huang
  • Xing Xie

Personalized news recommendation is very important for online news platforms to help users find interested news and improve user experience. News and user representation learning is critical for news recommendation. Existing news recommendation methods usually learn these representations based on single news information, e. g. , title, which may be insufficient. In this paper we propose a neural news recommendation approach which can learn informative representations of users and news by exploiting different kinds of news information. The core of our approach is a news encoder and a user encoder. In the news encoder we propose an attentive multi-view learning model to learn unified news representations from titles, bodies and topic categories by regarding them as different views of news. In addition, we apply both word-level and view-level attention mechanism to news encoder to select important words and views for learning informative news representations. In the user encoder we learn the representations of users based on their browsed news and apply attention mechanism to select informative news for user representation learning. Extensive experiments on a real-world dataset show our approach can effectively improve the performance of news recommendation.