Author name cluster

Fangzhao Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

17 papers

2 author rows

ICML Conference 2023 Conference Paper

Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting

Yuchen Liu
Chen Chen 0043
Lingjuan Lyu
Fangzhao Wu
Sai Wu
Gang Chen 0001

Federated learning has exhibited vulnerabilities to Byzantine attacks, where the Byzantine attackers can send arbitrary gradients to a central server to destroy the convergence and performance of the global model. A wealth of robust AGgregation Rules (AGRs) have been proposed to defend against Byzantine attacks. However, Byzantine clients can still circumvent robust AGRs when data is non-Identically and Independently Distributed (non-IID). In this paper, we first reveal the root causes of performance degradation of current robust AGRs in non-IID settings: the curse of dimensionality and gradient heterogeneity. In order to address this issue, we propose GAS, a GrAdient Splitting approach that can successfully adapt existing robust AGRs to non-IID settings. We also provide a detailed convergence analysis when the existing robust AGRs are combined with GAS. Experiments on various real-world datasets verify the efficacy of our proposed GAS. The implementation code is provided in https: //github. com/YuchenLiu-a/byzantine-gas.

Details

IJCAI Conference 2023 Conference Paper

FedSampling: A Better Sampling Strategy for Federated Learning

Tao Qi
Fangzhao Wu
Lingjuan Lyu
Yongfeng Huang
Xing Xie

Federated learning (FL) is an important technique for learning models from decentralized data in a privacy-preserving way. Existing FL methods usually uniformly sample clients for local model learning in each round. However, different clients may have significantly different data sizes, and the clients with more data cannot have more opportunities to contribute to model training, which may lead to inferior performance. In this paper, instead of client uniform sampling, we propose a novel data uniform sampling strategy for federated learning (FedSampling), which can effectively improve the performance of federated learning especially when client data size distribution is highly imbalanced across clients. In each federated learning round, local data on each client is randomly sampled for local model learning according to a probability based on the server desired sample size and the total sample size on all available clients. Since the data size on each client is privacy-sensitive, we propose a privacy-preserving way to estimate the total sample size with a differential privacy guarantee. Experiments on four benchmark datasets show that FedSampling can effectively improve the performance of federated learning.

PDF Details DOI

ICML Conference 2023 Conference Paper

Personalized Federated Learning with Inferred Collaboration Graphs

Rui Ye 0001
Zhenyang Ni
Fangzhao Wu
Siheng Chen
Yanfeng Wang 0001

Personalized federated learning (FL) aims to collaboratively train a personalized model for each client. Previous methods do not adaptively determine who to collaborate at a fine-grained level, making them difficult to handle diverse data heterogeneity levels and those cases where malicious clients exist. To address this issue, our core idea is to learn a collaboration graph, which models the benefits from each pairwise collaboration and allocates appropriate collaboration strengths. Based on this, we propose a novel personalized FL algorithm, pFedGraph, which consists of two key modules: (1) inferring the collaboration graph based on pairwise model similarity and dataset size at server to promote fine-grained collaboration and (2) optimizing local model with the assistance of aggregated model at client to promote personalization. The advantage of pFedGraph is flexibly adaptive to diverse data heterogeneity levels and model poisoning attacks, as the proposed collaboration graph always pushes each client to collaborate more with similar and beneficial clients. Extensive experiments show that pFedGraph consistently outperforms the other $14$ baseline methods across various heterogeneity levels and multiple cases where malicious clients exist. Code will be available at https: //github. com/MediaBrain-SJTU/pFedGraph.

Details

NeurIPS Conference 2022 Conference Paper

CATER: Intellectual Property Protection on Text Generation APIs via Conditional Watermarks

Xuanli He
Qiongkai Xu
Yi Zeng
Lingjuan Lyu
Fangzhao Wu
Jiwei Li
Ruoxi Jia

Previous works have validated that text generation APIs can be stolen through imitation attacks, causing IP violations. In order to protect the IP of text generation APIs, recent work has introduced a watermarking algorithm and utilized the null-hypothesis test as a post-hoc ownership verification on the imitation models. However, we find that it is possible to detect those watermarks via sufficient statistics of the frequencies of candidate watermarking words. To address this drawback, in this paper, we propose a novel Conditional wATERmarking framework (CATER) for protecting the IP of text generation APIs. An optimization method is proposed to decide the watermarking rules that can minimize the distortion of overall word distributions while maximizing the change of conditional word selections. Theoretically, we prove that it is infeasible for even the savviest attacker (they know how CATER works) to reveal the used watermarks from a large pool of potential word pairs based on statistical inspection. Empirically, we observe that high-order conditions lead to an exponential growth of suspicious (unused) watermarks, making our crafted watermarks more stealthy. In addition, CATER can effectively identify IP infringement under architectural mismatch and cross-domain imitation attacks, with negligible impairments on the generation quality of victim APIs. We envision our work as a milestone for stealthily protecting the IP of text generation APIs.

PDF Details

NeurIPS Conference 2022 Conference Paper

FairVFL: A Fair Vertical Federated Learning Framework with Contrastive Adversarial Learning

Tao Qi
Fangzhao Wu
Chuhan Wu
Lingjuan Lyu
Tong Xu
Hao Liao
Zhongliang Yang
Yongfeng Huang

Vertical federated learning (VFL) is a privacy-preserving machine learning paradigm that can learn models from features distributed on different platforms in a privacy-preserving way. Since in real-world applications the data may contain bias on fairness-sensitive features (e. g. , gender), VFL models may inherit bias from training data and become unfair for some user groups. However, existing fair machine learning methods usually rely on the centralized storage of fairness-sensitive features to achieve model fairness, which are usually inapplicable in federated scenarios. In this paper, we propose a fair vertical federated learning framework (FairVFL), which can improve the fairness of VFL models. The core idea of FairVFL is to learn unified and fair representations of samples based on the decentralized feature fields in a privacy-preserving way. Specifically, each platform with fairness-insensitive features first learns local data representations from local features. Then, these local representations are uploaded to a server and aggregated into a unified representation for the target task. In order to learn a fair unified representation, we send it to each platform storing fairness-sensitive features and apply adversarial learning to remove bias from the unified representation inherited from the biased data. Moreover, for protecting user privacy, we further propose a contrastive adversarial learning method to remove private information from the unified representation in server before sending it to the platforms keeping fairness-sensitive features. Experiments on three real-world datasets validate that our method can effectively improve model fairness with user privacy well-protected.

PDF Details

TIST Journal 2022 Journal Article

FedCTR: Federated Native Ad CTR Prediction with Cross-platform User Behavior Data

Chuhan Wu
Fangzhao Wu
Lingjuan Lyu
Yongfeng Huang
Xing Xie

Native ad is a popular type of online advertisement that has similar forms with the native content displayed on websites. Native ad click-through rate (CTR) prediction is useful for improving user experience and platform revenue. However, it is challenging due to the lack of explicit user intent, and user behaviors on the platform with native ads may be insufficient to infer users’ interest in ads. Fortunately, user behaviors exist on many online platforms that can provide complementary information for user-interest mining. Thus, leveraging multi-platform user behaviors is useful for native ad CTR prediction. However, user behaviors are highly privacy-sensitive, and the behavior data on different platforms cannot be directly aggregated due to user privacy concerns and data protection regulations. Existing CTR prediction methods usually require centralized storage of user behavior data for user modeling, which cannot be directly applied to the CTR prediction task with multi-platform user behaviors. In this article, we propose a federated native ad CTR prediction method named FedCTR, which can learn user-interest representations from cross-platform user behaviors in a privacy-preserving way. On each platform a local user model learns user embeddings from the local user behaviors on that platform. The local user embeddings from different platforms are uploaded to a server for aggregation, and the aggregated ones are sent to the ad platform for CTR prediction. Besides, we apply local differential privacy and differential privacy to the local and aggregated user embeddings, respectively, for better privacy protection. Moreover, we propose a federated framework for collaborative model training with distributed models and user behaviors. Extensive experiments on real-world dataset show that FedCTR can effectively leverage multi-platform user behaviors for native ad CTR prediction in a privacy-preserving manner.

Details DOI

AAAI Conference 2022 Conference Paper

Protecting Intellectual Property of Language Generation APIs with Lexical Watermark

Xuanli He
Qiongkai Xu
Lingjuan Lyu
Fangzhao Wu
Chenguang Wang

Nowadays, due to the breakthrough in natural language generation (NLG), including machine translation, document summarization, image captioning, etc. , NLG models have been encapsulated in cloud APIs to serve over half a billion people worldwide and process over one hundred billion word generations per day. Thus, NLG APIs have already become essential profitable services in many commercial companies. Due to the substantial financial and intellectual investments, service providers adopt a pay-as-you-use policy to promote sustainable market growth. However, recent works have shown that cloud platforms suffer from financial losses imposed by model extraction attacks, which aim to imitate the functionality and utility of the victim services, thus violating the intellectual property (IP) of cloud APIs. This work targets at protecting IP of NLG APIs by identifying the attackers who have utilized watermarked responses from the victim NLG APIs. However, most existing watermarking techniques are not directly amenable for IP protection of NLG APIs. To bridge this gap, we first present a novel watermarking method for text generation APIs by conducting lexical modification to the original outputs. Compared with the competitive baselines, our watermark approach achieves better identifiable performance in terms of p-value, with fewer semantic losses. In addition, our watermarks are more understandable and intuitive to humans than the baselines. Finally, the empirical studies show our approach is also applicable to queries from different domains, and is effective on the attacker trained on a mixture of the corpus which includes less than 10% watermarked samples.

PDF Details

IJCAI Conference 2022 Conference Paper

Rethinking InfoNCE: How Many Negative Samples Do You Need?

Chuhan Wu
Fangzhao Wu
Yongfeng Huang

InfoNCE is a widely used contrastive training loss. It aims to estimate the mutual information between a pair of variables by discriminating between each positive pair and its associated K negative pairs. It is proved that when the sample labels are clean, the lower bound of mutual information estimation is tighter when more negative samples are incorporated, which usually yields better model performance. However, in practice the labels often contain noise, and incorporating too many noisy negative samples into model training may be suboptimal. In this paper, we study how many negative samples are optimal for InfoNCE in different scenarios via a semi-quantitative theoretical framework. More specifically, we first propose a probabilistic model to analyze the influence of the negative sampling ratio K on training sample informativeness. Then, we design a training effectiveness function to measure the overall influence of training samples based on their informativeness. We estimate the optimal negative sampling ratio using the K value that maximizes the training effectiveness function. Based on our framework, we further propose an adaptive negative sampling method that can dynamically adjust the negative sampling ratio to improve InfoNCE-based model training. Extensive experiments in three different tasks show our framework can accurately predict the optimal negative sampling ratio, and various models can benefit from our adaptive negative sampling method.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Fairness-aware News Recommendation with Decomposed Adversarial Learning

Chuhan Wu
Fangzhao Wu
Xiting Wang
Yongfeng Huang
Xing Xie

News recommendation is important for online news services. Existing news recommendation models are usually learned from users’ news click behaviors. Usually the behaviors of users with the same sensitive attributes (e. g. , genders) have similar patterns and news recommendation models can easily capture these patterns. It may lead to some biases related to sensitive user attributes in the recommendation results, e. g. , always recommending sports news to male users, which is unfair since users may not receive diverse news information. In this paper, we propose a fairness-aware news recommendation approach with decomposed adversarial learning and orthogonality regularization, which can alleviate unfairness in news recommendation brought by the biases of sensitive user attributes. In our approach, we propose to decompose the user interest model into two components. One component aims to learn a bias-aware user embedding that captures the bias information on sensitive user attributes, and the other aims to learn a bias-free user embedding that only encodes attribute-independent user interest information for fairnessaware news recommendation. In addition, we propose to apply an attribute prediction task to the bias-aware user embedding to enhance its ability on bias modeling, and we apply adversarial learning to the bias-free user embedding to remove the bias information from it. Moreover, we propose an orthogonality regularization method to encourage the bias-free user embeddings to be orthogonal to the bias-aware one to better distinguish the bias-free user embedding from the bias-aware one. For fairness-aware news ranking, we only use the biasfree user embedding. Extensive experiments on benchmark dataset show that our approach can effectively improve fairness in news recommendation with minor performance loss.

PDF Details

IJCAI Conference 2021 Conference Paper

User-as-Graph: User Modeling with Heterogeneous Graph Pooling for News Recommendation

Chuhan Wu
Fangzhao Wu
Yongfeng Huang
Xing Xie

Accurate user modeling is critical for news recommendation. Existing news recommendation methods usually model users' interest from their behaviors via sequential or attentive models. However, they cannot model the rich relatedness between user behaviors, which can provide useful contexts of these behaviors for user interest modeling. In this paper, we propose a novel user modeling approach for news recommendation, which models each user as a personalized heterogeneous graph built from user behaviors to better capture the fine-grained behavior relatedness. In addition, in order to learn user interest embedding from the personalized heterogeneous graph, we propose a novel heterogeneous graph pooling method, which can summarize both node features and graph topology, and be aware of the varied characteristics of different types of nodes. Experiments on large-scale benchmark dataset show the proposed methods can effectively improve the performance of user modeling for news recommendation.

PDF Details DOI

ECAI Conference 2020 Conference Paper

A Multi-Task Learning Neural Network for Emotion-Cause Pair Extraction

Sixing Wu
Fang Chen
Fangzhao Wu
Yongfeng Huang 0001
Xing Li 0001

Emotion-cause pair extraction, which aims at extracting both the emotion and its corresponding cause in text, is a significant and challenging task in emotion analysis. Previous work formulated the task in a two-step framework, i. e. , emotion and cause extraction, and emotion-cause relation classification. However, different tasks may correlate with each other and the two-step framework does not fully exploit the interactions between tasks. In this paper, we propose a multi-task neural network to perform emotion-cause pair extraction in a unified model. The task of relation classification is learned together with emotion and cause extraction. To this end, we develop a method to obtain training samples for relation classification without the dependence on the result of emotion and cause extraction. To fully exploit the interactions between different tasks, our model shares useful features across tasks. Moreover, we propose a method to incorporate position-aware emotion information in cause extraction to further improve the performance. Experimental results show that our model outperforms the state-of-the-art model on emotion-cause pair extraction.

Details

IJCAI Conference 2020 Conference Paper

User Modeling with Click Preference and Reading Satisfaction for News Recommendation

Chuhan Wu
Fangzhao Wu
Tao Qi
Yongfeng Huang

Modeling user interest is critical for accurate news recommendation. Existing news recommendation methods usually infer user interest from click behaviors on news. However, users may click a news article because attracted by its title shown on the news website homepage, but may not be satisfied with its content after reading. In many cases users close the news page quickly after click. In this paper we propose to model user interest from both click behaviors on news titles and reading behaviors on news content for news recommendation. More specifically, we propose a personalized reading speed metric to measure users’ satisfaction with news content. We learn embeddings of users from the news content they have read and their satisfaction with these news to model their interest in news content. In addition, we also learn another user embedding from the news titles they have clicked to model their preference in news titles. We combine both kinds of user embeddings into a unified user representation for news recommendation. We train the user representation model using two supervised learning tasks built from user behaviors, i. e. , news title based click prediction and news content based satisfaction prediction, to encourage our model to recommend the news articles which not only are likely to be clicked but also have the content satisfied by the user. Experiments on real-world dataset show our method can effectively boost the performance of user modeling for news recommendation.

PDF Details DOI

AAAI Conference 2019 Conference Paper

DRr-Net: Dynamic Re-Read Network for Sentence Semantic Matching

Kun Zhang
Guangyi Lv
Linyuan Wang
Le Wu
Enhong Chen
Fangzhao Wu
Xing Xie

Sentence semantic matching requires an agent to determine the semantic relation between two sentences, which is widely used in various natural language tasks such as Natural Language Inference (NLI) and Paraphrase Identification (PI). Among all matching methods, attention mechanism plays an important role in capturing the semantic relations and properly aligning the elements of two sentences. Previous methods utilized attention mechanism to select important parts of sentences at one time. However, the important parts of the sentence during semantic matching are dynamically changing with the degree of sentence understanding. Selecting the important parts at one time may be insufficient for semantic understanding. To this end, we propose a Dynamic Re-read Network (DRr-Net) approach for sentence semantic matching, which is able to pay close attention to a small region of sentences at each step and re-read the important words for better sentence semantic understanding. To be specific, we first employ Attention Stack-GRU (ASG) unit to model the original sentence repeatedly and preserve all the information from bottom-most word embedding input to up-most recurrent output. Second, we utilize Dynamic Re-read (DRr) unit to pay close attention to one important word at one time with the consideration of learned information and re-read the important words for better sentence semantic understanding. Extensive experiments on three sentence matching benchmark datasets demonstrate that DRr-Net has the ability to model sentence semantic more precisely and significantly improve the performance of sentence semantic matching. In addition, it is very interesting that some of finding in our experiments are consistent with the findings of psychological research.

PDF Details

IJCAI Conference 2019 Conference Paper

Hi-Fi Ark: Deep User Representation via High-Fidelity Archive Network

Zheng Liu
Yu Xing
Fangzhao Wu
Mingxiao An
Xing Xie

Deep learning techniques have been widely applied to modern recommendation systems, bringing in flexible and effective ways of user representation. Conventionally, user representations are generated purely in the offline stage. Without referencing to the specific candidate item for recommendation, it is difficult to fully capture user preference from the perspective of interest. More recent algorithms tend to generate user representation at runtime, where user's historical behaviors are attentively summarized w. r. t. the presented candidate item. In spite of the improved efficacy, it is too expensive for many real-world scenarios because of the repetitive access to user's entire history. In this work, a novel user representation framework, Hi-Fi Ark, is proposed. With Hi-Fi Ark, user history is summarized into highly compact and complementary vectors in the offline stage, known as archives. Meanwhile, user preference towards a specific candidate item can be precisely captured via the attentive aggregation of such archives. As a result, both deployment feasibility and superior recommendation efficacy are achieved by Hi-Fi Ark. The effectiveness of Hi-Fi Ark is empirically validated on three real-world datasets, where remarkable and consistent improvements are made over a variety of well-recognized baseline methods.

PDF Details

IJCAI Conference 2019 Conference Paper

Neural News Recommendation with Attentive Multi-View Learning

Chuhan Wu
Fangzhao Wu
Mingxiao An
Jianqiang Huang
Yongfeng Huang
Xing Xie

Personalized news recommendation is very important for online news platforms to help users find interested news and improve user experience. News and user representation learning is critical for news recommendation. Existing news recommendation methods usually learn these representations based on single news information, e. g. , title, which may be insufficient. In this paper we propose a neural news recommendation approach which can learn informative representations of users and news by exploiting different kinds of news information. The core of our approach is a news encoder and a user encoder. In the news encoder we propose an attentive multi-view learning model to learn unified news representations from titles, bodies and topic categories by regarding them as different views of news. In addition, we apply both word-level and view-level attention mechanism to news encoder to select important words and views for learning informative news representations. In the user encoder we learn the representations of users based on their browsed news and apply attention mechanism to select informative news for user representation learning. Extensive experiments on a real-world dataset show our approach can effectively improve the performance of news recommendation.

PDF Details

AAAI Conference 2016 Conference Paper

Personalized Microblog Sentiment Classification via Multi-Task Learning

Fangzhao Wu
Yongfeng Huang

Microblog sentiment classiﬁcation is an interesting and important research topic with wide applications. Traditional microblog sentiment classiﬁcation methods usually use a single model to classify the messages from different users and omit individuality. However, microblogging users frequently embed their personal character, opinion bias and language habits into their messages, and the same word may convey different sentiments in messages posted by different users. In this paper, we propose a personalized approach for microblog sentiment classiﬁcation. In our approach, each user has a personalized sentiment classiﬁer, which is decomposed into two components, a global one and a user-speciﬁc one. Our approach can capture the individual personality and at the same time leverage the common sentiment knowledge shared by all users. The personalized sentiment classiﬁers of massive users are trained in a collaborative way based on multi-task learning to handle the data sparseness problem. In addition, we incorporate users’ social relations into our model to strengthen the learning of the personalized models. Moreover, we propose a distributed optimization algorithm to solve our model in parallel. Experiments on two real-world microblog sentiment datasets validate that our approach can improve microblog sentiment classiﬁcation accuracy effectively and efﬁciently.

PDF Details

AAAI Conference 2015 Conference Paper

Microblog Sentiment Classification with Contextual Knowledge Regularization

Fangzhao Wu
Yangqiu Song
Yongfeng Huang

Microblog sentiment classification is an important research topic which has wide applications in both academia and industry. Because microblog messages are short, noisy and contain masses of acronyms and informal words, microblog sentiment classification is a very challenging task. Fortunately, collectively the contextual information about these idiosyncratic words provide knowledge about their sentiment orientations. In this paper, we propose to use the microblogs’ contextual knowledge mined from a large amount of unlabeled data to help improve microblog sentiment classification. We define two kinds of contextual knowledge: wordword association and word-sentiment association. The contextual knowledge is formulated as regularization terms in supervised learning algorithms. An efficient optimization procedure is proposed to learn the model. Experimental results on benchmark datasets show that our method can consistently and significantly outperform the state-of-the-art methods.

PDF Details