Arrow Research search

Author name cluster

Cen Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

23 papers
1 author row

Possible papers

23

AAAI Conference 2026 Conference Paper

FILTER: A Framework for Defending Against Backdoor Attacks in Vertical Federated Learning

  • Zhanyi Hu
  • Cen Chen
  • Yanhao Wang

Vertical Federated Learning (VFL) is a distributed machine learning paradigm in which participants train models with vertically partitioned data. Many previous studies have identified backdoor vulnerabilities in VFL systems. However, limited effort has been devoted to developing defenses against such attacks. Unlike centralized machine learning or horizontal FL, VFL poses new challenges for defending against backdoor attacks, particularly because the central server lacks control over the entire model. In this paper, we first explore defenses against backdoor attacks in VFL when the attacker possesses sufficient knowledge of the label information. Specifically, we propose FILTER, a framework for defending against backdoor attacks in VFL to ensure the integrity of VFL systems during training in the presence of malicious participants. To address backdoor risks in VFL, it incorporates two novel filters: an embedding-based filter and a loss-based filter, which effectively identify and remove poisoned samples in later stages of training. Through extensive experiments on five benchmark datasets against four state-of-the-art backdoor attacks, we demonstrate that FILTER significantly reduces the success rate of attacks while maintaining accuracy on clean data close to that of the models trained without such defenses.

AAAI Conference 2026 Conference Paper

RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior

  • Junyao Yang
  • Jianwei Wang
  • Huiping Zhuang
  • Cen Chen
  • Ziqian Zeng

Large Language Models (LLMs) with long chain-of-thought (CoT) capability, termed Reasoning Models, demonstrate superior intricate problem-solving abilities through multi-step long CoT reasoning. To create a dual-capability model with long CoT capability and domain-specific knowledge without substantial computational and data costs, model merging emerges as a highly resource-efficient method. However, significant challenges lie in merging domain-specific LLMs with long CoT ones since nowadays merging methods suffer from reasoning capability degradation, even gibberish output and output collapse. To overcome this, we introduce RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior, a novel merging framework designed to integrate domain-specific LLMs with long CoT capability, meanwhile maintaining model performance in the original domain. Treating reasoning model weights as foundational prior, our method utilizes a reasoning capability indicator to preserve core long CoT capability model weights while selectively merging essential domain-specific weights. We conducted extensive experiments on Qwen2.5-7B, Llama3.1-8B, and Qwen2.5-1.5B models in BioMedicine and Finance domains. Our results show that RCP-Merging successfully merges a reasoning model with domain-specific ones, improving domain task performance by 9.5% and 9.2% over state-of-the-art methods, without significantly harming the original long CoT reasoning capability.

NeurIPS Conference 2025 Conference Paper

Comprehensive Assessment and Analysis for NSFW Content Erasure in Text-to-Image Diffusion models

  • Die Chen
  • Zhiwen Li
  • Cen Chen
  • Yuexiang Xie
  • Xiaodan Li
  • Jinyan Ye
  • Yingda Chen
  • Yaliang Li

Text-to-image diffusion models have gained widespread application across various domains, demonstrating remarkable creative potential. However, the strong generalization capabilities of diffusion models can inadvertently lead to the generation of not-safe-for-work (NSFW) content, posing significant risks to their safe deployment. While several concept erasure methods have been proposed to mitigate the issue associated with NSFW content, a comprehensive evaluation of their effectiveness across various scenarios remains absent. To bridge this gap, we introduce a full-pipeline toolkit specifically designed for concept erasure and conduct the first systematic study of NSFW concept erasure methods. By examining the interplay between the underlying mechanisms and empirical observations, we provide in-depth insights and practical guidance for the effective application of concept erasure methods in various real-world scenarios, with the aim of advancing the understanding of content safety in diffusion models and establishing a solid foundation for future research and development in this critical area.

IJCAI Conference 2025 Conference Paper

Enhancing Portfolio Optimization via Heuristic-Guided Inverse Reinforcement Learning with Multi-Objective Reward and Graph-based Policy Learning

  • Wenyi Zhang
  • Renjun Jia
  • Yanhao Wang
  • Dawei Cheng
  • Minghao Zhao
  • Cen Chen

Portfolio optimization encounters persistent challenges in adapting to dynamic markets due to static assumptions and high-dimensional decision spaces. Although reinforcement learning (RL) has emerged as a potential solution, conventional reward engineering often fails to capture complex market dynamics. Recent advances in deep RL and graph neural networks have attempted to enhance market microstructure modeling. However, these methods still struggle with the systematic integration of financial knowledge. To address the above issues, we propose a novel heuristic-guided inverse reinforcement learning framework for portfolio optimization. Specifically, our framework provides an interpretable expert strategy generation mechanism that takes into account sector diversification and correlation constraints. Then, a multi-objective reward optimization method is adopted to adaptively strike a balance between returns and risks. Furthermore, it also utilizes heterogeneous graph policy learning with hierarchical attention mechanisms to explicitly model inter-stock relationships. Finally, we conduct extensive experiments on real-world financial market data to demonstrate that our framework outperforms several state-of-the-art deep learning and RL baselines in terms of risk-adjusted returns. We provide case studies to showcase the ability of our framework to balance return maximization and risk containment. Our code is publicly available at https: //github. com/ChloeWenyiZhang/SmartFolio/.

IJCAI Conference 2025 Conference Paper

ExVideo: Extending Video Diffusion Models via Parameter-Efficient Post-Tuning

  • Zhongjie Duan
  • Hong Zhang
  • Wenmeng Zhou
  • Cen Chen
  • Yaliang Li
  • Yu Zhang
  • Yingda Chen

Recently, advancements in video synthesis have attracted significant attention. Video synthesis models have demonstrated the practical applicability of diffusion models in creating dynamic visual content. Despite these advancements, the extension of video lengths remains constrained by computational resources. Most existing video synthesis models are limited to generating short video clips. In this paper, we propose a novel post-tuning methodology for video synthesis models, called ExVideo. This approach is designed to enhance the capability of current video synthesis models, allowing them to produce content over extended temporal durations while incurring lower training expenditures. In particular, we design extension strategies across common temporal model architectures respectively, including 3D convolution, temporal attention, and positional embedding. To evaluate the efficacy of our proposed post-tuning approach, we trained ExSVD, an extended model based on Stable Video Diffusion model. Our approach enhances the model's capacity to generate up to 5x its original number of frames, requiring only 1. 5k GPU hours of training on a dataset comprising 40k videos. Importantly, the substantial increase in video length doesn't compromise the model's innate generalization capabilities, and the model showcases its advantages in generating videos of diverse styles and resolutions. We have released the source code and the enhanced model publicly.

IJCAI Conference 2025 Conference Paper

FastBlend: Enhancing Video Stylization Consistency via Model-Free Patch Blending

  • Zhongjie Duan
  • Chengyu Wang
  • Cen Chen
  • Weining Qian
  • Jun Huang
  • Mingyi Jin

With the emergence of diffusion models and the rapid development of image processing, generating artistic images in style transfer tasks has become effortless. However, these impressive image processing approaches face consistency issues in video processing due to the independent processing of each frame. In this paper, we propose a powerful, model-free approach called FastBlend to address the consistency problem in video stylization. FastBlend functions as a post-processor and can be seamlessly integrated with diffusion models to create a robust video stylization pipeline. Based on a patch-matching algorithm, we remap and blend the aligned content across multiple frames, thus compensating for inconsistent content with neighboring frames. Moreover, we propose a tree-like data structure and a specialized loss function, aiming to optimize computational efficiency and visual quality for different application scenarios. Extensive experiments have demonstrated the effectiveness of FastBlend. Compared with both independent video deflickering algorithms and diffusion-based video processing methods, FastBlend is capable of synthesizing more coherent and realistic videos.

IJCAI Conference 2025 Conference Paper

GSDNet: Revisiting Incomplete Multimodality-Diffusion Emotion Recognition from the Perspective of Graph Spectrum

  • Yuntao Shou
  • Jun Yao
  • Tao Meng
  • Wei Ai
  • Cen Chen
  • Keqin Li

Multimodal Emotion Recognition (MER) combines technologies from multiple fields (e. g. , computer vision, natural language processing, and audio signal processing), aiming to infer an individual's emotional state by analyzing information from different sources (i. e. , video, audio, and text). Compared with single modality, by fusing complementary semantic information from different modalities, the model can obtain more robust knowledge representation. However, the modality missing problem limits the performance of MERC in practical scenarios. Recent work has achieved impressive performance on modality completion using graph neural networks and diffusion models, respectively. This inspires us to combine these two dimensions in the completion network to obtain more powerful representation capabilities. However, we argue that directly running a full-rank score-based diffusion model on the entire graph adjacency matrix space may adversely affect the learning process of the diffusion model. This is because the model assumes a direct relationship between each pair of nodes and ignores local structural features and sparse connections between nodes, thereby significantly reducing the quality of the generated data. Based on the above ideas, we propose a novel Graph Spectral Diffusion Network (GSDNet), which utilizes a low-rank score-based diffusion model to map Gaussian noise to the graph spectral distribution space of missing modalities and recover the missing data according to its original distribution. Extensive experiments have demonstrated that GSDNet achieves state-of-the-art emotion recognition performance in various modality loss scenarios.

IJCAI Conference 2025 Conference Paper

TOTF: Missing-Aware Encoders for Clustering on Multi-View Incomplete Attributed Graphs

  • Mengyao Li
  • Xu Zhou
  • Jiapeng Zhang
  • Zhibang Yang
  • Cen Chen
  • Kenli Li

As the network data in real life become multi-modal and multi-relational, multi-view attributed graphs have garnered significant attention. Numerous methods have achieved excellent performance in multi-view attributed graph clustering; however, they cannot efficiently handle incomplete attribute scenarios, which are prevalent in many real-life applications. Inspired by this, we investigate the problem of multi-view incomplete attributed graph clustering for the first time. In particular, the TOTF (Train Once Then Freeze) framework is designed to train missing-aware encoders that capture view-specific information while ignoring the impact of incomplete attributes, and then employs frozen encoders to uncover common information driven by clustering. After that, we propose a correlation strength-aware graph neural network on the basis of the inherent relationships among attributes to enhance accuracy. It is proven theoretically that traditional Generative Adversarial Networks (GANs) are unable to generate the unique real distribution. To address this issue, we further introduce the missing-position reminder mechanism into our intra-view adversarial games for better clustering results. Extensive experimental results demonstrate that our method achieves up to a 17% improvement in accuracy over the state-of-the-art methods. The source code is available at https: //anonymous. 4open. science/r/TOTF-main.

AAAI Conference 2024 Conference Paper

ConsistentEE: A Consistent and Hardness-Guided Early Exiting Method for Accelerating Language Models Inference

  • Ziqian Zeng
  • Yihuai Hong
  • Hongliang Dai
  • Huiping Zhuang
  • Cen Chen

Early Exiting is one of the most popular methods to achieve efficient inference. Current early exiting methods adopt the (weighted) sum of the cross entropy loss of all internal classifiers as the objective function during training, imposing all these classifiers to predict all instances correctly. However, during inference, as long as one internal classifier predicts an instance correctly, it can accelerate without losing accuracy. Thus, there is a notable gap between training and inference. We propose ConsistentEE, an early exiting method that is consistent in training and inference. ConsistentEE formulates the early exiting process as a reinforcement learning problem. A policy network is added to decide whether an instance should exit or continue. The training objective of ConsistentEE only requires each instance to be predicted correctly by one internal classifier. Additionally, we introduce the concept "Memorized Layer" to measure the hardness of an instance. We incorporate the memorized layer into reward function design, which allows "easy'' instances to focus more on acceleration while ``hard'' instances to focus more on accuracy. Experimental results show that our method outperforms other baselines on various natural language understanding and generation tasks using PLMs and LLMs as backbones respectively.

IJCAI Conference 2024 Conference Paper

Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models

  • Zhongjie Duan
  • Chengyu Wang
  • Cen Chen
  • Weining Qian
  • Jun Huang

Toon shading is a type of non-photorealistic rendering task in animation. Its primary purpose is to render objects with a flat and stylized appearance. As diffusion models have ascended to the forefront of image synthesis, this paper delves into an innovative form of toon shading based on diffusion models, aiming to directly render photorealistic videos into anime styles. In video stylization, existing methods encounter persistent challenges, notably in maintaining consistency and achieving high visual quality. In this paper, we model the toon shading problem as four subproblems, i. e. , stylization, consistency enhancement, structure guidance, and colorization. To address the challenges in video stylization, we propose an effective toon shading approach called Diffutoon. Diffutoon is capable of rendering remarkably detailed, high-resolution, and extended-duration videos in anime style. It can also edit the video content according to input prompts via an additional branch. The efficacy of Diffutoon is evaluated through quantitive metrics and human evaluation. Notably, Diffutoon surpasses both open-source and closed-source baseline approaches in our experiments. Our work is accompanied by the release of both the source code and example videos on Github.

AAAI Conference 2024 Conference Paper

DS-AL: A Dual-Stream Analytic Learning for Exemplar-Free Class-Incremental Learning

  • Huiping Zhuang
  • Run He
  • Kai Tong
  • Ziqian Zeng
  • Cen Chen
  • Zhiping Lin

Class-incremental learning (CIL) under an exemplar-free constraint has presented a significant challenge. Existing methods adhering to this constraint are prone to catastrophic forgetting, far more so than replay-based techniques that retain access to past samples. In this paper, to solve the exemplar-free CIL problem, we propose a Dual-Stream Analytic Learning (DS-AL) approach. The DS-AL contains a main stream offering an analytical (i.e., closed-form) linear solution, and a compensation stream improving the inherent under-fitting limitation due to adopting linear mapping. The main stream redefines the CIL problem into a Concatenated Recursive Least Squares (C-RLS) task, allowing an equivalence between the CIL and its joint-learning counterpart. The compensation stream is governed by a Dual-Activation Compensation (DAC) module. This module re-activates the embedding with a different activation function from the main stream one, and seeks fitting compensation by projecting the embedding to the null space of the main stream's linear mapping. Empirical results demonstrate that the DS-AL, despite being an exemplar-free technique, delivers performance comparable with or better than that of replay-based methods across various datasets, including CIFAR-100, ImageNet-100 and ImageNet-Full. Additionally, the C-RLS' equivalent property allows the DS-AL to execute CIL in a phase-invariant manner. This is evidenced by a never-before-seen 500-phase CIL ImageNet task, which performs on a level identical to a 5-phase one. Our codes are available at https://github.com/ZHUANGHP/Analytic-continual-learning.

NeurIPS Conference 2024 Conference Paper

F-OAL: Forward-only Online Analytic Learning with Fast Training and Low Memory Footprint in Class Incremental Learning

  • Huiping Zhuang
  • Yuchen Liu
  • Run He
  • Kai Tong
  • Ziqian Zeng
  • Cen Chen
  • Yi Wang
  • Lap-Pui Chau

Online Class Incremental Learning (OCIL) aims to train models incrementally, where data arrive in mini-batches, and previous data are not accessible. A major challenge in OCIL is Catastrophic Forgetting, i. e. , the loss of previously learned knowledge. Among existing baselines, replay-based methods show competitive results but requires extra memory for storing exemplars, while exemplar-free (i. e. , data need not be stored for replay in production) methods are resource friendly but often lack accuracy. In this paper, we propose an exemplar-free approach—Forward-only Online Analytic Learning (F-OAL). Unlike traditional methods, F-OAL does not rely on back-propagation and is forward-only, significantly reducing memory usage and computational time. Cooperating with a pre-trained frozen encoder with Feature Fusion, F-OAL only needs to update a linear classifier by recursive least square. This approach simultaneously achieves high accuracy and low resource consumption. Extensive experiments on bench mark datasets demonstrate F-OAL’s robust performance in OCIL scenarios. Code is available at: https: //github. com/liuyuchen-cz/F-OAL

NeurIPS Conference 2024 Conference Paper

GACL: Exemplar-Free Generalized Analytic Continual Learning

  • Huiping Zhuang
  • Yizhu Chen
  • Di Fang
  • Run He
  • Kai Tong
  • Hongxin Wei
  • Ziqian Zeng
  • Cen Chen

Class incremental learning (CIL) trains a network on sequential tasks with separated categories in each task but suffers from catastrophic forgetting, where models quickly lose previously learned knowledge when acquiring new tasks. The generalized CIL (GCIL) aims to address the CIL problem in a more real-world scenario, where incoming data have mixed data categories and unknown sample size distribution. Existing attempts for the GCIL either have poor performance or invade data privacy by saving exemplars. In this paper, we propose a new exemplar-free GCIL technique named generalized analytic continual learning (GACL). The GACL adopts analytic learning (a gradient-free training technique) and delivers an analytical (i. e. , closed-form) solution to the GCIL scenario. This solution is derived via decomposing the incoming data into exposed and unexposed classes, thereby attaining a weight-invariant property, a rare yet valuable property supporting an equivalence between incremental learning and its joint training. Such an equivalence is crucial in GCIL settings as data distributions among different tasks no longer pose challenges to adopting our GACL. Theoretically, this equivalence property is validated through matrix analysis tools. Empirically, we conduct extensive experiments where, compared with existing GCIL methods, our GACL exhibits a consistently leading performance across various datasets and GCIL settings. Source code is available at https: //github. com/CHEN-YIZHU/GACL.

NeurIPS Conference 2024 Conference Paper

Transferability Bound Theory: Exploring Relationship between Adversarial Transferability and Flatness

  • Mingyuan Fan
  • Xiaodan Li
  • Cen Chen
  • Wenmeng Zhou
  • Yaliang Li

A prevailing belief in attack and defense community is that the higher flatness of adversarial examples enables their better cross-model transferability, leading to a growing interest in employing sharpness-aware minimization and its variants. However, the theoretical relationship between the transferability of adversarial examples and their flatness has not been well established, making the belief questionable. To bridge this gap, we embark on a theoretical investigation and, for the first time, derive a theoretical bound for the transferability of adversarial examples with few practical assumptions. Our analysis challenges this belief by demonstrating that the increased flatness of adversarial examples does not necessarily guarantee improved transferability. Moreover, building upon the theoretical analysis, we propose TPA, a Theoretically Provable Attack that optimizes a surrogate of the derived bound to craft adversarial examples. Extensive experiments across widely used benchmark datasets and various real-world applications show that TPA can craft more transferable adversarial examples compared to state-of-the-art baselines. We hope that these results can recalibrate preconceived impressions within the community and facilitate the development of stronger adversarial attack and defense mechanisms.

JBHI Journal 2023 Journal Article

CoIn: Correlation Induced Clustering for Cognition of High Dimensional Bioinformatics Data

  • Zeng Zeng
  • Ziyuan Zhao
  • Kaixin Xu
  • Yangfan Li
  • Cen Chen
  • Xiaofeng Zou
  • Yulan Wang
  • Wei Wei

Analysis of high dimensional biomedical data such as microarray gene expression data and mass spectrometry images, is crucial to provide better medical services including cancer subtyping, protein homology detection, etc. Clustering is a fundamental cognitive task which aims to group unlabeled data into multiple clusters based on their intrinsic similarities. However, for most clustering methods, including the most widely used $K$ -means algorithm, all features of the high dimensional data are considered equally in relevance, which distorts the performance when clustering high-dimensional data where there exist many redundant variables and correlated variables. In this paper, we aim at addressing the problem of the high dimensional bioinformatics data clustering and propose a new correlation induced clustering, CoIn, to capture complex correlations among high dimensional data and guarantee the correlation consistency within each cluster. We evaluate the proposed method on a high dimensional mass spectrometry dataset of liver cancer tumor to explore the metabolic differences on tissues and discover the intra-tumor heterogeneity (ITH). By comparing the results of baselines and ours, it has been found that our method produces more explainable and understandable results for clinical analysis, which demonstrates the proposed clustering paradigm has the potential with application to knowledge discovery in high dimensional bioinformatics data.

TIST Journal 2022 Journal Article

Toward Scalable and Privacy-preserving Deep Neural Network via Algorithmic-Cryptographic Co-design

  • Jun Zhou
  • Longfei Zheng
  • Chaochao Chen
  • Yan Wang
  • Xiaolin Zheng
  • Bingzhe Wu
  • Cen Chen
  • Li Wang

Deep Neural Networks (DNNs) have achieved remarkable progress in various real-world applications, especially when abundant training data are provided. However, data isolation has become a serious problem currently. Existing works build privacy-preserving DNN models from either algorithmic perspective or cryptographic perspective. The former mainly splits the DNN computation graph between data holders or between data holders and server, which demonstrates good scalability but suffers from accuracy loss and potential privacy risks. In contrast, the latter leverages time-consuming cryptographic techniques, which has strong privacy guarantee but poor scalability. In this article, we propose SPNN—a Scalable and Privacy-preserving deep Neural Network learning framework, from an algorithmic-cryptographic co-perspective. From algorithmic perspective, we split the computation graph of DNN models into two parts, i.e., the private-data-related computations that are performed by data holders and the rest heavy computations that are delegated to a semi-honest server with high computation ability. From cryptographic perspective, we propose using two types of cryptographic techniques, i.e., secret sharing and homomorphic encryption, for the isolated data holders to conduct private-data-related computations privately and cooperatively. Furthermore, we implement SPNN in a decentralized setting and introduce user-friendly APIs. Experimental results conducted on real-world datasets demonstrate the superiority of our proposed SPNN.

JBHI Journal 2021 Journal Article

DSAL: Deeply Supervised Active Learning From Strong and Weak Labelers for Biomedical Image Segmentation

  • Ziyuan Zhao
  • Zeng Zeng
  • Kaixin Xu
  • Cen Chen
  • Cuntai Guan

Image segmentation is one of the most essential biomedical image processing problems for different imaging modalities, including microscopy and X-ray in the Internet-of-Medical-Things (IoMT) domain. However, annotating biomedical images is knowledge-driven, time-consuming, and labor-intensive, making it difficult to obtain abundant labels with limited costs. Active learning strategies come into ease the burden of human annotation, which queries only a subset of training data for annotation. Despite receiving attention, most of active learning methods still require huge computational costs and utilize unlabeled data inefficiently. They also tend to ignore the intermediate knowledge within networks. In this work, we propose a deep active semi-supervised learning framework, DSAL, combining active learning and semi-supervised learning strategies. In DSAL, a new criterion based on deep supervision mechanism is proposed to select informative samples with high uncertainties and low uncertainties for strong labelers and weak labelers respectively. The internal criterion leverages the disagreement of intermediate features within the deep learning network for active sample selection, which subsequently reduces the computational costs. We use the proposed criteria to select samples for strong and weak labelers to produce oracle labels and pseudo labels simultaneously at each active learning iteration in an ensemble learning manner, which can be examined with IoMT Platform. Extensive experiments on multiple medical image datasets demonstrate the superiority of the proposed method over state-of-the-art active learning methods.

AAAI Conference 2021 Conference Paper

Reinforced History Backtracking for Conversational Question Answering

  • Minghui Qiu
  • Xinjing Huang
  • Cen Chen
  • Feng Ji
  • Chen Qu
  • Wei Wei
  • Jun Huang
  • Yin Zhang

To model the context history in multi-turn conversations has become a critical step towards a better understanding of the user query in question answering systems. To utilize the context history, most existing studies treat the whole context as input, which will inevitably face the following two challenges. First, modeling a long history can be costly as it requires more computation resources. Second, the long context history consists of a lot of irrelevant information that makes it difficult to model appropriate information relevant to the user query. To alleviate these problems, we propose a reinforcement learning based method to capture and backtrack the related conversation history to boost model performance in this paper. Our method seeks to automatically backtrack the history information with the implicit feedback from the model performance. We further consider both immediate and delayed rewards to guide the reinforced backtracking policy. Extensive experiments on a large conversational question answering dataset show that the proposed method can help to alleviate the problems arising from longer context history. Meanwhile, experiments show that the method yields better performance than other strong baselines, and the actions made by the method are insightful.

AAAI Conference 2020 Conference Paper

Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics

  • Bingzhe Wu
  • Chaochao Chen
  • Shiwan Zhao
  • Cen Chen
  • Yuan Yao
  • Guangyu Sun
  • Li Wang
  • Xiaolu Zhang

Bayesian deep learning is recently regarded as an intrinsic way to characterize the weight uncertainty of deep neural networks (DNNs). Stochastic Gradient Langevin Dynamics (SGLD) is an effective method to enable Bayesian deep learning on large-scale datasets. Previous theoretical studies have shown various appealing properties of SGLD, ranging from the convergence properties to the generalization bounds. In this paper, we study the properties of SGLD from a novel perspective of membership privacy protection (i. e. , preventing the membership attack). The membership attack, which aims to determine whether a specific sample is used for training a given DNN model, has emerged as a common threat against deep learning algorithms. To this end, we build a theoretical framework to analyze the information leakage (w. r. t. the training dataset) of a model trained using SGLD. Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent. Moreover, our theoretical analysis can be naturally extended to other types of Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) methods. Empirical results on different datasets and models verify our theoretical findings and suggest that the SGLD algorithm can not only reduce the information leakage but also improve the generalization ability of the DNN models in real-world applications.

IJCAI Conference 2019 Conference Paper

AntProphet: an Intention Mining System behind Alipay's Intelligent Customer Service Bot

  • Cen Chen
  • Xiaolu Zhang
  • Sheng Ju
  • Chilin Fu
  • Caizhi Tang
  • Jun Zhou
  • Xiaolong Li

We create an intention mining system, named AntProphet, for Alipay's intelligent customer service bot, to alleviate the burden of customer service. Whenever users have any questions, AntProphet is the first stop to help users to answer their questions. Our system gathers users' profile and their historical behavioral trajectories, together with contextual information to predict users' intention, i. e. , the potential questions that users want to resolve. AntProphet takes care of more than 90% of the customer service demands in the Alipay APP and resolves most of the users' problems on the spot, thus significantly reduces the burden of manpower. With the help of it, the overall satisfaction rate of our customer service bot exceeds 85%.

AAAI Conference 2019 Conference Paper

Gated Residual Recurrent Graph Neural Networks for Traffic Prediction

  • Cen Chen
  • Kenli Li
  • Sin G. Teo
  • Xiaofeng Zou
  • Kang Wang
  • Jie Wang
  • Zeng Zeng

Traffic prediction is of great importance to traffic management and public safety, and very challenging as it is affected by many complex factors, such as spatial dependency of complicated road networks and temporal dynamics, and many more. The factors make traffic prediction a challenging task due to the uncertainty and complexity of traffic states. In the literature, many research works have applied deep learning methods on traffic prediction problems combining convolutional neural networks (CNNs) with recurrent neural networks (RNNs), which CNNs are utilized for spatial dependency and RNNs for temporal dynamics. However, such combinations cannot capture the connectivity and globality of traffic networks. In this paper, we first propose to adopt residual recurrent graph neural networks (Res-RGNN) that can capture graph-based spatial dependencies and temporal dynamics jointly. Due to gradient vanishing, RNNs are hard to capture periodic temporal correlations. Hence, we further propose a novel hop scheme into Res-RGNN to utilize the periodic temporal dependencies. Based on Res-RGNN and hop Res-RGNN, we finally propose a novel end-to-end multiple Res-RGNNs framework, referred to as “MRes-RGNN”, for traffic prediction. Experimental results on two traffic datasets have demonstrated that the proposed MRes-RGNN outperforms state-of-the-art methods significantly.

TIST Journal 2017 Journal Article

Scalable Urban Mobile Crowdsourcing

  • Shih-Fen Cheng
  • Cen Chen
  • Thivya Kandappu
  • Hoong Chuin Lau
  • Archan Misra
  • Nikita Jaiman
  • Randy Tandriansyah
  • Desmond Koh

In this article, we investigate effective ways of utilizing crowdworkers in providing various urban services. The task recommendation platform that we design can match tasks to crowdworkers based on workers’ historical trajectories and time budget limits, thus making recommendations personal and efficient. One major challenge we manage to address is the handling of crowdworker’s trajectory uncertainties. In this article, we explicitly allow multiple routine routes to be probabilistically associated with each worker. We formulate this problem as an integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structures of the formulation and apply the Lagrangian relaxation technique to scale up computation. Numerical experiments have been performed over the instances generated using the realistic public transit dataset in Singapore. The results show that we can find significantly better solutions than the deterministic formulation, and in most cases we can find solutions that are very close to the theoretical performance limit. To demonstrate the practicality of our approach, we deployed our recommendation engine to a campus-scale field trial, and we demonstrate that workers receiving our recommendations incur fewer detours and complete more tasks, and are more efficient against workers relying on their own planning (25% more for top workers who receive recommendations). This is achieved despite having highly uncertain worker trajectories. We also demonstrate how to further improve the robustness of the system by using a simple multi-coverage mechanism.

IJCAI Conference 2015 Conference Paper

Towards City-Scale Mobile Crowdsourcing: Task Recommendations under Trajectory Uncertainties

  • Cen Chen
  • Shih-Fen Cheng
  • Hoong Chuin Lau
  • Archan Misra

In this work, we investigate the problem of largescale mobile crowdsourcing, where workers are financially motivated to perform location-based tasks physically. Unlike current industry practice that relies on workers to manually pick tasks to perform, we automatically make task recommendation based on workers’ historical trajectories and desired time budgets. The challenge of predicting workers’ trajectories is that it is faced with uncertainties, as a worker does not take same routes every day. In this work, we depart from deterministic modeling and study the stochastic task recommendation problem where each worker is associated with several predicted routine routes with probabilities. We formulate this problem as a stochastic integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structures of the formulation and apply the Lagrangian relaxation technique to scale up computation. Experiments have been performed over the instances generated using the real Singapore transportation network. The results show that we can find significantly better solutions than the deterministic formulation.