Arrow Research search

Author name cluster

Chun Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers
1 author row

Possible papers

21

AAAI Conference 2026 Conference Paper

DICE: Distilling Classifier-Free Guidance into Text Embeddings

  • Zhenyu Zhou
  • Defang Chen
  • Can Wang
  • Chun Chen
  • Siwei Lyu

Text-to-image diffusion models are capable of generating high-quality images, but suboptimal pre-trained text representations often result in these images failing to align closely with the given text prompts. Classifier-free guidance (CFG) is a popular and effective technique for improving text-image alignment in the generative process. However, CFG introduces significant computational overhead. In this paper, we present DIstilling CFG by sharpening text Embeddings (DICE) that replaces CFG in the sampling process with half the computational complexity while maintaining similar generation quality. DICE distills a CFG-based text-to-image diffusion model into a CFG-free version by refining text embeddings to replicate CFG-based directions. In this way, we avoid the computational drawbacks of CFG, enabling high-quality, well-aligned image generation at a fast sampling speed. Furthermore, examining the enhancement pattern, we identify the underlying mechanism of DICE that sharpens specific components of text embeddings to preserve semantic information while enhancing fine-grained details. Extensive experiments on multiple Stable Diffusion v1.5 variants, SDXL, and PixArt-\alpha demonstrate the effectiveness of our method.

AAAI Conference 2025 Conference Paper

Advancing Loss Functions in Recommender Systems: A Comparative Study with a Rényi Divergence-Based Solution

  • Shengjia Zhang
  • Jiawei Chen
  • Changdong Li
  • Sheng Zhou
  • Qihao Shi
  • Yan Feng
  • Chun Chen
  • Can Wang

Loss functions play a pivotal role in optimizing recommendation models. Among various loss functions, Softmax Loss (SL) and Cosine Contrastive Loss (CCL) are particularly effective. Their theoretical connections and differences warrant in-depth exploration. This work conducts comprehensive analyses of these losses, yielding significant insights: 1) Common strengths --- both can be viewed as augmentations of traditional losses with Distributional Robust Optimization (DRO), enhancing robustness to distributional shifts; 2) Respective limitations --- stemming from their use of different distribution distance metrics in DRO optimization, SL exhibits high sensitivity to false negative instances, whereas CCL suffers from low data utilization. To address these limitations, this work proposes a new loss function, DrRL, which generalizes SL and CCL by leveraging Rényi-divergence in DRO optimization. DrRL incorporates the advantageous structures of both SL and CCL, and can be demonstrated to effectively mitigate their limitations. Extensive experiments have been conducted to validate the superiority of DrRL on both recommendation accuracy and robustness.

TMLR Journal 2025 Journal Article

Conditional Image Synthesis with Diffusion Models: A Survey

  • Zheyuan Zhan
  • Defang Chen
  • Jian-Ping Mei
  • Zhenghe Zhao
  • Jiawei Chen
  • Chun Chen
  • Siwei Lyu
  • Can Wang

Conditional image synthesis based on user-specified requirements is a key component in creating complex visual content. In recent years, diffusion-based generative modeling has become a highly effective way for conditional image synthesis, leading to exponential growth in the literature. However, the complexity of diffusion-based modeling, the wide range of image synthesis tasks, and the diversity of conditioning mechanisms present significant challenges for researchers to keep up with rapid developments and to understand the core concepts on this topic. In this survey, we categorize existing works based on how conditions are integrated into the two fundamental components of diffusion-based modeling, i.e., the denoising network and the sampling process. We specifically highlight the underlying principles, advantages, and potential challenges of various conditioning approaches during the training, re-purposing, and specialization stages to construct a desired denoising network. We also summarize six mainstream conditioning mechanisms in the sampling process. All discussions are centered around popular applications. Finally, we pinpoint several critical yet still unsolved problems and suggest some possible solutions for future research.

NeurIPS Conference 2024 Conference Paper

PSL: Rethinking and Improving Softmax Loss from Pairwise Perspective for Recommendation

  • Weiqin Yang
  • Jiawei Chen
  • Xin Xin
  • Sheng Zhou
  • Binbin Hu
  • Yan Feng
  • Chun Chen
  • Can Wang

Softmax Loss (SL) is widely applied in recommender systems (RS) and has demonstrated effectiveness. This work analyzes SL from a pairwise perspective, revealing two significant limitations: 1) the relationship between SL and conventional ranking metrics like DCG is not sufficiently tight; 2) SL is highly sensitive to false negative instances. Our analysis indicates that these limitations are primarily due to the use of the exponential function. To address these issues, this work extends SL to a new family of loss functions, termed Pairwise Softmax Loss (PSL), which replaces the exponential function in SL with other appropriate activation functions. While the revision is minimal, we highlight three merits of PSL: 1) it serves as a tighter surrogate for DCG with suitable activation functions; 2) it better balances data contributions; and 3) it acts as a specific BPR loss enhanced by Distributionally Robust Optimization (DRO). We further validate the effectiveness and robustness of PSL through empirical experiments. The code is available at https: //github. com/Tiny-Snow/IR-Benchmark.

NeurIPS Conference 2024 Conference Paper

Simple and Fast Distillation of Diffusion Models

  • Zhenyu Zhou
  • Defang Chen
  • Can Wang
  • Chun Chen
  • Siwei Lyu

Diffusion-based generative models have demonstrated their powerful performance across various tasks, but this comes at a cost of the slow sampling speed. To achieve both efficient and high-quality synthesis, various distillation-based accelerated sampling methods have been developed recently. However, they generally require time-consuming fine tuning with elaborate designs to achieve satisfactory performance in a specific number of function evaluation (NFE), making them difficult to employ in practice. To address this issue, we propose **S**imple and **F**ast **D**istillation (SFD) of diffusion models, which simplifies the paradigm used in existing methods and largely shortens their fine-tuning time up to $1000\times$. We begin with a vanilla distillation-based sampling method and boost its performance to state of the art by identifying and addressing several small yet vital factors affecting the synthesis efficiency and quality. Our method can also achieve sampling with variable NFEs using a single distilled model. Extensive experiments demonstrate that SFD strikes a good balance between the sample quality and fine-tuning costs in few-step image generation task. For example, SFD achieves 4. 53 FID (NFE=2) on CIFAR-10 with only **0. 64 hours** of fine-tuning on a single NVIDIA A100 GPU.

NeurIPS Conference 2023 Conference Paper

OpenGSL: A Comprehensive Benchmark for Graph Structure Learning

  • Zhiyao Zhou
  • Sheng Zhou
  • Bochao Mao
  • Xuanyi Zhou
  • Jiawei Chen
  • Qiaoyu Tan
  • Daochen Zha
  • Yan Feng

Graph Neural Networks (GNNs) have emerged as the de facto standard for representation learning on graphs, owing to their ability to effectively integrate graph topology and node attributes. However, the inherent suboptimal nature of node connections, resulting from the complex and contingent formation process of graphs, presents significant challenges in modeling them effectively. To tackle this issue, Graph Structure Learning (GSL), a family of data-centric learning approaches, has garnered substantial attention in recent years. The core concept behind GSL is to jointly optimize the graph structure and the corresponding GNN models. Despite the proposal of numerous GSL methods, the progress in this field remains unclear due to inconsistent experimental protocols, including variations in datasets, data processing techniques, and splitting strategies. In this paper, we introduce OpenGSL, the first comprehensive benchmark for GSL, aimed at addressing this gap. OpenGSL enables a fair comparison among state-of-the-art GSL methods by evaluating them across various popular datasets using uniform data processing and splitting strategies. Through extensive experiments, we observe that existing GSL methods do not consistently outperform vanilla GNN counterparts. We also find that there is no significant correlation between the homophily of the learned structure and task performance, challenging the common belief. Moreover, we observe that the learned graph structure demonstrates a strong generalization ability across different GNN models, despite the high computational and space consumption. We hope that our open-sourced library will facilitate rapid and equitable evaluation and inspire further innovative research in this field. The code of the benchmark can be found in https: //github. com/OpenGSL/OpenGSL.

AAAI Conference 2023 Conference Paper

Robust Sequence Networked Submodular Maximization

  • Qihao Shi
  • Bingyang Fu
  • Can Wang
  • Jiawei Chen
  • Sheng Zhou
  • Yan Feng
  • Chun Chen

In this paper, we study the Robust optimization for sequence Networked submodular maximization (RoseNets) problem. We interweave the robust optimization with the sequence networked submodular maximization. The elements are connected by a directed acyclic graph and the objective function is not submodular on the elements but on the edges in the graph. Under such networked submodular scenario, the impact of removing an element from a sequence depends both on its position in the sequence and in the network. This makes the existing robust algorithms inapplicable and calls for new robust algorithms. In this paper, we take the first step to study the RoseNets problem. We design a robust greedy algorithms, which is robust against the removal of an arbitrary subset of the selected elements. The approximation ratio of the algorithm depends both on the number of the removed elements and the network topology. We further conduct experiments on real applications of recommendation and link prediction. The experimental results demonstrate the effectiveness of the proposed algorithm.

AAAI Conference 2021 Conference Paper

Cross-Layer Distillation with Semantic Calibration

  • Defang Chen
  • Jian-Ping Mei
  • Yuan Zhang
  • Can Wang
  • Zhe Wang
  • Yan Feng
  • Chun Chen

Recently proposed knowledge distillation approaches based on feature-map transfer validate that intermediate layers of a teacher model can serve as effective targets for training a student model to obtain better generalization ability. Existing studies mainly focus on particular representation forms for knowledge transfer between manually specified pairs of teacher-student intermediate layers. However, semantics of intermediate layers may vary in different networks and manual association of layers might lead to negative regularization caused by semantic mismatch between certain teacherstudent layer pairs. To address this problem, we propose Semantic Calibration for Cross-layer Knowledge Distillation (SemCKD), which automatically assigns proper target layers of the teacher model for each student layer with an attention mechanism. With a learned attention distribution, each student layer distills knowledge contained in multiple layers rather than a single fixed intermediate layer from the teacher model for appropriate cross-layer supervision in training. Consistent improvements over state-of-the-art approaches are observed in extensive experiments with various network architectures for teacher and student models, demonstrating the effectiveness and flexibility of the proposed attention based soft layer association mechanism for cross-layer distillation.

AAAI Conference 2020 Conference Paper

Fast Adaptively Weighted Matrix Factorization for Recommendation with Implicit Feedback

  • Jiawei Chen
  • Can Wang
  • Sheng Zhou
  • Qihao Shi
  • Jingbang Chen
  • Yan Feng
  • Chun Chen

Recommendation from implicit feedback is a highly challenging task due to the lack of the reliable observed negative data. A popular and effective approach for implicit recommendation is to treat unobserved data as negative but downweight their confidence. Naturally, how to assign confidence weights and how to handle the large number of the unobserved data are two key problems for implicit recommendation models. However, existing methods either pursuit fast learning by manually assigning simple confidence weights, which lacks flexibility and may create empirical bias in evaluating user’s preference; or adaptively infer personalized con- fidence weights but suffer from low efficiency. To achieve both adaptive weights assignment and efficient model learning, we propose a fast adaptively weighted matrix factorization (FAWMF) based on variational auto-encoder. The personalized data confidence weights are adaptively assigned with a parameterized neural network (function) and the network can be inferred from the data. Further, to support fast and stable learning of FAWMF, a new specific batchbased learning algorithm fBGD has been developed, which trains on all feedback data but its complexity is linear to the number of observed data. Extensive experiments on realworld datasets demonstrate the superiority of the proposed FAWMF and its learning algorithm fBGD.

AAAI Conference 2020 Conference Paper

Online Knowledge Distillation with Diverse Peers

  • Defang Chen
  • Jian-Ping Mei
  • Can Wang
  • Yan Feng
  • Chun Chen

Distillation is an effective knowledge-transfer technique that uses predicted distributions of a powerful teacher model as soft targets to train a less-parameterized student model. A pre-trained high capacity teacher, however, is not always available. Recently proposed online variants use the aggregated intermediate predictions of multiple student models as targets to train each student model. Although group-derived targets give a good recipe for teacher-free distillation, group members are homogenized quickly with simple aggregation functions, leading to early saturated solutions. In this work, we propose Online Knowledge Distillation with Diverse peers (OKDDip), which performs two-level distillation during training with multiple auxiliary peers and one group leader. In the first-level distillation, each auxiliary peer holds an individual set of aggregation weights generated with an attention-based mechanism to derive its own targets from predictions of other auxiliary peers. Learning from distinct target distributions helps to boost peer diversity for effectiveness of group-based distillation. The second-level distillation is performed to transfer the knowledge in the ensemble of auxiliary peers further to the group leader, i. e. , the model used for inference. Experimental results show that the proposed framework consistently gives better performance than state-of-the-art approaches without sacrificing training or inference complexity, demonstrating the effectiveness of the proposed two-level distillation framework.

AAAI Conference 2014 Conference Paper

Mapping Users across Networks by Manifold Alignment on Hypergraph

  • Shulong Tan
  • Ziyu Guan
  • Deng Cai
  • Xuzhen Qin
  • Jiajun Bu
  • Chun Chen

Nowadays many people are members of multiple online social networks simultaneously, such as Facebook, Twitter and some other instant messaging circles. But these networks are usually isolated from each other. Mapping common users across these social networks will benefit many applications. Methods based on username comparison perform well on parts of users, however they can not work in the following situations: (a) users choose different usernames in different networks; (b) a unique username corresponds to different individuals. In this paper, we propose to utilize social structures to improve the mapping performance. Specifically, a novel subspace learning algorithm, Manifold Alignment on Hypergraph (MAH), is proposed. Different from traditional semi-supervised manifold alignment methods, we use hypergraph to model high-order relations here. For a target user in one network, the proposed algorithm ranks all users in the other network by their possibilities of being the corresponding user. Moreover, methods based on username comparison can be incorporated into our algorithm easily to further boost the mapping accuracy. Experimental results have demonstrated the effectiveness of our proposed algorithm in mapping users across networks.

IJCAI Conference 2013 Conference Paper

Harmonious Hashing

  • Bin Xu
  • Jiajun Bu
  • Yue Lin
  • Chun Chen
  • Xiaofei He
  • Deng Cai

Hashing-based fast nearest neighbor search technique has attracted great attention in both research and industry areas recently. Many existing hashing approaches encode data with projection-based hash functions and represent each projected dimension by 1-bit. However, the dimensions with high variance hold large energy or information of data but treated equivalently as dimensions with low variance, which leads to a serious information loss. In this paper, we introduce a novel hashing algorithm called Harmonious Hashing which aims at learning hash functions with low information loss. Specifically, we learn a set of optimized projections to preserve the maximum cumulative energy and meet the constraint of equivalent variance on each dimension as much as possible. In this way, we could minimize the information loss after binarization. Despite the extreme simplicity, our method outperforms superiorly to many state-of-the-art hashing methods in large-scale and high-dimensional nearest neighbor search experiments.

JBHI Journal 2013 Journal Article

Secure and Lightweight Network Admission and Transmission Protocol for Body Sensor Networks

  • Daojing He
  • Chun Chen
  • Sammy Chan
  • Jiajun Bu
  • Pingxin Zhang

A body sensor network (BSN) is a wireless network of biosensors and a local processing unit, which is commonly referred to as the personal wireless hub (PWH). Personal health information (PHI) is collected by biosensors and delivered to the PWH before it is forwarded to the remote healthcare center for further processing. In a BSN, it is critical to only admit eligible biosensors and PWH into the network. Also, securing the transmission from each biosensor to PWH is essential not only for ensuring safety of PHI delivery, but also for preserving the privacy of PHI. In this paper, we present the design, implementation, and evaluation of a secure network admission and transmission subsystem based on a polynomial-based authentication scheme. The procedures in this subsystem to establish keys for each biosensor are communication efficient and energy efficient. Moreover, based on the observation that an adversary eavesdropping in a BSN faces inevitable channel errors, we propose to exploit the adversary's uncertainty regarding the PHI transmission to update the individual key dynamically and improve key secrecy. In addition to the theoretical analysis that demonstrates the security properties of our system, this paper also reports the experimental results of the proposed protocol on resource-limited sensor platforms, which show the efficiency of our system in practice.

AAAI Conference 2012 Conference Paper

A Bregman Divergence Optimization Framework for Ranking on Data Manifold and Its New Extensions

  • Bin Xu
  • Jiajun Bu
  • Chun Chen
  • Deng Cai

Recently, graph-based ranking algorithms have received considerable interests in machine learning, computer vision and information retrieval communities. Ranking on data manifold (or manifold ranking, MR) is one of the representative approaches. One of the limitations of manifold ranking is its high computational complexity (O(n3 ), where n is the number of samples in database). In this paper, we cast the manifold ranking into a Bregman divergence optimization framework under which we transform the original MR to an equivalent optimal kernel matrix learning problem. With this new formulation, two effective and efficient extensions are proposed to enhance the ranking performance. Extensive experimental results on two real world image databases show the effectiveness of the proposed approach.

AAAI Conference 2012 Conference Paper

Document Summarization Based on Data Reconstruction

  • Zhanying He
  • Chun Chen
  • Jiajun Bu
  • Can Wang
  • Lijun Zhang
  • Deng Cai
  • Xiaofei He

Document summarization is of great value to many real world applications, such as snippets generation for search results and news headlines generation. Traditionally, document summarization is implemented by extracting sentences that cover the main topics of a document with a minimum redundancy. In this paper, we take a different perspective from data reconstruction and propose a novel framework named Document Summarization based on Data Reconstruction (DSDR). Specifically, our approach generates a summary which consist of those sentences that can best reconstruct the original document. To model the relationship among sentences, we introduce two objective functions: (1) linear reconstruction, which approximates the document by linear combinations of the selected sentences; (2) nonnegative linear reconstruction, which allows only additive, not subtractive, linear combinations. In this framework, the reconstruction error becomes a natural criterion for measuring the quality of the summary. For each objective function, we develop an efficient algorithm to solve the corresponding optimization problem. Extensive experiments on summarization benchmark data sets DUC 2006 and DUC 2007 demonstrate the effectiveness of our proposed approach.

AAAI Conference 2012 Conference Paper

Efficient Online Learning for Large-Scale Sparse Kernel Logistic Regression

  • Lijun Zhang
  • Rong Jin
  • Chun Chen
  • Jiajun Bu
  • Xiaofei He

In this paper, we study the problem of large-scale Kernel Logistic Regression (KLR). A straightforward approach is to apply stochastic approximation to KLR. We refer to this approach as non-conservative online learning algorithm because it updates the kernel classifier after every received training example, leading to a dense classifier. To improve the sparsity of the KLR classifier, we propose two conservative online learning algorithms that update the classifier in a stochastic manner and generate sparse solutions. With appropriately designed updating strategies, our analysis shows that the two conservative algorithms enjoy similar theoretical guarantee as that of the non-conservative algorithm. Empirical studies on several benchmark data sets demonstrate that compared to batch-mode algorithms for KLR, the proposed conservative online learning algorithms are able to produce sparse KLR classifiers, and achieve similar classification accuracy but with significantly shorter training time. Furthermore, both the sparsity and classification accuracy of our methods are comparable to those of the online kernel SVM.

AAAI Conference 2011 Conference Paper

Social Recommendation Using Low-Rank Semidefinite Program

  • Jianke Zhu
  • Hao Ma
  • Chun Chen
  • Jiajun Bu

The most critical challenge for the recommendation system is to achieve the high prediction quality on the large scale sparse data contributed by the users. In this paper, we present a novel approach to the social recommendation problem, which takes the advantage of the graph Laplacian regularization to capture the underlying social relationship among the users. Differently from the previous approaches, that are based on the conventional gradient descent optimization, we formulate the presented graph Laplacian regularized social recommendation problem into a low-rank semidefinite program, which is able to be efficiently solved by the quasi-Newton algorithm. We have conducted the empirical evaluation on a large scale dataset of high sparsity, the promising experimental results show that our method is very effective and efficient for the social recommendation task.

AAAI Conference 2010 Conference Paper

G-Optimal Design with Laplacian Regularization

  • Chun Chen
  • Zhengguang Chen
  • Jiajun Bu
  • Can Wang
  • Lijun Zhang
  • Cheng Zhang

In many real world applications, labeled data are usually expensive to get, while there may be a large amount of unlabeled data. To reduce the labeling cost, active learning attempts to discover the most informative data points for labeling. Recently, Optimal Experimental Design (OED) techniques have attracted an increasing amount of attention. OED is concerned with the design of experiments that minimizes variances of a parameterized model. Typical design criteria include D-, A-, and E-optimality. However, all these criteria are based on an ordinary linear regression model which aims to minimize the empirical error whereas the geometrical structure of the data space is not well respected. In this paper, we propose a novel optimal experimental design approach for active learning, called Laplacian G-Optimal Design (LapGOD), which considers both discriminating and geometrical structures. By using Laplacian Regularized Least Squares which incorporates manifold regularization into linear regression, our proposed algorithm selects those data points that minimizes the maximum variance of the predicted values on the data manifold. We also extend our algorithm to nonlinear case by using kernel trick. The experimental results on various image databases have shown that our proposed LapGOD active learning algorithm can significantly enhance the classification accuracy if the selected data points are used as training data.

AAAI Conference 2010 Conference Paper

Modeling Dynamic Multi-Topic Discussions in Online Forums

  • Hao Wu
  • Jiajun Bu
  • Chun Chen
  • Can Wang
  • Guang Qiu
  • Lijun Zhang
  • Jianfeng Shen

In the form of topic discussions, users interact with each other to share knowledge and exchange information in online forums. Modeling the evolution of topic discussion reveals how information propagates on Internet and can thus help understand sociological phenomena and improve the performance of applications such as recommendation systems. In this paper, we argue that a user’s participation in topic discussions is motivated by either her friends or her own preferences. Inspired by the theory of information flow, we propose dynamic topic discussion models by mining influential relationships between users and individual preferences. Reply relations of users are exploited to construct the fundamental influential social network. The property of discussed topics and time lapse factor are also considered in our modeling. Furthermore, we propose a novel measure called ParticipationRank to rank users according to how important they are in the social network and to what extent they prefer to participate in the discussion of a certain topic. The experiments show our model can simulate the evolution of topic discussions well and predict the tendency of user’s participation accurately.

IJCAI Conference 2009 Conference Paper

  • Guang Qiu
  • Bing Liu
  • Jiajun Bu
  • Chun Chen

In most sentiment analysis applications, the sentiment lexicon plays a key role. However, it is hard, if not impossible, to collect and maintain a universal sentiment lexicon for all application domains because different words may be used in different domains. The main existing technique extracts such sentiment words from a large domain corpus based on different conjunctions and the idea of sentiment coherency in a sentence. In this paper, we propose a novel propagation approach that exploits the relations between sentiment words and topics or product features that the sentiment words modify, and also sentiment words and product features themselves to extract new sentiment words. As the method propagates information through both sentiment words and features, we call it double propagation. The extraction rules are designed based on relations described in dependency trees. A new method is also proposed to assign polarities to newly discovered sentiment words in a domain. Experimental results show that our approach is able to extract a large number of new sentiment words. The polarity assignment method is also effective.