Arrow Research search

Author name cluster

Kun Jin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
2 author rows

Possible papers

10

AAAI Conference 2026 Conference Paper

Addressing Polarization and Unfairness in Performative Prediction

  • Kun Jin
  • Tian Xie
  • Yang Liu
  • Xueru Zhang

In many real-world applications of machine learning—such as recommendations, hiring, and lending—deployed models influence the data they are trained on, leading to feedback loops between predictions and data distribution. The performative prediction (PP) framework captures this phenomenon by modeling the data distribution as a function of the deployed model. While prior work has focused on finding performative stable (PS) solutions for robustness, their societal impacts, particularly regarding fairness, remain underexplored. We show that PS solutions can lead to severe polarization and prediction performance disparities, and that conventional fairness interventions in previous works often fail under model-dependent distribution shifts due to failing the PS criteria. To address these challenges in PP, we introduce novel fairness mechanisms that provably ensure both stability and fairness, validated by theoretical analysis and empirical results.

NeurIPS Conference 2025 Conference Paper

Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization

  • Tao Zhang
  • Cheng Da
  • Kun Ding
  • Huan Yang
  • Kun Jin
  • Yan Li
  • Tingting Gao
  • Di Zhang

Preference optimization for diffusion models aims to align them with human preferences for images. Previous methods typically use Vision-Language Models (VLMs) as pixel-level reward models to approximate human preferences. However, when used for step-level preference optimization, these models face challenges in handling noisy images of different timesteps and require complex transformations into pixel space. In this work, we show that pre-trained diffusion models are naturally suited for step-level reward modeling in the noisy latent space, as they are explicitly designed to process latent images at various noise levels. Accordingly, we propose the Latent Reward Model (LRM), which repurposes components of the diffusion model to predict preferences of latent images at arbitrary timesteps. Building on LRM, we introduce Latent Preference Optimization (LPO), a step-level preference optimization method conducted directly in the noisy latent space. Experimental results indicate that LPO significantly improves the model's alignment with general, aesthetic, and text-image alignment preferences, while achieving a 2. 5-28x training speedup over existing preference optimization methods.

NeurIPS Conference 2024 Conference Paper

Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness

  • Yiquan Li
  • Zhongzhu Chen
  • Kun Jin
  • Jiongxiao Wang
  • Jiachen Lei
  • Bo Li
  • Chaowei Xiao

Diffusion Purification, purifying noised images with diffusion models, has been widely used for enhancing certified robustness via randomized smoothing. However, existing frameworks often grapple with the balance between efficiency and effectiveness. While the Denoising Diffusion Probabilistic Model (DDPM) offers an efficient single-step purification, it falls short in ensuring purified images reside on the data manifold. Conversely, the Stochastic Diffusion Model effectively places purified images on the data manifold but demands solving cumbersome stochastic differential equations, while its derivative, the Probability Flow Ordinary Differential Equation (PF-ODE), though solving simpler ordinary differential equations, still requires multiple computational steps. In this work, we demonstrated that an ideal purification pipeline should generate the purified images on the data manifold that are as much semantically aligned to the original images for effectiveness in one step for efficiency. Therefore, we introduced Consistency Purification, an efficiency-effectiveness Pareto superior purifier compared to the previous work. Consistency Purification employs the consistency model, a one-step generative model distilled from PF-ODE, thus can generate on-manifold purified images with a single network evaluation. However, the consistency model is designed not for purification thus it does not inherently ensure semantic alignment between purified and original images. To resolve this issue, we further refine it through Consistency Fine-tuning with LPIPS loss, which enables more aligned semantic meaning while keeping the purified images on data manifold. Our comprehensive experiments demonstrate that our Consistency Purification framework achieves state-of-the-art certified robustness and efficiency compared to baseline methods.

AAAI Conference 2024 Conference Paper

Performative Federated Learning: A Solution to Model-Dependent and Heterogeneous Distribution Shifts

  • Kun Jin
  • Tongxin Yin
  • Zhongzhu Chen
  • Zeyu Sun
  • Xueru Zhang
  • Yang Liu
  • Mingyan Liu

We consider a federated learning (FL) system consisting of multiple clients and a server, where the clients aim to collaboratively learn a common decision model from their distributed data. Unlike the conventional FL framework that assumes the client's data is static, we consider scenarios where the clients' data distributions may be reshaped by the deployed decision model. In this work, we leverage the idea of distribution shift mappings in performative prediction to formalize this model-dependent data distribution shift and propose a performative FL framework. We first introduce necessary and sufficient conditions for the existence of a unique performative stable solution and characterize its distance to the performative optimal solution. Then we propose the performative FedAvg algorithm and show that it converges to the performative stable solution at a rate of O(1/T) under both full and partial participation schemes. In particular, we use novel proof techniques and show how the clients' heterogeneity influences the convergence. Numerical results validate our analysis and provide valuable insights into real-world applications.

NeurIPS Conference 2024 Conference Paper

User-Creator Feature Polarization in Recommender Systems with Dual Influence

  • Tao Lin
  • Kun Jin
  • Andrew Estornell
  • Xiaoying Zhang
  • Yiling Chen
  • Yang Liu

Recommender systems serve the dual purpose of presenting relevant content to users and helping content creators reach their target audience. The dual nature of these systems naturally influences both users and creators: users' preferences are affected by the items they are recommended, while creators may be incentivized to alter their content to attract more users. We define a model, called user-creator feature dynamics, to capture the dual influence of recommender systems. We prove that a recommender system with dual influence is guaranteed to polarize, causing diversity loss in the system. We then investigate, both theoretically and empirically, approaches for mitigating polarization and promoting diversity in recommender systems. Unexpectedly, we find that common diversity-promoting approaches do not work in the presence of dual influence, while relevancy-optimizing methods like top-$k$ truncation can prevent polarization and improve diversity of the system.

ICLR Conference 2023 Conference Paper

DensePure: Understanding Diffusion Models for Adversarial Robustness

  • Chaowei Xiao
  • Zhongzhu Chen
  • Kun Jin
  • Jiongxiao Wang
  • Weili Nie
  • Mingyan Liu
  • Anima Anandkumar
  • Bo Li 0026

Diffusion models have been recently employed to improve certified robustness through the process of denoising. However, the theoretical understanding of why diffusion models are able to improve the certified robustness is still lacking, preventing from further improvement. In this study, we close this gap by analyzing the fundamental properties of diffusion models and establishing the conditions under which they can enhance certified robustness. This deeper understanding allows us to propose a new method DensePure, designed to improve the certified robustness of a pretrained model (i.e. classifier). Given an (adversarial) input, DensePure consists of multiple runs of denoising via the reverse process of the diffusion model (with different random seeds) to get multiple reversed samples, which are then passed through the classifier, followed by majority voting of inferred labels to make the final prediction. This design of using multiple runs of denoising is informed by our theoretical analysis of the conditional distribution of the reversed sample. Specifically, when the data density of a clean sample is high, its conditional density under the reverse process in a diffusion model is also high; thus sampling from the latter conditional distribution can purify the adversarial example and return the corresponding clean sample with a high probability. By using the highest density point in the conditional distribution as the reversed sample, we identify the robust region of a given instance under the diffusion model's reverse process. We show that this robust region is a union of multiple convex sets, and is potentially much larger than the robust regions identified in previous works. In practice, DensePure can approximate the label of the high density region in the conditional distribution so that it can enhance certified robustness. We conduct extensive experiments to demonstrate the effectiveness of DensePure by evaluating its certified robustness given a standard model via randomized smoothing. We show that DensePure is consistently better than existing methods on ImageNet, with 7% improvement on average.

EAAI Journal 2022 Journal Article

A novel multi-modal analysis model with Baidu Search Index for subway passenger flow forecasting

  • Kun Jin
  • Shaolong Sun
  • Hongtao Li
  • Fengting Zhang

With the boom of big data, the Internet contains more and more personal behavior information, but it is difficult to extract effectively. A model involving multivariate processing capability must be constructed to deal with these time series with complex characteristics. In this paper, a novel hybrid model embedding Baidu Search Index is therefore proposed to implement multi-step ahead subway passenger flow forecasting. Firstly, we collect data from informative Baidu Search Index, reduce dimensionality, and screen out the powerful predictors by statistical analysis. Secondly, we extract matching common modes at similar time scales between the subway passenger flow and screened Baidu Search Index via multivariate mode decomposition being optimized by multi-objective algorithm. Furthermore, to eliminate pseudo statistical causality, we select the optimal combination of modal components between subway passenger flow and its corresponding Baidu Search Index at each time scale by an innovative multi–modal analysis strategy. Thirdly, we reconstruct the forecasting values of each selected optimal combination as the final results. The empirical results of Beijing, Shanghai and Guangzhou show that the proposed model can significantly outperform six benchmark models in both the level and directional accuracy. So introducing Baidu Search Index creates a sound opportunity to enhance the subway passenger flow forecasting ability.

ICML Conference 2022 Conference Paper

Fairness Interventions as (Dis)Incentives for Strategic Manipulation

  • Xueru Zhang
  • Mohammad Mahdi Khalili
  • Kun Jin
  • Parinaz Naghizadeh
  • Mingyan Liu

Although machine learning (ML) algorithms are widely used to make decisions about individuals in various domains, concerns have arisen that (1) these algorithms are vulnerable to strategic manipulation and "gaming the algorithm"; and (2) ML decisions may exhibit bias against certain social groups. Existing works have largely examined these as two separate issues, e. g. , by focusing on building ML algorithms robust to strategic manipulation, or on training a fair ML algorithm. In this study, we set out to understand the impact they each have on the other, and examine how to characterize fair policies in the presence of strategic behavior. The strategic interaction between a decision maker and individuals (as decision takers) is modeled as a two-stage (Stackelberg) game; when designing an algorithm, the former anticipates the latter may manipulate their features in order to receive more favorable decisions. We analytically characterize the equilibrium strategies of both, and examine how the algorithms and their resulting fairness properties are affected when the decision maker is strategic (anticipates manipulation), as well as the impact of fairness interventions on equilibrium strategies. In particular, we identify conditions under which anticipation of strategic behavior may mitigate/exacerbate unfairness, and conditions under which fairness interventions can serve as (dis)incentives for strategic manipulation.

AAAI Conference 2021 Conference Paper

Multi-Scale Games: Representing and Solving Games on Networks with Group Structure

  • Kun Jin
  • Yevgeniy Vorobeychik
  • Mingyan Liu

Network games provide a natural machinery to compactly represent strategic interactions among agents whose payoffs exhibit sparsity in their dependence on the actions of others. Besides encoding interaction sparsity, however, real networks often exhibit a multi-scale structure, in which agents can be grouped into communities, those communities further grouped, and so on, and where interactions among such groups may also exhibit sparsity. We present a general model of multi-scale network games that encodes such multi-level structure. We then develop several algorithmic approaches that leverage this multi-scale structure, and derive sufficient conditions for convergence of these to a Nash equilibrium. Our numerical experiments demonstrate that the proposed approaches enable orders of magnitude improvements in scalability when computing Nash equilibria in such games. For example, we can solve previously intractable instances involving up to 1 million agents in under 15 minutes.

IJCAI Conference 2021 Conference Paper

Two-Sided Wasserstein Procrustes Analysis

  • Kun Jin
  • Chaoyue Liu
  • Cathy Xia

Learning correspondence between sets of objects is a key component in many machine learning tasks. Recently, optimal Transport (OT) has been successfully applied to such correspondence problems and it is appealing as a fully unsupervised approach. However, OT requires pairwise instances be directly comparable in a common metric space. This limits its applicability when feature spaces are of different dimensions or not directly comparable. In addition, OT only focuses on pairwise correspondence without sensing global transformations. To address these challenges, we propose a new method to jointly learn the optimal coupling between twosets, and the optimal transformations (e. g. rotation, projection and scaling) of each set based on a two-sided Wassertein Procrustes analysis (TWP). Since the joint problem is a non-convex optimization problem, we present a reformulation that renders the problem component-wise convex. We then propose a novel algorithm to solve the problem harnessing a Gauss–Seidel method. We further present competitive results of TWP on various applicationscompared with state-of-the-art methods.