Author name cluster

Chenyou Fan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

AAAI Conference 2026 Conference Paper

A Novel Fine-Tuned CLIP-OOD Detection Method with Double Loss Constraint Through Optimal Transport Semantic Alignment

Hengyang Lu
Xin Guo
Shuai Feng
Wenyu Jiang
Yuntao Du
Chang Xia
Chenyou Fan

Detecting Out-Of-Distribution (OOD) samples in image classification is crucial for model reliability. With the rise of Vision-Language Models (VLMs), CLIP-OOD has become a research hotspot. However, we observe the Low Focus Attention phenomenon from the image encoders of CLIP, which means the attention of image encoders often spreads to non-in-distribution regions. This phenomenon comes from the semantic mismalignment and inter-class feature confusion. To address these issues, we propose a novel fine-tuned OOD detection method with the Double loss constraint based on Optimal Transport (DOT-OOD). DOT-OOD integrates the Double Loss Constraint (DLC) module and Optimal Transport (OT) module. The DLC module comprises the Aligned Image-Text Concept Matching Loss and the Negative Sample Repulsion Loss, which respectively (1) focus on the core semantics of ID images and achieve cross-modal semantic alignment, (2) expand inter-class distances and enhance discriminative. While the OT module is introduced to obtain enhanced image feature representations. Extensive experimental results show that in the 16-shot scenario of the ImageNet-1k benchmark, DOT-OOD reduces the FPR95 by over 10% and improves the AUROC from 94.48% to 96.57% compared with SOTAs.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Forward KL Regularized Preference Optimization for Aligning Diffusion Policies

Zhao Shan
Chenyou Fan
Shuang Qiu
Jiyuan Shi
Chenjia Bai

Diffusion models have achieved remarkable success in sequential decision-making by leveraging the highly expressive model capabilities in policy learning. A central problem for learning diffusion policies is to align the policy output with human intents in various tasks. To achieve this, previous methods conduct return-conditioned policy generation or Reinforcement Learning (RL)-based policy optimization, while they both rely on pre-defined reward functions. In this work, we propose a novel framework, Forward KL regularized Preference optimization for aligning Diffusion policies, to align the diffusion policy with preferences directly. We first train a diffusion policy from the offline dataset without considering the preference, and then align the policy to the preference data via direct preference optimization. During the alignment phase, we formulate direct preference learning in a diffusion policy, where the forward KL regularization is employed in preference optimization to avoid generating out-of-distribution actions. We conduct extensive experiments for MetaWorld manipulation and D4RL tasks. The results show our method exhibits superior alignment with preferences and outperforms previous state-of-the-art algorithms.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Heterogeneous Federated Learning with Scalable Server Mixture-of-Experts

Jingang Jiang
Yanzhao Chen
Xiangyang Liu
Haiqi Jiang
Chenyou Fan

Classical Federated Learning (FL) encounters significant challenges when deploying large models on power-constrained clients. To tackle this, we propose an asymmetric FL mechanism that enables the aggregation of compact client models into a comprehensive server model. We design the server model as a Mixture-of-Experts (MoE), where each expert has the same architecture as each client model. This uniformity allows for efficient fusion of the most pertinent client models to update each server expert, based on the measured relevance between each client and server expert. To address the Non-IID data issue, we further optimize the server-side MoE architecture by incorporating a main expert that always activates alongside a set of selectively activated routed experts. This configuration ensures a balance between learning general knowledge and specific data distribution. Our Fed-MoE framework is model-agnostic and has demonstrated notable improvements on vision FL tasks with million-scale ResNet backbones, and language tasks with billion-scale BERT and GPT-2 backbones.

PDF Details DOI

ICML Conference 2025 Conference Paper

Task-Agnostic Pre-training and Task-Guided Fine-tuning for Versatile Diffusion Planner

Chenyou Fan
Chenjia Bai
Zhao Shan
Haoran He
Yang Zhang
Zhen Wang 0004

Diffusion models have demonstrated their capabilities in modeling trajectories of multi-tasks. However, existing multi-task planners or policies typically rely on task-specific demonstrations via multi-task imitation, or require task-specific reward labels to facilitate policy optimization via Reinforcement Learning (RL). They are costly due to the substantial human efforts required to collect expert data or design reward functions. To address these challenges, we aim to develop a versatile diffusion planner capable of leveraging large-scale inferior data that contains task-agnostic sub-optimal trajectories, with the ability to fast adapt to specific tasks. In this paper, we propose SODP, a two-stage framework that leverages Sub-Optimal data to learn a Diffusion Planner, which is generalizable for various downstream tasks. Specifically, in the pre-training stage, we train a foundation diffusion planner that extracts general planning capabilities by modeling the versatile distribution of multi-task trajectories, which can be sub-optimal and has wide data coverage. Then for downstream tasks, we adopt RL-based fine-tuning with task-specific rewards to quickly refine the diffusion planner, which aims to generate action sequences with higher task-specific returns. Experimental results from multi-task domains including Meta-World and Adroit demonstrate that SODP outperforms state-of-the-art methods with only a small amount of data for reward-guided fine-tuning.

Details

ECAI Conference 2023 Conference Paper

A Novel Differentiable Rank Learning Method Towards Stock Movement Quantile Forecasting

Chenyou Fan
Hengyang Lu
Aimin Huang

We focus on Stock Movement Forecasting (SMF) using AI techniques to develop modern automated trading systems. Previous studies with deep-learning-based methodology have only considered binary up-or-down trends, ignoring the importance of fine-grained categorization of the stock movements to facilitate decision-making. However, the challenges of SMF arise from the randomness of the global market impacting cross-sectional stocks and the volatility of internal dynamics in each time series. To address these challenges, we present a novel end-to-end learning-to-rank framework that incorporates both market-level and stock-level dynamics. Specifically, we aim to identify cross-sectional stocks that exhibit notable movements at every time step and learn to rank steps with the most significant movements in the temporal dimension. We conduct extensive evaluations of our multi-task learning framework utilizing real-world market data, which demonstrate superior performance when compared to state-of-the-art methods, with improvements in the Gain and Sharpe Ratio by 5–15%.

Details

IJCAI Conference 2022 Conference Paper

Private Semi-Supervised Federated Learning

Chenyou Fan
Junjie Hu
Jianwei Huang

We study a federated learning (FL) framework to effectively train models from scarce and skewly distributed labeled data. We consider a challenging yet practical scenario: a few data sources own a small amount of labeled data, while the rest mass sources own purely unlabeled data. Classical FL requires each client to have enough labeled data for local training, thus is not applicable in this scenario. In this work, we design an effective federated semi-supervised learning framework (FedSSL) to fully leverage both labeled and unlabeled data sources. We establish a unified data space across all participating agents, so that each agent can generate mixed data samples to boost semi-supervised learning (SSL), while keeping data locality. We further show that FedSSL can integrate differential privacy protection techniques to prevent labeled data leakage at the cost of minimum performance degradation. On SSL tasks with as small as 0. 17% and 1% of MNIST and CIFAR-10 datasets as labeled data, respectively, our approach can achieve 5-20% performance boost over the state-of-the-art methods.

PDF Details DOI

NeurIPS Conference 2020 Conference Paper

Projection Robust Wasserstein Distance and Riemannian Optimization

Tianyi Lin
Chenyou Fan
Nhat Ho
Marco Cuturi
Michael Jordan

Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a robust variant of the Wasserstein distance. Recent work suggests that this quantity is more robust than the standard Wasserstein distance, in particular when comparing probability measures in high-dimensions. However, it is ruled out for practical application because the optimization model is essentially non-convex and non-smooth which makes the computation intractable. Our contribution in this paper is to revisit the original motivation behind WPP/PRW, but take the hard route of showing that, despite its non-convexity and lack of nonsmoothness, and even despite some hardness results proved by~\citet{Niles-2019-Estimation} in a minimax sense, the original formulation for PRW/WPP \textit{can} be efficiently computed in practice using Riemannian optimization, yielding in relevant cases better behavior than its convex relaxation. More specifically, we provide three simple algorithms with solid theoretical guarantee on their complexity bound (one in the appendix), and demonstrate their effectiveness and efficiency by conducing extensive experiments on synthetic and real data. This paper provides a first step into a computational theory of the PRW distance and provides the links between optimal transport and Riemannian optimization.

PDF Details

AAAI Conference 2017 Conference Paper

Title Learning Latent Subevents in Activity Videos Using Temporal Attention Filters

A. Piergiovanni
Chenyou Fan
Michael Ryoo

In this paper, we newly introduce the concept of temporal attention ﬁlters, and describe how they can be used for human activity recognition from videos. Many high-level activities are often composed of multiple temporal parts (e. g. , sub-events) with different duration/speed, and our objective is to make the model explicitly learn such temporal structure using multiple attention ﬁlters and beneﬁt from them. Our temporal ﬁlters are designed to be fully differentiable, allowing end-of-end training of the temporal ﬁlters together with the underlying frame-based or segment-based convolutional neural network architectures. This paper presents an approach of learning a set of optimal static temporal attention ﬁlters to be shared across different videos, and extends this approach to dynamically adjust attention ﬁlters per testing video using recurrent long short-term memory networks (LSTMs). This allows our temporal attention ﬁlters to learn latent sub-events speciﬁc to each activity. We experimentally conﬁrm that the proposed concept of temporal attention ﬁlters beneﬁts the activity recognition, and we visualize the learned latent sub-events.

PDF Details