Arrow Research search

Author name cluster

Pengfei Jin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

AAAI Conference 2026 Conference Paper

Beyond Adapter Retrieval: Latent Geometry-Preserving Composition via Sparse Task Projection

  • Pengfei Jin
  • Peng Shu
  • Sifan Song
  • Sekeun Kim
  • Qing Xiao
  • Cheng Chen
  • Tianming Liu
  • Xiang Li

Recent advances in parameter-efficient transfer learning have demonstrated the utility of composing LoRA adapters from libraries of pretrained modules. However, most existing approaches rely on simple retrieval heuristics or uniform averaging, which overlook the latent structure of task relationships in representation space. We propose a new framework for adapter reuse that moves beyond retrieval, formulating adapter composition as a geometry-aware sparse reconstruction problem. Specifically, we represent each task by a latent prototype vector derived from the base model’s encoder and aim to approximate the target task prototype as a sparse linear combination of retrieved reference prototypes, under an L1-regularized optimization objective. The resulting combination weights are then used to blend the corresponding LoRA adapters, yielding a composite adapter tailored to the target task. This formulation not only preserves the local geometric structure of the task representation manifold, but also promotes interpretability and efficient reuse by selecting a minimal set of relevant adapters. We demonstrate the effectiveness of our approach across multiple domains—including medical image segmentation, medical report generation and image synthesis. Our results highlight the benefit of coupling retrieval with latent geometry-aware optimization for improved zero-shot generalization.

ICML Conference 2025 Conference Paper

Distribution-aware Fairness Learning in Medical Image Segmentation From A Control-Theoretic Perspective

  • Yujin Oh
  • Pengfei Jin
  • Sangjoon Park
  • Sekeun Kim
  • Siyeop Yoon
  • Jinsung Kim
  • Kyungsang Kim
  • Xiang Li 0001

Ensuring fairness in medical image segmentation is critical due to biases in imbalanced clinical data acquisition caused by demographic attributes (e. g. , age, sex, race) and clinical factors (e. g. , disease severity). To address these challenges, we introduce Distribution-aware Mixture of Experts (dMoE), inspired by optimal control theory. We provide a comprehensive analysis of its underlying mechanisms and clarify dMoE’s role in adapting to heterogeneous distributions in medical image segmentation. Furthermore, we integrate dMoE into multiple network architectures, demonstrating its broad applicability across diverse medical image analysis tasks. By incorporating demographic and clinical factors, dMoE achieves state-of-the-art performance on two 2D benchmark datasets and a 3D in-house dataset. Our results highlight the effectiveness of dMoE in mitigating biases from imbalanced distributions, offering a promising approach to bridging control theory and medical image segmentation within fairness learning paradigms. The source code is available at https: //github. com/tvseg/dMoE.

ICLR Conference 2025 Conference Paper

ECHOPulse: ECG Controlled Echocardio-gram Video Generation

  • Yiwei Li 0002
  • Sekeun Kim
  • Zihao Wu 0001
  • Hanqi Jiang
  • Yi Pan 0001
  • Pengfei Jin
  • Sifan Song
  • Yucheng Shi

Echocardiography (ECHO) is essential for cardiac assessments, but its video quality and interpretation heavily relies on manual expertise, leading to inconsistent results from clinical and portable devices. ECHO video generation offers a solution by improving automated monitoring through synthetic data and generating high-quality videos from routine health data. However, existing models often face high computational costs, slow inference, and rely on complex conditional prompts that require experts' annotations. To address these challenges, we propose ECHOPulse, an ECG-conditioned ECHO video generation model. ECHOPulse introduces two key advancements: (1) it accelerates ECHO video generation by leveraging VQ-VAE tokenization and masked visual token modeling for fast decoding, and (2) it conditions on readily accessible ECG signals, which are highly coherent with ECHO videos, bypassing complex conditional prompts. To the best of our knowledge, this is the first work to use time-series prompts like ECG signals for ECHO video generation. ECHOPulse not only enables controllable synthetic ECHO data generation but also provides updated cardiac function information for disease monitoring and prediction beyond ECG alone. Evaluations on three public and private datasets demonstrate state-of-the-art performance in ECHO video generation across both qualitative and quantitative measures. Additionally, ECHOPulse can be easily generalized to other modality generation tasks, such as cardiac MRI, fMRI, and 3D CT generation. We will make the synthetic ECHO dataset, along with the code and model, publicly available upon acceptance.

JBHI Journal 2025 Journal Article

MediViSTA: Medical Video Segmentation Via Temporal Fusion SAM Adaptation for Echocardiography

  • Sekeun Kim
  • Pengfei Jin
  • Cheng Chen
  • Kyungsang Kim
  • Zhiliang Lyu
  • Hui Ren
  • Sunghwan Kim
  • Zhengliang Liu

Despite achieving impressive results in general-purpose semantic segmentation with strong generalization on natural images, the Segment Anything Model (SAM) has shown less precision and stability in medical image segmentation. In particular, the original SAM architecture is designed for 2D natural images and is therefore not support to handle three-dimensional information, which is particularly important for medical imaging modalities that are often volumetric or video data. In this paper, we introduce MediViSTA, a parameter-efficient fine-tuning method designed to adapt the vision foundation model for medical video, with a specific focus on echocardiography segmentation. To achieve spatial adaptation, we propose a frequency feature fusion technique that injects spatial frequency information from a CNN branch. For temporal adaptation, we integrate temporal adapters within the transformer blocks of the image encoder. Using a fine-tuning strategy, only a small subset of pre-trained parameters is updated, allowing efficient adaptation to echocardiography data. The effectiveness of our method has been comprehensively evaluated on three datasets, comprising two public datasets and one multi-center in-house dataset. Our method consistently outperforms various state-of-the-art approaches without using any prompts. Furthermore, our model exhibits strong generalization capabilities on unseen datasets, surpassing the second-best approach by 2. 15% in Dice and 0. 09 in temporal consistency. The results demonstrate the potential of MediViSTA to significantly advance echocardiography video segmentation, offering improved accuracy and robustness in cardiac assessment applications.

NeurIPS Conference 2025 Conference Paper

RODS: Robust Optimization Inspired Diffusion Sampling for Detecting and Reducing Hallucination in Generative Models

  • Yiqi Tian
  • Pengfei Jin
  • Mingze Yuan
  • Na Li
  • Bo Zeng
  • Quanzheng Li

Diffusion models have achieved state-of-the-art performance in generative modeling, yet their sampling procedures remain vulnerable to hallucinations—often stemming from inaccuracies in score approximation. In this work, we reinterpret diffusion sampling through the lens of optimization and introduce RODS (Robust Optimization–inspired Diffusion Sampler), a novel method that detects and corrects high-risk sampling steps using geometric cues from the loss landscape. RODS enforces smoother sampling trajectories and \textit{adaptively} adjusts perturbations, reducing hallucinations without retraining and at minimal additional inference cost. Experiments on AFHQv2, FFHQ, and 11k-hands demonstrate that RODS maintains comparable image quality and preserves generation diversity. More importantly, it improves both sampling fidelity and robustness, detecting over 70\% of hallucinated samples and correcting more than 25\%, all while avoiding the introduction of new artifacts. We release our code at https: //github. com/Yiqi-Verna-Tian/RODS.

NeurIPS Conference 2025 Conference Paper

System-Embedded Diffusion Bridge Models

  • Bartlomiej Sobieski
  • Matthew Tivnan
  • Yuang Wang
  • Siyeop yoon
  • Pengfei Jin
  • Dufan Wu
  • Quanzheng Li
  • Przemyslaw Biecek

Solving inverse problems—recovering signals from incomplete or noisy measurements—is fundamental in science and engineering. Score-based generative models (SGMs) have recently emerged as a powerful framework for this task. Two main paradigms have formed: unsupervised approaches that adapt pretrained generative models to inverse problems, and supervised bridge methods that train stochastic processes conditioned on paired clean and corrupted data. While the former typically assume knowledge of the measurement model, the latter have largely overlooked this structural information. We introduce System-embedded Diffusion Bridge Models (SDBs), a new class of supervised bridge methods that explicitly embed the known linear measurement system into the coefficients of a matrix-valued SDE. This principled integration yields consistent improvements across diverse linear inverse problems and demonstrates robust generalization under system misspecification between training and deployment, offering a promising solution to real-world applications.