Arrow Research search

Author name cluster

Eunseop Yoon

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

NeurIPS Conference 2025 Conference Paper

A Gradient Guidance Perspective on Stepwise Preference Optimization for Diffusion Models

  • Joshua Tian Jin Tee
  • Hee Suk Yoon
  • Abu Hanif Muhammad Syarubany
  • Eunseop Yoon
  • Chang Yoo

Direct Preference Optimization (DPO) is a key framework for aligning text-to-image models with human preferences, extended by Stepwise Preference Optimization (SPO) to leverage intermediate steps for preference learning, generating more aesthetically pleasing images with significantly less computational cost. While effective, SPO's underlying mechanisms remain underexplored. In light of this, we critically re-examine SPO by formalizing its mechanism as gradient guidance. This new lens shows that SPO uses biased temporal weighting, giving too little weight to later generative steps, and unlike likelihood centric views it reveals substantial noise in the gradient estimates. Leveraging these insights, our GradSPO algorithm introduces a simplified loss and a targeted, variance-informed noise reduction strategy, enhancing training stability. Evaluations on SD 1. 5 and SDXL show GradSPO substantially outperforms leading baselines in human preference, yielding images with markedly improved aesthetics and semantic faithfulness, leading to more robust alignment. Code and models are available at https: //github. com/JoshuaTTJ/GradSPO.

ICLR Conference 2025 Conference Paper

Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models

  • Eunseop Yoon
  • Hee Suk Yoon
  • Mark Hasegawa-Johnson
  • Chang D. Yoo

In the broader context of deep learning, Multimodal Large Language Models have achieved significant breakthroughs by leveraging powerful Large Language Models as a backbone to align different modalities into the language space. A prime exemplification is the development of Video Large Language Models (Video-LLMs). While numerous advancements have been proposed to enhance the video understanding capabilities of these models, they are predominantly trained on questions generated directly from video content. However, in real-world scenarios, users often pose questions that extend beyond the informational scope of the video, highlighting the need for Video-LLMs to assess the relevance of the question. We demonstrate that even the best-performing Video-LLMs fail to reject unfit questions-not necessarily due to a lack of video understanding, but because they have not been trained to identify and refuse such questions. To address this limitation, we propose alignment for answerability, a framework that equips Video-LLMs with the ability to evaluate the relevance of a question based on the input video and appropriately decline to answer when the question exceeds the scope of the video, as well as an evaluation framework with a comprehensive set of metrics designed to measure model behavior before and after alignment. Furthermore, we present a pipeline for creating a dataset specifically tailored for alignment for answerability, leveraging existing video-description paired datasets.

ICML Conference 2025 Conference Paper

ConfPO: Exploiting Policy Model Confidence for Critical Token Selection in Preference Optimization

  • Hee Suk Yoon
  • Eunseop Yoon
  • Mark Hasegawa-Johnson
  • Sungwoong Kim
  • Chang D. Yoo

We introduce ConfPO, a method for preference learning in Large Language Models (LLMs) that identifies and optimizes preference-critical tokens based solely on the training policy’s confidence, without requiring any auxiliary models or compute. Unlike prior Direct Alignment Algorithms (DAAs) such as Direct Preference Optimization (DPO), which uniformly adjust all token probabilities regardless of their relevance to preference, ConfPO focuses optimization on the most impactful tokens. This targeted approach improves alignment quality while mitigating overoptimization (i. e. , reward hacking) by using the KL divergence budget more efficiently. In contrast to recent token-level methods that rely on credit-assignment models or AI annotators, raising concerns about scalability and reliability, ConfPO is simple, lightweight, and model-free. Experimental results on challenging alignment benchmarks, including AlpacaEval 2 and Arena-Hard, demonstrate that ConfPO consistently outperforms uniform DAAs across various LLMs, delivering better alignment with zero additional computational overhead.

ICLR Conference 2024 Conference Paper

C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion

  • Hee Suk Yoon
  • Eunseop Yoon
  • Joshua Tian Jin Tee
  • Mark Hasegawa-Johnson
  • Yingzhen Li
  • Chang D. Yoo

In deep learning, test-time adaptation has gained attention as a method for model fine-tuning without the need for labeled data. A prime exemplification is the recently proposed test-time prompt tuning for large-scale vision-language models such as CLIP. Unfortunately, these prompts have been mainly developed to improve accuracy, overlooking the importance of calibration, which is a crucial aspect for quantifying prediction uncertainty. However, traditional calibration methods rely on substantial amounts of labeled data, making them impractical for test-time scenarios. To this end, this paper explores calibration during test-time prompt tuning by leveraging the inherent properties of CLIP. Through a series of observations, we find that the prompt choice significantly affects the calibration in CLIP, where the prompts leading to higher text feature dispersion result in better-calibrated predictions. Introducing the Average Text Feature Dispersion (ATFD), we establish its relationship with calibration error and present a novel method, Calibrated Test-time Prompt Tuning (C-TPT), for optimizing prompts during test-time with enhanced calibration. Through extensive experiments on different CLIP architectures and datasets, we show that C-TPT can effectively improve the calibration of test-time prompt tuning without needing labeled data. The code is publicly accessible at https://github.com/hee-suk-yoon/C-TPT.

AAAI Conference 2024 Conference Paper

SimPSI: A Simple Strategy to Preserve Spectral Information in Time Series Data Augmentation

  • Hyun Ryu
  • Sunjae Yoon
  • Hee Suk Yoon
  • Eunseop Yoon
  • Chang D. Yoo

Data augmentation is a crucial component in training neural networks to overcome the limitation imposed by data size, and several techniques have been studied for time series. Although these techniques are effective in certain tasks, they have yet to be generalized to time series benchmarks. We find that current data augmentation techniques ruin the core information contained within the frequency domain. To address this issue, we propose a simple strategy to preserve spectral information (SimPSI) in time series data augmentation. SimPSI preserves the spectral information by mixing the original and augmented input spectrum weighted by a preservation map, which indicates the importance score of each frequency. Specifically, our experimental contributions are to build three distinct preservation maps: magnitude spectrum, saliency map, and spectrum-preservative map. We apply SimPSI to various time series data augmentations and evaluate its effectiveness across a wide range of time series benchmarks. Our experimental results support that SimPSI considerably enhances the performance of time series data augmentations by preserving core spectral information. The source code used in the paper is available at https://github.com/Hyun-Ryu/simpsi.

ICLR Conference 2023 Conference Paper

ESD: Expected Squared Difference as a Tuning-Free Trainable Calibration Measure

  • Hee Suk Yoon
  • Joshua Tian Jin Tee
  • Eunseop Yoon
  • Sunjae Yoon
  • Gwangsu Kim
  • Yingzhen Li
  • Chang D. Yoo

Studies have shown that modern neural networks tend to be poorly calibrated due to over-confident predictions. Traditionally, post-processing methods have been used to calibrate the model after training. In recent years, various trainable calibration measures have been proposed to incorporate them directly into the training process. However, these methods all incorporate internal hyperparameters, and the performance of these calibration objectives relies on tuning these hyperparameters, incurring more computational costs as the size of neural networks and datasets become larger. As such, we present Expected Squared Difference (ESD), a tuning-free (i.e., hyperparameter-free) trainable calibration objective loss, where we view the calibration error from the perspective of the squared difference between the two expectations. With extensive experiments on several architectures (CNNs, Transformers) and datasets, we demonstrate that (1) incorporating ESD into the training improves model calibration in various batch size settings without the need for internal hyperparameter tuning, (2) ESD yields the best-calibrated results compared with previous approaches, and (3) ESD drastically improves the computational costs required for calibration during training due to the absence of internal hyperparameter. The code is publicly accessible at https://github.com/hee-suk-yoon/ESD.