Author name cluster

Minsu Lee

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

AAAI Conference 2025 Conference Paper

Truncated Gaussian Policy for Debiased Continuous Control

Ganghun Lee
Minji Kim
Minsu Lee
Byoung-Tak Zhang

In continuous domains, reinforcement learning policies are often based on Gaussian distributions for their generality. However, the unbounded support of Gaussian policy can cause a bias toward sampling boundary actions in many continuous control tasks that impose action limits due to physical constraints. This "boundary action bias'' can negatively impact training in algorithms like Proximal Policy Optimization. Despite this, it has been overlooked in many existing research and applications. In this paper, we revisit this issue by presenting illustrative explanations and analysis from the sampling point of view. Then, we introduce a truncated Gaussian policy with inherent bounds as a minimal alternative to mitigate the bias. However, we find that the plain truncated Gaussian policy may lay the counter-bias, preferring interior actions: to balance the bias, we ultimately propose a scale-adjusted truncated Gaussian policy, where the distribution scale shrinks if the location is near the boundaries. This property makes boundary actions deterministic more than in plain truncated Gaussian, but still less than in original Gaussian. Extensive empirical studies and comparisons on various continuous control tasks demonstrate that the truncated Gaussian policies significantly reduce the rate of boundary action usage, while scale-adjusted ones successfully balance the bias and counter-bias. It generally outperforms the Gaussian policy and shows competitive results compared to other approaches designed to counteract the bias.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Unveiling the Significance of Toddler-Inspired Reward Transition in Goal-Oriented Reinforcement Learning

Junseok Park
Yoonsung Kim
Hee bin Yoo
Min Whoo Lee
Kibeom Kim
Won-Seok Choi
Minsu Lee
Byoung-Tak Zhang

Toddlers evolve from free exploration with sparse feedback to exploiting prior experiences for goal-directed learning with denser rewards. Drawing inspiration from this Toddler-Inspired Reward Transition, we set out to explore the implications of varying reward transitions when incorporated into Reinforcement Learning (RL) tasks. Central to our inquiry is the transition from sparse to potential-based dense rewards, which share optimal strategies regardless of reward changes. Through various experiments, including those in egocentric navigation and robotic arm manipulation tasks, we found that proper reward transitions significantly influence sample efficiency and success rates. Of particular note is the efficacy of the toddler-inspired Sparse-to-Dense (S2D) transition. Beyond these performance metrics, using Cross-Density Visualizer technique, we observed that transitions, especially the S2D, smooth the policy loss landscape, promoting wide minima that enhance generalization in RL models.

PDF Details DOI

AAAI Conference 2021 Conference Paper

DramaQA: Character-Centered Video Story Understanding with Hierarchical QA

Seongho Choi
Kyoung-Woon On
Yu-Jung Heo
Ahjeong Seo
Youwon Jang
Minsu Lee
Byoung-Tak Zhang

Despite recent progress on computer vision and natural language processing, developing a machine that can understand video story is still hard to achieve due to the intrinsic difficulty of video story. Moreover, researches on how to evaluate the degree of video understanding based on human cognitive process have not progressed as yet. In this paper, we propose a novel video question answering (Video QA) task, DramaQA, for a comprehensive understanding of the video story. The DramaQA focuses on two perspectives: 1) Hierarchical QAs as an evaluation metric based on the cognitive developmental stages of human intelligence. 2) Character-centered video annotations to model local coherence of the story. Our dataset is built upon the TV drama “Another Miss Oh”1 and it contains 17, 983 QA pairs from 23, 928 various length video clips, with each QA pair belonging to one of four difficulty levels. We provide 217, 308 annotated images with rich charactercentered annotations, including visual bounding boxes, behaviors and emotions of main characters, and coreference resolved scripts. Additionally, we suggest Multi-level Context Matching model which hierarchically understands charactercentered representations of video to answer questions. We release our dataset and model publicly for research purposes2, and we expect our work to provide a new perspective on video story understanding research.

PDF Details

NeurIPS Conference 2021 Conference Paper

Goal-Aware Cross-Entropy for Multi-Target Reinforcement Learning

Kibeom Kim
Min Whoo Lee
Yoonsung Kim
JeHwan Ryu
Minsu Lee
Byoung-Tak Zhang

Learning in a multi-target environment without prior knowledge about the targets requires a large amount of samples and makes generalization difficult. To solve this problem, it is important to be able to discriminate targets through semantic understanding. In this paper, we propose goal-aware cross-entropy (GACE) loss, that can be utilized in a self-supervised way using auto-labeled goal states alongside reinforcement learning. Based on the loss, we then devise goal-discriminative attention networks (GDAN) which utilize the goal-relevant information to focus on the given instruction. We evaluate the proposed methods on visual navigation and robot arm manipulation tasks with multi-target environments and show that GDAN outperforms the state-of-the-art methods in terms of task success ratio, sample efficiency, and generalization. Additionally, qualitative analyses demonstrate that our proposed method can help the agent become aware of and focus on the given instruction clearly, promoting goal-directed behavior.

PDF Details