Arrow Research search

Author name cluster

Kibum Kim

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

AAAI Conference 2025 Conference Paper

RA-SGG: Retrieval-Augmented Scene Graph Generation Framework via Multi-Prototype Learning

  • Kanghoon Yoon
  • Kibum Kim
  • Jaehyeong Jeon
  • Yeonjun In
  • Donghyun Kim
  • Chanyoung Park

Scene Graph Generation (SGG) research has suffered from two fundamental challenges: the long-tailed predicate distribution and semantic ambiguity between predicates. These challenges lead to a bias towards head predicates in SGG models, favoring dominant general predicates while overlooking fine-grained predicates. In this paper, we address the challenges of SGG by framing it as multi-label classification problem with partial annotation, where relevant labels of fine-grained predicates are missing. Under the new frame, we propose Retrieval-Augmented Scene Graph Generation (RA-SGG), which identifies potential instances to be multilabeled and enriches the single-label with multi-labels that are semantically similar to the original label by retrieving relevant samples from our established memory bank. Based on augmented relations (i.e., discovered multi-labels), we apply multi-prototype learning to train our SGG model. Several comprehensive experiments have demonstrated that RASGG outperforms state-of-the-art baselines by up to 3.6% on VG and 5.9% on GQA, particularly in terms of F@K, showing that RA-SGG effectively alleviates the issue of biased prediction caused by the long-tailed distribution and semantic ambiguity of predicates.

NeurIPS Conference 2025 Conference Paper

Training Robust Graph Neural Networks by Modeling Noise Dependencies

  • Yeonjun In
  • Kanghoon Yoon
  • Sukwon Yun
  • Kibum Kim
  • Sungchul Kim
  • Chanyoung Park

In real-world applications, node features in graphs often contain noise from various sources, leading to significant performance degradation in GNNs. Although several methods have been developed to enhance robustness, they rely on the unrealistic assumption that noise in node features is independent of the graph structure and node labels, thereby limiting their applicability. To this end, we introduce a more realistic noise scenario, dependency-aware noise on graphs (DANG), where noise in node features create a chain of noise dependencies that propagates to the graph structure and node labels. We propose a novel robust GNN, DA-GNN, which captures the causal relationships among variables in the data generating process (DGP) of DANG using variational inference. In addition, we present new benchmark datasets that simulate DANG in real-world applications, enabling more practical research on robust GNNs. Extensive experiments demonstrate that DA-GNN consistently outperforms existing baselines across various noise scenarios, including both DANG and conventional noise models commonly considered in this field.

ICLR Conference 2025 Conference Paper

Weakly Supervised Video Scene Graph Generation via Natural Language Supervision

  • Kibum Kim
  • Kanghoon Yoon
  • Yeonjun In
  • Jaehyeong Jeon
  • Jinyoung Moon
  • Dong Hyun Kim
  • Chanyoung Park 0001

Existing Video Scene Graph Generation (VidSGG) studies are trained in a fully supervised manner, which requires all frames in a video to be annotated, thereby incurring high annotation cost compared to Image Scene Graph Generation (ImgSGG). Although the annotation cost of VidSGG can be alleviated by adopting a weakly supervised approach commonly used for ImgSGG (WS-ImgSGG) that uses image captions, there are two key reasons that hinder such a naive adoption: 1) Temporality within video captions, i.e., unlike image captions, video captions include temporal markers (e.g., before, while, then, after) that indicate time-related details, and 2) Variability in action duration, i.e., unlike human actions in image captions, human actions in video captions unfold over varying duration. To address these issues, we propose a Natural Language-based Video Scene Graph Generation (NL-VSGG) framework that only utilizes the readily available video captions for training a VidSGG model. NL-VSGG consists of two key modules: Temporality-aware Caption Segmentation (TCS) module and Action Duration Variability-aware caption-frame alignment (ADV) module. Specifically, TCS segments the video captions into multiple sentences in a temporal order based on a Large Language Model (LLM), and ADV aligns each segmented sentence with appropriate frames considering the variability in action duration. Our approach leads to a significant enhancement in performance compared to simply applying the WS-ImgSGG pipeline to VidSGG on the Action Genome dataset. As a further benefit of utilizing the video captions as weak supervision, we show that the VidSGG model trained by NL-VSGG is able to predict a broader range of action classes that are not included in the training data, which makes our framework practical in reality.

ICLR Conference 2024 Conference Paper

Adaptive Self-training Framework for Fine-grained Scene Graph Generation

  • Kibum Kim
  • Kanghoon Yoon
  • Yeonjun In
  • Jinyoung Moon
  • Donghyun Kim 0006
  • Chanyoung Park 0001

Scene graph generation (SGG) models have suffered from inherent problems regarding the benchmark datasets such as the long-tailed predicate distribution and missing annotation problems. In this work, we aim to alleviate the long-tailed problem of SGG by utilizing unannotated triplets. To this end, we introduce a **S**elf-**T**raining framework for **SGG** **(ST-SGG)** that assigns pseudo-labels for unannotated triplets based on which the SGG models are trained. While there has been significant progress in self-training for image recognition, designing a self-training framework for the SGG task is more challenging due to its inherent nature such as the semantic ambiguity and the long-tailed distribution of predicate classes. Hence, we propose a novel pseudo-labeling technique for SGG, called **C**lass-specific **A**daptive **T**hresholding with **M**omentum **(CATM)**, which is a model-agnostic framework that can be applied to any existing SGG models. Furthermore, we devise a graph structure learner (GSL) that is beneficial when adopting our proposed self-training framework to the state-of-the-art message-passing neural network (MPNN)-based SGG models. Our extensive experiments verify the effectiveness of ST-SGG on various SGG models, particularly in enhancing the performance on fine-grained predicate classes.

AAAI Conference 2023 Conference Paper

Unbiased Heterogeneous Scene Graph Generation with Relation-Aware Message Passing Neural Network

  • Kanghoon Yoon
  • Kibum Kim
  • Jinyoung Moon
  • Chanyoung Park

Recent scene graph generation (SGG) frameworks have focused on learning complex relationships among multiple objects in an image. Thanks to the nature of the message passing neural network (MPNN) that models high-order interactions between objects and their neighboring objects, they are dominant representation learning modules for SGG. However, existing MPNN-based frameworks assume the scene graph as a homogeneous graph, which restricts the context-awareness of visual relations between objects. That is, they overlook the fact that the relations tend to be highly dependent on the objects with which the relations are associated. In this paper, we propose an unbiased heterogeneous scene graph generation (HetSGG) framework that captures relation-aware context using message passing neural networks. We devise a novel message passing layer, called relation-aware message passing neural network (RMP), that aggregates the contextual information of an image considering the predicate type between objects. Our extensive evaluations demonstrate that HetSGG outperforms state-of-the-art methods, especially outperforming on tail predicate classes. The source code for HetSGG is available at https://github.com/KanghoonYoon/hetsgg-torch

JBHI Journal 2022 Journal Article

Synergy Through Integration of Wearable EEG and Virtual Reality for Mild Cognitive Impairment and Mild Dementia Screening

  • Bohee Lee
  • Taeheon Lee
  • Hyungsin Jeon
  • Songsub Lee
  • Kibum Kim
  • Wanhee Cho
  • Jeonghwan Hwang
  • Yong-Wook Chae

Virtual reality (VR) technologies have shown promising potential in the early diagnosis of dementia by enabling accessible and regular assessment. However, previous VR studies were restricted to the analysis of behavioral responses, so information about degenerated brain dynamics could not be directly acquired. To address this issue, we provide a cognitive impairment (CI) screening tool based on a wearable EEG device integrated into a VR platform. Subjects were asked to use a hardware setup consisting of a frontal six-channel EEG device mounted on a VR device and to perform four cognitive tasks in VR. Behavioral response profiles and EEG features were extracted during the tasks, and classifiers were trained on extracted features to differentiate subjects with CI from healthy controls (HCs). Notably, the performance of the patient classification consistently improved when EEG characteristics measured during cognitive tasks were additionally included in feature attributes than when only the task scores or resting-state EEG features were used, suggesting that our protocol provides discriminative information for screening. These results propose that the integration of EEG devices into a VR framework could emerge as a powerful and synergistic strategy for constructing an easily accessible EEG-based CI screening tool.