Author name cluster

Hojun Son

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers

2 author rows

TMLR Journal 2025 Journal Article

Quantifying Context Bias in Domain Adaptation for Object Detection

Hojun Son
Asma A. Almutairi
Arpan Kusari

Domain adaptation for object detection (DAOD) has become essential to counter performance degradation caused by distribution shifts between training and deployment domains. However, a critical factor influencing DAOD—context bias resulting from learned foreground-background (FG–BG) association—remains underexplored. In this work, we present the first comprehensive empirical and causal analysis specifically targeting context bias in DAOD. We address three key questions regarding FG–BG association in object detection: (a) whether FG–BG association is encoded during training, (b) whether there is a causal relationship between FG–BG association and detection performance, and (c) whether FG–BG association affects DAOD. To examine how models capture FG–BG association, we analyze class-wise and feature-wise performance degradation using background masking and feature perturbation, measured via change in accuracy (defined as drop rate). To explore the causal role of FG–BG association, we apply do-calculus to FG–BG pairs guided by class activation mapping (CAM). To quantify the causal influence of FG–BG association across domains, we propose a novel metric—Domain Association Gradient—defined as the ratio of drop rate to maximum mean discrepancy (MMD). Through systematic experiments involving background masking, feature-level perturbations, and CAM, we reveal that convolution-based object detection models encode FG–BG association. The association substantially impacts detection performance, particularly under domain shifts where background information significantly diverges. Our results demonstrate that context bias not only exists but also causally undermines the generalization capabilities of object detection models across domains. Furthermore, we validate these findings across multiple models and datasets, including state-of-the-art architectures such as ALDI++. This study highlights the necessity of addressing context bias explicitly in DAOD frameworks, providing insights that pave the way for developing more robust and generalizable object detection systems.

PDF Details

IROS Conference 2023 Conference Paper

Lightweight Semantic Segmentation Network for Semantic Scene Understanding on Low-Compute Devices

Hojun Son
James D. Weiland

Semantic scene understanding is beneficial for mobile robots. Semantic information obtained through onboard cameras can improve robots' navigation performance. However, obtaining semantic information on small mobile robots with constrained power and computation resources is challenging. We propose a new lightweight convolution neural network comparable to previous semantic segmentation algorithms for mobile applications. Our network achieved 73. 06% on the Cityscapes validation set and 71. 8% on the Cityscapes test set. Our model runs at 116 fps with $\mathbf{1024\mathrm{x}2048}$, 172 fps with $1024\mathrm{x}1024$, and 175 fps with $720\mathrm{x}960$ on NVIDIA GTX 1080. We analyze a model size, which is defined as the summation of the number of floating operations and the number of parameters. The smaller model size enables tiny mobile robot systems that should operate multiple tasks simultaneously to work efficiently. Our model has the smallest model size compared to the real-time semantic segmentation convolution neural networks ranked on Cityscapes real-time benchmark and other high performing, lightweight convolution neural networks. On the Camvid test set, our model achieved a mIoU of 73. 29% with Cityscapes pre-training, which outperformed the accuracy of other lightweight convolution neural networks. For mobile applicability, we measured frame-per-second on different low-compute devices. Our model operates 35 fps on Jetson Xavier AGX, 21 FPS on Jetson Xavier NX, and 14 FPS on a ROS ASUS gaming phone. $1024\mathrm{x}2048$ resolution is used for the Jetson devices, and $512\mathrm{x}512$ size is utilized for the measurement on the phone. Our network did not use extra datasets such as ImageNet, Coarse Cityscapes, and Mapillary. Additionally, we did not use TensorRT to achieve fast inference speed. Compared to other real-time and lightweight CNNs, our model achieved significantly more efficiency while balancing accuracy, inference speed, and model size.

Details