Arrow Research search

Author name cluster

Jae-Jun Lee

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
2 author rows

Possible papers

3

ICLR Conference 2025 Conference Paper

Can One Modality Model Synergize Training of Other Modality Models?

  • Jae-Jun Lee
  • Sung Whan Yoon

Learning with multiple modalities has recently demonstrated significant gains in many domains by maximizing the shared information across modalities. However, the current approaches strongly rely on high-quality paired datasets, which allow co-training from the paired labels from different modalities. In this context, we raise a pivotal question: Can a model with one modality synergize the training of other models with the different modalities, even without the paired multimodal labels? Our answer is 'Yes'. As a figurative description, we argue that a writer, i.e., a language model, can promote the training of a painter, i.e., a visual model, even without the paired ground truth of text and image. We theoretically argue that a superior representation can be achieved by the synergy between two different modalities without paired supervision. As proofs of concept, we broadly confirm the considerable performance gains from the synergy among visual, language, and audio models. From a theoretical viewpoint, we first establish a mathematical foundation of the synergy between two different modality models, where each one is trained with its own modality. From a practical viewpoint, our work aims to broaden the scope of multimodal learning to encompass the synergistic usage of single-modality models, relieving a strong limitation of paired supervision. The code is available at https://github.com/johnjaejunlee95/synergistic-multimodal.

AAAI Conference 2020 Short Paper

An Automatic Shoplifting Detection from Surveillance Videos (Student Abstract)

  • U-Ju Gim
  • Jae-Jun Lee
  • Jeong-Hun Kim
  • Young-Ho Park
  • Aziz Nasridinov

The use of closed circuit television (CCTV) surveillance devices is increasing every year to prevent abnormal behaviors, including shoplifting. However, damage from shoplifting is also increasing every year. Thus, there is a need for intelligent CCTV surveillance systems that ensure the integrity of shops, despite workforce shortages. In this study, we propose an automatic detection system of shoplifting behaviors from surveillance videos. Instead of extracting features from the whole frame, we use the Region of Interest (ROI) optical- flow fusion network to highlight the necessary features more accurately.