AAAI Conference 2026 Conference Paper
HGLTR: Hierarchical Knowledge Injection for Calibrating Pre-trained Models in Long-Tail Recognition
- Jinpeng Zheng
- Shao-Yuan Li
- Gan Xu
- Wenhai Wan
- Zijian Tao
- Songcan Chen
- Kangkan Wang
Long-tail recognition remains challenging for pre-trained foundation models like CLIP, which often suffer from performance degradation under imbalanced data. This stems not only from the overfitting/underfitting issues during fine-tuning but, more fundamentally, from the inherent bias inherited from the long-tail distribution of their massive pre-training datasets. To address this, we propose HGLTR (Hierarchy-Guided Long-Tail Recognition), a novel framework that calibrates pre-trained models by injecting objective class hierarchy knowledge. We argue that the semantic proximity defined by a hierarchy provides a robust, data-independent prior to counteract model bias. Our method is specifically designed for vision-language models' dual-modality architecture. At the feature level, we align image embeddings with a hierarchy-guided text similarity structure. At the classifier level, we employ a distillation loss to regularize predictions using soft labels derived from the hierarchy. This dual-level injection effectively transfers knowledge from head to tail classes. Experiments on ImageNet-LT, Places-LT, and iNaturalist 2018 demonstrate that HGLTR achieves state-of-the-art performance, particularly in tail-classes accuracy, highlighting the importance of leveraging structural priors to calibrate foundation models for real-world data.