Arrow Research search

Author name cluster

Changyi He

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
2 author rows

Possible papers

2

ICML Conference 2025 Conference Paper

DA-KD: Difficulty-Aware Knowledge Distillation for Efficient Large Language Models

  • Changyi He
  • Yifu Ding
  • Jinyang Guo
  • Ruihao Gong
  • Haotong Qin
  • Xianglong Liu 0001

Although knowledge distillation (KD) is an effective approach to improve the performance of a smaller LLM (i. e. , the student model) by transferring knowledge from a large LLM (i. e. , the teacher model), it still suffers from high training cost. Existing LLM distillation methods ignore the difficulty difference among different samples, making the distillation of easy samples unnecessary. This leads to high distillation cost. In this paper, we propose difficulty-aware knowledge distillation (DA-KD) framework for efficient knowledge distillation, in which we dynamically adjust the distillation dataset based on the difficulty of samples. We further observe existing KD loss cannot perform well when most of samples are difficult in the distillation dataset because of unstable optimization and the neglect of hard samples. Therefore, we also propose a new KD loss called bidirectional discrepancy loss (BDL) for effective KD. Extensive experiments demonstrate that our DA-KD framework is effective and efficient. Without bells and whistles, DA-KD can outperform existing state-of-the-art KD methods by 2% with half training cost and even surpass the teacher model with 4. 7$\times$ compression.

NeurIPS Conference 2024 Conference Paper

LLMCBench: Benchmarking Large Language Model Compression for Efficient Deployment

  • Ge Yang
  • Changyi He
  • Jinyang Guo
  • Jianyu Wu
  • Yifu Ding
  • Aishan Liu
  • Haotong Qin
  • Pengliang Ji

Although large language models (LLMs) have demonstrated their strong intelligence ability, the high demand for computation and storage hinders their practical application. To this end, many model compression techniques are proposed to increase the efficiency of LLMs. However, current researches only validate their methods on limited models, datasets, metrics, etc, and still lack a comprehensive evaluation under more general scenarios. So it is still a question of which model compression approach we should use under a specific case. To mitigate this gap, we present the Large Language Model Compression Benchmark (LLMCBench), a rigorously designed benchmark with an in-depth analysis for LLM compression algorithms. We first analyze the actual model production requirements and carefully design evaluation tracks and metrics. Then, we conduct extensive experiments and comparison using multiple mainstream LLM compression approaches. Finally, we perform an in-depth analysis based on the evaluation and provide useful insight for LLM compression design. We hope our LLMCBench can contribute insightful suggestions for LLM compression algorithm design and serve as a foundation for future research.