Multi-Objective One-Shot Pruning for Large Language Models

Weiyu Chen; Hansi Yang; Yunhao Gou; Han Shi; Enliang Hu; Zhenguo Li; James Kwok

Back to NeurIPS

NeurIPS 2025

Multi-Objective One-Shot Pruning for Large Language Models

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks but require substantial computational resources, limiting their deployment in resource-constrained environments. While one-shot pruning methods can reduce model size without expensive retraining, they typically optimize for single objectives, ignoring LLMs' multi-faceted applications. We introduce Multi-Objective One-Shot Pruning (MOSP), which formulates LLM pruning as a multi-objective optimization problem. MOSP efficiently generates a Pareto set of pruned models representing different capability trade-offs, allowing users to select solutions aligned with their preferences. The proposed approach identifies share core support while enabling specialized support. Experiments across various LLMs and sparsity levels demonstrate MOSP's superior performance in navigating multi-objective trade-offs compared to baseline methods.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Annual Conference on Neural Information Processing Systems
Archive span: 1987-2025
Indexed papers: 30776
Paper id: 373045219575700742