Let the LLM Stick to Its Strengths: Learning to Route Economical LLM

Yi-Kai Zhang; Shiyin Lu; Qingguo Chen; Weihua Luo; De-Chuan Zhan; Han-Jia Ye

Back to NeurIPS

NeurIPS 2025

Let the LLM Stick to Its Strengths: Learning to Route Economical LLM

Conference Paper Main Conference Track Artificial Intelligence · Machine Learning

PDF Details

Abstract

Recently, test-time scaling of Large Language Models (LLMs) has emerged as a practical alternative to parameter and data scaling. Reasoning tasks often require large-scale, RLVR-based LLMs, while more economical LLMs can handle simpler tasks. Routing an LLM tailored to suitability ( i. e. , capability and cost) ensures usability and efficiency. We introduce LLMRec, which routes the most suitable LLM to the user query without pre-inference on the candidate LLM zoo. It pioneeringly reframes the LLM routing problem as a comprehensive recommendation system (RecSys) task. Our core insight is that an LLM's suitability for a query is a complex, latent signal equal to user-item preference. LLMRec systematically engineers features for candidate LLMs (intrinsic attributes and capability distributions), queries (general semantics and meta-dimensional info), and context (inference type, cost budgets). It also incorporates behavioral features to learn high-order interactions. LLMRec is designed to generalize to out-of-domain datasets and adapt to new LLMs as the model zoo evolves. We define the metric with the Pareto frontier under user-specified cost budgets. Across six datasets, LLMRec achieves an average cost reduction of over 38% while maintaining accuracy and consistently outperforming baselines in converging toward the Pareto frontier.

Let the LLM Stick to Its Strengths: Learning to Route Economical LLM

Abstract

Authors

Keywords

Context