Arrow Research search

Author name cluster

Bin Qian

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

2 papers
1 author row

Possible papers

2

AAAI Conference 2026 Conference Paper

FIRM-MoE:Fine-GrainedExpert Decomposition for Resource-Adaptive MoE Inference

  • Keyu Chen
  • Qihang Zhou
  • Bin Qian
  • Zhenyu Wen
  • Wenchao Meng
  • Shibo He

Mixture-of-Experts (MoE) is a sparse neural architecture that significantly increases model capacity while maintaining low computational complexity. However, deploying MoE-based large language models (LLMs) on memory-constrained edge devices remains challenging due to their substantial memory requirements. To address this issue, we propose FIRM-MoE, a fine-grained expert offloading framework designed to enable flexible and efficient MoE inference. The core insight of our approach is to reduce the risk of inaccurate expert loading by decomposing each expert into fine-grained sub-experts and then dynamically allocating them through a fine-grained scheduling strategy. To further reduce the error in expert loading, we introduce a multi-layer expert prediction mechanism and a resource-adaptive expert pre-loading algorithm to enable more robust expert allocation. This design allows our model to achieve more efficient expert utilization and improved resilience to prediction errors. We conduct extensive experiments to demonstrate the superiority of FIRM-MoE across diverse memory constraints. The results show that FIRM-MoE achieves up to 1.5× speedup and 2.8× memory savings in decoding, compared to state-of-the-art MoE offloading strategies.

EAAI Journal 2024 Journal Article

A multidimensional probabilistic model based evolutionary algorithm for the energy-efficient distributed flexible job-shop scheduling problem

  • Zi-Qi Zhang
  • Ying Li
  • Bin Qian
  • Rong Hu
  • Jian-Bo Yang

With escalating environmental effects, the spotlight on low-carbon manufacturing has garnered significant attention. The rise of distributed production has emerged as a prominent trend in response to the imperatives of economic globalization. This article focuses on addressing the energy-efficient distributed flexible job-shop scheduling problem (EE_DFJSP), with the aim of minimizing both makespan and total energy consumption (TEC) simultaneously. The production process contains four pivotal phases: 1) job assignment in distributed factories; 2) machine selection within factories; 3) operation allocation on flexible machines; and 4) machine speed adjustment for processing. Given the problem's multi-phase and strong coupling characteristics, it is imperative to develop a promising evolutionary algorithm (EA) for EE_DFJSP. To tackle this challenge, we propose a multidimensional probabilistic model-based EA (MPMEA) paradigm. First, problem-specific encoding and decoding schemes are developed based on the solution features of EE_DFJSP. Second, a hybrid initialization strategy incorporating four heuristic rules is devised to yield an initial population with diversity. Third, an effective union probabilistic model (UPM) is formulated to learn promising patterns from superior solutions, and an efficient sampling strategy is designed to produce high-quality offspring individuals. To achieve a balance between global exploration and local exploitation, problem-specific multiple neighborhood operators are proposed to perform an in-depth local search. Furthermore, a two-stage energy-saving speed adjustment strategy is designed for the superior solutions obtained through local search. Finally, computational comparisons and simulation studies are conducted to validate the effectiveness and superiority of the MPMEA in effectively addressing EE_DFJSP.