Arrow Research search

Author name cluster

Bo Bai

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

JBHI Journal 2026 Journal Article

LELN: A Large Language Model-Dynamically Enhanced Learning Network for Patient Similarity Calculation

  • Zhichao Zhu
  • Bo Bai
  • Jianqiang Li
  • Han Wang
  • Rui Li
  • Lan Lan

The rapid expansion of Electronic Medical Record (EMR) data has advanced AI-driven patient similarity computation, a key technology for intelligent healthcare. However, the handling of heterogeneous EMR formats and the integration of domain knowledge constrain existing methods. While graph-based approaches show promise, they still struggle with these issues. To address this, we propose a Large Language Model-Dynamically Enhanced Learning Network (LELN), leveraging LLMs' commonsense knowledge and reasoning to dynamically structure EMR data and enhance medical knowledge integration. LELN in tegrates two LLM-basedmodules: DS-EE(DeepSeek-Event Extraction) extracts medical events to construct structured EMR event graphs, and DS-KB (DeepSeek-Knowledge Base) infers disease-relevant knowledge to augment feature representations. The model employs a dual-stage spatial-temporal feature aggregation strategy: a Graph Attention Network captures intra- and inter-event dependencies, followed by a Bidirectional Long-Short Term Memory (BiLSTM) with attention to model temporal disease progression. Additionally, a clinical prior-guided attention mechanism emphasizes discriminative diagnostic features, improving clinical relevance. Extensive experiments on heterogeneous datasets—a real-world Chinese dataset and public MIMIC-III—show LELN outperforms baselines, achieving F1 scores of 87. 66% and 85. 95%, demonstrating robustness and accuracy.

NeurIPS Conference 2025 Conference Paper

Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference

  • Weizhi Fei
  • Xueyan Niu
  • XIE GUOQING
  • Yingqing Liu
  • Bo Bai
  • Wei Han

Although applications involving long-context inputs are crucial for the effective utilization of large language models (LLMs), they also result in increased computational costs and reduced performance. To address this challenge, we propose an efficient, training-free prompt compression method that retains key information within compressed prompts. We identify specific attention heads in transformer-based LLMs, which we designate as evaluator heads, that are capable of selecting tokens in long inputs that are most significant for inference. Building on this discovery, we develop EHPC, an Evaluator Head-based Prompt Compression method, which enables LLMs to rapidly "skim through'' input prompts by leveraging only the first few layers with evaluator heads during the pre-filling stage, subsequently passing only the important tokens to the model for inference. EHPC achieves state-of-the-art results across two mainstream benchmarks: prompt compression and long-context inference acceleration. Consequently, it effectively improves performance with the reduced costs associated with commercial API calls compared to prompt compressing methods. We further demonstrate that EHPC attains competitive results compared to key-value cache-based acceleration methods, thereby highlighting its potential to enhance the efficiency of LLMs for long-context tasks.

NeurIPS Conference 2021 Conference Paper

PartialFed: Cross-Domain Personalized Federated Learning via Partial Initialization

  • Benyuan Sun
  • Hongxing Huo
  • Yi Yang
  • Bo Bai

The burst of applications empowered by massive data have aroused unprecedented privacy concerns in AI society. Currently, data confidentiality protection has been one core issue during deep model training. Federated Learning (FL), which enables privacy-preserving training across multiple silos, gained rising popularity for its parameter-only communication. However, previous works have shown that FL revealed a significant performance drop if the data distributions are heterogeneous among different clients, especially when the clients have cross-domain characteristic, such as traffic, aerial and in-door. To address this challenging problem, we propose a novel idea, PartialFed, which loads a subset of the global model’s parameters rather than loading the entire model used in most previous works. We first validate our algorithm with manually decided loading strategies inspired by various expert priors, named PartialFed-Fix. Then we develop PartialFed-Adaptive, which automatically selects personalized loading strategy for each client. The superiority of our algorithm is proved by demonstrating the new state-of-the-art results on cross-domain federated classification and detection. In particular, solely by initializing a small fraction of layers locally, we improve the performance of FedAvg on Office-Home and UODB by 4. 88% and 2. 65%, respectively. Further studies show that the adaptive strategy performs significantly better on domains with large deviation, e. g. improves AP50 by 4. 03% and 4. 89% on aerial and medical image detection compared to FedAvg.