Arrow Research search

Author name cluster

Irene Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

AAAI Conference 2026 Conference Paper

MEML-GRPO: Heterogeneous Multi-Expert Mutual Learning for RLVR Advancement

  • Weitao Jia
  • Jinghui Lu
  • Haiyang Yu
  • Siqi Wang
  • Guozhi Tang
  • An-Lan Wang
  • Weijie Yin
  • Dingkang Yang

Recent advances demonstrate that reinforcement learning with verifiable rewards (RLVR) significantly enhances the reasoning capabilities of large language models (LLMs). However, standard RLVR faces challenges with reward sparsity, where zero rewards from consistently incorrect candidate answers provide no learning signal, particularly in challenging tasks. To address this,we propose Multi-Expert Mutual Learning GRPO (MEML-GRPO), an innovative framework that utilizes diverse expert prompts as system prompts to generate a broader range of responses, substantially increasing the likelihood of identifying correct solutions. Additionally, we introduce an inter-expert mutual learning mechanism that facilitates knowledge sharing and transfer among experts, further boosting the model’s performance through RLVR. Extensive experiments across multiple reasoning benchmarks show that MEML-GRPO delivers significant improvements, achieving an average performance gain of 4.89% with Qwen and 11.33% with Llama, effectively overcoming the core limitations of traditional RLVR methods.

JBHI Journal 2025 Journal Article

Benchmarking Foundation Models with Multimodal Public Electronic Health Records

  • Kunyu Yu
  • Rui Yang
  • Jingchi Liao
  • Siqi Li
  • Huitao Li
  • Irene Li
  • Yifan Peng
  • Rishikesan Kamaleswaran

Foundation models have emerged as a powerful approach for processing electronic health records (EHRs), offering flexibility to handle diverse medical data modalities. In this study, we present a comprehensive benchmark that evaluates the performance, fairness, and interpretability of foundation models, both as unimodal encoders and as multimodal learners, using the publicly available MIMIC-IV database. To support consistent and reproducible evaluation, we developed a standardized data processing pipeline that harmonizes heterogeneous clinical records into an analysis-ready format. We systematically compared twelve foundation models, encompassing both unimodal and multimodal models, as well as domain-specific and general-purpose variants. Our findings demonstrate that incorporating multiple data modalities generally improves predictive performance without introducing additional bias. While domain-specific fine-tuning offers a cost-effective solution for unimodal foundation models, this effectiveness does not translate well to multimodal scenarios. Additionally, our experiments reveal limited task generalizability in current large vision-language models (LVLMs), emphasizing the need for more versatile and robust medical LVLMs. Through this benchmark, we aim to support the development of effective and trustworthy multimodal artificial intelligence (AI) systems for real-world clinical applications.

AAAI Conference 2023 Conference Paper

Diffuser: Efficient Transformers with Multi-Hop Attention Diffusion for Long Sequences

  • Aosong Feng
  • Irene Li
  • Yuang Jiang
  • Rex Ying

Efficient Transformers have been developed for long sequence modeling, due to their subquadratic memory and time complexity. Sparse Transformer is a popular approach to improving the efficiency of Transformers by restricting self-attention to locations specified by the predefined sparse patterns. However, leveraging sparsity may sacrifice expressiveness compared to full-attention, when important token correlations are multiple hops away. To combine advantages of both the efficiency of sparse transformer and the expressiveness of full-attention Transformer, we propose Diffuser, a new state-of-the-art efficient Transformer. Diffuser incorporates all token interactions within one attention layer while maintaining low computation and memory costs. The key idea is to expand the receptive field of sparse attention using Attention Diffusion, which computes multi-hop token correlations based on all paths between corresponding disconnected tokens, besides attention among neighboring tokens. Theoretically, we show the expressiveness of Diffuser as a universal sequence approximator for sequence-to-sequence modeling, and investigate its ability to approximate full-attention by analyzing the graph expander property from the spectral perspective. Experimentally, we investigate the effectiveness of Diffuser with extensive evaluations, including language modeling, image modeling, and Long Range Arena (LRA). Evaluation results show that Diffuser achieves improvements by an average of 0.94% on text classification tasks and 2.30% on LRA, with 1.67x memory savings compared to state-of-the-art benchmarks, which demonstrates superior performance of Diffuser in both expressiveness and efficiency aspects.

AAAI Conference 2019 Conference Paper

ScisummNet: A Large Annotated Corpus and Content-Impact Models for Scientific Paper Summarization with Citation Networks

  • Michihiro Yasunaga
  • Jungo Kasai
  • Rui Zhang
  • Alexander R. Fabbri
  • Irene Li
  • Dan Friedman
  • Dragomir R. Radev

Scientific documents rely on both mathematics and text to communicate ideas. Inspired by the topical correspondence between mathematical equations and word contexts observed in scientific texts, we propose a novel topic model that jointly generates mathematical equations and their surrounding text (TopicEq). Using an extension of the correlated topic model, the context is generated from a mixture of latent topics, and the equation is generated by an RNN that depends on the latent topic activations. To experiment with this model, we create a corpus of 400K equation-context pairs extracted from a range of scientific articles from arXiv, and fit the model using a variational autoencoder approach. Experimental results show that this joint model significantly outperforms existing topic models and equation models for scientific texts. Moreover, we qualitatively show that the model effectively captures the relationship between topics and mathematics, enabling novel applications such as topic-aware equation generation, equation topic inference, and topic-aware alignment of mathematical symbols and words.

AAAI Conference 2019 Conference Paper

What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning

  • Irene Li
  • Alexander R. Fabbri
  • Robert R. Tung
  • Dragomir R. Radev

Recent years have witnessed the rising popularity of Natural Language Processing (NLP) and related fields such as Artificial Intelligence (AI) and Machine Learning (ML). Many online courses and resources are available even for those without a strong background in the field. Often the student is curious about a specific topic but does not quite know where to begin studying. To answer the question of “what should one learn first, ”we apply an embedding-based method to learn prerequisite relations for course concepts in the domain of NLP. We introduce LectureBank, a dataset containing 1, 352 English lecture files collected from university courses which are each classified according to an existing taxonomy as well as 208 manually-labeled prerequisite relation topics, which is publicly available 1. The dataset will be useful for educational purposes such as lecture preparation and organization as well as applications such as reading list generation. Additionally, we experiment with neural graph-based networks and non-neural classifiers to learn these prerequisite relations from our dataset.