Arrow Research search

Author name cluster

Ming Zhu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

EAAI Journal 2026 Journal Article

A dual-stream regional feature learning and adaptive fusion method for electroencephalogram-based emotion recognition

  • Yong Yang
  • Wenhao Wang
  • Kaibo Shi
  • Yuanlun Xie
  • Nan Zhou
  • Shiping Wen
  • Ming Zhu
  • Badong Chen

Electroencephalogram (EEG) has become a research hotspot in emotion recognition due to its high temporal resolution and ability to truly reflect brain activity. However, few existing EEG-based emotion recognition methods integrate brain region information into the algorithm and do not fully extract the deep features of each region. Brain science has shown that different brain regions have different functions and are highly correlated with the production of emotions. In this paper, based on the division of brain regions, a dual-branch regional feature learning and adaptive fusion neural network (DRFNet) is proposed to extract the features of different brain regions and adaptively fuse regional features, thereby achieving accurate EEG emotion recognition. Specifically, DRFNet mainly consists of regional feature extraction modules (DB-CTFEM) and a feature fusion module (RFM). The DB-CTFEM extracts regional local and global features through the dual-branch structure of convolutional neural network (CNN) and Transformer, respectively, and then uses cross-attention to effectively fuse the two to obtain enhanced regional features. Considering the differences of brain regions, RFM uses the attention mechanism to fuse regional features and adaptively reconstruct global brain features. In addition, a region loss function based on the importance of region features is proposed to dynamically adjust the contribution weights of different brain regions, thereby guiding the model to pay more attention to key regions. This paper conducts subject-dependent experiments on the SJTU Emotion EEG Datasets (SEED, SEED-IV, SEED-V, and SEED-VII) to verify effectiveness and robustness of the proposed method.

EAAI Journal 2026 Journal Article

Long-term cooperative path planning for stratospheric airships based on hierarchical multi-agent reinforcement learning

  • Chao Lv
  • Ming Zhu
  • Xiao Guo
  • Jiajun Ou
  • Baojin Zheng
  • Liran Sun

Stratospheric airships are increasingly used for long-term collaborative tasks, requiring efficient path planning for multiple airships. Traditional methods struggle with collaborative optimization and state space explosion in such tasks. To address these issues, this paper presents a hierarchical cooperative airship path planning (HiCAPP). This HiCAPP employs a dual-layer control architecture, with the high-level controller responsible for task allocation and the low-level controller concentrating on path planning. Experimental results show that HiCAPP outperforms traditional multi-agent reinforcement learning methods in two critical metrics: average remaining energy and average distance to the task center. Additionally, through experiments with varying numbers of agents, task durations, and disturbances, HiCAPP has demonstrated robustness and scalability. These results confirm its effectiveness in long-term cooperative monitoring tasks and highlight the advantages of hierarchical decision-making in multi-agent systems.

NeurIPS Conference 2025 Conference Paper

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

  • Akshara Prabhakar
  • Zuxin Liu
  • Ming Zhu
  • Jianguo Zhang
  • Tulika Manoj Awalgaonkar
  • Shiyu Wang
  • Zhiwei Liu
  • Haolin Chen

Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models---the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3. 5 on $\tau$-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents. We open-source both the synthetic data collected and the trained xLAM-2-fc-r models to advance research in AI agents. Dataset: https: //huggingface. co/datasets/Salesforce/APIGen-MT-5k & Models: https: //huggingface. co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4

NeurIPS Conference 2024 Conference Paper

APIGen: Automated PIpeline for Generating Verifiable and Diverse Function-Calling Datasets

  • Zuxin Liu
  • Thai Hoang
  • Jianguo Zhang
  • Ming Zhu
  • Tian Lan
  • Shirley Kokane
  • Juntao Tan
  • Weiran Yao

The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize high-quality datasets for function-calling applications. We leverage APIGen and collect 3, 673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scalable and structured manner. Each data in our dataset is verified through three hierarchical stages: format checking, actual function executions, and semantic verification, improving its reliability and correctness. We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Moreover, our 1B model achieves exceptional performance, surpassing GPT-3. 5-Turbo and Claude-3 Haiku. We release a dataset containing 60, 000 high-quality entries, aiming to advance the field of function-calling agent domains. The dataset and models are available on the project homepage \url{https: //apigen-pipeline. github. io/}.

ICML Conference 2024 Conference Paper

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

  • Xueyu Hu
  • Ziyu Zhao 0001
  • Shuang Wei
  • Ziwei Chai
  • Qianli Ma
  • Guoyin Wang 0002
  • Xuwu Wang
  • Jing Su

In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. Agents need to solve these tasks end-to-end by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 603 data analysis questions derived from 124 CSV files, and an agent framework which incorporates LLMs to serve as data analysis agents for both serving and evaluating. Since data analysis questions are often open-ended and hard to evaluate without human supervision, we adopt a format-prompting technique to convert each question into a closed-form format so that they can be automatically evaluated. Our extensive benchmarking of 34 LLMs uncovers the current challenges encountered in data analysis tasks. In addition, building upon our agent framework, we develop a specialized agent, DAAgent, which surpasses GPT-3. 5 by 3. 9% on DABench. Evaluation datasets and toolkits for InfiAgent-DABench are released at https: //github. com/InfiAgent/InfiAgent.

TCS Journal 2024 Journal Article

Universal enzymatic numerical P systems with small number of enzymatic rules

  • Jun Liu
  • Leiya Wang
  • Gexiang Zhang
  • Sergey Verlan
  • Ming Zhu

Enzymatic Numerical P Systems (ENPSs) are a model of membrane computing that is well-suited for the simulation of physical processes and that has been used for the design and the implementation of motion controllers for wheeled robots and flying drones. The ENPSs model has been proven to be Turing universal and the theoretical effort was focused on minimizing various descriptional complexity parameters. In this paper, we explore the minimum number of enzymatic rules needed to achieve universality in ENPSs, specifically focusing on the all-parallel derivation mode where all applicable rules are applied at the same time. We show that in the case of a linear restriction for production functions, the universality can be obtained using 21 enzymatic rules, substantially improving previously known results. If production functions are allowed to be polynomials of degree 2, we show that a single enzymatic rule is sufficient to achieve universality. To obtain these results, a new proof method is introduced based on the translation of ENPSs to systems of conditional recurrences.

NeurIPS Conference 2022 Conference Paper

FinRL-Meta: Market Environments and Benchmarks for Data-Driven Financial Reinforcement Learning

  • Xiao-Yang Liu
  • Ziyi Xia
  • Jingyang Rui
  • Jiechao Gao
  • Hongyang Yang
  • Ming Zhu
  • Christina Wang
  • Zhaoran Wang

Finance is a particularly challenging playground for deep reinforcement learning. However, establishing high-quality market environments and benchmarks for financial reinforcement learning is challenging due to three major factors, namely, low signal-to-noise ratio of financial data, survivorship bias of historical data, and backtesting overfitting. In this paper, we present an openly accessible FinRL-Meta library that has been actively maintained by the AI4Finance community. First, following a DataOps paradigm, we will provide hundreds of market environments through an automatic data curation pipeline that processes dynamic datasets from real-world markets into gym-style market environments. Second, we reproduce popular papers as stepping stones for users to design new trading strategies. We also deploy the library on cloud platforms so that users can visualize their own results and assess the relative performance via community-wise competitions. Third, FinRL-Meta provides tens of Jupyter/Python demos organized into a curriculum and a documentation website to serve the rapidly growing community. FinRL-Meta is available at: \url{https: //github. com/AI4Finance-Foundation/FinRL-Meta}

AAAI Conference 2022 Conference Paper

Multilingual Code Snippets Training for Program Translation

  • Ming Zhu
  • Karthik Suresh
  • Chandan K Reddy

Program translation aims to translate source code from one programming language to another. It is particularly useful in applications such as multiple-platform adaptation and legacy code migration. Traditional rule-based program translation methods usually rely on meticulous manual rule-crafting, which is costly both in terms of time and effort. Recently, neural network based methods have been developed to address this problem. However, the absence of high-quality parallel code data is one of the main bottlenecks which impedes the development of program translation models. In this paper, we introduce CoST, a new multilingual Code Snippet Translation dataset that contains parallel data from 7 commonly used programming languages. The dataset is parallel at the level of code snippets, which provides much more finegrained alignments between different languages than the existing translation datasets. We also propose a new program translation model that leverages multilingual snippet denoising auto-encoding and Multilingual Snippet Translation (MuST) pre-training. Extensive experiments show that the multilingual snippet training is effective in improving program translation performance, especially for low-resource languages. Moreover, our training method shows good generalizability and consistently improves the translation performance of a number of baseline models. The proposed model outperforms the baselines on both snippet-level and programlevel translation, and achieves state-of-the-art performance on CodeXGLUE translation task. The code, data, and appendix for this paper can be found at https: //github. com/reddy-labcode-research/MuST-CoST.

IROS Conference 2021 Conference Paper

Cross-Modal 3D Object Detection and Tracking for Auto-Driving

  • Yihan Zeng
  • Chao Ma 0004
  • Ming Zhu
  • Zhiming Fan
  • Xiaokang Yang 0001

Detecting and tracking objects in 3D scenes play crucial roles in autonomous driving. Successfully recognizing objects through space and time hinges on a strong detector and a reliable association scheme. Recent 3D detection and tracking approaches widely represent objects as points when associating detection results with trajectories. Despite the demonstrated success, these approaches do not fully exploit the rich appearance information of objects. In this paper, we present a conceptually simple yet effective algorithm, named AlphaTrack, which considers both the location and appearance changes to perform joint 3D object detection and tracking. To achieve this, we propose a cross-modal fusion scheme that fuses camera appearance feature with LiDAR feature to facilitate 3D detection and tracking. We further attach an additional branch to the 3D detector to output instance-aware appearance embedding, which significantly improves tracking performance with our designed association mechanisms. Extensive validations on large-scale autonomous driving dataset demonstrate the effectiveness of the proposed algorithm in comparison with state-of-the-art approaches. Notably, the proposed algorithm ranks first on the nuScenes tracking leaderboard to date.

AAAI Conference 2020 Conference Paper

LATTE: Latent Type Modeling for Biomedical Entity Linking

  • Ming Zhu
  • Busra Celikkaya
  • Parminder Bhatia
  • Chandan K. Reddy

Entity linking is the task of linking mentions of named entities in natural language text, to entities in a curated knowledge-base. This is of significant importance in the biomedical domain, where it could be used to semantically annotate a large volume of clinical records and biomedical literature, to standardized concepts described in an ontology such as Unified Medical Language System (UMLS). We observe that with precise type information, entity disambiguation becomes a straightforward task. However, fine-grained type information is usually not available in biomedical domain. Thus, we propose LATTE, a LATent Type Entity Linking model, that improves entity linking by modeling the latent fine-grained type information about mentions and entities. Unlike previous methods that perform entity linking directly between the mentions and the entities, LATTE jointly does entity disambiguation, and latent fine-grained type learning, without direct supervision. We evaluate our model on two biomedical datasets: MedMentions, a large scale public dataset annotated with UMLS concepts, and a de-identified corpus of dictated doctor’s notes that has been annotated with ICD concepts. Extensive experimental evaluation shows our model achieves significant performance improvements over several state-of-the-art techniques.

ICRA Conference 1999 Conference Paper

Experiments with Transparent Teleoperation Under Position and Rate Control

  • Wenhong Zhu
  • Septimiu E. Salcudean
  • Ming Zhu

A four-channel data transmission structure has been proposed in the literature to achieve "transparency" for master-slave teleoperator systems under both position and rate control. In this paper, experimental results for a one degree-of-freedom system are given to demonstrate that the proposed scheme possesses good stability and transparency under both position and rate control in contact with both flexible and rigid environments. A unilateral constraint correction under rate control is also discussed.

IROS Conference 1995 Conference Paper

Achieving transparency for teleoperator systems under position and rate control

  • Ming Zhu
  • Septimiu E. Salcudean

A four-channel data transmission structure has been suggested in the literature to achieve "transparency" for master-slave teleoperator systems under position control. In this paper, the result is generalized to include teleoperator systems that are under rate control or more general master-slave kinematic correspondence laws, such as a mixed position/rate mode. A one degree-of-freedom example is given to outline the design and analysis of such a system for transparency and stability.