Author name cluster

Ming Gao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

EAAI Journal 2026 Journal Article

Dimos: Diffusion model with unified sequential state space for session-based recommendation

Weiyue Li
Ming Gao
Bowei Chen
Jingmin An
Hao Dong
Wei Jiang
Jiafu Tang

Details DOI

EAAI Journal 2026 Journal Article

Global relationship awareness 3-dimensional object detection using 4-dimensional radar

Pianzhang Duan
Li Wang
Cheng Fang
Ziying Song
Ming Gao
Mo Zhou
Ying Li
Yibo Zhang

Details DOI

YNIMG Journal 2026 Journal Article

Hippocampal subfields volumes as biomarkers for early diagnosis of asymptomatic manganese overexposure

Jiayu Wu
Yuli Gao
Xuying Ru
Sijia Fan
Ming Gao
Mengxue Sun
Yixin Cao
Mingyue Ma

Details DOI

JBHI Journal 2026 Journal Article

Synergizing Anti-Cancer Drug Combinations With Dual-View Hypergraph Representation Fusion

Jixiang Yu
Nanjun Chen
Linlin Cao
Ming Gao
Daizong Liu
Fuzhou Wang
Qiuzhen Lin
Xiangtao Li

Drug combination therapy plays a vital role in disease treatment, including cancer, as it contributes to treatment efficacy and can alleviate the effect of drug resistance. Although clinical trials and screening may provide valuable information about synergistic drug combinations, they suffer from challenging combinatorial space. Multiple methods are proposed to address those issues. However, they still fail in making full use of global and local triplet context relationships of known synergistic combinations. To this end, a deep learning model which leverages dual view hypergraph representation fusion for synergistic drug combinations identification is proposed, namely DVHSyn. It first extracts the transcriptome features of cancer cell lines and molecular structures of drugs. Subsequently, by modeling the synergistic effect on a hypergraph, DVHSyn simultaneously learns the local and global context of the sample triplets via a hypergraph view and its expanded heterogeneous graph view. Finally, the learned representations of the above two branches are fused selectively to predict synergistic drug combinations. Experiment results demonstrate that DVHSyn surpasses six other competing methods. One case study also reflects that DVHSyn has the potential to predict novel synergistic drug combinations. Overall, our method is effective in identifying synergistic drug combinations and provides new insights for novel drug development.

Details DOI

ICRA Conference 2025 Conference Paper

Off-Road Freespace Detection with LiDAR-Camera Fusion and Self-Distillation

Shuo Gu
Ming Gao

LiDAR-camera fusion has gradually become the mainstream for the freespace detection in unstructured off-road environments. However, existing methods mainly use the traditional method to densify the sparse LiDAR data in the perspective view, which introduces noise and limits the representation ability. In this paper, we propose a lightweight end-to-end freespace detection network with cascaded LiDAR-camera fusion and multi-scale self-distillation. It first performs sparse freespace detection in the range view, and then projects the range-view features onto the perspective view and densifies them. The dense features obtained are fused with camera images to get the final freespace detection results. In our method, the cascaded fusion strategy reduces the impact of resolution differences between LiDAR point clouds and camera images, and the introduction of noise during the data densification process. The multi-scale self-distillation strategy distills knowledge from the LiDAR-camera fusion module to the perspective-view module to further improve the freespace detection performance using LiDAR data only. Experiments on the off-road ORFD datasets demonstrate the effectiveness of the proposed cascaded fusion and multi-scale self-distillation strategies, our method obtains 93. 4% IoU at speeds of more than 50 Hz. It also achieves state-of-the-art performance among all LiDAR-based freespace detection methods.

Details

IROS Conference 2025 Conference Paper

PCMF2-Net: A Pyramid Cross-Modal Feature Fusion Network for Off-Road Freespace Detection

Ming Gao
Chunpeng Lu
Shuo Gu
Yigong Zhang
Chenyang Zhang
Hui Kong 0001

Freespace detection plays an important role in autonomous driving. In recent years, deep learning based freespace detection methods have performed well in urban scenes. However, for off-road scenes, freespace detection poses significant challenges due to the complexity of the scenes and the lack of clear edges. The existing methods have not effectively fused LiDAR data and camera images. In this paper, we propose a Pyramid Cross-Modal Feature Fusion Network (PCMF2-Net) for off-road freespace detection. The dense depth maps are concatenated with RGB images and used as input along with surface normal maps. The dual branch CNN-Transformer encoder combines convolutional neural networks and transformers to extract local and global features from RGBD images and surface normal maps, respectively. Then, in the pyramid cross-modal feature fusion module, the multi-scale and multimodal encoder features are fused in a top-down manner. In addition, we also use an edge segmentation task and a two-step training strategy to further improve performance. Experiments on the off-road freespace detection dataset (ORFD) demonstrate that the proposed PCMF2-Net achieves a competitive result of 93. 9% IoU at a speed of 23 Hz.

Details

NeurIPS Conference 2024 Conference Paper

Cross-model Control: Improving Multiple Large Language Models in One-time Training

Jiayi Wu
Hao Sun
Hengyi Cai
Lixin Su
Shuaiqiang Wang
Dawei Yin
Xiang Li
Ming Gao

The number of large language models (LLMs) with varying parameter scales and vocabularies is increasing. While they deliver powerful performance, they also face a set of common optimization needs to meet specific requirements or standards, such as instruction following or avoiding the output of sensitive information from the real world. However, how to reuse the fine-tuning outcomes of one model to other models to reduce training costs remains a challenge. To bridge this gap, we introduce Cross-model Control (CMC), a method that improves multiple LLMs in one-time training with a portable tiny language model. Specifically, we have observed that the logit shift before and after fine-tuning is remarkably similar across different models. Based on this insight, we incorporate a tiny language model with a minimal number of parameters. By training alongside a frozen template LLM, the tiny model gains the capability to alter the logits output by the LLMs. To make this tiny language model applicable to models with different vocabularies, we propose a novel token mapping strategy named PM-MinED. We have conducted extensive experiments on instruction tuning and unlearning tasks, demonstrating the effectiveness of CMC. Our code is available at https: //github. com/wujwyi/CMC

PDF Details DOI

AAAI Conference 2024 Conference Paper

Unsupervised Gene-Cell Collective Representation Learning with Optimal Transport

Jixiang Yu
Nanjun Chen
Ming Gao
Xiangtao Li
Ka-Chun Wong

Cell type identification plays a vital role in single-cell RNA sequencing (scRNA-seq) data analysis. Although many deep embedded methods to cluster scRNA-seq data have been proposed, they still fail in elucidating the intrinsic properties of cells and genes. Here, we present a novel end-to-end deep graph clustering model for single-cell transcriptomics data based on unsupervised Gene-Cell Collective representation learning and Optimal Transport (scGCOT) which integrates both cell and gene correlations. Specifically, scGCOT learns the latent embedding of cells and genes simultaneously and reconstructs the cell graph, the gene graph, and the gene expression count matrix. A zero-inflated negative binomial (ZINB) model is estimated via the reconstructed count matrix to capture the essential properties of scRNA-seq data. By leveraging the optimal transport-based joint representation alignment, scGCOT learns the clustering process and the latent representations through a mutually supervised self optimization strategy. Extensive experiments with 14 competing methods on 15 real scRNA-seq datasets demonstrate the competitive edges of scGCOT.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

DFRD: Data-Free Robustness Distillation for Heterogeneous Federated Learning

Kangyang Luo
Shuai Wang
Yexuan Fu
Xiang Li
Yunshi Lan
Ming Gao

Federated Learning (FL) is a privacy-constrained decentralized machine learning paradigm in which clients enable collaborative training without compromising private data. However, how to learn a robust global model in the data-heterogeneous and model-heterogeneous FL scenarios is challenging. To address it, we resort to data-free knowledge distillation to propose a new FL method (namely DFRD). DFRD equips a conditional generator on the server to approximate the training space of the local models uploaded by clients, and systematically investigates its training in terms of fidelity, transferability and diversity. To overcome the catastrophic forgetting of the global model caused by the distribution shifts of the generator across communication rounds, we maintain an exponential moving average copy of the generator on the server. Additionally, we propose dynamic weighting and label sampling to accurately extract knowledge from local models. Finally, our extensive experiments on various image classification tasks illustrate that DFRD achieves significant performance gains compared to SOTA baselines.

PDF Details

AAAI Conference 2023 Conference Paper

Uncertainty-Aware Self-Training for Low-Resource Neural Sequence Labeling

Jianing Wang
Chengyu Wang
Jun Huang
Ming Gao
Aoying Zhou

Neural sequence labeling (NSL) aims at assigning labels for input language tokens, which covers a broad range of applications, such as named entity recognition (NER) and slot filling, etc. However, the satisfying results achieved by traditional supervised-based approaches heavily depend on the large amounts of human annotation data, which may not be feasible in real-world scenarios due to data privacy and computation efficiency issues. This paper presents SeqUST, a novel uncertain-aware self-training framework for NSL to address the labeled data scarcity issue and to effectively utilize unlabeled data. Specifically, we incorporate Monte Carlo (MC) dropout in Bayesian neural network (BNN) to perform uncertainty estimation at the token level and then select reliable language tokens from unlabeled data based on the model confidence and certainty. A well-designed masked sequence labeling task with a noise-robust loss supports robust training, which aims to suppress the problem of noisy pseudo labels. In addition, we develop a Gaussian-based consistency regularization technique to further improve the model robustness on Gaussian-distributed perturbed representations. This effectively alleviates the over-fitting dilemma originating from pseudo-labeled augmented data. Extensive experiments over six benchmarks demonstrate that our SeqUST framework effectively improves the performance of self-training, and consistently outperforms strong baselines by a large margin in low-resource scenarios.

PDF Details DOI

NeurIPS Conference 2021 Conference Paper

Efficient Bayesian network structure learning via local Markov boundary search

Ming Gao
Bryon Aragam

We analyze the complexity of learning directed acyclic graphical models from observational data in general settings without specific distributional assumptions. Our approach is information-theoretic and uses a local Markov boundary search procedure in order to recursively construct ancestral sets in the underlying graphical model. Perhaps surprisingly, we show that for certain graph ensembles, a simple forward greedy search algorithm (i. e. without a backward pruning phase) suffices to learn the Markov boundary of each node. This substantially improves the sample complexity, which we show is at most polynomial in the number of nodes. This is then applied to learn the entire graph under a novel identifiability condition that generalizes existing conditions from the literature. As a matter of independent interest, we establish finite-sample guarantees for the problem of recovering Markov boundaries from data. Moreover, we apply our results to the special case of polytrees, for which the assumptions simplify, and provide explicit conditions under which polytrees are identifiable and learnable in polynomial time. We further illustrate the performance of the algorithm, which is easy to implement, in a simulation study. Our approach is general, works for discrete or continuous distributions without distributional assumptions, and as such sheds light on the minimal assumptions required to efficiently learn the structure of directed graphical models from data.

PDF Details

NeurIPS Conference 2021 Conference Paper

Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families

Goutham Rajendran
Bohdan Kivva
Ming Gao
Bryon Aragam

Greedy algorithms have long been a workhorse for learning graphical models, and more broadly for learning statistical models with sparse structure. In the context of learning directed acyclic graphs, greedy algorithms are popular despite their worst-case exponential runtime. In practice, however, they are very efficient. We provide new insight into this phenomenon by studying a general greedy score-based algorithm for learning DAGs. Unlike edge-greedy algorithms such as the popular GES and hill-climbing algorithms, our approach is vertex-greedy and requires at most a polynomial number of score evaluations. We then show how recent polynomial-time algorithms for learning DAG models are a special case of this algorithm, thereby illustrating how these order-based algorithms can be rigourously interpreted as score-based algorithms. This observation suggests new score functions and optimality conditions based on the duality between Bregman divergences and exponential families, which we explore in detail. Explicit sample and computational complexity bounds are derived. Finally, we provide extensive experiments suggesting that this algorithm indeed optimizes the score in a variety of settings.

PDF Details

NeurIPS Conference 2020 Conference Paper

A polynomial-time algorithm for learning nonparametric causal graphs

Ming Gao
Yi Ding
Bryon Aragam

We establish finite-sample guarantees for a polynomial-time algorithm for learning a nonlinear, nonparametric directed acyclic graphical (DAG) model from data. The analysis is model-free and does not assume linearity, additivity, independent noise, or faithfulness. Instead, we impose a condition on the residual variances that is closely related to previous work on linear models with equal variances. Compared to an optimal algorithm with oracle knowledge of the variable ordering, the additional cost of the algorithm is linear in the dimension $d$ and the number of samples $n$. Finally, we compare the proposed algorithm to existing approaches in a simulation study.

PDF Details