Arrow Research search

Author name cluster

Mingyang Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers
2 author rows

Possible papers

13

ICLR Conference 2025 Conference Paper

A Policy-Gradient Approach to Solving Imperfect-Information Games with Best-Iterate Convergence

  • Mingyang Liu
  • Gabriele Farina
  • Asuman E. Ozdaglar

Policy gradient methods have become a staple of any single-agent reinforcement learning toolbox, due to their combination of desirable properties: iterate convergence, efficient use of stochastic trajectory feedback, and theoretically-sound avoidance of importance sampling corrections. In multi-agent imperfect-information settings (extensive-form games), however, it is still unknown whether the same desiderata can be guaranteed while retaining theoretical guarantees. Instead, sound methods for extensive-form games rely on approximating \emph{counterfactual} values (as opposed to Q values), which are incompatible with policy gradient methodologies. In this paper, we investigate whether policy gradient can be safely used in two-player zero-sum imperfect-information extensive-form games (EFGs). We establish positive results, showing for the first time that a policy gradient method leads to provable best-iterate convergence to a regularized Nash equilibrium in self-play.

NeurIPS Conference 2025 Conference Paper

Activation-Guided Consensus Merging for Large Language Models

  • Yuxuan Yao
  • Shuqi LIU
  • Zehua Liu
  • Qintong Li
  • Mingyang Liu
  • Xiongwei Han
  • Zhijiang Guo
  • Han Wu

Recent research has increasingly focused on reconciling the reasoning capabilities of System 2 with the efficiency of System 1. While existing training-based and prompt-based approaches face significant challenges in terms of efficiency and stability, model merging emerges as a promising strategy to integrate the diverse capabilities of different Large Language Models (LLMs) into a unified model. However, conventional model merging methods often assume uniform importance across layers, overlooking the functional heterogeneity inherent in neural components. To address this limitation, we propose \textbf{A}ctivation-Guided \textbf{C}onsensus \textbf{M}erging (\textbf{ACM}), a plug-and-play merging framework that determines layer-specific merging coefficients based on mutual information between activations of pre-trained and fine-tuned models. ACM effectively preserves task-specific capabilities without requiring gradient computations or additional training. Extensive experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods. For instance, in the case of Qwen-7B models, TIES-Merging equipped with ACM achieves a \textbf{55. 3\%} reduction in response length while simultaneously improving reasoning accuracy by \textbf{1. 3} points. We submit the code with the paper for reproducibility, and it will be publicly available.

NeurIPS Conference 2025 Conference Paper

Breakthrough Sensor-Limited Single View: Towards Implicit Temporal Dynamics for Time Series Domain Adaptation

  • Mingyang Liu
  • Xinyang Chen
  • Xiucheng Li
  • Weili Guan
  • Liqiang Nie

Unsupervised domain adaptation has emerged as a pivotal paradigm for mitigating distribution shifts in time series analysis. The fundamental challenge in time series domain adaptation arises from the entanglement of domain shifts and intricate temporal patterns. Crucially, the latent continuous-time dynamics, which are often inaccessible due to sensor constraints, are only partially observable through discrete time series from an explicit sensor-limited single view. This partial observability hinders the modeling of intricate temporal patterns, impeding domain invariant representation learning. To mitigate the limitation, we propose EDEN (multiple E xplicit D omain E nhanced adaptation N etwork), expanding the raw dataset to multi-scale explicit domains, multi-subspace explicit domains and multi-segment explicit domains. EDEN enhances domain adaptation with three coordinated modules tailored to integrate multiple explicit domains: (1) Multi-Scale Curriculum Adaptation implements progressive domain alignment from coarse-scale to fine-scale. (2) Quality-Aware Feature Fusion evaluates feature quality in multi-subspace explicit domains and adaptively integrates temporal-frequency features. (3) Temporal Coherence Learning enforces segment-level consistency with multi-segment explicit domains. The representation enriched by multiple explicit domains bridges the gap between partially observed discrete samples and the underlying implicit temporal dynamics, enabling more accurate approximation of implicit temporal patterns for effective cross-domain adaptation. Our comprehensive evaluation across 6 time series benchmarks demonstrates EDEN's consistent superiority, achieving average accuracy improvements of 4. 8% over state-of-the-art methods in cross-domain scenarios. Code is available at the anonymous link: .

ICLR Conference 2025 Conference Paper

Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling

  • Yuxuan Yao
  • Han Wu 0004
  • Mingyang Liu
  • Sichun Luo
  • Xiongwei Han
  • Jie Liu 0022
  • Zhijiang Guo
  • Linqi Song

Large language models (LLMs) exhibit varying strengths and weaknesses across different tasks, prompting recent studies to explore the benefits of ensembling models to leverage their complementary advantages. However, existing LLM ensembling methods often overlook model compatibility and struggle with inefficient alignment of probabilities across the entire vocabulary. In this study, we empirically investigate the factors influencing ensemble performance, identifying model performance, vocabulary size, and response style as key determinants, revealing that compatibility among models is essential for effective ensembling. This analysis leads to the development of a simple yet effective model selection strategy that identifies compatible models. Additionally, we introduce the \textsc{Uni}on \textsc{T}op-$k$ \textsc{E}nsembling (\textsc{UniTE}), a novel approach that efficiently combines models by focusing on the union of the top-k tokens from each model, thereby avoiding the need for full vocabulary alignment and reducing computational overhead. Extensive evaluations across multiple benchmarks demonstrate that \textsc{UniTE} significantly enhances performance compared to existing methods, offering a more efficient framework for LLM ensembling.

ICRA Conference 2025 Conference Paper

GERA: Geometric Embedding for Efficient Point Registration Analysis

  • Geng Li
  • Haozhi Cao
  • Mingyang Liu
  • Shenghai Yuan 0001
  • Jianfei Yang

Point cloud registration aims to provide estimated transformations to align point clouds, which plays a crucial role in pose estimation of various navigation systems, such as surgical guidance systems and autonomous vehicles. Despite the impressive performance of recent models on benchmark datasets, many rely on complex modules like KPConv and Transformers, which impose significant computational and memory demands. These requirements hinder their practical application, particularly in resource-constrained environments such as mobile robotics. In this paper, we propose a novel point cloud registration network that leverages a pure MLP architecture, constructing geometric information offline. This approach eliminates the computational and memory burdens associated with traditional complex feature extractors and significantly reduces inference time and resource consumption. Our method is the first to replace 3D coordinate inputs with offline-constructed geometric encoding, improving generalization and stability, as demonstrated by Maximum Mean Discrepancy (MMD) comparisons. This efficient and accurate geometric representation marks a significant advancement in point cloud analysis, particularly for applications requiring fast and reliability.

JBHI Journal 2025 Journal Article

The Large Language Models on Biomedical Data Analysis: A Survey

  • Wei Lan
  • Zhentao Tang
  • Mingyang Liu
  • Qingfeng Chen
  • Wei Peng
  • Yi-Ping Phoebe Chen
  • Yi Pan

With the rapid development of Large Language Model (LLM) technology, it has become an indispensable force in biomedical data analysis research. However, biomedical researchers currently have limited knowledge about LLM. Therefore, there is an urgent need for a summary of LLM applications in biomedical data analysis. Herein, we propose this review by summarizing the latest research work on LLM in biomedicine. In this review, LLM techniques are first outlined. We then discuss biomedical datasets and frameworks for biomedical data analysis, followed by a detailed analysis of LLM applications in genomics, proteomics, transcriptomics, radiomics, single-cell analysis, medical texts and drug discovery. Finally, the challenges of LLM in biomedical data analysis are discussed. In summary, this review is intended for researchers interested in LLM technology and aims to help them understand and apply LLM in biomedical data analysis research.

NeurIPS Conference 2025 Conference Paper

UFT: Unifying Supervised and Reinforcement Fine-Tuning

  • Mingyang Liu
  • Gabriele Farina
  • Asuman Ozdaglar

Post-training has demonstrated its importance in enhancing the reasoning capabilities of large language models (LLMs). The primary post-training methods can be categorized into supervised fine-tuning (SFT) and reinforcement fine-tuning (RFT). SFT is efficient and well-suited for small language models, but it may lead to overfitting and limit the reasoning abilities of larger models. In contrast, RFT generally yields better generalization but depends heavily on the strength of the base model. To address the limitations of SFT and RFT, we propose Unified Fine-Tuning (UFT), a novel post-training paradigm that unifies SFT and RFT into a single, integrated process. UFT enables the model to effectively explore solutions while incorporating informative supervision signals, bridging the gap between memorizing and thinking underlying existing methods. Notably, UFT outperforms both SFT and RFT in general, regardless of model sizes. Furthermore, we theoretically prove that UFT breaks RFT's inherent exponential sample complexity bottleneck, showing for the first time that unified training can exponentially accelerate convergence on long-horizon reasoning tasks.

IROS Conference 2025 Conference Paper

Unsupervised Liver Deformation Correction Network Using Optimal Transport for Image-Guided Liver Surgery

  • Mingyang Liu
  • Geng Li
  • Hao Yu
  • Xinzhe Du
  • Rui Song 0002
  • Yibin Li 0001
  • Max Q. -H. Meng
  • Zhe Min

In this paper, we propose a novel unsupervised intraoperative liver deformation correction method, called Learning Coherent point drift Network (LCNet), for image-guided liver surgery (IGLS). We first estimate the correspondences between the preoperative and intraoperative point sets in the optimal transport (OT) module by leveraging both original points and extracted features. Afterwards, we compute the point-wise displacement vector by solving the involved matrix equation in the Transformation module, where the point localisation noise is explicitly considered and modeled. Additionally, we present three variants of the proposed approach, i. e. , LCNet, LCNet-ED and LCNet-WD, where better registration performances of LCNet against the other two demonstrate the superiority of the utilised Chamfer loss. We have extensively evaluated LCNet on the MedShapeNet dataset consisting of 615 different liver shapes of real patients, and the 3Dircadb dataset comprising 20 liver models of real patients. Extensive experimental results under different deformation and noise magnitudes demonstrate that LCNet outperforms existing state-of-the-art registration algorithms and holds significant application potential in IGLS. For example, when the overlapping ratio between the preoperative and intraoperative point sets is 25%, the deformation magnitude is 8 mm, the maximum point localization noise magnitude is 2 mm and the rotation angle lies in the range of [−45°, 45°], LCNet achieves a root-mean-square error (RMSE) value being 3. 21 mm on MedShapeNet dataset, significantly outperforming those of Lepard and RoITr being 5. 41 mm (p < 0. 001) and 4. 90 mm (p < 0. 001) respectively.

IROS Conference 2024 Conference Paper

Bidirectional Partial-to-Full Non-Rigid Point Set Registration with Non-Overlapping Filtering

  • Hao Yu
  • Mingyang Liu
  • Rui Song 0002
  • Yibin Li 0001
  • Max Q. -H. Meng
  • Zhe Min

In this paper, we introduce Bidirectional Non-Overlapping Filtering Network (Bi-NOFNet), which registers the partial intraoperative point set with full preoperative point set for computer-assisted interventions (CAI). Our contributions are three-folds. First, Bi-NOFNet adopts customised feature extractor to extract distinctive features from both point sets, with which the per-point overlap mask is predicted and the overlapping region is segmented for the preoperative point set. Furthermore, we propose two methods to filter out the non-overlapping regions, at feature-level (i. e. , Bi-NOFNet(Feature)) and point-level (i. e. , Bi-NOFNet (Point)). For these two methods, we develop supervised registration strategy where the ground-truth overlap mask and displacement vectors are employed, and weakly-supervised registration strategies where only the ground-truth overlap mask is available. Additionally, to fully utilise the information in both space, we propose a bidirectional registration mechanism, which predicts the displacement vectors associated with the intraoperative point set (i. e. , the forward way) and those warpping the preoperative point set (i. e. , the backward way). Experiments have been conducted on the proposed DeformMedShapeNet dataset that contains 615 different liver shapes. Extensive results demonstrate that Bi-NOFNet performs well for partial-to-full registration tasks under various scenarios of noise, overlap ratios and deformation levels, outperforming existing non-rigid registration approaches.

NeurIPS Conference 2024 Conference Paper

Boosting Transferability and Discriminability for Time Series Domain Adaptation

  • Mingyang Liu
  • Xinyang Chen
  • Yang Shu
  • Xiucheng Li
  • Weili Guan
  • Liqiang Nie

Unsupervised domain adaptation excels in transferring knowledge from a labeled source domain to an unlabeled target domain, playing a critical role in time series applications. Existing time series domain adaptation methods either ignore frequency features or treat temporal and frequency features equally, which makes it challenging to fully exploit the advantages of both types of features. In this paper, we delve into transferability and discriminability, two crucial properties in transferable representation learning. It's insightful to note that frequency features are more discriminative within a specific domain, while temporal features show better transferability across domains. Based on the findings, we propose A dversarial CO -learning N etworks ( ACON ), to enhance transferable representation learning through a collaborative learning manner in three aspects: (1) Considering the multi-periodicity in time series, multi-period frequency feature learning is proposed to enhance the discriminability of frequency features; (2) Temporal-frequency domain mutual learning is proposed to enhance the discriminability of temporal features in the source domain and improve the transferability of frequency features in the target domain; (3) Domain adversarial learning is conducted in the correlation subspaces of temporal-frequency features instead of original feature spaces to further enhance the transferability of both features. Extensive experiments conducted on a wide range of time series datasets and five common applications demonstrate the state-of-the-art performance of ACON. Code is available at https: //github. com/mingyangliu1024/ACON.

ICLR Conference 2023 Conference Paper

The Power of Regularization in Solving Extensive-Form Games

  • Mingyang Liu
  • Asuman E. Ozdaglar
  • Tiancheng Yu
  • Kaiqing Zhang

In this paper, we investigate the power of {\it regularization}, a common technique in reinforcement learning and optimization, in solving extensive-form games (EFGs). We propose a series of new algorithms based on regularizing the payoff functions of the game, and establish a set of convergence results that strictly improve over the existing ones, with either weaker assumptions or stronger convergence guarantees. In particular, we first show that dilated optimistic mirror descent (DOMD), an efficient variant of OMD for solving EFGs, with adaptive regularization can achieve a fast $\tilde O(1/T)$ last-iterate convergence in terms of duality gap and distance to the set of Nash equilibrium (NE) without uniqueness assumption of the NE. Second, we show that regularized counterfactual regret minimization (\texttt{Reg-CFR}), with a variant of optimistic mirror descent algorithm as regret-minimizer, can achieve $O(1/T^{1/4})$ best-iterate, and $O(1/T^{3/4})$ average-iterate convergence rate for finding NE in EFGs. Finally, we show that \texttt{Reg-CFR} can achieve asymptotic last-iterate convergence, and optimal $O(1/T)$ average-iterate convergence rate, for finding the NE of perturbed EFGs, which is useful for finding approximate extensive-form perfect equilibria (EFPE). To the best of our knowledge, they constitute the first last-iterate convergence results for CFR-type algorithms, while matching the state-of-the-art average-iterate convergence rate in finding NE for non-perturbed EFGs. We also provide numerical results to corroborate the advantages of our algorithms.

NeurIPS Conference 2022 Conference Paper

Safe Opponent-Exploitation Subgame Refinement

  • Mingyang Liu
  • Chengjie Wu
  • Qihan Liu
  • Yansen Jing
  • Jun Yang
  • Pingzhong Tang
  • Chongjie Zhang

In zero-sum games, an NE strategy tends to be overly conservative confronted with opponents of limited rationality, because it does not actively exploit their weaknesses. From another perspective, best responding to an estimated opponent model is vulnerable to estimation errors and lacks safety guarantees. Inspired by the recent success of real-time search algorithms in developing superhuman AI, we investigate the dilemma of safety and opponent exploitation and present a novel real-time search framework, called Safe Exploitation Search (SES), which continuously interpolates between the two extremes of online strategy refinement. We provide SES with a theoretically upper-bounded exploitability and a lower-bounded evaluation performance. Additionally, SES enables computationally efficient online adaptation to a possibly updating opponent model, while previous safe exploitation methods have to recompute for the whole game. Empirical results show that SES significantly outperforms NE baselines and previous algorithms while keeping exploitability low at the same time.