Arrow Research search

Author name cluster

Lai Wei

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers
2 author rows

Possible papers

8

AAAI Conference 2026 Conference Paper

Clustering with Self-Learned Graph Regression

  • Lai Wei
  • Jin Liu

Graph-based clustering algorithms aim to construct an affinity graph that accurately captures the intrinsic structure of a dataset. To achieve this goal, these algorithms often use the k-nearest-neighbor (k-nn) method to build a graph regularizer for the required affinity graph, enabling it to have a grouping effect. However, due to the complex nature of real-world data, the k-nn method often fails to capture the true neighborhood relationships of a dataset, which in turn limits the quality of the learned affinity graph. Motivated by the insight that a learned affinity graph itself can more effectively reflect the underlying data structure, we propose a new graph-based clustering method, termed Self-learned Graph Regression (SGR). Unlike traditional approaches, SGR constructs its graph regularizer directly from the affinity graph being learned, allowing the graph to adaptively capture more accurate structural information. To solve the proposed problem, we develop an optimization algorithm along with an acceleration strategy. We further analyze the convergence and computational complexity of the proposed algorithm. Extensive clustering experiments on various benchmark datasets demonstrate that our method outperforms the state-of-the-art graph-based clustering algorithms.

AAAI Conference 2026 Conference Paper

SASST: Leveraging Syntax-Aware Chunking and LLMs for Simultaneous Speech Translation

  • Zeyu Yang
  • Lai Wei
  • Roman Koshkin
  • Xi Chen
  • Satoshi Nakamura

This work proposes a grammar-based chunking strategy that segments input streams into semantically complete units by parsing dependency relations (e.g., noun phrase boundaries, verb-object structures) and punctuation features. The method ensures chunk coherence and minimizes semantic fragmentation. Building on this mechanism, we present SASST (Syntax-Aware Simultaneous Translation), an end-to-end framework integrating frozen Whisper encoder and decoder-only LLM. The unified architecture dynamically outputs translation tokens or symbols to jointly optimize translation timing and content, with target-side reordering addressing word-order divergence. Experiments on CoVoST2 multilingual corpus (En to De/Zh/Ja) demonstrate significant translation quality improvements across languages, validating the effectiveness of syntactic structures in LLM-driven SimulST systems.

NeurIPS Conference 2025 Conference Paper

First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training

  • Lai Wei
  • Yuting Li
  • Chen Wang
  • Yue Wang
  • Linghe Kong
  • Weiran Huang
  • Lichao Sun

Improving Multi-modal Large Language Models (MLLMs) in the post-training stage typically relies on supervised fine-tuning (SFT) or reinforcement learning (RL), which require expensive and manually annotated multi-modal data--an ultimately unsustainable resource. This limitation has motivated a growing interest in unsupervised paradigms as a third stage of post-training after SFT and RL. While recent efforts have explored this direction, their methods are complex and difficult to iterate. To address this, we propose MM-UPT, a simple yet effective framework for unsupervised post-training of MLLMs, enabling continual self-improvement without any external supervision. The training method of MM-UPT builds upon GRPO, replacing traditional reward signals with a self-rewarding mechanism based on majority voting over multiple sampled responses. Our experiments demonstrate that such training method effectively improves the reasoning ability of Qwen2. 5-VL-7B (e. g. , 66. 3\%$\rightarrow$72. 9\% on MathVista, 62. 9\%$\rightarrow$68. 7\% on We-Math), using standard dataset without ground truth labels. To further explore scalability, we extend our framework to a data self-generation setting, designing two strategies that prompt the MLLM to synthesize new training samples on its own. Additional experiments show that combining these synthetic data with the unsupervised training method can also boost performance, highlighting a promising approach for scalable self-improvement. Overall, MM-UPT offers a new paradigm for autonomous enhancement of MLLMs, serving as a critical third step after initial SFT and RL in the absence of external supervision. Our code is available at \url{https: //github. com/waltonfuture/MM-UPT}.

ICML Conference 2024 Conference Paper

Adaptive Online Experimental Design for Causal Discovery

  • Muhammad Qasim Elahi
  • Lai Wei
  • Murat Kocaoglu
  • Mahsa Ghasemi

Causal discovery aims to uncover cause-and-effect relationships encoded in causal graphs by leveraging observational, interventional data, or their combination. The majority of existing causal discovery methods are developed assuming infinite interventional data. We focus on interventional data efficiency and formalize causal discovery from the perspective of online learning, inspired by pure exploration in bandit problems. A graph separating system, consisting of interventions that cut every edge of the graph at least once, is sufficient for learning causal graphs when infinite interventional data is available, even in the worst case. We propose a track-and-stop causal discovery algorithm that adaptively selects interventions from the graph separating system via allocation matching and learns the causal graph based on sampling history. Given any desired confidence value, the algorithm determines a termination condition and runs until it is met. We analyze the algorithm to establish a problem-dependent upper bound on the expected number of required interventional samples. Our proposed algorithm outperforms existing methods in simulations across various randomly generated causal graphs. It achieves higher accuracy, measured by the structural hamming distance (SHD) between the learned causal graph and the ground truth, with significantly fewer samples.

NeurIPS Conference 2024 Conference Paper

Diff-eRank: A Novel Rank-Based Metric for Evaluating Large Language Models

  • Lai Wei
  • Zhiquan Tan
  • Chenghai Li
  • Jindong Wang
  • Weiran Huang

Large Language Models (LLMs) have transformed natural language processing and extended their powerful capabilities to multi-modal domains. As LLMs continue to advance, it is crucial to develop diverse and appropriate metrics for their evaluation. In this paper, we introduce a novel rank-based metric, Diff-eRank, grounded in information theory and geometry principles. Diff-eRank assesses LLMs by analyzing their hidden representations, providing a quantitative measure of how efficiently they eliminate redundant information during training. We demonstrate the applicability of Diff-eRank in both single-modal (e. g. , language) and multi-modal settings. For language models, our results show that Diff-eRank increases with model size and correlates well with conventional metrics such as loss and accuracy. In the multi-modal context, we propose an alignment evaluation method based on the eRank, and verify that contemporary multi-modal LLMs exhibit strong alignment performance based on our method. Our code is publicly available at https: //github. com/waltonfuture/Diff-eRank.

AAAI Conference 2024 Conference Paper

Discriminatively Fuzzy Multi-View K-means Clustering with Local Structure Preserving

  • Jun Yin
  • Shiliang Sun
  • Lai Wei
  • Pei Wang

Multi-view K-means clustering successfully generalizes K-means from single-view to multi-view, and obtains excellent clustering performance. In every view, it makes each data point close to the center of the corresponding cluster. However, multi-view K-means only considers the compactness of each cluster, but ignores the separability of different clusters, which is of great importance to producing a good clustering result. In this paper, we propose Discriminatively Fuzzy Multi-view K-means clustering with Local Structure Preserving (DFMKLS). On the basis of minimizing the distance between each data point and the center of the corresponding cluster, DFMKLS separates clusters by maximizing the distance between the centers of pairwise clusters. DFMKLS also relaxes its objective by introducing the idea of fuzzy clustering, which calculates the probability that a data point belongs to each cluster. Considering multi-view K-means mainly focuses on the global information of the data, to efficiently use the local information, we integrate the local structure preserving into the framework of DFMKLS. The effectiveness of DFMKLS is evaluated on benchmark multi-view datasets. It obtains superior performances than state-of-the-art multi-view clustering methods, including multi-view K-means.

NeurIPS Conference 2023 Conference Paper

Approximate Allocation Matching for Structural Causal Bandits with Unobserved Confounders

  • Lai Wei
  • Muhammad Qasim Elahi
  • Mahsa Ghasemi
  • Murat Kocaoglu

Structural causal bandit provides a framework for online decision-making problems when causal information is available. It models the stochastic environment with a structural causal model (SCM) that governs the causal relations between random variables. In each round, an agent applies an intervention (or no intervention) by setting certain variables to some constants and receives a stochastic reward from a non-manipulable variable. Though the causal structure is given, the observational and interventional distributions of these random variables are unknown beforehand, and they can only be learned through interactions with the environment. Therefore, to maximize the expected cumulative reward, it is critical to balance the explore-versus-exploit tradeoff. We assume each random variable takes a finite number of distinct values, and consider a semi-Markovian setting, where random variables are affected by unobserved confounders. Using the canonical SCM formulation to discretize the domains of unobserved variables, we efficiently integrate samples to reduce model uncertainty. This gives the decision maker a natural advantage over those in a classical multi-armed bandit setup. We provide a logarithmic asymptotic regret lower bound for the structural causal bandit problem. Inspired by the lower bound, we design an algorithm that can utilize the causal structure to accelerate the learning process and take informative and rewarding interventions. We establish that our algorithm achieves a logarithmic regret and demonstrate that it outperforms the existing methods via simulations.

ICML Conference 2023 Conference Paper

Approximate Causal Effect Identification under Weak Confounding

  • Ziwei Jiang
  • Lai Wei
  • Murat Kocaoglu

Causal effect estimation has been studied by many researchers when only observational data is available. Sound and complete algorithms have been developed for pointwise estimation of identifiable causal queries. For non-identifiable causal queries, researchers developed polynomial programs to estimate tight bounds on causal effect. However, these are computationally difficult to optimize for variables with large support sizes. In this paper, we analyze the effect of "weak confounding’" on causal estimands. More specifically, under the assumption that the unobserved confounders that render a query non-identifiable have small entropy, we propose an efficient linear program to derive the upper and lower bounds of the causal effect. We show that our bounds are consistent in the sense that as the entropy of unobserved confounders goes to zero, the gap between the upper and lower bound vanishes. Finally, we conduct synthetic and real data simulations to compare our bounds with the bounds obtained by the existing work that cannot incorporate such entropy constraints and show that our bounds are tighter for the setting with weak confounders.