Arrow Research search

Author name cluster

Yiming Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

46 papers
2 author rows

Possible papers

46

AAAI Conference 2026 Conference Paper

CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial Optimization

  • Weiwei Sun
  • Shengyu Feng
  • Shanda Li
  • Yiming Yang

Although LLM-based agents have attracted significant attention in domains such as software engineering and machine learning research, their role in advancing combinatorial optimization (CO) remains relatively underexplored. This gap underscores the need for a deeper understanding of their potential in tackling structured, constraint-intensive problems---a pursuit currently limited by the absence of comprehensive benchmarks for systematic investigation. To address this, we introduce CO-Bench, a benchmark suite featuring 36 real-world CO problems drawn from a broad range of domains and complexity levels. CO-Bench includes structured problem formulations and curated data to support rigorous investigation of LLM agents. We evaluate multiple agentic frameworks against established human-designed algorithms, revealing the strengths and limitations of existing LLM agents and identifying promising directions for future research.

TMLR Journal 2026 Journal Article

CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

  • Shanda Li
  • Tanya Marwah
  • Junhong Shen
  • Weiwei Sun
  • Andrej Risteski
  • Yiming Yang
  • Ameet Talwalkar

Partial differential equations (PDEs) are fundamental to modeling physical systems, yet solving them remains a complex challenge. Traditional numerical solvers rely on expert knowledge to implement and are computationally expensive, while neural-network-based solvers require large training datasets and often lack interpretability. In this work, we frame PDE solving as a code generation task and introduce CodePDE, the first inference framework for generating PDE solvers using large language models (LLMs). With CodePDE, we present a thorough evaluation on critical capacities of LLM for PDE solving: reasoning, debugging, self-refinement, and test-time scaling. CodePDE shows that, with advanced inference-time algorithms and scaling strategies, LLMs can achieve strong performance across a range of representative PDE problems. We also identify novel insights into LLM-driven solver generation, such as trade-offs between solver reliability and sophistication, design principles for LLM-powered PDE solving agents, and failure modes for LLM on hard tasks. These insights offer guidance for building more capable and reliable LLM-based scientific engines.

AAAI Conference 2026 Conference Paper

Do Large Language Models Think like the Brain? Sentence-Level Evidences from Layer-Wise Embeddings and fMRI

  • Yu Lei
  • Xingyang Ge
  • Yi Zhang
  • Yiming Yang
  • Bolei Ma

Understanding whether large language models (LLMs) and the human brain converge on similar computational principles remains a fundamental and important question in cognitive neuroscience and AI. Do the brain-like patterns observed in LLMs emerge simply from scaling, or do they reflect deeper alignment with the architecture of human language processing? This study focuses on the sentence-level neural mechanisms of language models, systematically investigating how layer-wise representations in LLMs align with the dynamic neural responses during human sentence comprehension. By comparing hierarchical embeddings from 14 publicly available LLMs with fMRI data collected from participants, who were exposed to a naturalistic narrative story, we constructed sentence-level neural prediction models to identify the model layers most significantly correlated with brain region activations. Results show that improvements in model performance drive the evolution of representational architectures toward brain-like hierarchies, particularly achieving stronger functional and anatomical correspondence at higher semantic abstraction levels. These findings advance our understanding of the computational parallels between LLMs and the human brain, highlighting the potential of LLMs as models for human language processing.

AAAI Conference 2026 Conference Paper

DriveFlow: Rectified Flow Adaptation for Robust 3D Object Detection in Autonomous Driving

  • Hongbin Lin
  • Yiming Yang
  • Chaoda Zheng
  • Yifan Zhang
  • Shuaicheng Niu
  • Zilu Guo
  • Yafeng Li
  • Gui Gui

In autonomous driving, vision-centric 3D object detection recognizes and localizes 3D objects from RGB images. However, due to high annotation costs and diverse outdoor scenes, training data often fails to cover all possible test scenarios, known as the out-of-distribution (OOD) issue. Training-free image editing offers a promising solution for improving model robustness by training data enhancement without any modifications to pre-trained diffusion models. Nevertheless, inversion-based methods often suffer from limited effectiveness and inherent inaccuracies, while recent rectified-flow-based approaches struggle to preserve objects with accurate 3D geometry. In this paper, we propose DriveFlow, a Rectified Flow Adaptation method for training data enhancement in autonomous driving based on pre-trained Text-to-Image flow models. Based on frequency decomposition, DriveFlow introduces two strategies to adapt noise-free editing paths derived from text-conditioned velocities. 1) High-Frequency Foreground Preservation: DriveFlow incorporates a high-frequency alignment loss for foreground to maintain precise 3D object geometry. 2) Dual-Frequency Background Optimization: DriveFlow also conducts dual-frequency optimization for background, balancing editing flexibility and semantic consistency. Comprehensive experiments validate the effectiveness and efficiency of DriveFlow, demonstrating comprehensive performance improvements on all categories across OOD scenarios.

AAAI Conference 2026 Conference Paper

From Pretrain to Pain: Adversarial Vulnerability of Video Foundation Models Without Task Knowledge

  • Hui Lu
  • Yi Yu
  • Song Xia
  • Yiming Yang
  • Deepu Rajan
  • Boon Poh Ng
  • Alex Kot
  • Xudong Jiang

Large-scale Video Foundation Models (VFMs) have significantly advanced various video-related tasks, either through task-specific models or Multi-modal Large Language Models (MLLMs). However, the open accessibility of VFMs also introduces critical security risks, as adversaries can exploit full knowledge of the VFMs to launch potent attacks. This paper investigates a novel and practical adversarial threat scenario: attacking downstream models or MLLMs fine-tuned from open-source VFMs, without requiring access to the victim task, training data, model query, and architecture. In contrast to conventional transfer-based attacks that rely on task-aligned surrogate models, we demonstrate that adversarial vulnerabilities can be exploited directly from the VFMs. To this end, we propose the Transferable Video Attack (TVA), a temporal-aware adversarial attack method that leverages the temporal representation dynamics of VFMs to craft effective perturbations. TVA integrates a bidirectional contrastive learning mechanism to maximize the discrepancy between the clean and adversarial features, and introduces a temporal consistency loss that exploits motion cues to enhance the sequential impact of perturbations. TVA avoids the need to train expensive surrogate models or access to domain-specific data, thereby offering a more practical and efficient attack strategy. Extensive experiments across 24 video-related tasks demonstrate the efficacy of TVA against downstream models and MLLMs, revealing a previously underexplored security vulnerability in the deployment of video models.

AAAI Conference 2026 Conference Paper

Less Is Better: Sparse Instance Learning for Cross-Domain Few-Shot Object Detection

  • Yali Huang
  • Jie Mei
  • Ziyi Wu
  • Yiming Yang
  • Hongru Zhao
  • Mingyuan Jiu
  • Hichem Sahbi

Cross-Domain Few-Shot Object Detection (CD-FSOD) is an extremely challenging task due to the inherent data scarcity and substantial domain shift between the source and target domains. Existing methods often suffer from overfitting and noisy feature representations, which hinder the construction of discriminative class prototypes in the target domain. In this paper, we propose a novel framework with sparse instance learning (SI-ViTO) for CD-FSOD, which leverages instance sparsity to achieve a better detection with less representation. SI-ViTO adopts a dual-stage sparsity module, consisting of instance feature sparsity not only on the few-shot support images but also on the query images. This dual sparsity enables the model to effectively preserve salient foreground semantics and simultaneously to filter out redundant or noisy information. Furthermore, a new prototype calibration strategy is also used to dynamically refine the class prototypes with query instances to accelerate prototype adaptation. Extensive experimental results on CD-FSOD benchmarks show that SI-ViTO outperforms the state-of-the-art methods, demonstrating that less discriminative representations yield better cross-domain few-shot object detection performance than more abundant ones.

ICML Conference 2025 Conference Paper

Chaos Meets Attention: Transformers for Large-Scale Dynamical Prediction

  • Yi He
  • Yiming Yang
  • Xiaoyuan Cheng
  • Hai Wang
  • Xiao Xue
  • Boli Chen
  • Yukun Hu

Generating long-term trajectories of dissipative chaotic systems autoregressively is a highly challenging task. The inherent positive Lyapunov exponents amplify prediction errors over time. Many chaotic systems possess a crucial property — ergodicity on their attractors, which makes long-term prediction possible. State-of-the-art methods address ergodicity by preserving statistical properties using optimal transport techniques. However, these methods face scalability challenges due to the curse of dimensionality when matching distributions. To overcome this bottleneck, we propose a scalable transformer-based framework capable of stably generating long-term high-dimensional and high-resolution chaotic dynamics while preserving ergodicity. Our method is grounded in a physical perspective, revisiting the Von Neumann mean ergodic theorem to ensure the preservation of long-term statistics in the $\mathcal{L}^2$ space. We introduce novel modifications to the attention mechanism, making the transformer architecture well-suited for learning large-scale chaotic systems. Compared to operator-based and transformer-based methods, our model achieves better performances across five metrics, from short-term prediction accuracy to long-term statistics. In addition to our methodological contributions, we introduce new chaotic system benchmarks: a machine learning dataset of 140$k$ snapshots of turbulent channel flow and a processed high-dimensional Kolmogorov Flow dataset, along with various evaluation metrics for both short- and long-term performances. Both are well-suited for machine learning research on chaotic systems.

NeurIPS Conference 2025 Conference Paper

Enhancing Bioactivity Prediction via Spatial Emptiness Representation of Protein-ligand Complex and Union of Multiple Pockets

  • Zhiyuan Zhou
  • Yueming Yin
  • Yiming Yang
  • Yuguang Mu
  • Hoi-Yeung Li
  • Adams Wai Kin Kong

Predicting the bioactivity of candidate ligands remains a central challenge in drug discovery. Ligands and endogenous substrates often compete for the same binding sites on target proteins, and the extent to which a ligand can modulate protein function depends not only on its binding but also on how effectively it occupies the relevant pocket. However, most existing methods focus narrowly on local interactions within protein–ligand complexes and neglect spatial emptiness—the unoccupied regions within the binding site that may permit endogenous molecules to engage or interfere. Such unfilled space can diminish the ligand’s functional impact, regardless of binding affinity. To overcome this key limitation in protein–ligand modeling, we propose LigoSpace, a novel method integrating three core components. LigoSpace introduces GeoREC (Geometric Representation of Spatial Emptiness in Complexes) to quantify atomic-level empty space and Union-Pocket to unify multiple protein pockets, providing a global view of binding sites. Additionally, LigoSpace employs a pairwise loss instead of commonly used MSE loss, to better capture relative relationships critical for drug discovery. Extensive experiments on multiple datasets with diverse bioactivity types demonstrate that LigoSpace significantly improves performance when integrated into state-of-the-art models, highlighting the effectiveness of its novel components.

NeurIPS Conference 2025 Conference Paper

Enhancing Training Data Attribution with Representational Optimization

  • Weiwei Sun
  • Haokun Liu
  • Nikhil Kandpal
  • Colin Raffel
  • Yiming Yang

Training data attribution (TDA) methods aim to measure how training data impacts a model's predictions. While gradient-based attribution methods, such as influence functions, offer theoretical grounding, their computational costs make them impractical for large-scale applications. Representation-based approaches are far more scalable, but typically rely on heuristic embeddings that are not optimized for attribution, limiting their fidelity. To address these challenges, we propose AirRep, a scalable, representation-based approach that closes this gap by learning task-specific and model-aligned representations optimized explicitly for TDA. AirRep introduces two key innovations: a trainable encoder tuned for attribution quality, and an attention-based pooling mechanism that enables accurate estimation of group-wise influence. We train AirRep using a ranking objective over automatically constructed training subsets labeled by their empirical effect on target predictions. Experiments on instruction-tuned LLMs demonstrate that AirRep achieves performance on par with state-of-the-art gradient-based approaches while being nearly two orders of magnitude more efficient at inference time. Further analysis highlights its robustness and generalization across tasks and models. Our code is available at https: //github. com/sunnweiwei/AirRep.

NeurIPS Conference 2025 Conference Paper

Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning

  • Yiqun Chen
  • Lingyong Yan
  • Weiwei Sun
  • Xinyu Ma
  • Yi Zhang
  • Shuaiqiang Wang
  • Dawei Yin
  • Yiming Yang

Retrieval-augmented generation (RAG) is widely utilized to incorporate external knowledge into large language models, thereby enhancing factuality and reducing hallucinations in question-answering (QA) tasks. A standard RAG pipeline consists of several components, such as query rewriting, document retrieval, document filtering, and answer generation. However, these components are typically optimized separately through supervised fine-tuning, which can lead to misalignments between the objectives of individual components and the overarching aim of generating accurate answers. Although recent efforts have explored using reinforcement learning (RL) to optimize specific RAG components, these approaches often focus on simple pipelines with only two components or do not adequately address the complex interdependencies and collaborative interactions among the modules. To overcome these limitations, we propose treating the complex RAG pipeline with multiple components as a multi-agent cooperative task, in which each component can be regarded as an RL agent. Specifically, we present MMOA-RAG\footnote{The code of MMOA-RAG is on \url{https: //github. com/chenyiqun/MMOA-RAG}. }, \textbf{M}ulti-\textbf{M}odule joint \textbf{O}ptimization \textbf{A}lgorithm for \textbf{RAG}, which employs multi-agent reinforcement learning to harmonize all agents' goals toward a unified reward, such as the F1 score of the final answer. Experiments conducted on various QA benchmarks demonstrate that MMOA-RAG effectively boost the overall performance of the pipeline and outperforms existing baselines. Furthermore, comprehensive ablation studies validate the contributions of individual components and demonstrate MMOA-RAG can be adapted to different RAG pipelines and benchmarks.

ICML Conference 2025 Conference Paper

Large Language Models are Demonstration Pre-Selectors for Themselves

  • Jiarui Jin
  • Yuwei Wu
  • Haoxuan Li 0001
  • Xiaoting He
  • Weinan Zhang 0001
  • Yiming Yang
  • Yong Yu 0001
  • Jun Wang 0012

In-context learning with large language models (LLMs) delivers strong few-shot performance by choosing few-shot demonstrations from the entire training dataset. However, previous few-shot in-context learning methods, which calculate similarity scores for choosing demonstrations, incur high computational costs by repeatedly retrieving large-scale datasets for each query. This is due to their failure to recognize that not all demonstrations are equally informative, and many less informative demonstrations can be inferred from a core set of highly informative ones. To this end, we propose FEEDER (FEw yet Essential Demonstration prE-selectoR), a novel pre-selection framework that identifies a core subset of demonstrations containing the most informative examples. This subset, referred to as the FEEDER set, consists of demonstrations that capture both the ”sufficiency” and ”necessity” information to infer the entire dataset. Notice that FEEDER is selected before the few-shot in-context learning, enabling more efficient few-shot demonstrations choosing in a smaller set. To identify FEEDER, we propose a novel effective tree based algorithm. Once selected, it can replace the original dataset, leading to improved efficiency and prediction accuracy in few-shot in-context learning. Additionally, FEEDER also benefit fine-tuning LLMs, we propose a bi-level optimization method enabling more efficient training without sacrificing performance when datasets become smaller. Our experiments are on 6 text classification datasets, 1 reasoning dataset, and 1 semantic-parsing dataset, across 6 LLMs (ranging from 335M to 7B parameters), demonstrate that: (i) In few-shot inference, FEEDER achieves superior (or comparable) performance while utilizing only half the input training data. (ii) In fine-tuning, FEEDER significantly boosts the performance of LLMs.

ICLR Conference 2025 Conference Paper

Learning Chaos In A Linear Way

  • Xiaoyuan Cheng
  • Yi He
  • Yiming Yang
  • Xiao Xue
  • Sibo Cheng
  • Daniel Giles
  • Xiaohang Tang
  • Yukun Hu

Learning long-term behaviors in chaotic dynamical systems, such as turbulent flows and climate modelling, is challenging due to their inherent instability and unpredictability. These systems exhibit positive Lyapunov exponents, which significantly hinder accurate long-term forecasting. As a result, understanding long-term statistical behavior is far more valuable than focusing on short-term accuracy. While autoregressive deep sequence models have been applied to capture long-term behavior, they often lead to exponentially increasing errors in learned dynamics. To address this, we shift the focus from simple prediction errors to preserving an invariant measure in dissipative chaotic systems. These systems have attractors, where trajectories settle, and the invariant measure is the probability distribution on attractors that remains unchanged under dynamics. Existing methods generate long trajectories of dissipative chaotic systems by aligning invariant measures, but it is not always possible to obtain invariant measures for arbitrary datasets. We propose the Poincaré Flow Neural Network (PFNN), a novel operator learning framework designed to capture behaviors of chaotic systems without any explicit knowledge of the invariant measure. PFNN employs an auto-encoder to map the chaotic system to a finite-dimensional feature space, effectively linearizing the chaotic evolution. It then learns the linear evolution operators to match the physical dynamics by addressing two critical properties in dissipative chaotic systems: (1) contraction, the system’s convergence toward its attractors, and (2) measure invariance, trajectories on the attractors following a probability distribution invariant to the dynamics. Our experiments on a variety of chaotic systems, including Lorenz systems, Kuramoto-Sivashinsky equation and Navier–Stokes equation, demonstrate that PFNN has more accurate predictions and physical statistics compared to competitive baselines including the Fourier Neural Operator and the Markov Neural Operator.

NeurIPS Conference 2025 Conference Paper

Safe and Stable Control via Lyapunov-Guided Diffusion Models

  • Xiaoyuan Cheng
  • Xiaohang Tang
  • Yiming Yang

Diffusion models have made significant strides in recent years, exhibiting strong generalization capabilities in planning and control tasks. However, most diffusion-based policies remain focused on reward maximization or cost minimization, often overlooking critical aspects of safety and stability. In this work, we propose Safe and Stable Diffusion ($S^2$Diff), a model-based framework that explores how diffusion models can ensure safety and stability from a Lyapunov perspective. We demonstrate that $S^2$Diff eliminates the reliance on both complex gradient-based solvers (e. g. , quadratic programming, non-convex solvers) and control-affine structures, leading to globally valid control policies driven by the learned certificate functions. Additionally, we uncover intrinsic connections between diffusion sampling and almost Lyapunov theory, enabling the use of trajectory-level control policies to learn better certificate functions for safety and stability guarantees. To validate our approach, we conduct experiments on a wide variety of dynamical control systems, where $S^2$Diff consistently outperforms both certificate-based controllers and model-based diffusion baselines in terms of safety, stability, and overall control performance.

AAAI Conference 2025 Conference Paper

SORREL: Suboptimal-Demonstration-Guided Reinforcement Learning for Learning to Branch

  • Shengyu Feng
  • Yiming Yang

Mixed Integer Linear Program (MILP) solvers are mostly built upon a Branch-and-Bound (B&B) algorithm, where the efficiency of traditional solvers heavily depends on hand-crafted heuristics for branching. The past few years have witnessed the increasing popularity of data-driven approaches to automatically learn these heuristics. However, the success of these methods is highly dependent on the availability of high-quality demonstrations, which requires either the development of near-optimal heuristics or a time-consuming sampling process. This paper averts this challenge by proposing Suboptimal-Demonstration-Guided Reinforcement Learning (SORREL) for learning to branch. SORREL selectively learns from suboptimal demonstrations based on value estimation. It utilizes suboptimal demonstrations through both offline reinforcement learning on the demonstrations generated by suboptimal heuristics and self-imitation learning on past good experiences sampled by itself. Our experiments demonstrate its advanced performance in both branching quality and training efficiency over previous methods for various MILPs.

ICML Conference 2025 Conference Paper

Tensor-Var: Efficient Four-Dimensional Variational Data Assimilation

  • Yiming Yang
  • Xiaoyuan Cheng
  • Daniel Giles
  • Sibo Cheng
  • Yi He
  • Xiao Xue
  • Boli Chen
  • Yukun Hu

Variational data assimilation estimates the dynamical system states by minimizing a cost function that fits the numerical models with the observational data. Although four-dimensional variational assimilation (4D-Var) is widely used, it faces high computational costs in complex nonlinear systems and depends on imperfect state-observation mappings. Deep learning (DL) offers more expressive approximators, while integrating DL models into 4D-Var is challenging due to their nonlinearities and lack of theoretical guarantees in assimilation results. In this paper, we propose Tensor-Var, a novel framework that integrates kernel conditional mean embedding (CME) with 4D-Var to linearize nonlinear dynamics, achieving convex optimization in a learned feature space. Moreover, our method provides a new perspective for solving 4D-Var in a linear way, offering theoretical guarantees of consistent assimilation results between the original and feature spaces. To handle large-scale problems, we propose a method to learn deep features (DFs) using neural networks within the Tensor-Var framework. Experiments on chaotic systems and global weather prediction with real-time observations show that Tensor-Var outperforms conventional and DL hybrid 4D-Var baselines in accuracy while achieving a 10- to 20-fold speed improvement.

ICLR Conference 2025 Conference Paper

TFG-Flow: Training-free Guidance in Multimodal Generative Flow

  • Haowei Lin
  • Shanda Li
  • Haotian Ye
  • Yiming Yang
  • Stefano Ermon
  • Yitao Liang
  • Jianzhu Ma

Given an unconditional generative model and a predictor for a target property (e.g., a classifier), the goal of training-free guidance is to generate samples with desirable target properties without additional training. As a highly efficient technique for steering generative models toward flexible outcomes, training-free guidance has gained increasing attention in diffusion models. However, existing methods only handle data in continuous spaces, while many scientific applications involve both continuous and discrete data (referred to as multimodality). Another emerging trend is the growing use of the simple and general flow matching framework in building generative foundation models, where guided generation remains under-explored. To address this, we introduce TFG-Flow, a novel training-free guidance method for multimodal generative flow. TFG-Flow addresses the curse-of-dimensionality while maintaining the property of unbiased sampling in guiding discrete variables. We validate TFG-Flow on four molecular design tasks and show that TFG-Flow has great potential in drug design by generating molecules with desired properties.

AAAI Conference 2025 Conference Paper

Topo2Seq: Enhanced Topology Reasoning via Topology Sequence Learning

  • Yiming Yang
  • Yueru Luo
  • Bingkun He
  • Erlong Li
  • Zhipeng Cao
  • Chao Zheng
  • Shuqi Mei
  • Zhen Li

Extracting lane topology from perspective views (PV) is crucial for planning and control in autonomous driving. This approach extracts potential drivable trajectories for self-driving vehicles without relying on high-definition (HD) maps. However, the unordered nature and weak long-range perception of the DETR-like framework can result in misaligned segment endpoints and limited topological prediction capabilities. Inspired by the learning of contextual relationships in language models, the connectivity relations in roads can be characterized as explicit topology sequences. In this paper, we introduce Topo2Seq, a novel approach for enhancing topology reasoning via topology sequences learning. The core concept of Topo2Seq is a randomized order prompt-to-sequence learning between lane segment decoder and topology sequence decoder. The dual-decoder branches simultaneously learn the lane topology sequences extracted from the Directed Acyclic Graph (DAG) and the lane graph containing geometric information. Randomized order prompt-to-sequence learning extracts unordered key points from the lane graph predicted by the lane segment decoder, which are then fed into the prompt design of the topology sequence decoder to reconstruct an ordered and complete lane graph. In this way, the lane segment decoder learns powerful long-range perception and accurate topological reasoning from the topology sequence decoder. Notably, topology sequence decoder is only introduced during training and does not affect the inference efficiency. Experimental evaluations on the OpenLane-V2 dataset demonstrate the state-of-the-art performance of Topo2Seq in topology reasoning.

NeurIPS Conference 2024 Conference Paper

AutoMix: Automatically Mixing Language Models

  • Pranjal Aggarwal
  • Aman Madaan
  • Ankit Anand
  • Srividya Pranavi Potharaju
  • Swaroop Mishra
  • Pei Zhou
  • Aditya Gupta
  • Dheeraj Rajagopal

Large language models (LLMs) are now available from cloud API providers in various sizes and configurations. While this diversity offers a broad spectrum of choices, effectively leveraging the options to optimize computational cost and performance remains challenging. In this work, we present AutoMix, an approach that strategically routes queries to larger LMs, based on the approximate correctness of outputs from a smaller LM. Central to AutoMix are two key technical contributions. First, it has a few-shot self-verification mechanism, which estimates the reliability of its own outputs without requiring extensive training. Second, given that self-verification can be noisy, it employs a POMDP based router that can effectively select an appropriately sized model, based on answer confidence. Experiments across five language models and five challenging datasets show that Automix consistently surpasses strong baselines, reducing computational cost by over 50\% for comparable performance.

NeurIPS Conference 2024 Conference Paper

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

  • Zhiqing Sun
  • Longhui Yu
  • Yikang Shen
  • Weiyang Liu
  • Yiming Yang
  • Sean Welleck
  • Chuang Gan

Current AI alignment methodologies rely on human-provided demonstrations or judgments, and the learned capabilities of AI systems would be upper-bounded by human capabilities as a result. This raises a challenging research question: How can we keep improving the systems when their capabilities have surpassed the levels of humans? This paper answers this question in the context of tackling hard reasoning tasks (e. g. , level 4-5 MATH problems) via learning from human annotations on easier tasks (e. g. , level 1-3 MATH problems), which we term as easy-to-hard generalization. Our key insight is that an evaluator (reward model) trained on supervisions for easier tasks can be effectively used for scoring candidate solutions of harder tasks and hence facilitating easy-to-hard generalization over different levels of tasks. Based on this insight, we propose a novel approach to scalable alignment, which firstly trains the (process-supervised) reward models on easy problems (e. g. , level 1-3), and then uses them to evaluate the performance of policy models on hard problems. We show that such easy-to-hard generalization from evaluators can enable easy-to-hard generalizations in generators either through re-ranking or reinforcement learning (RL). Notably, our process-supervised 7b RL model and 34b model (reranking@1024) achieves an accuracy of 34. 0% and 52. 5% on MATH500, respectively, despite only using human supervision on easy problems. Our approach suggests a promising path toward AI systems that advance beyond the frontier of human supervision.

ECAI Conference 2023 Conference Paper

Cardsformer: Grounding Language to Learn a Generalizable Policy in Hearthstone

  • Wannian Xia
  • Yiming Yang
  • Jingqing Ruan
  • Dengpeng Xing
  • Bo Xu 0002

Hearthstone is a widely played collectible card game that challenges players to strategize using cards with various effects described in natural language. While human players can easily comprehend card descriptions and make informed decisions, artificial agents struggle to understand the game’s inherent rules and are unable to generalize their policies through natural language. To address this issue, we propose Cardsformer, a method capable of acquiring linguistic knowledge and learning a generalizable policy in Hearthstone. Cardsformer consists of a Prediction Model trained with offline trajectories to predict state transitions based on card descriptions and a Policy Model capable of generalizing its policy on unseen cards. To our knowledge, this is the first work to consider language knowledge in a card game. Experiments show that our approach significantly improves data efficiency and outperforms the state-of-the-art in Hearthstone even when there are untrained cards in the deck, inspiring a new perspective of tackling problems as such with knowledge representation from large language models. As the game constantly releases new cards along with new descriptions and new effects, the challenge in Hearthstone remains. To encourage further research, we make our code publicly available and publish PyStone, the code base of Hearthstone on which we conducted our experiments, as an open benchmark.

NeurIPS Conference 2023 Conference Paper

DIFUSCO: Graph-based Diffusion Solvers for Combinatorial Optimization

  • Zhiqing Sun
  • Yiming Yang

Neural network-based Combinatorial Optimization (CO) methods have shown promising results in solving various NP-complete (NPC) problems without relying on hand-crafted domain knowledge. This paper broadens the current scope of neural solvers for NPC problems by introducing a new graph-based diffusion framework, namely DIFUSCO. It formulates NPC problems into a discrete {0, 1}-vector space and uses graph-based denoising diffusion models to generate high-quality solutions. Specifically, we explore diffusion models with Gaussian and Bernoulli noise, respectively, and also introduce an effective inference schedule to improve the generation quality. We evaluate our methods on two well-studied combinatorial optimization problems: Traveling Salesman Problem (TSP) and Maximal Independent Set (MIS). Experimental results show that DIFUSCO strongly outperforms the previous state-of-the-art neural solvers, improving the performance gap between ground-truth and neural solvers from 1. 76% to 0. 46% on TSP-500, from 2. 46% to 1. 17% on TSP-1000, and from 3. 19% to 2. 58% on TSP-10000. For the MIS problem, DIFUSCO outperforms the previous state-of-the-art neural solver on the challenging SATLIB benchmark. Our code is available at this url.

IROS Conference 2023 Conference Paper

Generalized Robot Dynamics Learning and Gen2Real Transfer

  • Dengpeng Xing
  • Yiming Yang
  • Zechang Wang
  • Jiale Li
  • Bo Xu 0002

Acquiring dynamics is critical for robot learning and is fundamental to planning and control. This paper concerns two fundamental questions: How can we learn a model that covers massive, diverse robot dynamics? Can we construct a model that lifts the data-collection pain and domain expertise required for building specific robot models? We learn the dynamics involved in a dataset containing a large number of serial articulated robots and propose a new concept, “Gen2Real”, to transfer simulated, generalized models to physical, specific robots. We generate a large-scale dataset by randomizing dynamics parameters, topology configurations, and model dimensions, which, in sequence, correspond to different properties, connections, and numbers of robot links. A structure modified from the generative pre-trained transformer is applied to approximate the dynamics of massive heterogeneous robots. In Gen2Real, we transfer the pre-trained model to a target robot using distillation, for the sake of real-time computation. The results demonstrate the superiority of the proposed method in terms of its accuracy in learning a tremendous amount of robot dynamics and its generality to transfer to different robots.

NeurIPS Conference 2023 Conference Paper

Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision

  • Zhiqing Sun
  • Yikang Shen
  • Qinhong Zhou
  • Hongxin Zhang
  • Zhenfang Chen
  • David Cox
  • Yiming Yang
  • Chuang Gan

Recent AI-assistant agents, such as ChatGPT, predominantly rely on supervised fine-tuning (SFT) with human annotations and reinforcement learning from human feedback (RLHF) to align the output of large language models (LLMs) with human intentions, ensuring they are helpful, ethical, and reliable. However, this dependence can significantly constrain the true potential of AI-assistant agents due to the high cost of obtaining human supervision and the related issues on quality, reliability, diversity, self-consistency, and undesirable biases. To address these challenges, we propose a novel approach called SELF-ALIGN, which combines principle-driven reasoning and the generative power of LLMs for the self-alignment of AI agents with minimal human supervision. Our approach encompasses four stages: first, we use an LLM to generate synthetic prompts, and a topic-guided method to augment the prompt diversity; second, we use a small set of human-written principles for AI models to follow, and guide the LLM through in-context learning from demonstrations (of principles application) to produce helpful, ethical, and reliable responses to user's queries; third, we fine-tune the original LLM with the high-quality self-aligned responses so that the resulting model can generate desirable responses for each query directly without the principle set and the demonstrations anymore; and finally, we offer a refinement step to address the issues of overly-brief or indirect responses. Applying SELF-ALIGN to the LLaMA-65b base language model, we develop an AI assistant named Dromedary. With fewer than 300 lines of human annotations (including < 200 seed prompts, 16 generic principles, and 5 exemplars for in-context learning). Dromedary significantly surpasses the performance of several state-of-the-art AI systems, including Text-Davinci-003 and Alpaca, on benchmark datasets with various settings.

NeurIPS Conference 2023 Conference Paper

Self-Refine: Iterative Refinement with Self-Feedback

  • Aman Madaan
  • Niket Tandon
  • Prakhar Gupta
  • Skyler Hallinan
  • Luyu Gao
  • Sarah Wiegreffe
  • Uri Alon
  • Nouha Dziri

Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides *feedback* for its output and uses it to *refine* itself, iteratively. Self-Refine does not require any supervised training data, additional training, or reinforcement learning, and instead uses a single LLM as the generator, refiner and the feedback provider. We evaluate Self-Refine across 7 diverse tasks, ranging from dialog response generation to mathematical reasoning, using state-of-the-art (GPT-3. 5, ChatGPT, and GPT-4) LLMs. Across all evaluated tasks, outputs generated with Self-Refine are preferred by humans and automatic metrics over those generated with the same LLM using conventional one-step generation, improving by $\sim$20\% absolute on average in task performance. Our work demonstrates that even state-of-the-art LLMs like GPT-4 can be further improved at test-time using our simple, standalone approach.

NeurIPS Conference 2022 Conference Paper

DIMES: A Differentiable Meta Solver for Combinatorial Optimization Problems

  • Ruizhong Qiu
  • Zhiqing Sun
  • Yiming Yang

Recently, deep reinforcement learning (DRL) models have shown promising results in solving NP-hard Combinatorial Optimization (CO) problems. However, most DRL solvers can only scale to a few hundreds of nodes for combinatorial optimization problems on graphs, such as the Traveling Salesman Problem (TSP). This paper addresses the scalability challenge in large-scale combinatorial optimization by proposing a novel approach, namely, DIMES. Unlike previous DRL methods which suffer from costly autoregressive decoding or iterative refinements of discrete solutions, DIMES introduces a compact continuous space for parameterizing the underlying distribution of candidate solutions. Such a continuous space allows stable REINFORCE-based training and fine-tuning via massively parallel sampling. We further propose a meta-learning framework to enable the effective initialization of model parameters in the fine-tuning stage. Extensive experiments show that DIMES outperforms recent DRL-based methods on large benchmark datasets for Traveling Salesman Problems and Maximal Independent Set problems.

AAAI Conference 2022 Conference Paper

JAKET: Joint Pre-training of Knowledge Graph and Language Understanding

  • Donghan Yu
  • Chenguang Zhu
  • Yiming Yang
  • Michael Zeng

Knowledge graphs (KGs) contain rich information about world knowledge, entities, and relations. Thus, they can be great supplements to existing pre-trained language models. However, it remains a challenge to efficiently integrate information from KG into language modeling. And the understanding of a knowledge graph requires related context. We propose a novel joint pre-training framework, JAKET, to model both the knowledge graph and language. The knowledge module and language module provide essential information to mutually assist each other: the knowledge module produces embeddings for entities in text while the language module generates context-aware initial embeddings for entities and relations in the graph. Our design enables the pre-trained model to easily adapt to unseen knowledge graphs in new domains. Experiment results on several knowledge-aware NLP tasks show that our proposed framework achieves superior performance by effectively leveraging knowledge in language understanding.

NeurIPS Conference 2020 Conference Paper

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing

  • Zihang Dai
  • Guokun Lai
  • Yiming Yang
  • Quoc Le

With the success of language pretraining, it is highly desirable to develop more efficient architectures of good scalability that can exploit the abundant unlabeled data at a lower cost. To improve the efficiency, we examine the much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only require a single-vector presentation of the sequence. With this intuition, we propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, we further improve the model capacity. In addition, to perform token-level predictions as required by common pretraining objectives, Funnel-Transformer is able to recover a deep representation for each token from the reduced hidden sequence via a decoder. Empirically, with comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks, including text classification, language understanding, and reading comprehension.

NeurIPS Conference 2019 Conference Paper

Re-examination of the Role of Latent Variables in Sequence Modeling

  • Guokun Lai
  • Zihang Dai
  • Yiming Yang
  • Shinjae Yoo

With latent variables, stochastic recurrent models have achieved state-of-the-art performance in modeling sound-wave sequence. However, opposite results are also observed in other domains, where standard recurrent networks often outperform stochastic models. To better understand this discrepancy, we re-examine the roles of latent variables in stochastic recurrent models for speech density estimation. Our analysis reveals that under the restriction of fully factorized output distribution in previous evaluations, the stochastic variants were implicitly leveraging intra-step correlation but the deterministic recurrent baselines were prohibited to do so, resulting in an unfair comparison. To correct the unfairness, we remove such restriction in our re-examination, where all the models can explicitly leverage intra-step correlation with an auto-regressive structure. Over a diverse set of univariate and multivariate sequential data, including human speech, MIDI music, handwriting trajectory, and frame-permuted speech, our results show that stochastic recurrent models fail to deliver the performance advantage claimed in previous work. %exhibit any practical advantage despite the claimed theoretical superiority. In contrast, standard recurrent models equipped with an auto-regressive output distribution consistently perform better, dramatically advancing the state-of-the-art results on three speech datasets.

AAAI Conference 2019 Conference Paper

Switch-Based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning

  • Yuexin Wu
  • Xiujun Li
  • Jingjing Liu
  • Jianfeng Gao
  • Yiming Yang

Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences. The Dyna-Q algorithm extends Q-learning by integrating a world model, and thus can effectively boost training efficiency using simulated experiences generated by the world model. The effectiveness of Dyna-Q, however, depends on the quality of the world model - or implicitly, the pre-specified ratio of real vs. simulated experiences used for Q-learning. To this end, we extend the recently proposed Deep Dyna-Q (DDQ) framework by integrating a switcher that automatically determines whether to use a real or simulated experience for Q-learning. Furthermore, we explore the use of active learning for improving sample efficiency, by encouraging the world model to generate simulated experiences in the stateaction space where the agent has not (fully) explored. Our results show that by combining switcher and active learning, the new framework named as Switch-based Active Deep Dyna-Q (Switch-DDQ), leads to significant improvement over DDQ and Q-learning baselines in both simulation and human evaluations. 1

NeurIPS Conference 2019 Conference Paper

XLNet: Generalized Autoregressive Pretraining for Language Understanding

  • Zhilin Yang
  • Zihang Dai
  • Yiming Yang
  • Jaime Carbonell
  • Russ Salakhutdinov
  • Quoc Le

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, under comparable experiment setting, XLNet outperforms BERT on 20 tasks, often by a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.

IROS Conference 2018 Conference Paper

Automatic Parameter Tuning of Motion Planning Algorithms

  • José Cano 0001
  • Yiming Yang
  • Bruno Bodin
  • Vijay Nagarajan
  • Michael F. P. O'Boyle

Motion planning algorithms attempt to find a good compromise between planning time and quality of solution. Due to their heuristic nature, they are typically configured with several parameters. In this paper we demonstrate that, in many scenarios, the widely used default parameter values are not ideal. However, finding the best parameters to optimise some metric(s) is not trivial because the size of the parameter space can be large. We evaluate and compare the efficiency of four different methods (i. e. random sampling, AUC-Bandit, random forest, and bayesian optimisation) to tune the parameters of two motion planning algorithms, BKPIECE and RRT-connect. We present a table-top-reaching scenario where the seven degrees-of-freedom KUKA LWR robotic arm has to move from an initial to a goal pose in the presence of several objects in the environment. We show that the best methods for BKPIECE (AUC-Bandit) and RRT-Connect (random forest) improve the performance by 4. 5x and 1. 26x on average respectively. Then, we generate a set of random scenarios of increasing complexity, and we observe that optimal parameters found in simple environments perform well in more complex scenarios. Finally, we find that the time required to evaluate parameter configurations can be reduced by more than 2/3 with low error. Overall, our results demonstrate that for a variety of motion planning problems it is possible to find solutions that significantly improve the performance over default configurations while requiring very reasonable computation times.

IROS Conference 2018 Conference Paper

Dual-Arm Coordinated Motion Planning and Compliance Control for Capturing Moving Objects with Large Momentum

  • Lei Yan 0011
  • Yiming Yang
  • Wenfu Xu
  • Sethu Vijayakumar

Capturing a moving object with large momentum by a dual-arm robot is especially challenging because of the requirement of dual-arm coordinated motion planning for tracking the moving object, and the operational force control for contact and momentum transfer. In this paper, we present a dual-arm coordinated motion planning and compliance control method with a unique null-space projected relative Jacobian and relative operational force between the two arms. The proposed method is able to plan dual-arm capturing motion and control the capturing force without disturbing the tracking motion. We have also adopted a direct collocation trajectory optimization method to generate optimal trajectory to decrease the object's momentum with minimum effort. Simulation and experiment of dual-arm robots picking up a moving box on a mobile platform are carried out to verify the proposed method.

AAAI Conference 2017 Conference Paper

Cross-Domain Kernel Induction for Transfer Learning

  • Wei-Cheng Chang
  • Yuexin Wu
  • Hanxiao Liu
  • Yiming Yang

The key question in transfer learning (TL) research is how to make model induction transferable across different domains. Common methods so far require source and target domains to have a shared/homogeneous feature space, or the projection of features from heterogeneous domains onto a shared space. This paper proposes a novel framework, which does not require a shared feature space but instead uses a parallel corpus to calibrate domain-specific kernels into a unified kernel, to leverage graph-based label propagation in cross-domain settings, and to optimize semi-supervised learning based on labeled and unlabeled data in both source and target domains. Our experiments on benchmark datasets show advantageous performance of the proposed method over that of other stateof-the-art TL methods.

IJCAI Conference 2017 Conference Paper

Data-driven Random Fourier Features using Stein Effect

  • Wei-Cheng Chang
  • Chun-Liang Li
  • Yiming Yang
  • Barnabás Póczos

Large-scale kernel approximation is an important problem in machine learning research. Approaches using random Fourier features have become increasingly popular \cite{Rahimi_NIPS_07}, where kernel approximation is treated as empirical mean estimation via Monte Carlo (MC) or Quasi-Monte Carlo (QMC) integration \cite{Yang_ICML_14}. A limitation of the current approaches is that all the features receive an equal weight summing to 1. In this paper, we propose a novel shrinkage estimator from "Stein effect", which provides a data-driven weighting strategy for random features and enjoys theoretical justifications in terms of lowering the empirical risk. We further present an efficient randomized algorithm for large-scale applications of the proposed method. Our empirical results on six benchmark data sets demonstrate the advantageous performance of this approach over representative baselines in both kernel approximation and supervised learning tasks.

NeurIPS Conference 2017 Conference Paper

MMD GAN: Towards Deeper Understanding of Moment Matching Network

  • Chun-Liang Li
  • Wei-Cheng Chang
  • Yu Cheng
  • Yiming Yang
  • Barnabas Poczos

Generative moment matching network (GMMN) is a deep generative model that differs from Generative Adversarial Network (GAN) by replacing the discriminator in GAN with a two-sample test based on kernel maximum mean discrepancy (MMD). Although some theoretical guarantees of MMD have been studied, the empirical performance of GMMN is still not as competitive as that of GAN on challenging and large benchmark datasets. The computational efficiency of GMMN is also less desirable in comparison with GAN, partially due to its requirement for a rather large batch size during the training. In this paper, we propose to improve both the model expressiveness of GMMN and its computational efficiency by introducing {\it adversarial kernel learning} techniques, as the replacement of a fixed Gaussian kernel in the original GMMN. The new approach combines the key ideas in both GMMN and GAN, hence we name it MMD-GAN. The new distance measure in MMD-GAN is a meaningful loss that enjoys the advantage of weak$^*$ topology and can be optimized via gradient descent with relatively small batch sizes. In our evaluation on multiple benchmark datasets, including MNIST, CIFAR-10, CelebA and LSUN, the performance of MMD-GAN significantly outperforms GMMN, and is competitive with other representative GAN works.

NeurIPS Conference 2016 Conference Paper

Adaptive Smoothed Online Multi-Task Learning

  • Keerthiram Murugesan
  • Hanxiao Liu
  • Jaime Carbonell
  • Yiming Yang

This paper addresses the challenge of jointly learning both the per-task model parameters and the inter-task relationships in a multi-task online learning setting. The proposed algorithm features probabilistic interpretation, efficient updating rules and flexible modulation on whether learners focus on their specific task or on jointly address all tasks. The paper also proves a sub-linear regret bound as compared to the best linear predictor in hindsight. Experiments over three multi-task learning benchmark datasets show advantageous performance of the proposed approach over several state-of-the-art online multi-task learning baselines.

JAIR Journal 2016 Journal Article

Learning Concept Graphs from Online Educational Data

  • Hanxiao Liu
  • Wanli Ma
  • Yiming Yang
  • Jaime Carbonell

This paper addresses an open challenge in educational data mining, i.e., the problem of automatically mapping online courses from different providers (universities, MOOCs, etc.) onto a universal space of concepts, and predicting latent prerequisite dependencies (directed links) among both concepts and courses. We propose a novel approach for inference within and across course-level and concept-level directed graphs. In the training phase, our system projects partially observed course-level prerequisite links onto directed concept-level links; in the testing phase, the induced concept-level links are used to infer the unknown course-level prerequisite links. Whereas courses may be specific to one institution, concepts are shared across different providers. The bi-directional mappings enable our system to perform interlingua-style transfer learning, e.g. treating the concept graph as the interlingua and transferring the prerequisite relations across universities via the interlingua. Experiments on our newly collected datasets of courses from MIT, Caltech, Princeton and CMU show promising results.

NeurIPS Conference 2012 Conference Paper

Bayesian models for Large-scale Hierarchical Classification

  • Siddharth Gopal
  • Yiming Yang
  • Bing Bai
  • Alexandru Niculescu-Mizil

A challenging problem in hierarchical classification is to leverage the hierarchical relations among classes for improving classification performance. An even greater challenge is to do so in a manner that is computationally feasible for the large scale problems usually encountered in practice. This paper proposes a set of Bayesian methods to model hierarchical dependencies among class labels using multivari- ate logistic regression. Specifically, the parent-child relationships are modeled by placing a hierarchical prior over the children nodes centered around the parame- ters of their parents; thereby encouraging classes nearby in the hierarchy to share similar model parameters. We present new, efficient variational algorithms for tractable posterior inference in these models, and provide a parallel implementa- tion that can comfortably handle large-scale problems with hundreds of thousands of dimensions and tens of thousands of classes. We run a comparative evaluation on multiple large-scale benchmark datasets that highlights the scalability of our approach, and shows a significant performance advantage over the other state-of- the-art hierarchical methods.

IS Journal 2007 Journal Article

Cost-Sensitive-Data Preprocessing for Mining Customer Relationship Management Databases

  • Junfeng Pan
  • Qiang Yang
  • Yiming Yang
  • Lei Li
  • Frances Li
  • George Li

A staged-framework for data preprocessing has been developed to support data mining and help service providers identify customers who might switch to a competitor. The framework pushes the cost sensitivity and data imbalance of customer retention data into the data preprocessing itself. Tests using data set from the ACM KDD Cup 1998 showed that the framework outperformed the winner of that data mining and knowledge discovery competition. The framework has also been incorporated into a software system, called ED-Money. To demonstrate the framework's ability to predict customer attrition with high accuracy, it was applied to some benchmark data and to a real customer attrition data set from a large Chinese mobile telecommunications company

NeurIPS Conference 2005 Conference Paper

From Lasso regression to Feature vector machine

  • Fan Li
  • Yiming Yang
  • Eric Xing

Lasso regression tends to assign zero weights to most irrelevant or redundant features, and hence is a promising technique for feature selection. Its limitation, however, is that it only offers solutions to linear models. Kernel machines with feature scaling techniques have been studied for feature selection with non-linear models. However, such approaches require to solve hard non-convex optimization problems. This paper proposes a new approach named the Feature Vector Machine (FVM). It reformulates the standard Lasso regression into a form isomorphic to SVM, and this form can be easily extended for feature selection with non-linear models by introducing kernels defined on feature vectors. FVM generates sparse solutions in the nonlinear feature space and it is much more tractable compared to feature scaling kernel machines. Our experiments with FVM on simulated data show encouraging results in identifying the small number of dominating features that are non-linearly correlated to the response, a task the standard Lasso fails to complete.

NeurIPS Conference 2005 Conference Paper

Learning Multiple Related Tasks using Latent Independent Component Analysis

  • Jian Zhang
  • Zoubin Ghahramani
  • Yiming Yang

We propose a probabilistic model based on Independent Component Analysis for learning multiple related tasks. In our model the task parameters are assumed to be generated from independent sources which account for the relatedness of the tasks. We use Laplace distributions to model hidden sources which makes it possible to identify the hidden, independent components instead of just modeling correlations. Furthermore, our model enjoys a sparsity property which makes it both parsimonious and robust. We also propose efficient algorithms for both empirical Bayes method and point estimation. Our experimental results on two multi-label text classification data sets show that the proposed approach is promising.

NeurIPS Conference 2004 Conference Paper

A Probabilistic Model for Online Document Clustering with Application to Novelty Detection

  • Jian Zhang
  • Zoubin Ghahramani
  • Yiming Yang

In this paper we propose a probabilistic model for online document clus- tering. We use non-parametric Dirichlet process prior to model the grow- ing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichlet- multinomial distribution. We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature.

JMLR Journal 2004 Journal Article

RCV1: A New Benchmark Collection for Text Categorization Research

  • David D. Lewis
  • Yiming Yang
  • Tony G. Rose
  • Fan Li

Reuters Corpus Volume I (RCV1) is an archive of over 800,000 manually categorized newswire stories recently made available by Reuters, Ltd. for research purposes. Use of this data for research on text categorization requires a detailed understanding of the real world constraints under which the data was produced. Drawing on interviews with Reuters personnel and access to Reuters documentation, we describe the coding policy and quality control procedures used in producing the RCV1 data, the intended semantics of the hierarchical category taxonomies, and the corrections necessary to remove errorful data. We refer to the original data as RCV1-v1, and the corrected data as RCV1-v2. We benchmark several widely used supervised learning methods on RCV1-v2, illustrating the collection's properties, suggesting new directions for research, and providing baseline results for future studies. We make available detailed, per-category experimental results, as well as corrected versions of the category assignments and taxonomy structures, via online appendices. [abs] [ pdf ][ ps.gz ][ ps ] [ appendices ]