Arrow Research search

Author name cluster

Wei Ding

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers
2 author rows

Possible papers

25

TMLR Journal 2026 Journal Article

Contextual Learning for Anomaly Detection in Tabular Data

  • Spencer King
  • Zhilu Zhang
  • Ruofan Yu
  • Baris Coskun
  • Wei Ding
  • Qian Cui

Anomaly detection is critical in domains such as cybersecurity and finance, especially when working with large-scale tabular data. Yet, unsupervised anomaly detection---where no labeled anomalies are available---remains challenging because traditional deep learning methods model a single global distribution, assuming all samples follow the same behavior. In contrast, real-world data often contain heterogeneous contexts (e.g., different users, accounts, or devices), where globally rare events may be normal within specific conditions. We introduce a \emph{contextual learning framework} that explicitly models how normal behavior varies across contexts by learning conditional data distributions $P(\mathbf{Y} \mid \mathbf{C})$ rather than a global joint distribution $P(\mathbf{X})$. The framework encompasses (1) a probabilistic formulation for context-conditioned learning, (2) a principled bilevel optimization strategy for automatically selecting informative context features using early validation loss, and (3) theoretical grounding through variance decomposition and discriminative learning principles. We instantiate this framework using a novel conditional Wasserstein autoencoder as a simple yet effective model for tabular anomaly detection. Extensive experiments across eight benchmark datasets demonstrate that contextual learning consistently outperforms global approaches---even when the optimal context is not intuitively obvious---establishing a new foundation for anomaly detection in heterogeneous tabular data.

ICLR Conference 2025 Conference Paper

AnoLLM: Large Language Models for Tabular Anomaly Detection

  • Che-Ping Tsai
  • Ganyu Teng
  • Phillip Wallis
  • Wei Ding

We introduce AnoLLM, a novel framework that leverages large language models (LLMs) for unsupervised tabular anomaly detection. By converting tabular data into a standardized text format, we further adapt a pre-trained LLM with this serialized data, and assign anomaly scores based on the negative log likelihood generated by the LLM. Unlike traditional methods that can require extensive feature engineering, and often lose textual information during data processing, AnoLLM preserves data integrity and streamlines the preprocessing required for tabular anomaly detection. This approach can effectively handle mixed-type data, especially those containing textual features. Our empirical results indicate that AnoLLM delivers the best performance on six benchmark datasets with mixed feature types. Additionally, across 30 datasets from the ODDS library, which are predominantly numerical, AnoLLM performs on par with top performing baselines.

NeurIPS Conference 2025 Conference Paper

Dual Alignment Framework for Few-shot Learning with Inter-Set and Intra-Set Shifts

  • Siyang Jiang
  • Rui Fang
  • Hsi-Wen Chen
  • Wei Ding
  • Guoliang Xing
  • Ming-Syan Chen

Few-shot learning (FSL) aims to classify unseen examples (query set) into labeled data (support set) through low-dimensional embeddings. However, the diversity and unpredictability of environments and capture devices make FSL more challenging in real-world applications. In this paper, we propose Dual Support Query Shift (DSQS), a novel challenge in FSL that integrates two key issues: inter-set shifts (between support and query sets) and intra-set shifts (within each set), which significantly hinder model performance. To tackle these challenges, we introduce a Dual Alignment framework (DUAL), whose core insight is that clean features can improve optimal transportation (OT) alignment. Firstly, DUAL leverages a robust embedding function enhanced by a repairer network trained with perturbed and adversarially generated “hard” examples to obtain clean features. Additionally, it incorporates a two-stage OT approach with a negative entropy regularizer, which aligns support set instances, minimizes intra-class distances, and uses query data as anchor nodes to achieve effective distribution alignment. We provide a theoretical bound of DUAL and experimental results on three image datasets, compared against 10 state-of-the-art baselines, showing that DUAL achieves a remarkable average performance improvement of 25. 66%. Our code is available at https: //github. com/siyang-jiang/DUAL.

TIST Journal 2025 Journal Article

Horizon Forcing: Improving the Recurrent Forecasting of Chaotic Systems

  • Yong Zhuang
  • Matthew Almeida
  • Wei Ding
  • Shafiqul Islam
  • Zihan Li
  • Ping Chen

Chaotic dynamics are ubiquitous in many real-world systems, ranging from biological and industrial processes to climate dynamics and the spread of viruses. These systems are characterized by high sensitivity to initial conditions, making it challenging to predict their future behavior confidently. In this study, we propose a novel deep-learning framework that addresses this challenge by directly exploiting the long-term compounding of local prediction errors during model training, aiming to extend the time horizon for reliable predictions of chaotic systems. Our approach observes the future trajectories of initial errors at a time horizon, modeling the evolution of the loss to that point through the use of two major components: (1) a recurrent architecture (Error Trajectory Tracing) designed to trace the trajectories of predictive errors through phase space, and (2) a training regime, Horizon Forcing, that pushes the model’s focus out to a predetermined time horizon. We validate our method on three classic chaotic systems and six real-world time series prediction tasks with chaotic characteristics. The results show that our approach outperforms the state-of-the-art methods.

AAAI Conference 2023 Conference Paper

Generalized Category Discovery with Decoupled Prototypical Network

  • Wenbin An
  • Feng Tian
  • Qinghua Zheng
  • Wei Ding
  • Qianying Wang
  • Ping Chen

Generalized Category Discovery (GCD) aims to recognize both known and novel categories from a set of unlabeled data, based on another dataset labeled with only known categories. Without considering differences between known and novel categories, current methods learn about them in a coupled manner, which can hurt model's generalization and discriminative ability. Furthermore, the coupled training approach prevents these models transferring category-specific knowledge explicitly from labeled data to unlabeled data, which can lose high-level semantic information and impair model performance. To mitigate above limitations, we present a novel model called Decoupled Prototypical Network (DPN). By formulating a bipartite matching problem for category prototypes, DPN can not only decouple known and novel categories to achieve different training targets effectively, but also align known categories in labeled and unlabeled data to transfer category-specific knowledge explicitly and capture high-level semantics. Furthermore, DPN can learn more discriminative features for both known and novel categories through our proposed Semantic-aware Prototypical Learning (SPL). Besides capturing meaningful semantic information, SPL can also alleviate the noise of hard pseudo labels through semantic-weighted soft assignment. Extensive experiments show that DPN outperforms state-of-the-art models by a large margin on all evaluation metrics across multiple benchmark datasets. Code and data are available at https://github.com/Lackel/DPN.

AAAI Conference 2023 Conference Paper

Incremental Reinforcement Learning with Dual-Adaptive ε-Greedy Exploration

  • Wei Ding
  • Siyang Jiang
  • Hsi-Wen Chen
  • Ming-Syan Chen

Reinforcement learning (RL) has achieved impressive performance in various domains. However, most RL frameworks oversimplify the problem by assuming a fixed-yet-known environment and often have difficulty being generalized to real-world scenarios. In this paper, we address a new challenge with a more realistic setting, Incremental Reinforcement Learning, where the search space of the Markov Decision Process continually expands. While previous methods usually suffer from the lack of efficiency in exploring the unseen transitions, especially with increasing search space, we present a new exploration framework named Dual-Adaptive ϵ-greedy Exploration (DAE) to address the challenge of Incremental RL. Specifically, DAE employs a Meta Policy and an Explorer to avoid redundant computation on those sufficiently learned samples. Furthermore, we release a testbed based on a synthetic environment and the Atari benchmark to validate the effectiveness of any exploration algorithms under Incremental RL. Experimental results demonstrate that the proposed framework can efficiently learn the unseen transitions in new environments, leading to notable performance improvement, i.e., an average of more than 80%, over eight baselines examined.

IS Journal 2023 Journal Article

New User Intent Discovery With Robust Pseudo Label Training and Source Domain Joint Training

  • Wenbin An
  • Feng Tian
  • Ping Chen
  • Qinghua Zheng
  • Wei Ding

Discovering new user intents based on existing intents from constantly incoming unlabeled data is an important task in many intelligent systems deployed in the real world (e. g. , dialogue systems). Since data with new intents are completely unlabeled, most current approaches employ clustering methods to generate pseudo labels to train their models. However, due to intent gaps between existing and new intents, pseudo labels generated by these models are noisy, and prior knowledge from existing intents is not fully utilized. To mitigate these issues, we propose a robust pseudo label training and source domain joint-training network to refine the noisy pseudo labels and make full use of prior knowledge. Experimental results on three intent detection datasets show that our model is more effective and robust than state-of-the-art methods. The code and data are released at https://github.com/Lackel/PTJN.

IROS Conference 2023 Conference Paper

Self-Supervised Object Goal Navigation with In-Situ Finetuning

  • So Yeon Min
  • Yao-Hung Hubert Tsai
  • Wei Ding
  • Ali Farhadi
  • Ruslan Salakhutdinov
  • Yonatan Bisk
  • Jian Zhang 0050

A household robot should be able to navigate to target objects without requiring users to first annotate everything in their home. Most current approaches to object navigation do not test on real robots and rely solely on reconstructed scans of houses and their expensively labeled semantic 3D meshes. In this work, our goal is to build an agent that builds self-supervised models of the world via exploration, the same as a child might - thus we (1) eschew the expense of labeled 3D mesh and (2) enable self-supervised in-situ finetuning in the real world. We identify a strong source of self-supervision (Location Consistency - LocCon) that can train all components of an ObjectNav agent, using unannotated simulated houses. Our key insight is that embodied agents can leverage location consistency as a self-supervision signal - collecting images from different views/angles and applying contrastive learning. We show that our agent can perform competitively in the real world and simulation. Our results also indicate that supervised training with 3D mesh annotations causes models to learn simulation artifacts, which are not transferrable to the real world. In contrast, our LocCon shows the most robust transfer in the real world among the set of models we compare to, and that the real-world performance of all models can be further improved with self-supervised LocCon in-situ training.

IS Journal 2022 Journal Article

Maximizing Fairness in Deep Neural Networks via Mode Connectivity

  • Olga Andreeva
  • Matthew Almeida
  • Wei Ding
  • Scott E. Crouter
  • Ping Chen

With frequent reports of biased outcomes of AI systems, fairness rightfully becomes an active area of current ML research. However, while progress has been made on theoretical analysis and formulation of fairness as constraints on error probabilities, our ability to design and train modern deep learning models that reach the targeted fairness goals in practice is still limited. In this work, we focus on an interesting yet common fairness setting, where multiple samples are collected from each individual, and the goal is to maximally reduce performance disparity among individuals while maintaining overall model performance. To obtain such fair deep learning models, we use mode connectivity combined with multiobjective optimization to select the best model out of an identified feasible set of model weight configurations with similar overall performance but different distributions of performance over individuals. Our method is model-agnostic and effectively bridges fairness theory and practice.

TIST Journal 2019 Journal Article

BAMB

  • Zhaolong Ling
  • Kui Yu
  • Hao Wang
  • Lin Liu
  • Wei Ding
  • Xindong Wu

The discovery of Markov blanket (MB) for feature selection has attracted much attention in recent years, since the MB of the class attribute is the optimal feature subset for feature selection. However, almost all existing MB discovery algorithms focus on either improving computational efficiency or boosting learning accuracy, instead of both. In this article, we propose a novel MB discovery algorithm for balancing efficiency and accuracy, called <underline>BA</underline>lanced <underline>M</underline>arkov <underline>B</underline>lanket (BAMB) discovery. To achieve this goal, given a class attribute of interest, BAMB finds candidate PC (parents and children) and spouses and removes false positives from the candidate MB set in one go. Specifically, once a feature is successfully added to the current PC set, BAMB finds the spouses with regard to this feature, then uses the updated PC and the spouse set to remove false positives from the current MB set. This makes the PC and spouses of the target as small as possible and thus achieves a trade-off between computational efficiency and learning accuracy. In the experiments, we first compare BAMB with 8 state-of-the-art MB discovery algorithms on 7 benchmark Bayesian networks, then we use 10 real-world datasets and compare BAMB with 12 feature selection algorithms, including 8 state-of-the-art MB discovery algorithms and 4 other well-established feature selection methods. On prediction accuracy, BAMB outperforms 12 feature selection algorithms compared. On computational efficiency, BAMB is close to the IAMB algorithm while it is much faster than the remaining seven MB discovery algorithms.

AAAI Conference 2018 Conference Paper

A Semantic QA-Based Approach for Text Summarization Evaluation

  • Ping Chen
  • Fei Wu
  • Tong Wang
  • Wei Ding

Many Natural Language Processing and Computational Linguistics applications involve the generation of new texts based on some existing texts, such as summarization, text simplification and machine translation. However, there has been a serious problem haunting these applications for decades, that is, how to automatically and accurately assess quality of these applications. In this paper, we will present some preliminary results on one especially useful and challenging problem in NLP system evaluation – how to pinpoint content differences of two text passages (especially for large passages such as articles and books). Our idea is intuitive and very different from existing approaches. We treat one text passage as a small knowledge base, and ask it a large number of questions to exhaustively identify all content points in it. By comparing the correctly answered questions from two text passages, we will be able to compare their content precisely. The experiment using 2007 DUC summarization corpus clearly shows promising results.

IROS Conference 2017 Conference Paper

An end-to-end system for crowdsourced 3D maps for autonomous vehicles: The mapping component

  • Onkar Dabeer
  • Wei Ding
  • Radhika Gowaiker
  • Slawomir K. Grzechnik
  • Mythreya J. Lakshman
  • Sean Lee
  • Gerhard Reitmayr
  • Arunandan Sharma

Autonomous vehicles rely on precise high definition (HD) 3D maps for navigation. This paper presents the mapping component of an end-to-end system for crowdsourcing precise 3D maps with semantically meaningful landmarks such as traffic signs (6 dof pose, shape and size) and traffic lanes (3D splines). The system uses consumer grade parts, and in particular, relies on a single front facing camera and a consumer grade GPS. Using real-time sign and lane triangulation on-device in the vehicle, with offline sign/lane clustering across multiple journeys and offline Bundle Adjustment across multiple journeys in the backend, we construct maps with mean absolute accuracy at sign corners of less than 20 cm from 25 journeys. To the best of our knowledge, this is the first end-to-end HD mapping pipeline in global coordinates in the automotive context using cost effective sensors.

AAAI Conference 2013 Conference Paper

Online Group Feature Selection from Feature Streams

  • Haiguang Li
  • Xindong Wu
  • Zhao Li
  • Wei Ding

Standard feature selection algorithms deal with given candidate feature sets at the individual feature level. When features exhibit certain group structures, it is beneficial to conduct feature selection in a grouped manner. For high-dimensional features, it could be far more preferable to online generate and process features one at a time rather than wait for generating all features before learning begins. In this paper, we discuss a new and interesting problem of online group feature selection from feature streams at both the group and individual feature levels simultaneously from a feature stream. Extensive experiments on both real-world and synthetic datasets demonstrate the superiority of the proposed algorithm.

TIST Journal 2011 Journal Article

Subkilometer crater discovery with boosting and transfer learning

  • Wei Ding
  • Tomasz F. Stepinski
  • Yang Mu
  • Lourenco Bandeira
  • Ricardo Ricardo
  • Youxi Wu
  • Zhenyu Lu
  • Tianyu Cao

Counting craters in remotely sensed images is the only tool that provides relative dating of remote planetary surfaces. Surveying craters requires counting a large amount of small subkilometer craters, which calls for highly efficient automatic crater detection. In this article, we present an integrated framework on autodetection of subkilometer craters with boosting and transfer learning. The framework contains three key components. First, we utilize mathematical morphology to efficiently identify crater candidates, the regions of an image that can potentially contain craters. Only those regions occupying relatively small portions of the original image are the subjects of further processing. Second, we extract and select image texture features, in combination with supervised boosting ensemble learning algorithms, to accurately classify crater candidates into craters and noncraters. Third, we integrate transfer learning into boosting, to enhance detection performance in the regions where surface morphology differs from what is characterized by the training set. Our framework is evaluated on a large test image of 37,500 × 56,250 m 2 on Mars, which exhibits a heavily cratered Martian terrain characterized by nonuniform surface morphology. Empirical studies demonstrate that the proposed crater detection framework can achieve an F1 score above 0.85, a significant improvement over the other crater detection algorithms.

ICRA Conference 2009 Conference Paper

Formations on two-layer pursuit systems

  • Wei Ding
  • Gangfeng Yan
  • Zhiyun Lin

The paper studies hierarchical pursuit strategies for groups of mobile agents in the plane. It is shown that fascinating global patterns emerge from simple two-layer pursuit schemes, including rendezvous, uniform circular motion, complex circular motion, concentric circular motion, and concentric logarithmic spiral motion. Both rigorous analysis and simulations are provided.

IROS Conference 2009 Conference Paper

Leader-following formation control based on pursuit strategies

  • Wei Ding
  • Gangfeng Yan
  • Zhiyun Lin
  • Ying Lan

The paper studies formation control of multi-agent systems under a directed acyclic graph. In a directed acyclic graph, the agents without neighbors are leaders and the others are followers. Leaders move in a formation with a time-varying velocity and followers can access the relative positions of their neighbors and the leaders' velocity. A local formation control law is proposed in the paper based on pursuit strategies and necessary and sufficient conditions for stability and convergence are derived. Moreover, the results are extended to the case with arbitrary communication delays, for which the steady-state formation is presented according both the control parameters and time delays.

IROS Conference 2007 Conference Paper

Biomimetic control of pan-tilt-zoom camera for visual tracking based-on an autonomous helicopter

  • Shaorong Xie
  • Jun Luo 0006
  • Zhenbang Gong
  • Wei Ding
  • Hairong Zou
  • Xiangguo Fu

A novel control strategy of pan-tilt-zoom camera is described. Because the active camera is mounted on a moving autonomous helicopter in visual tracking system, and the tracked object is moving at same time, and there exists the vibration influence of the helicopter, image stabilization becomes poor, and all pixels are running. Therefore, a biomimetic control strategy of on-board pan-tilt-zoom camera is presented. In this paper, the biomimetic oculomotor control model is obtained based on physiological neural path of eye movement control. In order to validate the functions of the biomimetic control model, simulation experiments were done under the same condition as the physiological experiments in physiological researches. Then the biomimetic controller of onboard pan-tilt-zoom camera is developed. The results of flight tracking experiments show that the biomimetic controller can compensate the deflection caused by the flight platform, and enhance the visual tracking system performance.

IROS Conference 2006 Conference Paper

Real-time Vision-based Object Tracking from a Moving Platform in the Air

  • Wei Ding
  • Zhenbang Gong
  • Shaorong Xie
  • Hairong Zou

Generally the scene of the visual surveillance using one camera is confined within certain limits. If you want to extend the surveillance area, you need multiple cameras. It attempts to detect, recognize and track certain objects from image sequences. This paper presents a simple and effective approach segmenting and tracking a moving object from a moving platform in the air such as UAV or ballonet airship, etc. The moving platform can follow the tracked object during a long distance. It can be used to track the doubtful vehicle and strengthen the visual surveillance more. The proposed approach is efficient and robust. The object is searched only in a subwindow of the current frame. The position of sub-window is estimated by its position in the last two frames. The features of the object are updated in each frame. The approach produces good tracking results at a frame rate of 25 fps. It is available in tracking a moving object in various scenes