Author name cluster

Tao Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

30 papers

2 author rows

EAAI Journal 2026 Journal Article

An electroencephalogram signal analysis method based on dual self-supervised graph diffusion recurrent network

Sunan Ge
Shuang Wang
Rui Zhang
Xueqing Zhao
Xinshi
Meng Wang
Tao Wu

Details DOI

EAAI Journal 2026 Journal Article

Optimal strategies for spatial network disintegration through virtual node model

Zhigang Wang
Ye Deng
Xiaoda Shen
Meng Li
Tao Wu
Fujuan Gao
Jun Wu

Details DOI

AAAI Conference 2025 Conference Paper

CustomCrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities

Tao Wu
Yong Zhang
Xintao Wang
Xianpan Zhou
Guangcong Zheng
Zhongang Qi
Ying Shan
Xi Li

Customized video generation aims to generate high-quality videos guided by text prompts and subject's reference images. However, since it is only trained on static images, the fine-tuning process of subject learning disrupts abilities of video diffusion models (VDMs) to combine concepts and generate motions. To restore these abilities, some methods use additional video similar to the prompt to fine-tune or guide the model. This requires frequent changes of guiding videos and even re-tuning of the model when generating different motions, which is very inconvenient for users. In this paper, we propose CustomCrafter, a novel framework that preserves the model's motion generation and conceptual combination abilities without additional video and fine-tuning to recovery. For preserving conceptual combination ability, we design a plug-and-play module to update few parameters in VDMs, enhancing the model's ability to capture the appearance details and the ability of concept combinations for new subjects. For motion generation, we observed that VDMs tend to restore the motion of video in the early stage of denoising, while focusing on the recovery of subject details in the later stage. Therefore, we propose Dynamic Weighted Video Sampling Strategy. Using the pluggability of our subject learning modules, we reduce the impact of this module on motion generation in the early stage of denoising, preserving the ability to generate motion of VDMs. In the later stage of denoising, we restore this module to repair the appearance details of the specified subject, thereby ensuring the fidelity of the subject's appearance. Experimental results show that our method has a significant improvement compared to previous methods.

PDF Details DOI

EAAI Journal 2025 Journal Article

Lightweight distributed deep learning on compressive measurements for internet of things

Guiqiang Hu
Yong Hu
Tao Wu
Yushu Zhang
Shuai Yuan

Details DOI

NeurIPS Conference 2025 Conference Paper

LithoSim: A Large, Holistic Lithography Simulation Benchmark for AI-Driven Semiconductor Manufacturing

Hongquan He
Zhen Wang
Jingya Wang
Tao Wu
Xuming He
Bei Yu
Jingyi Yu
Hao GENG

Lithography orchestrates a symphony of light, mask and photochemicals to transfer the integrated circuit patterns onto the wafer. Lithography simulation serves as the critical nexus between circuit design and manufacturing, where its speed and accuracy fundamentally govern the optimization quality of downstream resolution enhancement techniques (RET). While machine learning promises to circumvent computational limitations of lithography process through data-driven or physics-informed approximations of computational lithography, existing simulators suffer from inadequate lithographic awareness due to insufficient training data capturing essential process variations and mask correction rules. We present LithoSim, the most comprehensive lithography simulation benchmark to date, featuring over $4$ million high-resolution input-output pairs with rigorous physical correspondence. The dataset systematically incorporates alterable optical source distributions, metal and via mask topologies with optical proximity correction (OPC) variants, and process windows reflecting fab-realistic variations. By integrating domain-specific metrics spanning AI performance and lithographic fidelity, LithoSim establishes a unified evaluation framework for data-driven and physics-informed computational lithography. The data (https: //huggingface. co/datasets/grandiflorum/LithoSim), code (https: //dw-hongquan. github. io/LithoSim), and pre-trained models (https: //huggingface. co/grandiflorum/LithoSim) are released openly to support the development of hybrid ML-based and high-fidelity lithography simulation for the benefit of semiconductor manufacturing.

PDF Details

ECAI Conference 2025 Conference Paper

PMR: Physical Model-Driven Multi-Stage Restoration of Turbulent Dynamic Videos

Tao Wu
Jingyuan Ye
Cheng Zhou
Wenlong Chen
Zheng Liu
Huiming Zheng
Wei Liu
Ying Fu

Geometric distortions and blurring caused by atmospheric turbulence degrade the quality of long-range dynamic scene videos. Existing methods struggle with restoring edge details and eliminating mixed distortions, especially under conditions of strong turbulence and complex dynamics. To address these challenges, we introduce a Dynamic Efficiency Index (DEI), which combines turbulence intensity, optical flow, and proportions of dynamic regions to accurately quantify video dynamic intensity under varying turbulence conditions and provide a high-dynamic turbulence training dataset. Additionally, we propose a Physical Model-Driven Multi-Stage Video Restoration (PMR) framework that consists of three stages: de-tilting for geometric stabilization, motion segmentation enhancement for dynamic region refinement, and de-blurring for quality restoration. PMR employs lightweight backbones and stage-wise joint training to ensure both efficiency and high restoration quality. Experimental results demonstrate that the proposed method effectively suppresses motion trailing artifacts, restores edge details and exhibits strong generalization capability, especially in real-world scenarios characterized by high-turbulence and complex dynamics. We will make the code and datasets openly available.

Details

IROS Conference 2025 Conference Paper

Simpler Is Better: Revisiting Doppler Velocity for Enhanced Moving Object Tracking with FMCW LiDAR

Yubin Zeng
Tao Wu
Shouzheng Qi
Junxiang Li
Xingyu Duan
Youjin Yu

Real-time and accurate perception of dynamic objects is crucial for autonomous driving. To better capture the motion information of objects, some methods now employ 4D Doppler point clouds collected by frequency-modulated continuous-wave (FMCW) LiDAR to enhance the detection and tracking of moving objects. Compared to standard time-of-flight (ToF) LiDAR, FMCW LiDAR can provide the relative radial velocity of each point through the Doppler effect, offering a more detailed understanding of an object’s motion state. However, despite the proven efficacy of these methods, ablation studies reveal that the direct contribution of Doppler velocity to tracking is limited, with performance gains often resulting from improved object recognition and labeling accuracy. Revisiting the role of Doppler velocity, this study proposes DopplerTrack, a simple yet effective learning-free tracking method tailored for FMCW LiDAR. DopplerTrack harnesses Doppler velocity for efficient point cloud preprocessing and object detection with O(N) complexity. Furthermore, by exploring the potential motion directions of objects, it reconstructs the full velocity vector, enabling more direct and precise motion prediction. Extensive experiments on four datasets demonstrate that DopplerTrack outperforms existing learning-free and learning-based methods, achieving state-of-the-art tracking performance with strong generalization across diverse scenarios. Moreover, DopplerTrack runs efficiently at 120 Hz on a mobile CPU, making it highly practical for real-world deployment. The code and datasets have been released at https://github.com/12w2/DopplerTrack.

Details

AIIM Journal 2025 Journal Article

Sum of similarity-regularized squared correlations for enhancing SSVEP detection

Tian-jian Luo
Tao Wu

Details DOI

YNIMG Journal 2024 Journal Article

Brain age prediction via cross-stratified ensemble learning

Xinlin Li
Zezhou Hao
Di Li
Qiuye Jin
Zhixian Tang
Xufeng Yao
Tao Wu

Details DOI

AAAI Conference 2024 Conference Paper

CR-SAM: Curvature Regularized Sharpness-Aware Minimization

Tao Wu
Tie Luo
Donald C. Wunsch II

The capacity to generalize to future unseen data stands as one of the utmost crucial attributes of deep neural networks. Sharpness-Aware Minimization (SAM) aims to enhance the generalizability by minimizing worst-case loss using one-step gradient ascent as an approximation. However, as training progresses, the non-linearity of the loss landscape increases, rendering one-step gradient ascent less effective. On the other hand, multi-step gradient ascent will incur higher training cost. In this paper, we introduce a normalized Hessian trace to accurately measure the curvature of loss landscape on both training and test sets. In particular, to counter excessive non-linearity of loss landscape, we propose Curvature Regularized SAM (CR-SAM), integrating the normalized Hessian trace as a SAM regularizer. Additionally, we present an efficient way to compute the trace via finite differences with parallelism. Our theoretical analysis based on PAC-Bayes bounds establishes the regularizer's efficacy in reducing generalization error. Empirical evaluation on CIFAR and ImageNet datasets shows that CR-SAM consistently enhances classification performance for ResNet and Vision Transformer (ViT) models across various datasets. Our code is available at https://github.com/TrustAIoT/CR-SAM.

PDF Details DOI

YNICL Journal 2024 Journal Article

From bench to bedside: Overview of magnetoencephalography in basic principle, signal processing, source localization and clinical applications

Yanling Yang
Shichang Luo
Wenjie Wang
Xiumin Gao
Xufeng Yao
Tao Wu

Details DOI

AAAI Conference 2024 Conference Paper

LRS: Enhancing Adversarial Transferability through Lipschitz Regularized Surrogate

Tao Wu
Tie Luo
Donald C. Wunsch II

The transferability of adversarial examples is of central importance to transfer-based black-box adversarial attacks. Previous works for generating transferable adversarial examples focus on attacking given pretrained surrogate models while the connections between surrogate models and adversarial trasferability have been overlooked. In this paper, we propose Lipschitz Regularized Surrogate (LRS) for transfer-based black-box attacks, a novel approach that transforms surrogate models towards favorable adversarial transferability. Using such transformed surrogate models, any existing transfer-based black-box attack can run without any change, yet achieving much better performance. Specifically, we impose Lipschitz regularization on the loss landscape of surrogate models to enable a smoother and more controlled optimization process for generating more transferable adversarial examples. In addition, this paper also sheds light on the connection between the inner properties of surrogate models and adversarial transferability, where three factors are identified: smaller local Lipschitz constant, smoother loss landscape, and stronger adversarial robustness. We evaluate our proposed LRS approach by attacking state-of-the-art standard deep neural networks and defense models. The results demonstrate significant improvement on the attack success rates and transferability. Our code is available at https://github.com/TrustAIoT/LRS.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

ReinforceNS: Reinforcement Learning-based Multi-start Neighborhood Search for Solving the Traveling Thief Problem

Tao Wu
Huachao Cui
Tao Guan
Yuesong Wang
Yan Jin

The Traveling Thief Problem (TTP) is a challenging combinatorial optimization problem with broad practical applications. TTP combines two NP-hard problems: the Traveling Salesman Problem (TSP) and Knapsack Problem (KP). While a number of machine learning and deep learning based algorithms have been developed for TSP and KP, there is limited research dedicated to TTP. In this paper, we present the first reinforcement learning based multi-start neighborhood search algorithm, denoted by ReinforceNS, for solving TTP. To accelerate the search, we employ a pre-processing procedure for neighborhood reduction. A TSP routing and an iterated greedy packing are independently utilized to construct a high-quality initial solution, further improved by a reinforcement learning based neighborhood search. Additionally, a post-optimization procedure is devised for continued solution improvement. We conduct extensive experiments on 60 commonly used benchmark instances with 76 to 33810 cities in the literature. The experimental results demonstrate that our proposed ReinforceNS algorithm outperforms three state-of-the-art algorithms in terms of solution quality with the same time limit. In particular, ReinforceNS achieves 12 new results for 18 instances publicly reported in a recent TTP competition. We also perform an additional experiment to validate the effectiveness of the reinforcement learning strategy.

PDF Details DOI

AAAI Conference 2024 Conference Paper

SphereDiffusion: Spherical Geometry-Aware Distortion Resilient Diffusion Model

Tao Wu
Xuewei Li
Zhongang Qi
Di Hu
Xintao Wang
Ying Shan
Xi Li

Controllable spherical panoramic image generation holds substantial applicative potential across a variety of domains. However, it remains a challenging task due to the inherent spherical distortion and geometry characteristics, resulting in low-quality content generation. In this paper, we introduce a novel framework of SphereDiffusion to address these unique challenges, for better generating high-quality and precisely controllable spherical panoramic images. For the spherical distortion characteristic, we embed the semantics of the distorted object with text encoding, then explicitly construct the relationship with text-object correspondence to better use the pre-trained knowledge of the planar images. Meanwhile, we employ a deformable technique to mitigate the semantic deviation in latent space caused by spherical distortion. For the spherical geometry characteristic, in virtue of spherical rotation invariance, we improve the data diversity and optimization objectives in the training process, enabling the model to better learn the spherical geometry characteristic. Furthermore, we enhance the denoising process of the diffusion model, enabling it to effectively use the learned geometric characteristic to ensure the boundary continuity of the generated images. With these specific techniques, experiments on Structured3D dataset show that SphereDiffusion significantly improves the quality of controllable spherical image generation and relatively reduces around 35% FID on average.

PDF Details DOI

EAAI Journal 2023 Journal Article

Imbalanced data classification: Using transfer learning and active sampling

Yang Liu
Guoping Yang
Shaojie Qiao
Meiqi Liu
Lulu Qu
Nan Han
Guan Yuan
Tao Wu

Details DOI

IJCAI Conference 2023 Conference Paper

SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation

Xuewei Li
Tao Wu
Zhongang Qi
Gaoang Wang
Ying Shan
Xi Li

As an important and challenging problem in computer vision, PAnoramic Semantic Segmentation (PASS) gives complete scene perception based on an ultra-wide angle of view. Usually, prevalent PASS methods with 2D panoramic image input focus on solving image distortions but lack consideration of the 3D properties of original 360 degree data. Therefore, their performance will drop a lot when inputting panoramic images with the 3D disturbance. To be more robust to 3D disturbance, we propose our Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation (SGAT4PASS), considering 3D spherical geometry knowledge. Specifically, a spherical geometry-aware framework is proposed for PASS. It includes three modules, i. e. , spherical geometry-aware image projection, spherical deformable patch embedding, and a panorama-aware loss, which takes input images with 3D disturbance into account, adds a spherical geometry-aware constraint on the existing deformable patch embedding, and indicates the pixel density of original 360 degree data, respectively. Experimental results on Stanford2D3D Panoramic datasets show that SGAT4PASS significantly improves performance and robustness, with approximately a 2% increase in mIoU, and when small 3D disturbances occur in the data, the stability of our performance is improved by an order of magnitude. Our code and supplementary material are available at https: //github. com/TencentARC/SGAT4PASS.

PDF Details DOI

AAAI Conference 2023 Conference Paper

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning

Jinxiang Lai
Siqian Yang
Wenlong Wu
Tao Wu
Guannan Jiang
Xi Wang
Jun Liu
Bin-Bin Gao

Recent Few-Shot Learning (FSL) methods put emphasis on generating a discriminative embedding features to precisely measure the similarity between support and query sets. Current CNN-based cross-attention approaches generate discriminative representations via enhancing the mutually semantic similar regions of support and query pairs. However, it suffers from two problems: CNN structure produces inaccurate attention map based on local features, and mutually similar backgrounds cause distraction. To alleviate these problems, we design a novel SpatialFormer structure to generate more accurate attention regions based on global features. Different from the traditional Transformer modeling intrinsic instance-level similarity which causes accuracy degradation in FSL, our SpatialFormer explores the semantic-level similarity between pair inputs to boost the performance. Then we derive two specific attention modules, named SpatialFormer Semantic Attention (SFSA) and SpatialFormer Target Attention (SFTA), to enhance the target object regions while reduce the background distraction. Particularly, SFSA highlights the regions with same semantic information between pair features, and SFTA finds potential foreground object regions of novel feature that are similar to base categories. Extensive experiments show that our methods are effective and achieve new state-of-the-art results on few-shot classification benchmarks.

PDF Details DOI

JBHI Journal 2022 Journal Article

Learning From Highly Confident Samples for Automatic Knee Osteoarthritis Severity Assessment: Data From the Osteoarthritis Initiative

Yifan Wang
Zhaori Bi
Yuxue Xie
Tao Wu
Xuan Zeng
Shuang Chen
Dian Zhou

Knee osteoarthritis (OA) is a chronic disease that considerably reduces patients’ quality of life. Preventive therapies require early detection and lifetime monitoring of OA progression. In the clinical environment, the severity of OA is classified by the Kellgren and Lawrence (KL) grading system, ranging from KL-0 to KL-4. Recently, deep learning methods were applied to OA severity assessment to improve accuracy and efficiency. However, this task is still challenging due to the ambiguity between adjacent grades, especially in early-stage OA. Low confident samples, which are less representative than the typical ones, undermine the training process. Targeting the uncertainty in the OA dataset, we propose a novel learning scheme that dynamically separates the data into two sets according to their reliability. Besides, we design a hybrid loss function to help CNN learn from the two sets accordingly. With the proposed approach, we emphasize the typical samples and control the impacts of low confident cases. Experiments are conducted in a five-fold manner on five-class task and early-stage OA task. Our method achieves a mean accuracy of 70. 13% on the five-class OA assessment task, which outperforms all other state-of-art methods. Despite early-stage OA detection still benefiting from the human intervention of lesion region selection, our approach achieves superior performance on the KL-0 vs. KL-2 task. Moreover, we design an experiment to validate large-scale automatic data refining during training. The result verifies the ability to characterize low confidence samples. The dataset used in this paper was obtained from the Osteoarthritis Initiative.

Details DOI

AAAI Conference 2022 Conference Paper

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

Zhenzhi Wang
Limin Wang
Tao Wu
Tianhao Li
Gangshan Wu

Temporal grounding aims to localize a video moment which is semantically aligned with a given natural language query. Existing methods typically apply a detection or regression pipeline on the fused representation with the research focus on designing complicated prediction heads or fusion strategies. Instead, from a perspective on temporal grounding as a metric-learning problem, we present a Mutual Matching Network (MMN), to directly model the similarity between language queries and video moments in a joint embedding space. This new metric-learning framework enables fully exploiting negative samples from two new aspects: constructing negative cross-modal pairs in a mutual matching scheme and mining negative pairs across different videos. These new negative samples could enhance the joint representation learning of two modalities via cross-modal mutual matching to maximize their mutual information. Experiments show that our MMN achieves highly competitive performance compared with the state-of-the-art methods on four video grounding benchmarks. Based on MMN, we present a winner solution for the HC-STVG challenge of the 3rd PIC workshop. This suggests that metric learning is still a promising method for temporal grounding via capturing the essential cross-modal correlation in a joint embedding space. Code is available at https: //github. com/MCG-NJU/MMN.

PDF Details

YNICL Journal 2021 Journal Article

Alteration of brain structural connectivity in progression of Parkinson's disease: A connectome-wide network analysis

Yanwu Yang
Chenfei Ye
Junyan Sun
Li Liang
Haiyan Lv
Linlin Gao
Jiliang Fang
Ting Ma

Details DOI

TCS Journal 2019 Journal Article

Multi-player End-Nim games

Wen An Liu
Tao Wu

Details DOI

NeurIPS Conference 2016 Conference Paper

General Tensor Spectral Co-clustering for Higher-Order Data

Tao Wu
Austin Benson
David Gleich

Spectral clustering and co-clustering are well-known techniques in data analysis, and recent work has extended spectral clustering to square, symmetric tensors and hypermatrices derived from a network. We develop a new tensor spectral co-clustering method that simultaneously clusters the rows, columns, and slices of a nonnegative three-mode tensor and generalizes to tensors with any number of modes. The algorithm is based on a new random walk model which we call the super-spacey random surfer. We show that our method out-performs state-of-the-art co-clustering methods on several synthetic datasets with ground truth clusters and then use the algorithm to analyze several real-world datasets.

PDF Details

IJCAI Conference 2015 Conference Paper

Determining Expert Research Areas with Multi-Instance Learning of Hierarchical Multi-Label Classification Model

Tao Wu
Qifan Wang
Zhiwei Zhang
Luo Si

Automatically identifying the research areas of academic/industry researchers is an important task for building expertise organizations or search systems. In general, this task can be viewed as text classification that generates a set of research areas given the expertise of a researcher like documents of publications. However, this task is challenging because the evidence of a research area may only exist in a few documents instead of all documents. Moreover, the research areas are often organized in a hierarchy, which limits the effectiveness of existing text categorization methods. This paper proposes a novel approach, Multi-instance Learning of Hierarchical Multi-label Classification Model (MIHML) for the task, which effectively identifies multiple research areas in a hierarchy from individual documents within the profile of a researcher. An Expectation- Maximization (EM) optimization algorithm is designed to learn the model parameters. Extensive experiments have been conducted to demonstrate the superior performance of proposed research with a real world application.

PDF Details

YNIMG Journal 2013 Journal Article

Cerebellum and integration of neural networks in dual-task processing

Tao Wu
Jun Liu
Mark Hallett
Zheng Zheng
Piu Chan

Details DOI

IROS Conference 2013 Conference Paper

Light-weight localization for vehicles using road markings

Ananth Ranganathan
David Ilstrup
Tao Wu

Traditional vision-based localization methods such as visual SLAM suffer from practical problems in outdoor environments such as unstable feature detection and inability to perform location recognition under lighting, perspective, weather and appearance change. Additionally map construction on a large scale in these systems presents its own challenges. In this work, we present a novel method for precisely localizing vehicles on the road using signs marked on the road (road markings), which have the advantage of being distinct and easy to detect, their detection being robust under changes in lighting and weather. Our method uses corners detected on road markings to perform localization in global coordinates. The method consists of two phases — a mapping phase when a high-quality GPS device is used to automatically survey road marks and add them to a light-weight “map” or database, and a localization phase where road mark detection and look-up in the map, combined with visual odometry, produces precise localization. We present experiments using a real-time implementation operating in a car that demonstrates the improved localization robustness and accuracy of our system even when using road marks alone. However, in this case the trajectory between road marks has to be filled-in by visual odometry, which contributes drift. Hence, we also present a mechanism for combining road-mark-based maps with sparse feature-based maps that results in greater accuracy still. We see our use of road marks as a significant step in the general trend of using higher-level features for improved localization performance irrespective of environment conditions.

Details

YNIMG Journal 2011 Journal Article