Arrow Research search

Author name cluster

Yong Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

49 papers
2 author rows

Possible papers

49

AAAI Conference 2026 Conference Paper

AdaCuRL: Adaptive Curriculum Reinforcement Learning with Invalid Sample Mitigation and Historical Revisiting

  • Renda Li
  • Hailang Huang
  • Fei Wei
  • Feng Xiong
  • Yong Wang
  • Xiangxiang Chu

Reinforcement learning (RL) has demonstrated considerable potential for enhancing reasoning in large language models (LLMs). However, existing methods suffer from Gradient Starvation and Policy Degradation when training directly on samples with mixed difficulty. To mitigate this, prior approaches leverage Chain-of-Thought (CoT) data, but the construction of high-quality CoT annotations remains labor-intensive. Alternatively, curriculum learning strategies have been explored but frequently encounter challenges, such as difficulty mismatch, reliance on manual curriculum design, and catastrophic forgetting. To address these issues, we propose AdaCuRL, a Adaptive Curriculum Reinforcement Learning framework that integrates coarse-to-fine difficulty estimation with adaptive curriculum scheduling. This approach dynamically aligns data difficulty with model capability and incorporates a data revisitation mechanism to mitigate catastrophic forgetting. Furthermore, AdaCuRL employs adaptive reference and sparse KL strategies to prevent Policy Degradation. Extensive experiments across diverse reasoning benchmarks demonstrate that AdaCuRL consistently achieves significant performance improvements on both LLMs and MLLMs.

AAAI Conference 2026 Conference Paper

CaT-Diff: Cascaded Text-enhanced Diffusion Model for Time-Series Imputation

  • Changjian Xu
  • Yong Wang
  • Ruizheng Huang
  • Zhicheng Zhang
  • Wen Yin
  • Kexin Li

Most state-of-the-art time series imputation methods can leverage textual information to improve imputation quality, but they often struggle because they fail to effectively filter noisy information from large language model (LLM) derived textual information. Some existing solutions only filter over the entire token set, which can introduce erroneous conditional constraints, extreme token frequency effects and increased computational complexity. To address this, we propose CaT-Diff, a novel cascaded text-enhanced diffusion model for probabilistic imputation of multivariate time series under Missing Not At Random (MNAR) scenarios. To suppress irrelevant semantics and focus on context most predictive of missing values, CaT-Diff introduces an innovative Hierarchical Semantic Filter (HSF) that collaborates with a Mixture-of-Experts (MoE) Network. The MoE projects heterogeneous text embeddings into the time series latent space, and the HSF cascade-filters text embeddings from the segment level to the token level, thereby avoiding the pitfalls of direct token-level filtering and reducing overhead. We also incorporate a lightweight Missing Mechanism Estimator, jointly optimized with the denoising network to explicitly capture MNAR missingness patterns. Extensive tests on nine domains show that CaT-Diff outperforms state-of-the-art baselines. Our work presents a new approach for selectively fusing LLM-derived textual information.

AAAI Conference 2026 Conference Paper

Shedding the Facades, Connecting the Domains: Detecting Shifting Multimodal Hate Video with Test-Time Adaptation

  • Jiao Li
  • Jian Lang
  • Xikai Tang
  • Wenzheng Shu
  • Ting Zhong
  • Qiang Gao
  • Yong Wang
  • Leiting Chen

Hate Video Detection (HVD) is crucial for online ecosystems. Existing methods assume identical distributions between training (source) and inference (target) data. However, hateful content often evolves into irregular and ambiguous forms to evade censorship, resulting in substantial semantic drift and rendering previously trained models ineffective. Test-Time Adaptation (TTA) offers a solution by adapting models during inference to narrow the cross-domain gap, while conventional TTA methods target mild distribution shifts and struggle with the severe semantic drift in HVD. To tackle these challenges, we propose SCANNER, the first TTA framework tailored for HVD. Motivated by the insight that, despite the evolving nature of hateful manifestations, their underlying cores remain largely invariant (i.e., targeting is still based on characteristics like gender, race, etc), we leverage these stable cores as a bridge to connect the source and target domains. Specifically, SCANNER initially reveals the stable cores from the ambiguous layout in evolving hateful content via a principled centroid-guided alignment mechanism. To alleviate the impact of outlier-like samples that are weakly correlated with centroids during the alignment process, SCANNER enhances the prior by incorporating a sample-level adaptive centroid alignment strategy, promoting more stable adaptation. Furthermore, to mitigate semantic collapse from overly uniform outputs within clusters, SCANNER introduces an intra-cluster diversity regularization that encourages the cluster-wise semantic richness. Experiments show that SCANNER outperforms all baselines, with an average gain of 4.69% in Macro-F1 over the best.

AAAI Conference 2026 Conference Paper

Where and What Matters: Sensitivity-Aware Task Vectors for Many-Shot Multimodal In-Context Learning

  • Ziyu Ma
  • Chenhui Gou
  • Yiming Hu
  • Yong Wang
  • Bohan Zhuang
  • Jianfei Cai

Large Multimodal Models (LMMs) have shown promising in-context learning (ICL) capabilities, but scaling to many-shot settings remains difficult due to limited context length and high inference cost. To address these challenges, task-vector-based methods have been explored by inserting compact representations of many-shot in-context demonstrations into model activations. However, existing task-vector-based methods either overlook the importance of where to insert task vectors or struggle to determine suitable values for each location. To this end, we propose a novel Sensitivity-aware Task Vector insertion framework (STV) to figure out where and what to insert. Our key insight is that activation deltas across query-context pairs exhibit consistent structural patterns, providing a reliable cue for insertion. Based on the identified sensitive-aware locations, we construct a pre-clustered activation bank for each location by clustering the activation values, and then apply reinforcement learning to choose the most suitable one to insert. We evaluate STV across a range of multimodal models (e.g., Qwen-VL, Idefics-2) and tasks (e.g., VizWiz, OK-VQA), demonstrating its effectiveness and showing consistent improvements over previous task-vector-based methods with strong generalization.

ICRA Conference 2025 Conference Paper

METDrive: Multimodal End-to-End Autonomous Driving with Temporal Guidance

  • Ziang Guo
  • Xinhao Lin
  • Zakhar Yagudin
  • Artem Lykov
  • Yong Wang
  • Yanqiang Li
  • Dzmitry Tsetserukou

Multimodal end-to-end autonomous driving has shown promising advancements in recent work. By embedding more modalities into end-to-end networks, the system's understanding of both static and dynamic aspects of the driving environment is enhanced, thereby improving the safety of autonomous driving. In this paper, we introduce METDrive, an end-to-end system that leverages temporal guidance from the embedded time series features of ego states, including rotation angles, steering, throttle signals, and waypoint vectors. The geometric features derived from the perception sensor data and the time series features of ego state data jointly guide the waypoint prediction with the proposed temporal guidance loss function. We evaluated METDrive on the CARLA leaderboard benchmarks, achieving a driving score of 70%, a route completion score of 94%, and an infraction score of 0. 78.

JBHI Journal 2025 Journal Article

MiRNA-disease Association Prediction via Cosine Annealing and Multi-Head Self-Attention in HyperGCN

  • Rong Zhu
  • Jie Zheng
  • Zheng-Hua Chang
  • Jin-Xing Liu
  • Jun-Liang Shang
  • Yong Wang

Biological studies have demonstrated that understanding the association between miRNAs and disease is critical for disease prevention, assessment, and therapy. However, traditional experimental methods for inferring these connections are not only costly but also inefficient. Hence, there is a pressing need to develop novel methods to improve the accuracy and efficiency of forecasting. Currently, graph convolutional networks (GCNs) techniques are one of the mainstream methods for predicting disease correlations. Nevertheless, traditional GCNs suffer from gradient vanishing and gradient explosion problems when dealing with long-range dependencies. To overcome these problems, we suggest a new approach called HGCMMDA, which relies on HyperGCN and combines a cosine annealing algorithm and a multi-head self-attention mechanism. In HGCMMDA, similarity networks for miRNAs and diseases are constructed, and GCN is used for feature extraction. A heterogeneity hypergraph is then built via HyperGCN for improved information propagation. Multi-head self-attention captures diverse node relations, while cosine annealing adjusts the learning rate. A combined BCE-Dice loss ensures accurate prediction. To evaluate the effectiveness of the proposed method, a comprehensive set of experiments was conducted using the Human microRNA Disease Database (HMDD v3. 2). The method achieved a peak area under the receiver operating characteristic curve (AUC) of 0. 9515, along with competitive performance in other evaluation metrics. The experimental findings indicate that HGCMMDA achieves notable enhancements over previously established approaches. These results strongly support the assertion that HGCMMDA serves as a dependable and effective framework for comprehensively exploring the intricate associations between microRNAs and human diseases.

JBHI Journal 2025 Journal Article

Transformer-Based Weakly Supervised Learning for Whole Slide Lung Cancer Image Classification

  • Jianpeng An
  • Yong Wang
  • Qing Cai
  • Gang Zhao
  • Stephan Dooper
  • Geert Litjens
  • Zhongke Gao

Image analysis can play an important role in supporting histopathological diagnoses of lung cancer, with deep learning methods already achieving remarkable results. However, due to the large scale of whole-slide images (WSIs), creating manual pixel-wise annotations from expert pathologists is expensive and time-consuming. In addition, the heterogeneity of tumors and similarities in the morphological phenotype of tumor subtypes have caused inter-observer variability in annotations, which limits optimal performance. Effective use of weak labels could potentially alleviate these issues. In this paper, we propose a two-stage transformer-based weakly supervised learning framework called Simple Shuffle-Remix Vision Transformer (SSRViT). Firstly, we introduce a Shuffle-Remix Vision Transformer (SRViT) to retrieve discriminative local tokens and extract effective representative features. Then, the token features are selected and aggregated to generate sparse representations of WSIs, which are fed into a simple transformer-based classifier (SViT) for slide-level prediction. Experimental results demonstrate that the performance of our proposed SSRViT is significantly improved compared with other state-of-the-art methods in discriminating between adenocarcinoma, pulmonary sclerosing pneumocytoma and normal lung tissue (accuracy of 96. 9 ${\%}$ and AUC of 99. 6 ${\%}$ ).

NeurIPS Conference 2024 Conference Paper

Classifier Clustering and Feature Alignment for Federated Learning under Distributed Concept Drift

  • Junbao Chen
  • Jingfeng Xue
  • Yong Wang
  • Zhenyan Liu
  • Lu Huang

Data heterogeneity is one of the key challenges in federated learning, and many efforts have been devoted to tackling this problem. However, distributed concept drift with data heterogeneity, where clients may additionally experience different concept drifts, is a largely unexplored area. In this work, we focus on real drift, where the conditional distribution $P(\mathcal{Y}|\mathcal{X})$ changes. We first study how distributed concept drift affects the model training and find that local classifier plays a critical role in drift adaptation. Moreover, to address data heterogeneity, we study the feature alignment under distributed concept drift, and find two factors that are crucial for feature alignment: the conditional distribution $P(\mathcal{Y}|\mathcal{X})$ and the degree of data heterogeneity. Motivated by the above findings, we propose FedCCFA, a federated learning framework with classifier clustering and feature alignment. To enhance collaboration under distributed concept drift, FedCCFA clusters local classifiers at class-level and generates clustered feature anchors according to the clustering results. Assisted by these anchors, FedCCFA adaptively aligns clients' feature spaces based on the entropy of label distribution $P(\mathcal{Y})$, alleviating the inconsistency in feature space. Our results demonstrate that FedCCFA significantly outperforms existing methods under various concept drift settings. Code is available at https: //github. com/Chen-Junbao/FedCCFA.

AAAI Conference 2024 Short Paper

Improving IP Geolocation With Target-Centric IP Graph (Student Abstract)

  • Kai Yang
  • Jiayang Li
  • Wenxin Tai
  • Zhenhui Li
  • Ting Zhong
  • Guangqiang Yin
  • Yong Wang

Accurate IP geolocation is indispensable for location-aware applications. While recent advances based on router-centric IP graphs are considered cutting-edge, one challenge remain: the prevalence of sparse IP graphs (14.24% with fewer than 10 nodes, 9.73% isolated) limits graph learning. To mitigate this issue, we designate the target host as the central node and aggregate multiple last-hop routers to construct the target-centric IP graph, instead of relying solely on the router with the smallest last-hop latency as in previous works. Experiments on three real-world datasets show that our method significantly improves the geolocation accuracy compared to existing baselines.

AAAI Conference 2024 Conference Paper

On the Concept Trustworthiness in Concept Bottleneck Models

  • Qihan Huang
  • Jie Song
  • Jingwen Hu
  • Haofei Zhang
  • Yong Wang
  • Mingli Song

Concept Bottleneck Models (CBMs), which break down the reasoning process into the input-to-concept mapping and the concept-to-label prediction, have garnered significant attention due to their remarkable interpretability achieved by the interpretable concept bottleneck. However, despite the transparency of the concept-to-label prediction, the mapping from the input to the intermediate concept remains a black box, giving rise to concerns about the trustworthiness of the learned concepts (i.e., these concepts may be predicted based on spurious cues). The issue of concept untrustworthiness greatly hampers the interpretability of CBMs, thereby hindering their further advancement. To conduct a comprehensive analysis on this issue, in this study we establish a benchmark to assess the trustworthiness of concepts in CBMs. A pioneering metric, referred to as concept trustworthiness score, is proposed to gauge whether the concepts are derived from relevant regions. Additionally, an enhanced CBM is introduced, enabling concept predictions to be made specifically from distinct parts of the feature map, thereby facilitating the exploration of their related regions. Besides, we introduce three modules, namely the cross-layer alignment (CLA) module, the cross-image alignment (CIA) module, and the prediction alignment (PA) module, to further enhance the concept trustworthiness within the elaborated CBM. The experiments on five datasets across ten architectures demonstrate that without using any concept localization annotations during training, our model improves the concept trustworthiness by a large margin, meanwhile achieving superior accuracy to the state-of-the-arts. Our code is available at https://github.com/hqhQAQ/ProtoCBM.

JBHI Journal 2023 Journal Article

A Real-Time Bionic Method Inspired by Neural Oscillators for Estimation and Extraction of Pathological Tremor

  • Feiyun Xiao
  • Biqing Zhong
  • Yanming Wu
  • Yong Wang

The typical representative of pathological tremor is Parkinson's disease. One of the pathogenesis is that the synchronized neural oscillations within and between brain areas are affected. Inspired by this, this work proposes an algorithm based on neural oscillator to extract voluntary motion and estimate tremor motion in real time, which is named as RTBNO. This algorithm is composed of multiple adaptive modified Hopf oscillators linear combiner. The combiner is divided into two parts: one is used to estimate tremor motion and the other is applied to estimate voluntary motion. As it is updated iteratively in real time, this method has no phase delay. The performance of the proposed method was verified by the simulated action tremor and the actual experimental results of twenty Parkinson's disease patients. For the rest tremor signals of patients, the mean Root Mean Square Error (RMSE) values between the estimated signal and the actual signal was 0. 0272±0. 0077. The mean RMSE values between the estimated voluntary movement from action tremor and the actual voluntary movement were 0. 0360±0. 0097 (pick and put motion) and 0. 0380±0. 0083 (drawing motion). The execution time for the corresponding 10 seconds data was 0. 0478s. The comparison results between the proposed method and the existing methods demonstrated the effectiveness of the proposed method.

NeurIPS Conference 2023 Conference Paper

Optimized Covariance Design for AB Test on Social Network under Interference

  • Qianyi Chen
  • Bo Li
  • Lu Deng
  • Yong Wang

Online A/B tests have become increasingly popular and important for social platforms. However, accurately estimating the global average treatment effect (GATE) has proven to be challenging due to network interference, which violates the Stable Unit Treatment Value Assumption (SUTVA) and poses great challenge to experimental design. Existing network experimental design research was mostly based on the unbiased Horvitz-Thompson (HT) estimator with substantial data trimming to ensure unbiasedness at the price of high resultant estimation variance. In this paper, we strive to balance the bias and variance in designing randomized network experiments. Under a potential outcome model with 1-hop interference, we derive the bias and variance of the standard HT estimator and reveal their relation to the network topological structure and the covariance of the treatment assignment vector. We then propose to formulate the experimental design problem as to optimize the covariance matrix of the treatment assignment vector to achieve the bias and variance balance by minimizing the mean squared error (MSE) of the estimator. An efficient projected gradient descent algorithm is presented to the implement of the desired randomization scheme. Finally, we carry out extensive simulation studies to demonstrate the advantages of our proposed method over other existing methods in many settings, with different levels of model misspecification.

AAAI Conference 2022 Short Paper

Large-Scale IP Usage Identification via Deep Ensemble Learning (Student Abstract)

  • Zhiyuan Wang
  • Fan Zhou
  • Kunpeng Zhang
  • Yong Wang

Understanding users’ behavior via IP addresses is essential towards numerous practical IP-based applications such as online content delivery, fraud prevention, and many others. Among which profiling IP address has been extensively studied, such as IP geolocation and anomaly detection. However, less is known about the scenario of an IP address, e. g. , dedicated enterprise network or home broadband. In this work, we initiate the first attempt to address a large-scale IP scenario prediction problem. Specifically, we collect IP scenario data from four regions and propose a novel deep ensemble learning-based model to learn IP assignment rules and complex feature interactions. Extensive experiments support that our method can make accurate IP scenario identification and generalize from data in one region to another.

AAAI Conference 2020 Conference Paper

Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks

  • Yong Wang
  • Longyue Wang
  • Shuming Shi
  • Victor O.K. Li
  • Zhaopeng Tu

The key challenge of multi-domain translation lies in simultaneously encoding both the general knowledge shared across domains and the particular knowledge distinctive to each domain in a unified model. Previous work shows that the standard neural machine translation (NMT) model, trained on mixed-domain data, generally captures the general knowledge, but misses the domain-specific knowledge. In response to this problem, we augment NMT model with additional domain transformation networks to transform the general representations to domain-specific representations, which are subsequently fed to the NMT decoder. To guarantee the knowledge transformation, we also propose two complementary supervision signals by leveraging the power of knowledge distillation and adversarial learning. Experimental results on several language pairs, covering both balanced and unbalanced multi-domain translation, demonstrate the effectiveness and universality of the proposed approach. Encouragingly, the proposed unified model achieves comparable results with the fine-tuning approach that requires multiple models to preserve the particular knowledge. Further analyses reveal that the domain transformation networks successfully capture the domain-specific knowledge as expected. 1

IJCAI Conference 2020 Conference Paper

Lexical-Constraint-Aware Neural Machine Translation via Data Augmentation

  • Guanhua Chen
  • Yun Chen
  • Yong Wang
  • Victor O. K. Li

Leveraging lexical constraint is extremely significant in domain-specific machine translation and interactive machine translation. Previous studies mainly focus on extending beam search algorithm or augmenting the training corpus by replacing source phrases with the corresponding target translation. These methods either suffer from the heavy computation cost during inference or depend on the quality of the bilingual dictionary pre-specified by user or constructed with statistical machine translation. In response to these problems, we present a conceptually simple and empirically effective data augmentation approach in lexical constrained neural machine translation. Specifically, we make constraint-aware training data by first randomly sampling the phrases of the reference as constraints, and then packing them together into the source sentence with a separation symbol. Extensive experiments on several language pairs demonstrate that our approach achieves superior translation results over the existing systems, improving translation of constrained sentences without hurting the unconstrained ones.

AAAI Conference 2020 Conference Paper

PSENet: Psoriasis Severity Evaluation Network

  • Yi Li
  • Zhe Wu
  • Shuang Zhao
  • Xian Wu
  • Yehong Kuang
  • YangTian Yan
  • Shen Ge
  • Kai Wang

Psoriasis is a chronic skin disease which affects hundreds of millions of people around the world. This disease cannot be fully cured and requires lifelong caring. If the deterioration of Psoriasis is not detected and properly treated in time, it could cause serious complications or even lead to a life threat. Therefore, a quantitative measurement that can track the Psoriasis severity is necessary. Currently, PASI (Psoriasis Area and Severity Index) is the most frequently used measurement in clinical practices. However, PASI has the following disadvantages: (1) Time consuming: calculating PASI usually takes more than 30 minutes which poses a heavy burden on dermatologists; and (2) Inconsistency: due to the complexity of PASI calculation, different or even the same dermatologist could give different scores for the same case. To overcome these drawbacks, we propose PSENet which applies deep neural networks to estimate Psoriasis severity based on skin lesion images. Different from typical deep learning frameworks for image processing, PSENet has the following characteristics: (1) PSENet introduces a score re- fine module which is able to capture the visual features of skin at both coarse and fine-grained granularities; (2) PSENet uses siamese structure in training and accepts pairwise inputs, which reduces the dependency on large amount of training data; and (3) PSENet can not only estimate the severity, but also locate the skin lesion regions from the input image. To train and evaluate PSENet, we work with professional dermatologists from a top hospital and spend years in building a golden dataset. The experimental results show that PSENet can achieve the mean absolute error of 2. 21 and the accuracy of 77. 87% in pair comparison, outperforming baseline methods. Overall, PSENet not only relieves dermatologists from the dull PASI calculation but also enables patients to track Psoriasis severity in a much more convenient manner.

IROS Conference 2019 Conference Paper

Local Pose optimization with an Attention-based Neural Network

  • Yiling Liu
  • Hesheng Wang 0001
  • Fan Xu 0004
  • Yong Wang
  • Weidong Chen 0001
  • Qirong Tang

In this paper, we propose a novel pose optimizer which can be inserted into either supervised or unsupervised end-to-end visual odometry for the purpose of local pose optimization. The pose optimizer is an analogue of the pose graph optimization used in traditional VSLAM algorithms. Local pose optimization is performed by an attention-based neural network which iteratively refines the predicted pose estimates of an image snippet. Instead of complicated graph convolutional network, the attention mechanism based on geometric consistency of trajectory constraint is utilized because pose features whose spatial distribution is not important can be flattened to vectors and then processed. The pose optimizer is aimed at improving pose estimation accuracy by redistributing errors of pose estimates. Quantitative and qualitative evaluation of the proposed approach on the KITTI Odometry dataset [1] is presented to demonstrate its effectiveness in improving pose estimation accuracy and minimizing pose drift.

AAAI Conference 2018 Conference Paper

Search Engine Guided Neural Machine Translation

  • Jiatao Gu
  • Yong Wang
  • Kyunghyun Cho
  • Victor O.K. Li

In this paper, we extend an attention-based neural machine translation (NMT) model by allowing it to access an entire training set of parallel sentence pairs even after training. The proposed approach consists of two stages. In the first stage– retrieval stage–, an off-the-shelf, black-box search engine is used to retrieve a small subset of sentence pairs from a training set given a source sentence. These pairs are further filtered based on a fuzzy matching score based on edit distance. In the second stage–translation stage–, a novel translation model, called search engine guided NMT (SEG-NMT), seamlessly uses both the source sentence and a set of retrieved sentence pairs to perform the translation. Empirical evaluation on three language pairs (En-Fr, En-De, and En-Es) shows that the proposed approach significantly outperforms the baseline approach and the improvement is more significant when more relevant sentence pairs were retrieved.

IS Journal 2014 Journal Article

An Energy-Efficient and Swarm Intelligence-Based Routing Protocol for Next-Generation Sensor Networks

  • Yong Wang
  • Changle Li
  • Yulong Duan
  • Jin Yang
  • Xiang Cheng

After providing a brief overview of routing protocols for next-generation sensor networks (NGSNs), the authors propose Bee-Sensor-C, an energy-efficient, swarm intelligence-based, and scalable multipath routing protocol that integrates dynamic clustering, multipath routing, and bee-inspired routing to meet the performance requirements of NGSNs. A performance evaluation is also provided.

ICRA Conference 2011 Conference Paper

Hybrid map-based navigation for intelligent wheelchair

  • Yong Wang
  • Weidong Chen

A navigation system based on hybrid map for intelligent wheelchair is presented. The system is consisted of hybrid map building, localization, path planning and trajectory following. The hybrid map includes a series of small probabilistic grid maps (PGM) and a global topological map (GTM). They are built simultaneously and easily using the human-guided method. Then on the hybrid map, the localization and the real-time path planning algorithms are realized smartly and effectively. The experiments and applications results show that the human-guided method integrates both the computer's modeling ability and the human's sensory perception to the environment. The hybrid map is easy to solve the loop-closure and doorway problems that enhances the robustness against uncertainty of sensors. It also can improve the efficiency in large-scale SLAM. Further more, at an elderly home we did the activities of daily living (ADL) testing and at Shanghai Expo 2010 we demonstrated the wheelchair system by offering trial rides to visitors.

IJCAI Conference 2011 Conference Paper

Local and Structural Consistency for Multi-Manifold Clustering

  • Yong Wang
  • Yuan Jiang
  • Yi Wu
  • Zhi-Hua Zhou

Data sets containing multi-manifold structures are ubiquitous in real-world tasks, and effective grouping of such data is an important yet challenging problem. Though there were many studies on this problem, it is not clear on how to design principled methods for the grouping of multiple hybrid manifolds. In this paper, we show that spectral methods are potentially helpful for hybridmanifold clustering when the neighborhood graph is constructed to connect the neighboring samples from the same manifold. However, traditional algorithms which identify neighbors according to Euclidean distance will easily connect samples belonging to different manifolds. To handle this drawback, we propose a new criterion, i. e. , local and structural consistency criterion, which considers the neighboring information as well as the structural information implied by the samples. Based on this criterion, we develop a simple yet effective algorithm, named Local and Structural Consistency (LSC), for clustering with multiple hybrid manifolds. Experiments show that LSC achieves promising performance.

AAAI Conference 2011 Conference Paper

Localized K-Flats

  • Yong Wang
  • Yuan Jiang
  • Yi Wu
  • Zhi-Hua Zhou

K-flats is a model-based linear manifold clustering algorithm which has been successfully applied in many real-world scenarios. Though some previous works have shown that K-flats doesn’t always provide good performance, little effort has been devoted to analyze its inherent deficiency. In this paper, we address this challenge by showing that the deteriorative performance of K-flats can be attributed to the usual reconstruction error measure and the infinitely extending representations of linear models. Then we propose Localized K-flats algorithm (LKF), which introduces localized representations of linear models and a new distortion measure, to remove confusion among different clusters. Experiments on both synthetic and real-world data sets demonstrate the efficiency of the proposed algorithm. Moreover, preliminary experiments show that LKF has the potential to group manifolds with nonlinear structure.