Arrow Research search

Author name cluster

Fei Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

108 papers
2 author rows

Possible papers

108

AAAI Conference 2026 Conference Paper

APT: Affine Prototype-Timestamp for Time Series Forecasting Under Distribution Shift

  • Yujie Li
  • Zezhi Shao
  • Chengqing Yu
  • Yisong Fu
  • Tao Sun
  • Yongjun Xu
  • Fei Wang

Time series forecasting under distribution shift remains challenging, as existing deep learning models often rely on local statistical normalization (e.g., mean and variance) that fails to capture global distribution shift. Methods like RevIN and its variants attempt to decouple distribution and pattern but still struggle with missing values, noisy observations, and invalid channel-wise affine transformation. To address these limitations, we propose Affine Prototype-Timestamp(APT), a lightweight and flexible plug-in module that injects global distribution features into the normalization–forecasting pipeline. By leveraging timestamp-conditioned prototype learning, APT dynamically generates affine parameters that modulate both input and output series, enabling the backbone to learn from self-supervised, distribution-aware clustered instances. APT is compatible with arbitrary forecasting backbones and normalization strategies while introducing minimal computational overhead. Extensive experiments across six benchmark datasets and multiple backbone-normalization combinations demonstrate that APT significantly improves forecasting performance under distribution shift.

JBHI Journal 2026 Journal Article

Bridging Stepwise Lab-Informed Pretraining and Knowledge-Guided Learning for Diagnostic Reasoning

  • Pengfei Hu
  • Chang Lu
  • Fei Wang
  • Yue Ning

Despite the growing use of Electronic Health Records (EHR) for AI-assisted diagnosis prediction, most data-driven models struggle to incorporate clinically meaningful medical knowledge. They often rely on limited ontologies, lacking structured reasoning capabilities and comprehensive coverage. This raises an important research question: Will medical knowledge improve predictive models to support stepwise clinical reasoning as performed by human doctors? To address this problem, we propose DuaLK, a dual-expertise framework that combines two complementary sources of information. For external knowledge, we construct a Diagnosis Knowledge Graph (KG) that encodes both hierarchical and semantic relations enriched by large language models (LLM). To align with patient data, we further introduce a lab-informed proxy task that guides the model to follow a clinically consistent, stepwise reasoning process based on lab test signals. Experimental results on two public EHR datasets demonstrate that DuaLK consistently outperforms existing baselines across four clinical prediction tasks. These findings highlight the potential of combining structured medical knowledge with individual-level clinical signals to achieve more accurate and interpretable diagnostic predictions. The source code is publicly available on https://github.com/humphreyhuu/DuaLK.

JBHI Journal 2026 Journal Article

Efficient Sleep Staging With Bayesian Uncertainty-Guided Active Learning

  • Tianyou Yu
  • Rui Huang
  • Fei Wang
  • Jun Zhang
  • Wei Wu
  • Zhuliang Yu
  • Yuanqing Li
  • Jun Xiao

Automated sleep staging is essential for large-scale and home-based sleep monitoring; however, in routine clinical practice, sleep annotation remains largely dependent on experienced experts performing time-consuming and labor-intensive manual scoring. Existing automatic systems often struggle to adapt reliably to new subjects, limiting their clinical adoption and reinforcing the reliance on expert review. This creates a strong demand for adaptive and efficient sleep staging systems that can substantially reduce annotation workload while preserving expert-level accuracy. We propose BayesSleepNet, a novel framework that integrates Bayesian uncertainty quantification with active learning for adaptive sleep staging. BayesSleepNet employs principled Bayesian modeling by placing distributions over network weights and performing Monte Carlo sampling at inference, enabling explicit quantification of model (epistemic) uncertainty. These uncertainty estimates drive a two-stage sample selection strategy that first fine-tunes the model using representative epochs and subsequently prioritizes persistently uncertain samples for expert review. Across four public sleep datasets, BayesSleepNet consistently improves performance—by 7. 60% in accuracy, 8. 27% in macro-F1, and 0. 104 in Cohen's $\kappa$ —while requiring manual annotation of only 20% of data from new subjects. Despite its adaptive learning capability, BayesSleepNet remains computationally lightweight, using substantially fewer parameters than representative high-capacity state-of-the-art models. These results demonstrate the clinical promise of uncertainty-aware active learning as a practical and cost-efficient paradigm for semi-automated sleep staging. Code is available at https://github.com/yuty2009/bayesugal.

JBHI Journal 2025 Journal Article

An Efficient Transfer Learning With Prompt Learning for Brain Disorders Diagnosis

  • Liuzeng Zhang
  • Lanting Li
  • Peng Cao
  • Jinzhu Yang
  • Osmar R. Zaiane
  • Fei Wang

The limited availability of training data significantly restricts the performance of deep supervised models for brain disease diagnosis. It is crucial to develop a learning framework through cross-disease transfer learning that can extract more information from the limited data. To address this challenge, we concentrate on prompt learning and endeavor to extend its application to the brain networks. Specifically, we propose a novel prompt learning framework called BPformer, which integrates knowledge transferred across diseases via specific prompts while keeping the original architecture of BPformer unchanged. The specific prompts incorporate 1) a mask prompt to determine whether the edges are noisy or discriminating, 2) disorder prompts for modeling consistent and disorder-specific knowledge, and 3) adaptive instance-level prompts to account for inter-individual variations. We evaluate BPformer on the private center Nanjing Medical University dataset, the public Autism Brain Imaging Data Exchange dataset, and the public Alzheimer's Disease Neuroimaging Initiative dataset. We demonstrate the effectiveness of the proposed model across various disease classification tasks, including major depressive disorder, bipolar disorder, alzheimer's disease, and autism spectrum disorder diagnoses. In addition, the proposed method enables disease interpretability and subtype analysis, empowering physicians to provide patients with more accurate and fine-grained treatment plans.

IJCAI Conference 2025 Conference Paper

Conditional Causal Representation Learning for Heterogeneous Single-cell RNA Data Integration and Prediction

  • Jiayi Dong
  • Jiahao Li
  • Fei Wang

Single-cell sequencing technology provides deep insights into gene activity at the individual cell level, facilitating the study of gene regulatory mechanisms. However, observed gene expression are often influenced by confounding factors such as batch effects, perturbations, and spatial position, which obscure the true gene regulatory network that governs the cell’s intrinsic state. To address these challenges, we propose scConCRL, a novel conditionally causal representation learning framework designed to extract the true gene regulatory relationships independent of confounding information. By considering both fine-grained molecular gene variables and coarse-grained latent domain variables, scConCRL not only uncovers the intrinsic biological signals but also models the complex relationships between these variables. This dual function enables the separation of genuine cellular states from domain information, providing valuable insights for downstream analyses and biological discovery. We demonstrate the effectiveness of our model on multi-domain datasets from different platforms and perturbation conditions, showing its ability to accurately disentangle confounding influences and discover novel gene relationships. Extensive comparisons across various scenarios illustrate the superior performance of scConCRL in several tasks compared to existing methods.

JBHI Journal 2025 Journal Article

Deep Self-Reinforced Multi-View Subspace Clustering for Cancer Subtyping

  • Cheng Liu
  • Baoyuan Zheng
  • Jiaojiao Wang
  • Xibiao Wang
  • Hang Gao
  • Fei Wang
  • Wenjun Shen
  • Si Wu

Identifying cancer subtypes is crucial for understanding disease progression and guiding precision medicine. With advances in high-throughput experimental technologies, the integration of multiple types of omics data for cancer subtype identification has become increasingly feasible. However, despite the promising performance of existing integrative cancer subtyping methods, efficiently integrating and clustering multi-omics datasets remains challenging due to the high levels of noise inherent in omics data, which impede the accurate characterization of relationships among samples. To address these challenges, we propose a novel deep multi-view subspace clustering model that incorporates a self-reinforced learning strategy. This strategy iteratively improves the quality of self-representation, which is critical for accurately capturing sample relationships and enabling effective clustering. Specifically, during model training, the proposed method learns a highly reliable self-representation through a good-neighbor learning mechanism, allowing it to model more accurate and robust inter-sample relationships. Building upon this reliable self-representation, we further develop a learnable view-graph fusion framework that integrates complementary information across multiple omics views to derive a consensus representation for clustering, thereby guiding the overall learning process. In addition, we introduce a local graph-guided learning mechanism based on an initial graph constructed from the raw data. This mechanism serves as an effective regularization strategy to prevent the model from converging to suboptimal solutions, thereby enhancing stability and robustness during training. Extensive experimental results demonstrate that the proposed method consistently outperforms several state-of-the-art approaches, validating its effectiveness and robustness for cancer subtype identification.

NeurIPS Conference 2025 Conference Paper

Democratizing Clinical Risk Prediction with Cross-Cohort Cross-Modal Knowledge Transfer

  • Qiannan Zhang
  • Manqi Zhou
  • Zilong Bai
  • Chang Su
  • Fei Wang

Clinical risk prediction plays a crucial role in early disease detection and personalized intervention. While recent models increasingly incorporate multimodal data, their development typically assumes access to large-scale, multimodal datasets and substantial computational resources. In practice, however, most clinical sites operate under resource constraints, with access limited to EHR data alone and insufficient capacity to train complicated models. This gap highlights the urgent need to democratize clinical risk prediction by enabling effective deployment in data- and resource-limited local clinical settings. In this work, we propose a cross-cohort cross-modal knowledge transfer framework that leverages the multimodal model trained on a nationwide cohort and adapts it to local cohorts with only EHR data. We focus on EHR and genetic data as representative multimodal inputs and address two key challenges. First, to mitigate the influence of noisy or less informative biological signals, we propose a novel mixture-of-aggregations design to enhance the modeling of informative and relevant genetic features. Second, to support rapid model adaptation in low-resource sites, we develop a lightweight graph-guided fine-tuning method that adapts pretrained phenotypical EHR representations to target cohorts using limited patient data. Extensive experiments on real-world clinical data validate the effectiveness of our proposed model.

AAAI Conference 2025 Conference Paper

Editing Memories Through Few Targeted Neurons

  • Wei Zhou
  • Wei Wei
  • Guibang Cao
  • Fei Wang

Model editing is a novel research topic in large language models (LLMs), aimed at efficiently handling various knowledge editing tasks. Since irrelevant knowledge is difficult to measure, existing editing methods often lack explicit ways to preserve it, especially for editing methods based on the fine-tuning paradigm. They generally control the locality performance of model editing by constraining the range of changes in model parameters. However, their performance improvements are not always ideal, and may even lead to a decrease in the editing reliability. In this paper, we try to explore effective editing locality control methods based on the relationship between the stored knowledge and the strongly associated model components. Based on the discovery of ``knowledge neurons'' and enough experimental results, we further explore the potential characteristics between knowledge and model components, confirm and point out: (1) only 1% neurons have significant contributions to specific knowledge storage, and (2) these targeted neurons often have a high overlap for knowledge with similar relational descriptions, which means that knowledge with similar relationships may be severely affected when these targeted neurons are modified. Based on these findings, we propose Targeted Neurons Fine-tuning with Data Augmentation (TNF-DA), which performs data augmentation based on the relational representation of edited knowledge to improve editing locality. By freezing most of the model parameters and only fine-tuning the highly contributing neurons corresponding to the edited knowledge, we obtain desirable results in terms of generalization and specificity compared with previous fine-tuning-based methods. Extensive experiments have demonstrated the superior editing performance achieved by our proposed method.

TMLR Journal 2025 Journal Article

FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction

  • Adamo Young
  • Fei Wang
  • David Wishart
  • Bo Wang
  • Russell Greiner
  • Hannes Rost

Compound identification from tandem mass spectrometry (MS/MS) data is a critical step in the analysis of complex mixtures. Typical solutions for the MS/MS spectrum to compound (MS2C) problem involve comparing the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to MS/MS spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted MS/MS spectra. Unfortunately, many existing C2MS models suffer from problems with mass accuracy, generalization, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately simulate MS/MS spectra with high mass accuracy. Our approach formulates the C2MS problem as learning a distribution over molecule fragments. FraGNNet achieves state-of-the-art performance in terms of prediction error and surpasses existing C2MS models as a tool for retrieval-based MS2C.

IROS Conference 2025 Conference Paper

Human-guided robotic-assistance handheld continuum medical robot system

  • Fei Wang
  • Changhao Luo
  • Zexi Zhao
  • Pingyu Xiang
  • Ke Qiu
  • Yufei Wei
  • Yue Wang 0020
  • Rong Xiong

Nowadays, laparoscopic surgery procedures face a trade-off between expensive, complex robotic systems and manual instruments with limited functionality. Fully robotic solutions offer precision but lack portability and intuitive control, while manual tools rely solely on the surgeon’s dexterity, limiting maneuverability and depth perception in confined spaces. To bridge this, we propose a Human-Guided Robotic-Assistance Handheld Continuum Medical Robot System (HRHC). This system simulates intuitive manual operation with robotic precision, extending the surgeon’s capabilities while maintaining portability. Additionally, a stereo vision system enhances real-time depth perception, improving spatial awareness in minimally invasive procedures.

JBHI Journal 2025 Journal Article

Incremental Classification for High-Dimensional EEG Manifold Representation Using Bidirectional Dimensionality Reduction and Prototype Learning

  • Dongxu Liu
  • Qichuan Ding
  • Chenyu Tong
  • Jinshuo Ai
  • Fei Wang

In brain-computer interface (BCI) systems, symmetric positive definite (SPD) manifold within Riemannian space has been frequently utilized to extract spatial features from electroencephalogram (EEG) signals. However, the intrinsic high dimensionality of SPD matrices introduces too much computational burden to hinder the real-time applications of such BCI, especially in handling dynamic tasks, like incremental learning. Directly reducing the dimensionality of SPD matrices with conventional dimensionality reduction (DR) methods will alter the fundamental properties of SPD matrices. Moreover, current DR methods for incremental learning always necessitate retaining old data to update their representations under new mapping. To this end, a bidirectional two-dimensional principal component analysis for SPD manifold (B2DPCA-SPD) is proposed to reduce the dimensionality of SPD matrices, in such way that the reduced matrices remain on SPD manifold. Afterwards, the B2DPCA-SPD is extended to adapt to incremental learning tasks without saving old data. The incremental B2DPCA-SPD can be seamlessly integrated with the matrix-formed growing neural gas network (MF-GNG) to achieve an incremental EEG classification, where the new low-dimensional representations of the prototypes in old classifiers can be easily recalculated with the updated projection matrix. Extensive experiments are conducted on two public datasets to perform the EEG classification. The results demonstrate that our method significantly reduces computation time by 38. 53% and 35. 96%, and outperforms conventional methods in classification accuracy by 4. 21% to 19. 59%.

NeurIPS Conference 2025 Conference Paper

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

  • Fei Wang
  • Li Shen
  • Liang Ding
  • Chao Xue
  • Ye Liu
  • Changxing Ding

Large Language Models (LLMs) excel at natural language processing tasks, but their massive size leads to high computational and storage demands. Recent works have sought to reduce their model size through layer-wise structured pruning. However, they tend to ignore retaining the capabilities in the pruned part. In this work, we re-examine structured pruning paradigms and uncover several key limitations: 1) notable performance degradation due to direct layer removal, 2) incompetent linear weighted layer aggregation, and 3) the lack of effective post-training recovery mechanisms. To address these limitations, we propose CoMe, including a progressive layer pruning framework with a Concatenation-based Merging technology and a hierarchical distillation post-training process. Specifically, we introduce a channel sensitivity metric that utilizes activation intensity and weight norms for fine-grained channel selection. Subsequently, we employ a concatenation-based layer merging method to fuse the most critical channels in the adjacent layers, enabling a progressive model size reduction. Finally, we propose a hierarchical distillation protocol, which leverages the correspondences between the original and pruned model layers established during pruning, enabling efficient knowledge transfer. Experiments on seven benchmarks show that CoMe achieves state-of-the-art performance; when pruning 30% of LLaMA-2-7b's parameters, the pruned model retains 83% of its original average accuracy.

NeurIPS Conference 2025 Conference Paper

LayerIF: Estimating Layer Quality for Large Language Models using Influence Functions

  • Hadi Askari
  • Shivanshu Gupta
  • Fei Wang
  • Anshuman Chhabra
  • Muhao Chen

Pretrained Large Language Models (LLMs) achieve strong performance across a wide range of tasks, yet exhibit substantial variability in the various layers' training quality with respect to specific downstream applications, limiting their downstream performance. It is therefore critical to estimate layer-wise training quality in a manner that accounts for both model architecture and training data. However, existing approaches predominantly rely on model-centric heuristics (such as spectral statistics, outlier detection, or uniform allocation) while overlooking the influence of data. To address these limitations, we propose LayerIF, a data-driven framework that leverages Influence Functions to quantify the training quality of individual layers in a principled and task-sensitive manner. By isolating each layer's gradients and measuring the sensitivity of the validation loss to training examples by computing layer-wise influences, we derive data-driven estimates of layer importance. Notably, our method produces task-specific layer importance estimates for the same LLM, revealing how layers specialize for different test-time evaluation tasks. We demonstrate the utility of our scores by leveraging them for two downstream applications: (a) expert allocation in LoRA-MoE architectures and (b) layer-wise sparsity distribution for LLM pruning. Experiments across multiple LLM architectures demonstrate that our model-agnostic, influence-guided allocation leads to consistent gains in task performance.

AAAI Conference 2025 Conference Paper

Local Causal Discovery for Structural Evidence of Direct Discrimination

  • Jacqueline Maasch
  • Kyra Gan
  • Violet Chen
  • Agni Orfanoudaki
  • Nil-Jana Akpinar
  • Fei Wang

Identifying the causal pathways of unfairness is a critical objective for improving policy design and algorithmic decision-making. Prior work in causal fairness analysis often requires knowledge of the causal graph, hindering practical applications in complex or low-knowledge domains. Moreover, global discovery methods that learn causal structure from data can display unstable performance on finite samples, preventing robust fairness conclusions. To mitigate these challenges, we introduce local discovery for direct discrimination (LD3): a method that uncovers structural evidence of direct unfairness by identifying the causal parents of an outcome variable. LD3 performs a linear number of conditional independence tests relative to variable set size, and allows for latent confounding under the sufficient condition that all parents of the outcome are observed. We show that LD3 returns a valid adjustment set (VAS) under a new graphical criterion for the weighted controlled direct effect, a qualitative indicator of direct discrimination. LD3 limits unnecessary adjustment, providing interpretable VAS for assessing unfairness. We use LD3 to analyze causal fairness in two complex decision systems: criminal recidivism prediction and liver transplant allocation. LD3 was more time-efficient and returned more plausible results on real-world data than baselines, which took 46× to 5870× longer to execute.

AAAI Conference 2025 Conference Paper

MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

  • Weilun Feng
  • Haotong Qin
  • Chuanguang Yang
  • Zhulin An
  • Libo Huang
  • Boyu Diao
  • Fei Wang
  • Renshuai Tao

Diffusion models have received wide attention in generation tasks. However, the expensive computation cost prevents the application of diffusion models in resource-constrained scenarios. Quantization emerges as a practical solution that significantly saves storage and computation by reducing the bit-width of parameters. However, the existing quantization methods for diffusion models still cause severe degradation in performance, especially under extremely low bit-widths (2-4 bit). The primary decrease in performance comes from the significant discretization of activation values at low bit quantization. Too few activation candidates are unfriendly for outlier significant weight channel quantization, and the discretized features prevent stable learning over different time steps of the diffusion model. This paper presents MPQ-DM, a Mixed-Precision Quantization method for Diffusion Models. The proposed MPQ-DM mainly relies on two techniques: (1) To mitigate the quantization error caused by outlier severe weight channels, we propose an Outlier-Driven Mixed Quantization (OMQ) technique that uses Kurtosis to quantify outlier salient channels and apply optimized intra-layer mixed-precision bit-width allocation to recover accuracy performance within target efficiency. (2) To robustly learn representations crossing time steps, we construct a Time-Smoothed Relation Distillation (TRD) scheme between the quantized diffusion model and its full-precision counterpart, transferring discrete and continuous latent to a unified relation space to reduce the representation inconsistency. Comprehensive experiments demonstrate that MPQ-DM achieves significant accuracy gains under extremely low bit-widths compared with SOTA quantization methods. MPQ-DM achieves a 58% FID decrease under W2A4 setting compared with baseline, while all other methods even collapse.

ICRA Conference 2025 Conference Paper

Neural Ranging Inertial Odometry

  • Si Wang
  • Bingqi Shen
  • Fei Wang
  • Yanjun Cao
  • Rong Xiong
  • Yue Wang 0020

Ultra-wideband (UWB) has shown promising potential in GPS-denied localization thanks to its lightweight and drift-free characteristics, while the accuracy is limited in real scenarios due to its sensitivity to sensor arrangement and non-Gaussian pattern induced by multi-path or multi-signal interference, which commonly occurs in many typical applications like long tunnels. We introduce a novel neural fusion framework for ranging inertial odometry which involves a graph attention UWB network and a recurrent neural inertial network. Our graph net learns scene-relevant ranging patterns and adapts to any number of anchors or tags, realizing accurate positioning without calibration. Additionally, the integration of least squares and the incorporation of nominal frame enhance overall performance and scalability. The effectiveness and robustness of our methods are validated through extensive experiments on both public and self-collected datasets, spanning indoor, outdoor, and tunnel environments. The results demonstrate the superiority of our proposed IR-ULSG in handling challenging conditions, including scenarios outside the convex envelope and cases where only a single anchor is available.

TMLR Journal 2025 Journal Article

Offset Unlearning for Large Language Models

  • James Y. Huang
  • Wenxuan Zhou
  • Fei Wang
  • Fred Morstatter
  • Sheng Zhang
  • Hoifung Poon
  • Muhao Chen

Despite the strong capabilities of Large Language Models (LLMs) to acquire knowledge from their training corpora, the memorization of sensitive information in the corpora such as copyrighted, biased, and private content has led to ethical and legal concerns. In response to these challenges, unlearning has emerged as a potential remedy for LLMs affected by problematic training data. However, previous unlearning techniques are either not applicable to black-box LLMs due to required access to model internal weights, or violate data protection principles by retaining sensitive data for inference-time correction. We propose $\delta$-unlearning, an offset unlearning framework for black-box LLMs. Instead of tuning the black-box LLM itself, $\delta$-unlearning learns the logit offset needed for unlearning by contrasting the logits from a pair of smaller models. Experiments demonstrate that $\delta$-unlearning can effectively unlearn target data while maintaining similar or even stronger performance on general out-of-forget-scope tasks. $\delta$-unlearning also effectively incorporates different unlearning algorithms, making our approach a versatile solution to adapting various existing unlearning algorithms to black-box LLMs.

NeurIPS Conference 2025 Conference Paper

On the Integration of Spatial-Temporal Knowledge: A Lightweight Approach to Atmospheric Time Series Forecasting

  • Yisong Fu
  • Fei Wang
  • Zezhi Shao
  • Boyu Diao
  • Lin Wu
  • Zhulin An
  • Chengqing Yu
  • Yujie Li

Transformers have gained attention in atmospheric time series forecasting (ATSF) for their ability to capture global spatial-temporal correlations. However, their complex architectures lead to excessive parameter counts and extended training times, limiting their scalability to large-scale forecasting. In this paper, we revisit ATSF from a theoretical perspective of atmospheric dynamics and uncover a key insight: spatial-temporal position embedding (STPE) can inherently model spatial-temporal correlations even without attention mechanisms. Its effectiveness arises from integrating geographical coordinates and temporal features, which are intrinsically linked to atmospheric dynamics. Based on this, we propose STELLA, a S patial- T emporal knowledge E mbedded L ightweight mode L for ASTF, utilizing only STPE and an MLP architecture in place of Transformer layers. With 10k parameters and one hour of training, STELLA achieves superior performance on five datasets compared to other advanced methods. The paper emphasizes the effectiveness of spatial-temporal knowledge integration over complex architectures, providing novel insights for ATSF.

JBHI Journal 2025 Journal Article

Paradigm-Shifting Attention-Based Hybrid View Learning for Enhanced Mammography Breast Cancer Classification With Multi-Scale and Multi-View Fusion

  • Haoran Zhao
  • Chengwei Zhang
  • Jiong Chen
  • Zhaotong Li
  • Fei Wang
  • Song Gao

Breast cancer poses a serious threat to women’s health, and its early detection is crucial for enhancing patient survival rates. While deep learning has significantly advanced mammographic image analysis, existing methods struggle to balance between view consistency with input adaptability. Furthermore, current models face challenges in accurately capturing multi-scale features, especially when subtle lesion variations across different scales are involved. To address this challenge, this paper proposes a Hybrid View Learning (HVL) paradigm that unifies traditional Single-View and Multi-View Learning approaches. The core component of this paradigm, our Attention-based Hybrid View Learning (AHVL) framework, incorporates two essential attention mechanisms: Contrastive Switch Attention (CSA) and Selective Pooling Attention (SPA). The CSA mechanism flexibly alternates between self-attention and cross-attention based on data integrity, integrating a pre-trained language model for contrastive learning to enhance model stability. Meanwhile, the SPA module employs multi-scale feature pooling and selection to capture critical features from mammographic images, overcoming the limitations of traditional models that struggle with fine-grained lesion detection. Experimental validation on the INbreast and CBIS-DDSM datasets shows that the AHVL framework outperforms both single-view and multi-view methods, especially under extreme view missing conditions. Even with an 80% missing rate on both datasets, AHVL maintains the highest accuracy and experiences the smallest performance decline in metrics like F1 score and AUC-PR, demonstrating its robustness and stability. This study redefines mammographic image analysis by leveraging attention-based hybrid view processing, setting a new standard for precise and efficient breast cancer diagnosis.

NeurIPS Conference 2025 Conference Paper

Selective Learning for Deep Time Series Forecasting

  • Yisong Fu
  • Zezhi Shao
  • Chengqing Yu
  • Yujie Li
  • Zhulin An
  • Qi Wang
  • Yongjun Xu
  • Fei Wang

Benefiting from high capacity for capturing complex temporal patterns, deep learning (DL) has significantly advanced time series forecasting (TSF). However, deep models tend to suffer from severe overfitting due to the inherent vulnerability of time series to noise and anomalies. The prevailing DL paradigm uniformly optimizes all timesteps through the MSE loss and learns those uncertain and anomalous timesteps without difference, ultimately resulting in overfitting. To address this, we propose a novel selective learning strategy for deep TSF. Specifically, selective learning screens a subset of the whole timesteps to calculate the MSE loss in optimization, guiding the model to focus on generalizable timesteps while disregarding non-generalizable ones. Our framework introduces a dual-mask mechanism to target timesteps: (1) an uncertainty mask leveraging residual entropy to filter uncertain timesteps, and (2) an anomaly mask employing residual lower bound estimation to exclude anomalous timesteps. Extensive experiments across eight real-world datasets demonstrate that selective learning can significantly improve the predictive performance for typical state-of-the-art deep models, including 37. 4% MSE reduction for Informer, 8. 4% for TimesNet, and 6. 5% for iTransformer.

NeurIPS Conference 2025 Conference Paper

SMARTraj$^2$: A Stable Multi-City Adaptive Method for Multi-View Spatio-Temporal Trajectory Representation Learning

  • Tangwen Qian
  • Junhe Li
  • Yile Chen
  • Gao Cong
  • Zezhi Shao
  • Jun Zhang
  • Tao Sun
  • Fei Wang

Spatio-temporal trajectory representation learning plays a crucial role in various urban applications such as transportation systems, urban planning, and environmental monitoring. Existing methods can be divided into single-view and multi-view approaches, with the latter offering richer representations by integrating multiple sources of spatio-temporal data. However, these methods often struggle to generalize across diverse urban scenes due to multi-city structural heterogeneity, which arises from the disparities in road networks, grid layouts, and traffic regulations across cities, and the amplified seesaw phenomenon, where optimizing for one city, view, or task can degrade performance in others. These challenges hinder the deployment of trajectory learning models across multiple cities, limiting their real-world applicability. In this work, we propose SMARTraj$^2$, a novel stable multi-city adaptive method for multi-view spatio-temporal trajectory representation learning. Specifically, we introduce a feature disentanglement module to separate domain-invariant and domain-specific features, and a personalized gating mechanism to dynamically stabilize the contributions of different views and tasks. Our approach achieves superior generalization across heterogeneous urban scenes while maintaining robust performance across multiple downstream tasks. Extensive experiments on benchmark datasets demonstrate the effectiveness of SMARTraj$^2$ in enhancing cross-city generalization and outperforming state-of-the-art methods. See our project website at \url{https: //github. com/GestaltCogTeam/SMARTraj}.

JBHI Journal 2024 Journal Article

A Coarse-Fine Collaborative Learning Model for Three Vessel Segmentation in Fetal Cardiac Ultrasound Images

  • Shan Ling
  • Laifa Yan
  • Rongsong Mao
  • Jizhou Li
  • Haoran Xi
  • Fei Wang
  • Xiaolin Li
  • Min He

Congenital heart disease (CHD) is the most frequent birth defect and a leading cause of infant mortality, emphasizing the crucial need for its early diagnosis. Ultrasound is the primary imaging modality for prenatal CHD screening. As a complement to the four-chamber view, the three-vessel view (3VV) plays a vital role in detecting anomalies in the great vessels. However, the interpretation of fetal cardiac ultrasound images is subjective and relies heavily on operator experience, leading to variability in CHD detection rates, particularly in resource-constrained regions. In this study, we propose an automated method for segmenting the pulmonary artery, ascending aorta, and superior vena cava in the 3VV using a novel deep learning network named CoFi-Net. Our network incorporates a coarse-fine collaborative strategy with two parallel branches dedicated to simultaneous global localization and fine segmentation of the vessels. The coarse branch employs a partial decoder to leverage high-level semantic features, enabling global localization of objects and suppression of irrelevant structures. The fine branch utilizes attention-parameterized skip connections to improve feature representations and improve boundary information. The outputs of the two branches are fused to generate accurate vessel segmentations. Extensive experiments conducted on a collected dataset demonstrate the superiority of CoFi-Net compared to state-of-the-art segmentation models for 3VV segmentation, indicating its great potential for enhancing CHD diagnostic efficiency in clinical practice. Furthermore, CoFi-Net outperforms other deep learning models in breast lesion segmentation on a public breast ultrasound dataset, despite not being specifically designed for this task, demonstrating its potential and robustness for various segmentation tasks.

IJCAI Conference 2024 Conference Paper

A Density-driven Iterative Prototype Optimization for Transductive Few-shot Learning

  • Jingcong Li
  • Chunjin Ye
  • Fei Wang
  • Jiahui Pan

Few-shot learning (FSL) poses a considerable challenge since it aims to improve the model generalization ability with limited labeled data. Previous works usually attempt to construct class-specific prototypes and then predict novel classes using these prototypes. However, the feature distribution represented by the limited labeled data is coarse-grained, leading to large information gap between the labeled and unlabeled data as well as biases in the prototypes. In this paper, we investigate the correlation between sample quality and density, and propose a Density-driven Iterative Prototype Optimization to acquire high-quality prototypes, and further improve few-shot learning performance. Specifically, the proposed method consists of two optimization strategies. The similarity-evaluating strategy is for capturing the information gap between the labeled and unlabeled data by reshaping the feature manifold for the novel feature distribution. The density-driven strategy is proposed to iteratively refine the prototypes in the direction of density growth. The proposed method could reach or even exceed the state-of-the-art performance on four benchmark datasets, including mini-ImageNet, tiered-ImageNet, CUB, and CIFAR-FS. The code will be available soon at https: //github. com/tailofcat/DIPO.

ICRA Conference 2024 Conference Paper

Cross-Modal Registration Using Adaptive Modeling in Infrastructure-based Vehicle Localization

  • Fei Wang
  • Yuesheng He
  • Hanyang Zhuang
  • Chenxi Yang 0002
  • Ming Yang 0002

Infrastructure-based vehicle localization, in comparison to single-agent approaches, offers several advantages including reduced system cost, extended perception range, enhanced data fusion capabilities, and energy savings. Many conventional approaches impose limitations on the types of objects due to the need for specific object-end modifications, such as applying perceptual markers like color-labeled plates and reflective balls. LiDAR presents a solution in terms of object arbitrariness, as it addresses the challenges of feature-free object modeling and continuous registration. However, achieving complete environmental coverage with LiDAR remains prohibitively expensive, particularly in extensive areas. Hence, this study proposes a cross-modal localization approach using adaptive modeling, employing LiDAR for object modeling and cost-effective sensor cameras for object tracking through image-point-cloud registration. Accurate correspondence between the model and observation can be estimated in real-time. The experiments are conducted in a typical scenario that requires adaptive modeling: Autonomous Valet Parking (AVP). Results demonstrate that the proposed system achieves comparable performance with significantly reduced system costs, highlighting its potential for large-scale deployment.

AAAI Conference 2024 Conference Paper

EulerMormer: Robust Eulerian Motion Magnification via Dynamic Filtering within Transformer

  • Fei Wang
  • Dan Guo
  • Kun Li
  • Meng Wang

Video Motion Magnification (VMM) aims to break the resolution limit of human visual perception capability and reveal the imperceptible minor motion that contains valuable information in the macroscopic domain. However, challenges arise in this task due to photon noise inevitably introduced by photographic devices and spatial inconsistency in amplification, leading to flickering artifacts in static fields and motion blur and distortion in dynamic fields in the video. Existing methods focus on explicit motion modeling without emphasizing prioritized denoising during the motion magnification process. This paper proposes a novel dynamic filtering strategy to achieve static-dynamic field adaptive denoising. Specifically, based on Eulerian theory, we separate texture and shape to extract motion representation through inter-frame shape differences, expecting to leverage these subdivided features to solve this task finely. Then, we introduce a novel dynamic filter that eliminates noise cues and preserves critical features in the motion magnification and amplification generation phases. Overall, our unified framework, EulerMormer, is a pioneering effort to first equip with Transformer in learning-based VMM. The core of the dynamic filter lies in a global dynamic sparse cross-covariance attention mechanism that explicitly removes noise while preserving vital information, coupled with a multi-scale dual-path gating mechanism that selectively regulates the dependence on different frequency features to reduce spatial attenuation and complement motion boundaries. We demonstrate extensive experiments that EulerMormer achieves more robust video motion magnification from the Eulerian perspective, significantly outperforming state-of-the-art methods. The source code is available at https://github.com/VUT-HFUT/EulerMormer.

NeurIPS Conference 2024 Conference Paper

MassSpecGym: A benchmark for the discovery and identification of molecules

  • Roman Bushuiev
  • Anton Bushuiev
  • Niek F. de Jonge
  • Adamo Young
  • Fleming Kretschmer
  • Raman Samusevich
  • Janne Heirman
  • Fei Wang

The discovery and identification of molecules in biological and environmental samples is crucial for advancing biomedical and chemical sciences. Tandem mass spectrometry (MS/MS) is the leading technique for high-throughput elucidation of molecular structures. However, decoding a molecular structure from its mass spectrum is exceptionally challenging, even when performed by human experts. As a result, the vast majority of acquired MS/MS spectra remain uninterpreted, thereby limiting our understanding of the underlying (bio)chemical processes. Despite decades of progress in machine learning applications for predicting molecular structures from MS/MS spectra, the development of new methods is severely hindered by the lack of standard datasets and evaluation protocols. To address this problem, we propose MassSpecGym -- the first comprehensive benchmark for the discovery and identification of molecules from MS/MS data. Our benchmark comprises the largest publicly available collection of high-quality MS/MS spectra and defines three MS/MS annotation challenges: \textit{de novo} molecular structure generation, molecule retrieval, and spectrum simulation. It includes new evaluation metrics and a generalization-demanding data split, therefore standardizing the MS/MS annotation tasks and rendering the problem accessible to the broad machine learning community. MassSpecGym is publicly available at \url{https: //github. com/pluskal-lab/MassSpecGym}.

JBHI Journal 2024 Journal Article

Radial Undersampled MRI Reconstruction Using Deep Learning With Mutual Constraints Between Real and Imaginary Components of K-Space

  • Zhaotong Li
  • Sha Li
  • Zeru Zhang
  • Fei Wang
  • Fengliang Wu
  • Song Gao

The deep learning method is an efficient solution for improving the quality of undersampled magnetic resonance (MR) image reconstruction while reducing lengthy data acquisition. Most deep learning methods neglect the mutual constraints between the real and imaginary components of complex-valued k-space data. In this paper, a new complex-valued convolutional neural network, namely, Dense-U-Dense Net (DUD-Net), is proposed to interpolate the undersampled k-space data and reconstruct MR images. The proposed network comprises dense layers, U-Net, and other dense layers in sequence. The dense layers are used to simulate the mutual constraints between real and imaginary components, and U-Net performs feature sparsity and interpolation estimation for k-space data. Two MRI datasets were used to evaluate the proposed method: brain magnitude-only MR images and knee complex-valued k-space data. Several operations were conducted for data preprocessing. First, the complex-valued MR images were synthesized by phase modulation on magnitude-only images. Second, a radial trajectory based on the golden angle was used for k-space undersampling, whereby a reversible normalization method was proposed to balance the distribution of positive and negative values in k-space data. The optimal performance of DUD-Net was demonstrated based on a quantitative evaluation of inter-method and intra-method comparisons. When compared with other methods, significant improvements were achieved, PSNRs were increased by 10. 78 and 5. 74dB, whereas RMSEs were decreased by 71. 53% and 30. 31% for magnitude and phase image, respectively. It is concluded that DUD-Net significantly improves the performance of MR image reconstruction.

NeurIPS Conference 2024 Conference Paper

Unified Insights: Harnessing Multi-modal Data for Phenotype Imputation via View Decoupling

  • Qiannan Zhang
  • Weishen Pan
  • Zilong Bai
  • Chang Su
  • Fei Wang

Phenotype imputation plays a crucial role in improving comprehensive and accurate medical evaluation, which in turn can optimize patient treatment and bolster the reliability of clinical research. Despite the adoption of various techniques, multi-modal biological data, which can provide crucial insights into a patient's overall health, is often overlooked. With multi-modal biological data, patient characterization can be enriched from two distinct views: the biological view and the phenotype view. However, the heterogeneity and imprecise nature of the multimodal data still pose challenges in developing an effective method to model from two views. In this paper, we propose a novel framework to incorporate multi-modal biological data via view decoupling. Specifically, we segregate the modeling of biological data from phenotype data in a graph-based learning framework. From the biological view, the latent factors in biological data are discovered to model patient correlation. From the phenotype view, phenotype co-occurrence can be modeled to reveal patterns across patients. Then patients are encoded from these two distinct views. To mitigate the influence of noise and irrelevant information in biological data, we devise a cross-view contrastive knowledge distillation aimed at distilling insights from the biological view to enhance phenotype imputation. We show that phenotype imputation with the proposed model significantly outperforms the state-of-the-art models on the real-world biomedical database.

JBHI Journal 2023 Journal Article

A Comparative Effectiveness Study on Opioid Use Disorder Prediction Using Artificial Intelligence and Existing Risk Models

  • Sajjad Fouladvand
  • Jeffery Talbert
  • Linda P. Dwoskin
  • Heather Bush
  • Amy L. Meadows
  • Lars E. Peterson
  • Yash R. Mishra
  • Steven K. Roggenkamp

Opioid use disorder (OUD) is a leading cause of death in the United States placing a tremendous burden on patients, their families, and health care systems. Artificial intelligence (AI) can be harnessed with available healthcare data to produce automated OUD prediction tools. In this retrospective study, we developed AI based models for OUD prediction and showed that AI can predict OUD more effectively than existing clinical tools including the unweighted opioid risk tool (ORT). Data include 474, 208 patients' data over 10 years; 269, 748 were females with an average age of 56. 78 years. Cases are prescription opioid users with at least one diagnosis of OUD or at least one prescription for buprenorphine or methadone. Controls are prescription opioid users with no OUD diagnoses or buprenorphine or methadone prescriptions. On 100 randomly selected test sets including 47, 396 patients, our proposed transformer-based AI model can predict OUD more efficiently (AUC = 0. 742 $\pm$ 0. 021) compared to logistic regression (AUC = 0. 651 $\pm$ 0. 025), random forest (AUC = 0. 679 $\pm$ 0. 026), xgboost (AUC = 0. 690 $\pm$ 0. 027), long short-term memory model (AUC = 0. 706 $\pm$ 0. 026), transformer (AUC = 0. 725 $\pm$ 0. 024), and unweighted ORT model (AUC = 0. 559 $\pm$ 0. 025). Our results show that embedding AI algorithms into clinical care may assist clinicians in risk stratification and management of patients receiving opioid therapy.

JBHI Journal 2023 Journal Article

An Enhanced EEG Microstate Recognition Framework Based on Deep Neural Networks: An Application to Parkinson's Disease

  • Chunguang Chu
  • Zhen Zhang
  • Zhenxi Song
  • Zifan Xu
  • Jiang Wang
  • Fei Wang
  • Wei Liu
  • Liying Lu

Variations in brain activity patterns reveal impairments of motor and cognitive functions in the human brain. Electroencephalogram (EEG) microstates embody brain activity patterns at a microscopic time scale. However, current microstate analysis method can only recognize less than 90% of EEG signals per subject, which severely limits the characterization of dynamic brain activity. As an application to early Parkinson's disease (PD), we propose an enhanced EEG microstate recognition framework based on deep neural networks, which yields recognition rates from 90% to 99%, as accompanied by a strong anti-artifact property. Additionally, gradient-weighted class activation mapping, as a visualization technique, is employed to locate the activated functional brain regions of each microstate class. We find that each microstate class corresponds to a particular activated brain region. Finally, based on the improved identification of microstate sequences, we explore the EEG microstate characteristics and their clinical associations. We show that the decreased occurrences of a particular microstate class reflect the degree of cognitive decline in early PD, and reduced transitions between certain microstates suggest injury in motor-related brain regions. The novel EEG microstate recognition framework paves the way to revealing more effective biomarkers for early PD.

NeurIPS Conference 2023 Conference Paper

FairLISA: Fair User Modeling with Limited Sensitive Attributes Information

  • Zheng Zhang
  • Qi Liu
  • Hao Jiang
  • Fei Wang
  • Yan Zhuang
  • Le Wu
  • Weibo Gao
  • Enhong Chen

User modeling techniques profile users' latent characteristics (e. g. , preference) from their observed behaviors, and play a crucial role in decision-making. Unfortunately, traditional user models may unconsciously capture biases related to sensitive attributes (e. g. , gender) from behavior data, even when this sensitive information is not explicitly provided. This can lead to unfair issues and discrimination against certain groups based on these sensitive attributes. Recent studies have been proposed to improve fairness by explicitly decorrelating user modeling results and sensitive attributes. However, most existing approaches assume that fully sensitive attribute labels are available in the training set, which is unrealistic due to collection limitations like privacy concerns, and hence bear the limitation of performance. In this paper, we focus on a practical situation with limited sensitive data and propose a novel FairLISA framework, which can efficiently utilize data with known and unknown sensitive attributes to facilitate fair model training. We first propose a novel theoretical perspective to build the relationship between data with both known and unknown sensitive attributes with the fairness objective. Then, based on this, we provide a general adversarial framework to effectively leverage the whole user data for fair user modeling. We conduct experiments on representative user modeling tasks including recommender system and cognitive diagnosis. The results demonstrate that our FairLISA can effectively improve fairness while retaining high accuracy in scenarios with different ratios of missing sensitive attributes.

JBHI Journal 2023 Journal Article

Graph Self-Supervised Learning With Application to Brain Networks Analysis

  • Guangqi Wen
  • Peng Cao
  • Lingwen Liu
  • Jinzhu Yang
  • Xizhe Zhang
  • Fei Wang
  • Osmar R. Zaiane

The less training data and insufficient supervision limit the performance of the deep supervised models for brain disease diagnosis. It is significant to construct a learning framework that can capture more information in limited data and insufficient supervision. To address these issues, we focus on self-supervised learning and aim to generalize the self-supervised learning to the brain networks, which are non-Euclidean graph data. More specifically, we propose an ensemble masked graph self-supervised framework named BrainGSLs, which incorporates 1) a local topological-aware encoder that takes the partially visible nodes as input and learns these latent representations, 2) a node-edge bi-decoder that reconstructs the masked edges by the representations of both the masked and visible nodes, 3) a signal representation learning module for capturing temporal representations from BOLD signals and 4) a classifier used for the classification. We evaluate our model on three real medical clinical applications: diagnosis of Autism Spectrum Disorder (ASD), diagnosis of Bipolar Disorder (BD) and diagnosis of Major Depressive Disorder (MDD). The results suggest that the proposed self-supervised training has led to remarkable improvement and outperforms state-of-the-art methods. Moreover, our method is able to identify the biomarkers associated with the diseases, which is consistent with the previous studies. We also explore the correlation of these three diseases and find the strong association between ASD and BD. To the best of our knowledge, our work is the first attempt of applying the idea of self-supervised learning with masked autoencoder on the brain network analysis.

NeurIPS Conference 2023 Conference Paper

Knowledge Diffusion for Distillation

  • Tao Huang
  • Yuan Zhang
  • Mingkai Zheng
  • Shan You
  • Fei Wang
  • Chen Qian
  • Chang Xu

The representation gap between teacher and student is an emerging topic in knowledge distillation (KD). To reduce the gap and improve the performance, current methods often resort to complicated training schemes, loss functions, and feature alignments, which are task-specific and feature-specific. In this paper, we state that the essence of these methods is to discard the noisy information and distill the valuable information in the feature, and propose a novel KD method dubbed DiffKD, to explicitly denoise and match features using diffusion models. Our approach is based on the observation that student features typically contain more noises than teacher features due to the smaller capacity of student model. To address this, we propose to denoise student features using a diffusion model trained by teacher features. This allows us to perform better distillation between the refined clean feature and teacher feature. Additionally, we introduce a light-weight diffusion model with a linear autoencoder to reduce the computation cost and an adaptive noise matching module to improve the denoising performance. Extensive experiments demonstrate that DiffKD is effective across various types of features and achieves state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at https: //github. com/hunto/DiffKD.

IJCAI Conference 2023 Conference Paper

Learning Attention from Attention: Efficient Self-Refinement Transformer for Face Super-Resolution

  • Guanxin Li
  • Jingang Shi
  • Yuan Zong
  • Fei Wang
  • Tian Wang
  • Yihong Gong

Recently, Transformer-based architecture has been introduced into face super-resolution task due to its advantage in capturing long-range dependencies. However, these approaches tend to integrate global information in a large searching region, which neglect to focus on the most relevant information and induce blurry effect by the irrelevant textures. Some improved methods simply constrain self-attention in a local window to suppress the useless information. But it also limits the capability of recovering high-frequency details when flat areas dominate the local searching window. To improve the above issues, we propose a novel self-refinement mechanism which could adaptively achieve texture-aware reconstruction in a coarse-to-fine procedure. Generally, the primary self-attention is first conducted to reconstruct the coarse-grained textures and detect the fine-grained regions required further compensation. Then, region selection attention is performed to refine the textures on these key regions. Since self-attention considers the channel information on tokens equally, we employ a dual-branch feature integration module to privilege the important channels in feature extraction. Furthermore, we design the wavelet fusion module which integrate shallow-layer structure and deep-layer detailed feature to recover realistic face images in frequency domain. Extensive experiments demonstrate the effectiveness on a variety of datasets.

JBHI Journal 2023 Journal Article

Personality in Daily Life: Multi-Situational Physiological Signals Reflect Big-Five Personality Traits

  • Xinyu Shui
  • Yiling Chen
  • Xin Hu
  • Fei Wang
  • Dan Zhang

The popularity of wearable physiological recording devices has opened up new possibilities for the assessment of personality traits in everyday life. Compared with traditional questionnaires or laboratory assessments, wearable device-based measurements can collect rich data about individual physiological activities in real-life situations without interfering with normal life, enabling a more comprehensive description of individual differences. The present study aimed to explore the assessment of individuals’ Big-Five personality traits by physiological signals in daily life situations. A commercial bracelet was used to track the heart rate (HR) data from eighty college students (all male) enrolled in a special training program with a strictly-controlled daily schedule for ten consecutive working days. Their HR activities were divided into five daily situations (morning exercise, morning classes, afternoon classes, free time in the evening, and self-study situations) according to their daily schedule. Regression analyses with HR-based features in these five situations averaged across the ten days revealed significant cross-validated quantitative prediction correlations of 0. 32 and 0. 26 for the dimensions of Openness and Extraversion, with the prediction correlation trending significance for Conscientiousness and Neuroticism. Moreover, the multi-situation HR-based results were in general superior to those based on single-situation HR-based features, as well as those based on the multi-situation self-reported emotion ratings. Togetherour findings demonstrate the link between personality and daily HR measures using state-of-the-art commercial devices and could shed light on the development of Big-Five personality assessment based on daily multi-situation physiological measures.

NeurIPS Conference 2022 Conference Paper

Contrastive Graph Structure Learning via Information Bottleneck for Recommendation

  • Chunyu Wei
  • Jian Liang
  • Di Liu
  • Fei Wang

Graph convolution networks (GCNs) for recommendations have emerged as an important research topic due to their ability to exploit higher-order neighbors. Despite their success, most of them suffer from the popularity bias brought by a small number of active users and popular items. Also, a real-world user-item bipartite graph contains many noisy interactions, which may hamper the sensitive GCNs. Graph contrastive learning show promising performance for solving the above challenges in recommender systems. Most existing works typically perform graph augmentation to create multiple views of the original graph by randomly dropping edges/nodes or relying on predefined rules, and these augmented views always serve as an auxiliary task by maximizing their correspondence. However, we argue that the graph structures generated from these vanilla approaches may be suboptimal, and maximizing their correspondence will force the representation to capture information irrelevant for the recommendation task. Here, we propose a Contrastive Graph Structure Learning via Information Bottleneck (CGI) for recommendation, which adaptively learns whether to drop an edge or node to obtain optimized graph structures in an end-to-end manner. Moreover, we innovatively introduce the Information Bottleneck into the contrastive learning process to avoid capturing irrelevant information among different views and help enrich the final representation for recommendation. Extensive experiments on public datasets are provided to show that our model significantly outperforms strong baselines.

AAAI Conference 2022 Conference Paper

Deep Recurrent Neural Network with Multi-Scale Bi-directional Propagation for Video Deblurring

  • Chao Zhu
  • Hang Dong
  • Jinshan Pan
  • Boyang Liang
  • Yuhao Huang
  • Lean Fu
  • Fei Wang

The success of the state-of-the-art video deblurring methods stems mainly from implicit or explicit estimation of alignment among the adjacent frames for latent video restoration. However, due to the influence of the blur effect, estimating the alignment information from the blurry adjacent frames is not a trivial task. Inaccurate estimations will interfere the following frame restoration. Instead of estimating alignment information, we propose a simple and effective deep Recurrent Neural Network with Multi-scale Bidirectional Propagation (RNN-MBP) to effectively propagate and gather the information from unaligned neighboring frames for better video deblurring. Specifically, we build a Multi-scale Bi-directional Propagation (MBP) module with two U-Net RNN cells which can directly exploit the interframe information from unaligned neighboring hidden states by integrating them in different scales. Moreover, to better evaluate the proposed algorithm and existing state-of-the-art methods on real-world blurry scenes, we also create a Real- World Blurry Video Dataset (RBVD) by a well-designed Digital Video Acquisition System (DVAS) and use it as the training and evaluation dataset. Extensive experimental results demonstrate that the proposed RBVD dataset effectively improves the performance of existing algorithms on real-world blurry videos, and the proposed algorithm performs favorably against the state-of-the-art methods on three typical benchmarks. The code is available at https: //github. com/XJTU- CVLAB-LOWLEVEL/RNN-MBP.

NeurIPS Conference 2022 Conference Paper

Green Hierarchical Vision Transformer for Masked Image Modeling

  • Lang Huang
  • Shan You
  • Mingkai Zheng
  • Fei Wang
  • Chen Qian
  • Toshihiko Yamasaki

We present an efficient approach for Masked Image Modeling (MIM) with hierarchical Vision Transformers (ViTs), allowing the hierarchical ViTs to discard masked patches and operate only on the visible ones. Our approach consists of three key designs. First, for window attention, we propose a Group Window Attention scheme following the Divide-and-Conquer strategy. To mitigate the quadratic complexity of the self-attention w. r. t. the number of patches, group attention encourages a uniform partition that visible patches within each local window of arbitrary size can be grouped with equal size, where masked self-attention is then performed within each group. Second, we further improve the grouping strategy via the Dynamic Programming algorithm to minimize the overall computation cost of the attention on the grouped patches. Third, as for the convolution layers, we convert them to the Sparse Convolution that works seamlessly with the sparse data, i. e. , the visible patches in MIM. As a result, MIM can now work on most, if not all, hierarchical ViTs in a green and efficient way. For example, we can train the hierarchical ViTs, e. g. , Swin Transformer and Twins Transformer, about 2. 7$\times$ faster and reduce the GPU memory usage by 70%, while still enjoying competitive performance on ImageNet classification and the superiority on downstream COCO object detection benchmarks.

IJCAI Conference 2022 Conference Paper

IDPT: Interconnected Dual Pyramid Transformer for Face Super-Resolution

  • Jingang Shi
  • Yusi Wang
  • Songlin Dong
  • Xiaopeng Hong
  • Zitong Yu
  • Fei Wang
  • Changxin Wang
  • Yihong Gong

Face Super-resolution (FSR) task works for generating high-resolution (HR) face images from the corresponding low-resolution (LR) inputs, which has received a lot of attentions because of the wide application prospects. However, due to the diversity of facial texture and the difficulty of reconstructing detailed content from degraded images, FSR technology is still far away from being solved. In this paper, we propose a novel and effective face super-resolution framework based on Transformer, namely Interconnected Dual Pyramid Transformer (IDPT). Instead of straightly stacking cascaded feature reconstruction blocks, the proposed IDPT designs the pyramid encoder/decoder Transformer architecture to extract coarse and detailed facial textures respectively, while the relationship between the dual pyramid Transformers is further explored by a bottom pyramid feature extractor. The pyramid encoder/decoder structure is devised to adapt various characteristics of textures in different spatial spaces hierarchically. A novel fusing modulation module is inserted in each spatial layer to guide the refinement of detailed texture by the corresponding coarse texture, while fusing the shallow-layer coarse feature and corresponding deep-layer detailed feature simultaneously. Extensive experiments and visualizations on various datasets demonstrate the superiority of the proposed method for face super-resolution tasks.

NeurIPS Conference 2022 Conference Paper

Knowledge Distillation from A Stronger Teacher

  • Tao Huang
  • Shan You
  • Fei Wang
  • Chen Qian
  • Chang Xu

Unlike existing knowledge distillation methods focus on the baseline settings, where the teacher models and training strategies are not that strong and competing as state-of-the-art approaches, this paper presents a method dubbed DIST to distill better from a stronger teacher. We empirically find that the discrepancy of predictions between the student and a stronger teacher may tend to be fairly severer. As a result, the exact match of predictions in KL divergence would disturb the training and make existing methods perform poorly. In this paper, we show that simply preserving the relations between the predictions of teacher and student would suffice, and propose a correlation-based loss to capture the intrinsic inter-class relations from the teacher explicitly. Besides, considering that different instances have different semantic similarities to each class, we also extend this relational match to the intra-class level. Our method is simple yet practical, and extensive experiments demonstrate that it adapts well to various architectures, model sizes and training strategies, and can achieve state-of-the-art performance consistently on image classification, object detection, and semantic segmentation tasks. Code is available at: https: //github. com/hunto/DIST_KD.

AAAI Conference 2022 Conference Paper

Uncertainty-Aware Learning against Label Noise on Imbalanced Datasets

  • Yingsong Huang
  • Bing Bai
  • Shengwei Zhao
  • Kun Bai
  • Fei Wang

Learning against label noise is a vital topic to guarantee a reliable performance for deep neural networks. Recent research usually refers to dynamic noise modeling with model output probabilities and loss values, and then separates clean and noisy samples. These methods have gained notable success. However, unlike cherry-picked data, existing approaches often cannot perform well when facing imbalanced datasets, a common scenario in the real world. We thoroughly investigate this phenomenon and point out two major issues that hinder the performance, i. e. , inter-class loss distribution discrepancy and misleading predictions due to uncertainty. The first issue is that existing methods often perform class-agnostic noise modeling. However, loss distributions show a significant discrepancy among classes under class imbalance, and class-agnostic noise modeling can easily get confused with noisy samples and samples in minority classes. The second issue refers to that models may output misleading predictions due to epistemic uncertainty and aleatoric uncertainty, thus existing methods that rely solely on the output probabilities may fail to distinguish confident samples. Inspired by our observations, we propose an Uncertainty-aware Label Correction framework (ULC) to handle label noise on imbalanced datasets. First, we perform epistemic uncertainty-aware classspecific noise modeling to identify trustworthy clean samples and refine/discard highly confident true/corrupted labels. Then, we introduce aleatoric uncertainty in the subsequent learning process to prevent noise accumulation in the label noise modeling process. We conduct experiments on several synthetic and real-world datasets. The results demonstrate the effectiveness of the proposed method, especially on imbalanced datasets.

NeurIPS Conference 2021 Conference Paper

Addressing Algorithmic Disparity and Performance Inconsistency in Federated Learning

  • Sen Cui
  • Weishen Pan
  • Jian Liang
  • Changshui Zhang
  • Fei Wang

Federated learning (FL) has gain growing interests for its capability of learning from distributed data sources collectively without the need of accessing the raw data samples across different sources. So far FL research has mostly focused on improving the performance, how the algorithmic disparity will be impacted for the model learned from FL and the impact of algorithmic disparity on the utility inconsistency are largely unexplored. In this paper, we propose an FL framework to jointly consider performance consistency and algorithmic fairness across different local clients (data sources). We derive our framework from a constrained multi-objective optimization perspective, in which we learn a model satisfying fairness constraints on all clients with consistent performance. Specifically, we treat the algorithm prediction loss at each local client as an objective and maximize the worst-performing client with fairness constraints through optimizing a surrogate maximum function with all objectives involved. A gradient-based procedure is employed to achieve the Pareto optimality of this optimization problem. Theoretical analysis is provided to prove that our method can converge to a Pareto solution that achieves the min-max performance with fairness constraints on all clients. Comprehensive experiments on synthetic and real-world datasets demonstrate the superiority that our approach over baselines and its effectiveness in achieving both fairness and consistency across all local clients.

AAAI Conference 2021 Conference Paper

Fast Multi-view Discrete Clustering with Anchor Graphs

  • Qianyao Qiang
  • Bin Zhang
  • Fei Wang
  • Feiping Nie

Generally, the existing graph-based multi-view clustering models consists of two steps: (1) graph construction; (2) eigendecomposition on the graph Laplacian matrix to compute a continuous cluster assignment matrix, followed by a postprocessing algorithm to get the discrete one. However, both the graph construction and eigen-decomposition are timeconsuming, and the two-stage process may deviate from directly solving the primal problem. To this end, we propose Fast Multi-view Discrete Clustering (FMDC) with anchor graphs, focusing on directly solving the spectral clustering problem with a small time cost. We efficiently generate representative anchors and construct anchor graphs on different views. The discrete cluster assignment matrix is directly obtained by performing clustering on the automatically aggregated graph. FMDC has a linear computational complexity with respect to the data scale, which is a significant improvement compared to the quadratic one. Extensive experiments on benchmark datasets demonstrate its efficiency and effectiveness.

JBHI Journal 2021 Journal Article

Human Protein Complex-Based Drug Signatures for Personalized Cancer Medicine

  • Fei Wang
  • Yulian Ding
  • Xiujuan Lei
  • Bo Liao
  • Fang-Xiang Wu

Disease signature-based drug repositioning approaches typically first identify a disease signature from gene expression profiles of disease samples to represent a particular disease. Then such a disease signature is connected with the drug-induced gene expression profiles to find potential drugs for the particular disease. In order to obtain reliable disease signatures, the size of disease samples should be large enough, which is not always a single case in practice, especially for personalized medicine. On the other hand, the sample sizes of drug-induced gene expression profiles are generally large. In this study, we propose a new drug repositioning approach (HDgS), in which the drug signature is first identified from drug-induced gene expression profiles, and then connected to the gene expression profiles of disease samples to find the potential drugs for patients. In order to take the dependencies among genes into account, the human protein complexes (HPC) are used to define the drug signature. The proposed HDgS is applied to the drug-induced gene expression profiles in LINCS and several types of cancer samples. The results indicate that the HPC-based drug signature can effectively find drug candidates for patients and that the proposed HDgS can be applied for personalized medicine with even one patient sample.

JMLR Journal 2021 Journal Article

Method of Contraction-Expansion (MOCE) for Simultaneous Inference in Linear Models

  • Fei Wang
  • Ling Zhou
  • Lu Tang
  • Peter X.K. Song

Simultaneous inference after model selection is of critical importance to address scientific hypotheses involving a set of parameters. In this paper, we consider a high-dimensional linear regression model in which a regularization procedure such as LASSO is applied to yield a sparse model. To establish a simultaneous post-model selection inference, we propose a method of contraction and expansion (MOCE) along the line of debiasing estimation in that we investigate a desirable trade-off between model selection variability and sample variability by the means of forward screening. We establish key theoretical results for the inference from the proposed MOCE procedure. Once the expanded model is properly selected, the theoretical guarantees and simultaneous confidence regions can be constructed by the joint asymptotic normal distribution. In comparison with existing methods, our proposed method exhibits stable and reliable coverage at a nominal significance level and enjoys substantially less computational burden. Thus, our MOCE approach is trustworthy in solving real-world problems. [abs] [ pdf ][ bib ] &copy JMLR 2021. ( edit, beta )

NeurIPS Conference 2021 Conference Paper

ReSSL: Relational Self-Supervised Learning with Weak Augmentation

  • Mingkai Zheng
  • Shan You
  • Fei Wang
  • Chen Qian
  • Changshui Zhang
  • Xiaogang Wang
  • Chang Xu

Self-supervised Learning (SSL) including the mainstream contrastive learning has achieved great success in learning visual representations without data annotations. However, most of methods mainly focus on the instance level information (\ie, the different augmented images of the same instance should have the same feature or cluster into the same class), but there is a lack of attention on the relationships between different instances. In this paper, we introduced a novel SSL paradigm, which we term as relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances. Specifically, our proposed method employs sharpened distribution of pairwise similarities among different instances as \textit{relation} metric, which is thus utilized to match the feature embeddings of different augmentations. Moreover, to boost the performance, we argue that weak augmentations matter to represent a more reliable relation, and leverage momentum strategy for practical efficiency. Experimental results show that our proposed ReSSL significantly outperforms the previous state-of-the-art algorithms in terms of both performance and training efficiency.

AAAI Conference 2020 Conference Paper

A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation

  • Yihua Cheng
  • Shiyao Huang
  • Fei Wang
  • Chen Qian
  • Feng Lu

Human gaze is essential for various appealing applications. Aiming at more accurate gaze estimation, a series of recent works propose to utilize face and eye images simultaneously. Nevertheless, face and eye images only serve as independent or parallel feature sources in those works, the intrinsic correlation between their features is overlooked. In this paper we make the following contributions: 1) We propose a coarseto-fine strategy which estimates a basic gaze direction from face image and refines it with corresponding residual predicted from eye images. 2) Guided by the proposed strategy, we design a framework which introduces a bi-gram model to bridge gaze residual and basic gaze direction, and an attention component to adaptively acquire suitable fine-grained feature. 3) Integrating the above innovations, we construct a coarse-to-fine adaptive network named CA-Net and achieve state-of-the-art performances on MPIIGaze and EyeDiap.

NeurIPS Conference 2020 Conference Paper

Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space

  • Shangchen Du
  • Shan You
  • Xiaojie Li
  • Jianlong Wu
  • Fei Wang
  • Chen Qian
  • Changshui Zhang

Distilling knowledge from an ensemble of teacher models is expected to have a more promising performance than that from a single one. Current methods mainly adopt a vanilla average rule, i. e. , to simply take the average of all teacher losses for training the student network. However, this approach treats teachers equally and ignores the diversity among them. When conflicts or competitions exist among teachers, which is common, the inner compromise might hurt the distillation performance. In this paper, we examine the diversity of teacher models in the gradient space and regard the ensemble knowledge distillation as a multi-objective optimization problem so that we can determine a better optimization direction for the training of student network. Besides, we also introduce a tolerance parameter to accommodate disagreement among teachers. In this way, our method can be seen as a dynamic weighting method for each teacher in the ensemble. Extensive experiments validate the effectiveness of our method for both logits-based and feature-based cases.

AAAI Conference 2020 Conference Paper

Federated Patient Hashing

  • Jie Xu
  • Zhenxing Xu
  • Peter Walker
  • Fei Wang

Privacy concerns on sharing sensitive data across institutions are particularly paramount for the medical domain, which hinders the research and development of many applications, such as cohort construction for cross-institution observational studies and disease surveillance. Not only that, the large volume and heterogeneity of the patient data pose great challenges for retrieval and analysis. To address these challenges, in this paper, we propose a Federated Patient Hashing (FPH) framework, which collaboratively trains a retrieval model stored in a shared memory while keeping all the patientlevel information in local institutions. Specifically, the objective function is constructed by minimization of a similarity preserving loss and a heterogeneity digging loss, which preserves both inter-data and intra-data relationships. Then, by leveraging the concept of Bregman divergence, we implement optimization in a federated manner in both centralized and decentralized learning settings, without accessing the raw training data across institutions. In addition to this, we also analyze the convergence rate of the FPH framework. Extensive experiments on real-world clinical data set from critical care are provided to demonstrate the effectiveness of the proposed method on similar patient matching across institutions.

NeurIPS Conference 2020 Conference Paper

ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

  • Yibo Yang
  • Hongyang Li
  • Shan You
  • Fei Wang
  • Chen Qian
  • Zhouchen Lin

Neural architecture search (NAS) aims to produce the optimal sparse solution from a high-dimensional space spanned by all candidate connections. Current gradient-based NAS methods commonly ignore the constraint of sparsity in the search phase, but project the optimized solution onto a sparse one by post-processing. As a result, the dense super-net for search is inefficient to train and has a gap with the projected architecture for evaluation. In this paper, we formulate neural architecture search as a sparse coding problem. We perform the differentiable search on a compressed lower-dimensional space that has the same validation loss as the original sparse solution space, and recover an architecture by solving the sparse coding problem. The differentiable search and architecture recovery are optimized in an alternate manner. By doing so, our network for search at each update satisfies the sparsity constraint and is efficient to train. In order to also eliminate the depth and width gap between the network in search and the target-net in evaluation, we further propose a method to search and evaluate in one stage under the target-net settings. When training finishes, architecture variables are absorbed into network weights. Thus we get the searched architecture and optimized parameters in a single run. In experiments, our two-stage method on CIFAR-10 requires only 0. 05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.

JBHI Journal 2020 Journal Article

MFB-CBRNN: A Hybrid Network for MI Detection Using 12-Lead ECGs

  • Wenhan Liu
  • Fei Wang
  • Qijun Huang
  • Sheng Chang
  • Hao Wang
  • Jin He

This paper proposes a novel hybrid network named multiple-feature-branch convolutional bidirectional recurrent neural network (MFB-CBRNN) for myocardial infarction (MI) detection using 12-lead ECGs. The model efficiently combines convolutional neural network-based and recurrent neural network-based structures. Each feature branch consists of several one-dimensional convolutional and pooling layers, corresponding to a certain lead. All the feature branches are independent from each other, which are utilized to learn the diverse features from different leads. Moreover, a bidirectional long short term memory network is employed to summarize all the feature branches. Its good ability of feature aggregation has been proved by the experiments. Furthermore, the paper develops a novel optimization method, lead random mask (LRM), to alleviate overfitting and implement an implicit ensemble like dropout. The model with LRM can achieve a more accurate MI detection. Class-based and subject-based fivefold cross validations are both carried out using Physikalisch-Technische Bundesanstalt diagnostic database. Totally, there are 148 MI and 52 healthy control subjects involved in the experiments. The MFB-CBRNN achieves an overall accuracy of 99. 90% in class-based experiments, and an overall accuracy of 93. 08% in subject-based experiments. Compared with other related studies, our algorithm achieves a comparable or even better result on MI detection. Therefore, the MFB-CBRNN has a good generalization capacity and is suitable for MI detection using 12-lead ECGs. It has a potential to assist the real-world MI diagnostics and reduce the burden of cardiologists.

AAAI Conference 2020 Conference Paper

Neural Cognitive Diagnosis for Intelligent Education Systems

  • Fei Wang
  • Qi Liu
  • Enhong Chen
  • Zhenya Huang
  • Yuying Chen
  • Yu Yin
  • Zai Huang
  • Shijin Wang

Cognitive diagnosis is a fundamental issue in intelligent education, which aims to discover the proficiency level of students on specific knowledge concepts. Existing approaches usually mine linear interactions of student exercising process by manual-designed function (e. g. , logistic function), which is not sufficient for capturing complex relations between students and exercises. In this paper, we propose a general Neural Cognitive Diagnosis (NeuralCD) framework, which incorporates neural networks to learn the complex exercising interactions, for getting both accurate and interpretable diagnosis results. Specifically, we project students and exercises to factor vectors and leverage multi neural layers for modeling their interactions, where the monotonicity assumption is applied to ensure the interpretability of both factors. Furthermore, we propose two implementations of NeuralCD by specializing the required concepts of each exercise, i. e. , the NeuralCDM with traditional Q-matrix and the improved NeuralCDM+ exploring the rich text content. Extensive experimental results on real-world datasets show the effectiveness of NeuralCD framework with both accuracy and interpretability.

AAAI Conference 2020 Short Paper

Structure-Based Drug-Drug Interaction Detection via Expressive Graph Convolutional Networks and Deep Sets (Student Abstract)

  • Mengying Sun
  • Fei Wang
  • Olivier Elemento
  • Jiayu Zhou

In this work, we proposed a DDI detection method based on molecular structures using graph convolutional networks and deep sets. We proposed a more discriminative convolutional layer compared to conventional GCN and achieved permutation invariant prediction without losing the capability of capturing complicated interactions.

AAAI Conference 2019 Conference Paper

A Neural Multi-Task Learning Framework to Jointly Model Medical Named Entity Recognition and Normalization

  • Sendong Zhao
  • Ting Liu
  • Sicheng Zhao
  • Fei Wang

State-of-the-art studies have demonstrated the superiority of joint modeling over pipeline implementation for medical named entity recognition and normalization due to the mutual benefits between the two processes. To exploit these benefits in a more sophisticated way, we propose a novel deep neural multi-task learning framework with explicit feedback strategies to jointly model recognition and normalization. On one hand, our method benefits from the general representations of both tasks provided by multi-task learning. On the other hand, our method successfully converts hierarchical tasks into a parallel multi-task setting while maintaining the mutual supports between tasks. Both of these aspects improve the model performance. Experimental results demonstrate that our method performs significantly better than state-of-theart approaches on two publicly available medical literature datasets.

AAAI Conference 2019 Conference Paper

Compressing Recurrent Neural Networks with Tensor Ring for Action Recognition

  • Yu Pan
  • Jing Xu
  • Maolin Wang
  • Jinmian Ye
  • Fei Wang
  • Kun Bai
  • Zenglin Xu

Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling. The hidden layers in RNNs can be regarded as the memory units, which are helpful in storing information in sequential contexts. However, when dealing with high dimensional input data, such as video and text, the input-to-hidden linear transformation in RNNs brings high memory usage and huge computational cost. This makes the training of RNNs very difficult. To address this challenge, we propose a novel compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring decomposition (TRD) to reformulate the input-to-hidden transformation. Compared with other tensor decomposition methods, TR-LSTM is more stable. In addition, TR-LSTM can complete an end-to-end training and also provide a fundamental building block for RNNs in handling large input data. Experiments on real-world action recognition datasets have demonstrated the promising performance of the proposed TR-LSTM compared with the tensor-train LSTM and other state-of-the-art competitors.

JBHI Journal 2019 Journal Article

Efficient Mining Template of Predictive Temporal Clinical Event Patterns From Patient Electronic Medical Records

  • Jianqiang Li
  • Xiyue Tan
  • Xi Xu
  • Fei Wang

Exploring the temporal relationship among events in patient electronic medical records (EMR) is an important problem in biomedical informatics and the results can reveal patients’ impending disease conditions. In this paper, we investigate the problem of mining patterns from a sequence of point events, i. e. , we only have the information on when the event happens but no duration or numerical value available. We propose a whole pipeline, including event preprocessing, pattern mining, and outcome analysis to mine the patterns and evaluate their effectiveness and discriminative power. Finally, we treat those mined patterns as additional features and evaluate them in a predictive modeling task for the early detection of congestive heart failure. On a real-world EMR data warehouse, we found that by adding those sequential pattern features, the prediction performance could be significantly improved approximately 0. 1.

NeurIPS Conference 2019 Conference Paper

Glyce: Glyph-vectors for Chinese Character Representations

  • Yuxian Meng
  • Wei Wu
  • Fei Wang
  • Xiaoya Li
  • Ping Nie
  • Fan Yin
  • Muyu Li
  • Qinghong Han

It is intuitive that NLP tasks for logographic languages like Chinese should benefit from the use of the glyph information in those languages. However, due to the lack of rich pictographic evidence in glyphs and the weak generalization ability of standard computer vision models on character data, an effective way to utilize the glyph information remains to be found. In this paper, we address this gap by presenting Glyce, the glyph-vectors for Chinese character representations. We make three major innovations: (1) We use historical Chinese scripts (e. g. , bronzeware script, seal script, traditional Chinese, etc) to enrich the pictographic evidence in characters; (2) We design CNN structures (called tianzege-CNN) tailored to Chinese character image processing; and (3) We use image-classification as an auxiliary task in a multi-task learning setup to increase the model's ability to generalize. We show that glyph-based models are able to consistently outperform word/char ID-based models in a wide range of Chinese NLP tasks. When combing with BERT, we are able to set new state-of-the-art results for a variety of Chinese NLP tasks, including language modeling, tagging (NER, CWS, POS), sentence pair classification (BQ, LCQMC, XNLI, NLPCC-DBQA), single sentence classification tasks (ChnSentiCorp, the Fudan corpus, iFeng), dependency parsing, and semantic role labeling. For example, the proposed model achieves an F1 score of 81. 6 on the OntoNotes dataset of NER, +1. 5 over BERT; it achieves an almost perfect accuracy of 99. 8\% on the the Fudan corpus for text classification.

AAAI Conference 2019 Conference Paper

Robustness Can Be Cheap: A Highly Efficient Approach to Discover Outliers under High Outlier Ratios

  • Siqi Wang
  • En Zhu
  • Xiping Hu
  • Xinwang Liu
  • Qiang Liu
  • Jianping Yin
  • Fei Wang

Efficient detection of outliers from massive data with a high outlier ratio is challenging but not explicitly discussed yet. In such a case, existing methods either suffer from poor robustness or require expensive computations. This paper proposes a Low-rank based Efficient Outlier Detection (LEOD) framework to achieve favorable robustness against high outlier ratios with much cheaper computations. Specifically, it is worth highlighting the following aspects of LEOD: (1) Our framework exploits the low-rank structure embedded in the similarity matrix and considers inliers/outliers equally based on this low-rank structure, which facilitates us to encourage satisfying robustness with low computational cost later; (2) A novel re-weighting algorithm is derived as a new general solution to the constrained eigenvalue problem, which is a major bottleneck for the optimization process. Instead of the high space and time complexity (O((2n)2 )/O((2n)3 )) required by the classic solution, our algorithm enjoys O(n) space complexity and a faster optimization speed in the experiments; (3) A new alternative formulation is proposed for further acceleration of the solution process, where a cheap closed-form solution can be obtained. Experiments show that LEOD achieves strong robustness under an outlier ratio from 20% to 60%, while it is at most 100 times more memory efficient and 1000 times faster than its previous counterpart that attains comparable performance. The codes of LEOD are publicly available at https: //github. com/demonzyj56/LEOD.

NeurIPS Conference 2018 Conference Paper

Backpropagation with Callbacks: Foundations for Efficient and Expressive Differentiable Programming

  • Fei Wang
  • James Decker
  • Xilun Wu
  • Gregory Essertel
  • Tiark Rompf

Training of deep learning models depends on gradient descent and end-to-end differentiation. Under the slogan of differentiable programming, there is an increasing demand for efficient automatic gradient computation for emerging network architectures that incorporate dynamic control flow, especially in NLP. In this paper we propose an implementation of backpropagation using functions with callbacks, where the forward pass is executed as a sequence of function calls, and the backward pass as a corresponding sequence of function returns. A key realization is that this technique of chaining callbacks is well known in the programming languages community as continuation-passing style (CPS). Any program can be converted to this form using standard techniques, and hence, any program can be mechanically converted to compute gradients. Our approach achieves the same flexibility as other reverse-mode automatic differentiation (AD) techniques, but it can be implemented without any auxiliary data structures besides the function call stack, and it can easily be combined with graph construction and native code generation techniques through forms of multi-stage programming, leading to a highly efficient implementation that combines the performance benefits of define-then-run software frameworks such as TensorFlow with the expressiveness of define-by-run frameworks such as PyTorch.

NeurIPS Conference 2018 Conference Paper

Boosted Sparse and Low-Rank Tensor Regression

  • Lifang He
  • Kun Chen
  • Wanwan Xu
  • Jiayu Zhou
  • Fei Wang

We propose a sparse and low-rank tensor regression model to relate a univariate outcome to a feature tensor, in which each unit-rank tensor from the CP decomposition of the coefficient tensor is assumed to be sparse. This structure is both parsimonious and highly interpretable, as it implies that the outcome is related to the features through a few distinct pathways, each of which may only involve subsets of feature dimensions. We take a divide-and-conquer strategy to simplify the task into a set of sparse unit-rank tensor regression problems. To make the computation efficient and scalable, for the unit-rank tensor regression, we propose a stagewise estimation procedure to efficiently trace out its entire solution path. We show that as the step size goes to zero, the stagewise solution paths converge exactly to those of the corresponding regularized regression. The superior performance of our approach is demonstrated on various real-world and synthetic examples.

IJCAI Conference 2018 Conference Paper

Drug Similarity Integration Through Attentive Multi-view Graph Auto-Encoders

  • Tengfei Ma
  • Cao Xiao
  • Jiayu Zhou
  • Fei Wang

Drug similarity has been studied to support downstream clinical tasks such as inferring novel properties of drugs (e. g. side effects, indications, interactions) from known properties. The growing availability of new types of drug features brings the opportunity of learning a more comprehensive and accurate drug similarity that represents the full spectrum of underlying drug relations. However, it is challenging to integrate these heterogeneous, noisy, nonlinear-related information to learn accurate similarity measures especially when labels are scarce. Moreover, there is a trade-off between accuracy and interpretability. In this paper, we propose to learn accurate and interpretable similarity measures from multiple types of drug features. In particular, we model the integration using multi-view graph auto-encoders, and add attentive mechanism to determine the weights for each view with respect to corresponding tasks and features for better interpretability. Our model has flexible design for both semi-supervised and unsupervised settings. Experimental results demonstrated significant predictive accuracy improvement. Case studies also showed better model capacity (e. g. embed node features) and interpretability.

IJCAI Conference 2018 Conference Paper

Pairwise-Ranking based Collaborative Recurrent Neural Networks for Clinical Event Prediction

  • Zhi Qiao
  • Shiwan Zhao
  • Cao Xiao
  • Xiang Li
  • Yong Qin
  • Fei Wang

Patient Electronic Health Records (EHR) data consist of sequences of patient visits over time. Sequential prediction of patients' future clinical events (e. g. , diagnoses) from their historical EHR data is a core research task and motives a series of predictive models including deep learning. The existing research mainly adopts a classification framework, which treats the observed and unobserved events as positive and negative classes. However, this may not be true in real clinical setting considering the high rate of missed diagnoses and human errors. In this paper, we propose to formulate the clinical event prediction problem as an events recommendation problem. An end-to-end pairwise-ranking based collaborative recurrent neural networks (PacRNN) is proposed to solve it, which firstly embeds patient clinical contexts with attention RNN, then uses Bayesian Personalized Ranking (BPR) regularized by disease co-occurrence to rank probabilities of patient-specific diseases, as well as use point process to provide simultaneous prediction of the occurring time of these diagnoses. Experimental results on two real world EHR datasets demonstrate the robust performance, interpretability, and efficacy of PacRNN.

AAAI Conference 2018 Conference Paper

Structural Deep Embedding for Hyper-Networks

  • Ke Tu
  • Peng Cui
  • Xiao Wang
  • Fei Wang
  • Wenwu Zhu

Network embedding has recently attracted lots of attentions in data mining. Existing network embedding methods mainly focus on networks with pairwise relationships. In real world, however, the relationships among data points could go beyond pairwise, i. e. , three or more objects are involved in each relationship represented by a hyperedge, thus forming hyper-networks. These hyper-networks pose great challenges to existing network embedding methods when the hyperedges are indecomposable, that is to say, any subset of nodes in a hyperedge cannot form another hyperedge. These indecomposable hyperedges are especially common in heterogeneous networks. In this paper, we propose a novel Deep Hyper-Network Embedding (DHNE) model to embed hypernetworks with indecomposable hyperedges. More specifically, we theoretically prove that any linear similarity metric in embedding space commonly used in existing methods cannot maintain the indecomposibility property in hypernetworks, and thus propose a new deep model to realize a non-linear tuplewise similarity function while preserving both local and global proximities in the formed embedding space. We conduct extensive experiments on four different types of hyper-networks, including a GPS network, an online social network, a drug network and a semantic network. The empirical results demonstrate that our method can significantly and consistently outperform the state-of-the-art algorithms.

AAAI Conference 2017 Conference Paper

Adverse Drug Reaction Prediction with Symbolic Latent Dirichlet Allocation

  • Cao Xiao
  • Ping Zhang
  • W. Chaovalitwongse
  • Jianying Hu
  • Fei Wang

Adverse drug reaction (ADR) is a major burden for patients and healthcare industry. It usually causes preventable hospitalizations and deaths, while associated with a huge amount of cost. Traditional preclinical in vitro safety profiling and clinical safety trials are restricted in terms of small scale, long duration, huge financial costs and limited statistical significance. The availability of large amounts of drug and ADR data potentially allows ADR predictions during the drugs’ early preclinical stage with data analytics methods to inform more targeted clinical safety tests. Despite their initial success, existing methods have trade-offs among interpretability, predictive power and efficiency. This urges us to explore methods that could have all these strengths and provide practical solutions for real world ADR predictions. We cast the ADR-drug relation structure into a three-layer hierarchical Bayesian model. We interpret each ADR as a symbolic word and apply latent Dirichlet allocation (LDA) to learn topics that may represent certain biochemical mechanism that relates ADRs with drug structures. Based on LDA, we designed an equivalent regularization term to incorporate the hierarchical ADR domain knowledge. Finally, we developed a mixed input model leveraging a fast collapsed Gibbs sampling method that the complexity of each iteration of Gibbs sampling proportional only to the number of positive ADRs. Experiments on real world data show our models achieved higher prediction accuracy and shorter running time than the state-of-the-art alternatives.

IJCAI Conference 2017 Conference Paper

Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study

  • Suyog Gupta
  • Wei Zhang
  • Fei Wang

Deep learning with a large number of parame-ters requires distributed training, where model accuracy and runtime are two important factors to be considered. However, there has been no systematic study of the tradeoff between these two factors during the model training process. This paper presents Rudra, a parameter server based distributed computing framework tuned for training large-scale deep neural networks. Using variants of the asynchronous stochastic gradient descent algorithm we study the impact of synchronization protocol, stale gradient updates, minibatch size, learning rates, and number of learners on runtime performance and model accuracy. We introduce a new learningrate modulation strategy to counter the effect of stale gradients and propose a new synchronization protocol that can effectively bound the staleness in gradients, improve runtime performance and achieve good model accuracy. Our empirical investigation reveals a principled approach for distributed training of neural networks: the mini-batch size per learner should be reduced as more learners are added to the system to preserve the model accuracy. We validate this approach using commonly-used image classification benchmarks: CIFAR10 and ImageNet.

AAAI Conference 2017 Conference Paper

Multitask Dyadic Prediction and Its Application in Prediction of Adverse Drug-Drug Interaction

  • Bo Jin
  • Haoyu Yang
  • Cao Xiao
  • Ping Zhang
  • Xiaopeng Wei
  • Fei Wang

Adverse drug-drug interactions (DDIs) remain a leading cause of morbidity and mortality around the world. Identifying potential DDIs during the drug design process is critical in guiding targeted clinical drug safety testing. Although detection of adverse DDIs is conducted during Phase IV clinical trials, there are still a large number of new DDIs founded by accidents after the drugs were put on market. With the arrival of big data era, more and more pharmaceutical research and development data are becoming available, which provides an invaluable resource for digging insights that can potentially be leveraged in early prediction of DDIs. Many computational approaches have been proposed in recent years for DDI prediction. However, most of them focused on binary prediction (with or without DDI), despite the fact that each DDI is associated with a different type. Predicting the actual DDI type will help us better understand the DDI mechanism and identify proper ways to prevent it. In this paper, we formulate the DDI type prediction problem as a multitask dyadic regression problem, where the prediction of each specific DDI type is treated as a task. Compared with conventional matrix completion approaches which can only impute the missing entries in the DDI matrix, our approach can directly regress those dyadic relationships (DDIs) and thus can be extend to new drugs more easily. We developed an effective proximal gradient method to solve the problem. Evaluation on real world datasets is presented to demonstrate the effectiveness of the proposed approach.

AAAI Conference 2017 Conference Paper

TaGiTeD: Predictive Task Guided Tensor Decomposition for Representation Learning from Electronic Health Records

  • Kai Yang
  • Xiang Li
  • Haifeng Liu
  • Jing Mei
  • Guotong Xie
  • Junfeng Zhao
  • Bing Xie
  • Fei Wang

With the better availability of healthcare data, such as Electronic Health Records (EHR), more and more data analytics methodologies are developed aiming at digging insights from them to improve the quality of care delivery. There are many challenges on analyzing EHR, such as high dimensionality and event sparsity. Moreover, different from other application domains, the EHR analysis algorithms need to be highly interpretable to make them clinically useful. This makes representation learning from EHRs of key importance. In this paper, we propose an algorithm called Predictive Task Guided Tensor Decomposition (TaGiTeD), to analyze EHRs. Specifically, TaGiTeD learns event interaction patterns that are highly predictive for certain tasks from EHRs with supervised tensor decomposition. Compared with unsupervised methods, TaGiTeD can learn effective EHR representations in a more focused way. This is crucial because most of the medical problems have very limited patient samples, which are not enough for unsupervised algorithms to learn meaningful representations form. We apply TaGiTeD on real world EHR data warehouse and demonstrate that TaGiTeD can learn representations that are both interpretable and predictive.

AAAI Conference 2017 Conference Paper

Treatment Effect Estimation with Data-Driven Variable Decomposition

  • Kun Kuang
  • Peng Cui
  • Bo Li
  • Meng Jiang
  • Shiqiang Yang
  • Fei Wang

One fundamental problem in causal inference is the treatment effect estimation in observational studies when variables are confounded. Control for confounding effect is generally handled by propensity score. But it treats all observed variables as confounders and ignores the adjustment variables, which have no influence on treatment but are predictive of the outcome. Recently, it has been demonstrated that the adjustment variables are effective in reducing the variance of estimated treatment effect. However, how to automatically separate the confounders and adjustment variables in observational studies is still an open problem, especially in the scenarios of high dimensional variables, which are common in big data era. In this paper, we propose a Data-Driven Variable Decomposition (D2 VD) algorithm, which can 1) automatically separate confounders and adjustment variables with a data driven approach, and 2) simultaneously estimate treatment effect in observational studies with high dimensional variables. Under standard assumptions, we show experimentally that our D2 VD algorithm can automatically separate the variables precisely, and estimate treatment effect more accurately and with tighter confidence intervals than the state-of-the-art methods on both synthetic data and real online advertising dataset.

IS Journal 2016 Journal Article

Online Behavioral Analysis and Modeling [Guest Editorial]

  • Peng Cui
  • Huan Liu
  • Charu Aggarwal
  • Fei Wang

Online behavioral analysis and modeling has aroused considerable interest from closely related research fields such as data mining, machine learning, and information retrieval. This special issue provides a forum for researchers in behavior analysis to review pressing needs, discuss challenging research issues, and showcase state-of-the-art research and development in modern Web platforms.

IS Journal 2016 Journal Article

Uncovering and Predicting Human Behaviors

  • Peng Cui
  • Huan Liu
  • Charu Aggarwal
  • Fei Wang

This installment of Trends & Controversies provides an array of perspectives on the latest research in modeling user behavior. Peng Cui, Huan Liu, Charu Aggarwal, and Fei Wang introduce the field in "Uncovering and Predicting Human Behaviors. " The essays included are "Computational Modeling of Complex User Behaviors: Challenges and Opportunities, " by Peng Cui, Huan Liu, Charu Aggarwal, and Fei Wang; "Non-IID Recommendation Theories and Systems, " by Longbing Cao and Philip S. Yu; "User Behavior Modeling and Fraud Detection, " by Alex Beutel and Christos Faloutsos; and "Transfer Learning for Behavior Prediction, " by Weike Pan and Qiang Yang.

AAAI Conference 2015 Conference Paper

Perceiving Group Themes from Collective Social and Behavioral Information

  • Peng Cui
  • Tianyang Zhang
  • Fei Wang
  • Peng He

Collective social and behavioral information commonly exists in nature. There is a widespread intuitive sense that the characteristics of these social and behavioral information are to some extend related to the themes (or semantics) of the activities or targets. In this paper, we explicitly validate the interplay of collective social behavioral information and group themes using a large scale real dataset of online groups, and demonstrate the possibility of perceiving group themes from collective social and behavioral information. We propose a REgularized miXEd Regression (REXER) model based on matrix factorization to infer hierarchical semantics (including both group category and group labels) from collective social and behavioral information of group members. We extensively evaluate the proposed method in a large scale real online group dataset. For the prediction of group themes, the proposed REXER achieves satisfactory performances in various criterions. More specifically, we can predict the category of a group (among 6 categories) purely based on the collective social and behavioral information of the group with the Precision@1 to be 55. 16%, without any assistance from group labels or conversation contents. We also show, perhaps counterintuitively, that the collective social and behavioral information is more reliable than the titles and labels of groups for inferring the group categories.

AAAI Conference 2015 Conference Paper

Probabilistic Attributed Hashing

  • Mingdong Ou
  • Peng Cui
  • Jun Wang
  • Fei Wang
  • Wenwu Zhu

Due to the simplicity and efficiency, many hashing methods have recently been developed for large-scale similarity search. Most of the existing hashing methods focus on mapping low-level features to binary codes, but neglect attributes that are commonly associated with data samples. Attribute data, such as image tag, product brand, and user profile, can represent human recognition better than low-level features. However, attributes have specific characteristics, including high-dimensional, sparse and categorical properties, which is hardly leveraged into the existing hashing learning frameworks. In this paper, we propose a hashing learning framework, Probabilistic Attributed Hashing (PAH), to integrate attributes with low-level features. The connections between attributes and low-level features are built through sharing a common set of latent binary variables, i. e. hash codes, through which attributes and features can complement each other. Finally, we develop an efficient iterative learning algorithm, which is generally feasible for large-scale applications. Extensive experiments and comparison study are conducted on two public datasets, i. e. , DBLP and NUS-WIDE. The results clearly demonstrate that the proposed PAH method substantially outperforms the peer methods.

JBHI Journal 2015 Journal Article

PSF: A Unified Patient Similarity Evaluation Framework Through Metric Learning With Weak Supervision

  • Fei Wang
  • Jimeng Sun

Patient similarity is an important analytic operation in healthcare applications. At the core, patient similarity takes an index patient as the input and retrieves a ranked list of similar patients that are relevant in a specific clinical context. It takes patient information such as their electronic health records as input and computes the distance between a pair of patients based on those information. To construct a clinically valid similarity measure, physician input often needs to be incorporated. However, obtaining physicians’ input is difficult and expensive. As a result, typically only limited physician feedbacks can be obtained on a small portion of patients. How to leverage all unlabeled patient data and limited supervision information from physicians to construct a clinically meaningful distance metric? In this paper, we present a patient similarity framework (PSF) that unifies and significantly extends existing supervised patient similarity metric learning methods. PSF is a general framework that can learn an appropriate distance metric through supervised and unsupervised information. Within PSF framework, we propose a novel patient similarity algorithm that uses local spline regression to capture the unsupervised information. To speedup the incorporation of physician feedback or newly available clinical information, we introduce a general online update algorithm for an existing PSF distance metric.

ICRA Conference 2014 Conference Paper

Design of driving fatigue detection system based on hybrid measures using wavelet-packets transform

  • Fei Wang
  • Shaonan Wang
  • Xihui Wang
  • Ying Peng
  • Yiding Yang

With the rapid development of urbanization and motorization in China, fatigue driving has become an increasingly serious road traffic problem. Driving fatigue affects drivers' alertness, decreasing an individual's ability to operate a vehicle safely and increasing the risk of human error that could lead to fatalities, which have been widely recognized as critical safety issues that cut across all modes in the transportation industry. In this paper, firstly, with a virtual driving system we developed, driving simulation experiments were designed to collect subjects' electroencephalogram (EEG) signals and mental fatigue data. To detect drivers' mental state in real time, wavelet-packets transform (WPT) was selected to extract continuous features; then, the subjective evaluation combined with video monitoring was used to evaluate driver's mental state in experiment accurately. At last, with fatigue feature as the input and fatigue state as the output, driving fatigue detection model can be constructed by classification methods. In this paper, Support Vector Machine (SVM) was used to build driving fatigue detection model to estimate mental fatigue state of EEG signal features, and the binary classification accuracy can be achieved up to 88. 6207%.

IJCAI Conference 2013 Conference Paper

Active Learning from Relative Queries

  • Buyue Qian
  • Xiang Wang
  • Fei Wang
  • Hongfei Li
  • Jieping Ye
  • Ian Davidson

Active learning has been extensively studied and shown to be useful in solving real problems. The typical setting of traditional active learning methods is querying labels from an oracle. This is only possible if an expert exists, which may not be the case in many real world applications. In this paper, we focus on designing easier questions that can be answered by a non-expert. These questions poll relative information as opposed to absolute information and can be even generated from sideinformation. We propose an active learning approach that queries the ordering of the importance of an instance’s neighbors rather than its label. We explore our approach on real datasets and make several interesting discoveries including that querying neighborhood information can be an effective question to ask and sometimes can even yield better performance than querying labels.

IJCAI Conference 2013 Conference Paper

Listening to the Crowd: Automated Analysis of Events via Aggregated Twitter Sentiment

  • Yuheng Hu
  • Fei Wang
  • Subbarao Kambhampati

Individuals often express their opinions on social media platforms like Twitter and Facebook during public events such as the U. S. Presidential debate and the Oscar awards ceremony. Gleaning insights from these posts is of importance to analyzing the impact of the event. In this work, we consider the problem of identifying the segments and topics of an event that garnered praise or criticism, according to aggregated Twitter responses. We propose a flexible factorization framework, SOCSENT, to learn factors about segments, topics, and sentiments. To regulate the learning process, several constraints based on prior knowledge on sentiment lexicon, sentiment orientations (on a few tweets) as well as tweets alignments to the event are enforced. We implement our approach using simple update rules to get the optimal solution. We evaluate the proposed method both quantitatively and qualitatively on two large-scale tweet datasets associated with two events from different domains to show that it improves significantly over baseline models.

AAAI Conference 2012 Conference Paper

ET-LDA: Joint Topic Modeling for Aligning Events and their Twitter Feedback

  • Yuheng Hu
  • Ajita John
  • Fei Wang
  • Subbarao Kambhampati

During broadcast events such as the Superbowl, the U. S. Presidential and Primary debates, etc. , Twitter has become the de facto platform for crowds to share perspectives and commentaries about them. Given an event and an associated large-scale collection of tweets, there are two fundamental research problems that have been receiving increasing attention in recent years. One is to extract the topics covered by the event and the tweets; the other is to segment the event. So far these problems have been viewed separately and studied in isolation. In this work, we argue that these problems are in fact inter-dependent and should be addressed together. We develop a joint Bayesian model that performs topic modeling and event segmentation in one unified framework. We evaluate the proposed model both quantitatively and qualitatively on two large-scale tweet datasets associated with two events from different domains to show that it improves significantly over baseline models.

AAAI Conference 2011 Conference Paper

Automatic Group Sparse Coding

  • Fei Wang
  • Noah Lee
  • Jimeng Sun
  • Jianying Hu
  • Shahram Ebadollahi

Sparse Coding (SC), which models the data vectors as sparse linear combinations over basis vectors (i. e. , dictionary), has been widely applied in machine learning, signal processing and neuroscience. Recently, one specific SC technique, Group Sparse Coding (GSC), has been proposed to learn a common dictionary over multiple different groups of data, where the data groups are assumed to be pre-defined. In practice, this may not always be the case. In this paper, we propose Automatic Group Sparse Coding (AutoGSC), which can (1) discover the hidden data groups; (2) learn a common dictionary over different data groups; and (3) learn an individual dictionary for each data group. Finally, we conduct experiments on both synthetic and real world data sets to demonstrate the effectiveness of AutoGSC, and compare it with traditional sparse coding and Nonnegative Matrix Factorization (NMF) methods.

IROS Conference 2011 Conference Paper

Controlling telerobotic operations adaptive to quality of teleoperator and task dexterity

  • Yunyi Jia
  • Ning Xi 0001
  • Fei Wang
  • Yunxia Wang
  • Xin Li

Telerobotic systems have been researched for decades due to their extensive applications in many civilian and military areas. Most research mainly focused on either the stability or telepresence of the telerobotic systems. Few studies have investigated the effects of the confidence of the teleoperator on the performance of the teleoperation. The confidence of the teleoperator is of significant importance to the efficiency and safety of the telerobotic systems. This paper proposes a concept named quality of teleoperator (QoT) to represent the confidence of the decisions and commands generated by the teleoperator. The value of QoT is computed based on a set of mental states of the teleoperator. Based on the QoT, a control adjustment mechanism is designed to enhance the efficiency and safety of the telerobotic systems. Experiments were implemented on a manipulator to demonstrate the effectiveness of the proposed method.

AAAI Conference 2011 Conference Paper

Item-Level Social Influence Prediction with Probabilistic Hybrid Factor Matrix Factorization

  • Peng Cui
  • Fei Wang
  • Shiqiang Yang
  • Lifeng Sun

Social influence has become the essential factor which drives the dynamic evolution process of social network structure and user behaviors. Previous research often focus on social in- fluence analysis in network-level or topic-level. In this paper, we concentrate on predicting item-level social influence to reveal the users’ influences in a more fine-grained level. We formulate the social influence prediction problem as the estimation of a user-post matrix, where each entry in the matrix represents the social influence strength the corresponding user has given the corresponding web post. To deal with the sparsity and complex factor challenges in the research, we model the problem by extending the probabilistic matrix factorization method to incorporate rich prior knowledge on both user dimension and web post dimension, and propose the Probabilistic Hybrid Factor Matrix Factorization (PHF- MF) approach. Intensive experiments are conducted on a real world online social network to demonstrate the advantages and characteristics of the proposed method.

AAAI Conference 2011 Conference Paper

Towards Evolutionary Nonnegative Matrix Factorization

  • Fei Wang
  • Hanghang Tong
  • Ching-Yung Lin

Nonnegative Matrix Factorization (NMF) techniques has aroused considerable interests from the field of artificial intelligence in recent years because of its good interpretability and computational efficiency. However, in many real world applications, the data features usually evolve over time smoothly. In this case, it would be very expensive in both computation and storage to rerun the whole NMFprocedure after each time when the data feature changing. In this paper, we propose Evolutionary Nonnegative Matrix Factorization (eNMF), which aims to incrementally update the factorized matrices in a computation and space efficient manner with the variation of the data matrix. We devise such evolutionary procedure for both asymmetric and symmetric NMF. Finally we conduct experiments on several real world data sets to demonstrate the efficacy and efficiency of eNMF.

IJCAI Conference 2009 Conference Paper

  • Fei Wang
  • Xin Wang
  • Tao Li

Clustering aggregation has emerged as an important extension of the classical clustering problem. It refers to the situation in which a number of different (input) clusterings have been obtained for a particular data set and it is desired to aggregate those clustering results to get a better clustering solution. In this paper, we propose a unified framework to solve the clustering aggregation problem, where the aggregated clustering result is obtained by minimizing the (weighted) sum of the Bregman divergence between it and all the input clusterings. Moreover, under our algorithm framework, we also propose a novel cluster aggregation problem where some must-link and cannot-link constraints are given in addition to the input clusterings. Finally the experimental results on some real world data sets are presented to show the effectiveness of our method.

IJCAI Conference 2009 Conference Paper

  • Dan Zhang
  • Fei Wang
  • Luo Si
  • Tao Li

Clustering, classification, and regression, are three major research topics in machine learning. So far, much work has been conducted in solving multiple instance classification and multiple instance regression problems, where supervised training patterns are given as bags and each bag consists of some instances. But the research on unsupervised multiple instance clustering is still limited. This paper formulates a novel Maximum Margin Multiple Instance Clustering (M3 IC) problem for the multiple instance clustering task. To avoid solving a nonconvex optimization problem directly, M3 IC is further relaxed, which enables an efficient optimization solution with a combination of Constrained Concave-Convex Procedure (CCCP) and the Cutting Plane method. Furthermore, this paper analyzes some important properties of the proposed method and the relationship between the proposed method and some other related ones. An extensive set of empirical results demonstrate the advantages of the proposed method against existing research for both effectiveness and efficiency.

IJCAI Conference 2009 Conference Paper

  • Fei Wang
  • Bin Zhang
  • Ta-Hsin Li
  • Wen Jun Yin
  • Jin Dong
  • Tao Li

In this paper, we consider a general problem of semi-supervised preference learning, in which we assume that we have the information of the extreme cases and some ordered constraints, our goal is to learn the unknown preferences of the other places. Taking the potential housing place selection problem as an example, we have many candidate places together with their associated information (e. g. , position, environment), and we know some extreme examples (i. e. several places are perfect for building a house, and several places are the worst that cannot build a house there), and we know some partially ordered constraints (i. e. for two places, which place is better), then how can we judge the preference of one potential place whose preference is unknown beforehand? We propose a Bayesian framework based on Gaussian process to tackle this problem, from which we not only solve for the unknown preferences, but also the hyperparameters contained in our model.

AAAI Conference 2008 Conference Paper

On Discriminative Semi-Supervised Classification

  • Fei Wang

The recent years have witnessed a surge of interests in semi-supervised learning methods. A common strategy for these algorithms is to require that the predicted data labels should be sufficiently smooth with respect to the intrinsic data manifold. In this paper, we argue that rather than penalizing the label smoothness, we can directly punish the discriminality of the classification function to achieve a more powerful predictor, and we derive two specific algorithms: Semi- Supervised Discriminative Regularization (SSDR) and Semi-parametric Discriminative Semi-supervised Classification (SDSC). Finally many experimental results are presented to show the effectiveness of our method.

AAAI Conference 2008 Conference Paper

Semi-supervised Classification Using Local and Global Regularization

  • Fei Wang
  • Gang Wang

In this paper, we propose a semi-supervised learning (SSL) algorithm based on local and global regularization. In the local regularization part, our algorithm constructs a regularized classifier for each data point using its neighborhood, while the global regularization part adopts a Laplacian regularizer to smooth the data labels predicted by those local classifiers. We show that some existing SSL algorithms can be derived from our framework. Finally we present some experimental results to show the effectiveness of our method.

AAAI Conference 2007 Conference Paper

Clustering with Local and Global Regularization

  • Fei Wang

Clustering is an old research topic in data mining and machine learning communities. Most of the traditional clustering methods can be categorized local or global ones. In this paper, a novel clustering method that can explore both the local and global information in the dataset is proposed. The method, Clustering with Local and Global Consistency (CLGR), aims to minimize a cost function that properly trades off the local and global costs. We will show that such an optimization problem can be solved by the eigenvalue decomposition of a sparse symmetric matrix, which can be done efficiently by some iterative methods. Finally the experimental results on several datasets are presented to show the effectiveness of our method.

IROS Conference 2007 Conference Paper

Generic and systematic evaluation of haptic interfaces based on testbeds

  • Evren Samur
  • Fei Wang
  • Ulrich Spaelter
  • Hannes Bleuler

The purpose of evaluation procedures is to achieve both qualitative and quantitative statements on haptic rendering realism and performance. Since a haptic interface provides an interaction between a user and a virtual environment, fidelity of a haptic interface directly affects the performance. To our knowledge, a standard, generic and reusable validation method which comprehensively addresses all the attributes of haptic feedback has not been realized yet. Despite the large number of human factor studies, only few of them have been proposed as well for haptic interface performance measurements. For this reason, we review validation procedures for haptic rendering and propose an evaluation method based on testbeds to obtain a systematic haptic interface assessment. We integrated the approaches of human factor studies into the testbeds to obtain a simple and yet complete measure of human-machine interaction performance. The testbeds were tested on a haptic interface, the IHP of Xitact SA, and performance results are presented. In the testbeds, performance metrics for generic haptic interaction tasks are expressed in terms of information transfer (bits) and sensory thresholds which are indeed device specific benchmark metrics. Thus, the suitability of a haptic interface for a defined task can be verified, device comparisons become possible and the obtained information can be used to identify possible improvements.