Arrow Research search

Author name cluster

Yiping Ke

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

EAAI Journal 2026 Journal Article

Model-based speech enhancement with spectral envelope correction using stacked autoencoders

  • Wenhao Lu
  • Zhenya Zang
  • Feng Qin
  • Xia Dong
  • Jie Han
  • Zuozhou Pan
  • Yiping Ke

Speech enhancement aims to improve the intelligibility and perceptual quality of noisy speech signals. Many deep learning-based denoising approaches have been developed in the past decade. Their training objectives are to minimize the overall error between predicted and target signals with various mathematical metrics. However, enhancing the perceptual quality is more dependent on preserving the inherent speech characteristics than on overall signal matching. Neglecting this aspect may limit the improvements. Hence, we propose a speech enhancement system that combines the Harmonic Noise Model (HNM) with Stacked Autoencoder (SAE)-based spectral envelope correction. The HNM framework reconstructs the harmonic structure, which is a key spectral feature that contributes to timbre and perceived loudness. Since the parameters used for HNM reconstruction are corrupted by background noise, we design spectral envelope correction modules for restoration. These modules adopt a cluster-specific training strategy. Input data with similar characteristics are first grouped to guide the neural network in learning specific feature representations. Then, within each cluster, the associated SAE builds a robust mapping between clean and noisy parameters by mitigating redundancy and random perturbations in the data. Experimental results verify the effectiveness of our scheme across various noise types and input signal-to-noise levels.

JBHI Journal 2026 Journal Article

Multi-Atlas Brain Network Classification Through Consistency Distillation and Complementary Information Fusion

  • Jiaxing Xu
  • Mengcheng Lan
  • Xia Dong
  • Kai He
  • Wei Zhang
  • Qingtian Bian
  • Yiping Ke

Brain network analysis plays a crucial role in identifying distinctive patterns associated with neurological disorders. Functional magnetic resonance imaging (fMRI) enables the construction of brain networks by analyzing correlations in blood-oxygen-level-dependent (BOLD) signals across different brain regions, known as regions of interest (ROIs). These networks are typically constructed using atlases that parcellate the brain based on various hypotheses of functional and anatomical divisions. However, there is no standard atlas for brain network classification, leading to limitations in detecting abnormalities in disorders. Recent methods leveraging multiple atlases fail to ensure consistency across atlases and lack effective ROI-level information exchange, limiting their efficacy. To address these challenges, we propose the Atlas-Integrated Distillation and Fusion network (AIDFusion), a novel framework designed to enhance brain network classification using fMRI data. AIDFusion introduces a disentangle Transformer to filter out inconsistent atlas-specific information and distill meaningful cross-atlas connections. Additionally, it enforces subject- and population-level consistency constraints to improve cross-atlas coherence. To further enhance feature integration, AIDFusion incorporates an inter-atlas message-passing mechanism that facilitates the fusion of complementary information across brain regions. We evaluate AIDFusion on four resting-state fMRI datasets encompassing different neurological disorders. Experimental results demonstrate its superior classification performance and computational efficiency compared to state-of-the-art methods. Furthermore, a case study highlights AIDFusion’s ability to extract interpretable patterns that align with established neuroscience findings, reinforcing its potential as a robust tool for multi-atlas brain network analysis.

ICLR Conference 2025 Conference Paper

BrainOOD: Out-of-distribution Generalizable Brain Network Analysis

  • Jiaxing Xu
  • Yongqiang Chen
  • Xia Dong
  • Mengcheng Lan
  • Tiancheng Huang
  • Qingtian Bian
  • James Cheng
  • Yiping Ke

In neuroscience, identifying distinct patterns linked to neurological disorders, such as Alzheimer's and Autism, is critical for early diagnosis and effective intervention. Graph Neural Networks (GNNs) have shown promising in analyzing brain networks, but there are two major challenges in using GNNs: (1) distribution shifts in multi-site brain network data, leading to poor Out-of-Distribution (OOD) generalization, and (2) limited interpretability in identifying key brain regions critical to neurological disorders. Existing graph OOD methods, while effective in other domains, struggle with the unique characteristics of brain networks. To bridge these gaps, we introduce BrainOOD, a novel framework tailored for brain networks that enhances GNNs' OOD generalization and interpretability. BrainOOD framework consists of a feature selector and a structure extractor, which incorporates various auxiliary losses including an improved Graph Information Bottleneck (GIB) objective to recover causal subgraphs. By aligning structure selection across brain networks and filtering noisy features, BrainOOD offers reliable interpretations of critical brain regions. Our approach outperforms 16 existing methods and improves generalization to OOD subjects by up to 8.5%. Case studies highlight the scientific validity of the patterns extracted, which aligns with the findings in known neuroscience literature. We also propose the first OOD brain network benchmark, which provides a foundation for future research in this field. Our code is available at https://github.com/AngusMonroe/BrainOOD.

ICLR Conference 2025 Conference Paper

Text4Seg: Reimagining Image Segmentation as Text Generation

  • Mengcheng Lan
  • Chaofeng Chen
  • Yue Zhou 0005
  • Jiaxing Xu
  • Yiping Ke
  • Xinjiang Wang
  • Litong Feng
  • Wayne Zhang 0001

Multimodal Large Language Models (MLLMs) have shown exceptional capabilities in vision-language tasks; however, effectively integrating image segmentation into these models remains a significant challenge. In this paper, we introduce Text4Seg, a novel text-as-mask paradigm that casts image segmentation as a text generation problem, eliminating the need for additional decoders and significantly simplifying the segmentation process. Our key innovation is semantic descriptors, a new textual representation of segmentation masks where each image patch is mapped to its corresponding text label. This unified representation allows seamless integration into the auto-regressive training pipeline of MLLMs for easier optimization. We demonstrate that representing an image with $16\times16$ semantic descriptors yields competitive segmentation performance. To enhance efficiency, we introduce the Row-wise Run-Length Encoding (R-RLE), which compresses redundant text sequences, reducing the length of semantic descriptors by 74\% and accelerating inference by $3\times$, without compromising performance. Extensive experiments across various vision tasks, such as referring expression segmentation and comprehension, show that Text4Seg achieves state-of-the-art performance on multiple datasets by fine-tuning different MLLM backbones. Our approach provides an efficient, scalable solution for vision-centric tasks within the MLLM framework.

AAAI Conference 2024 Conference Paper

Union Subgraph Neural Networks

  • Jiaxing Xu
  • Aihu Zhang
  • Qingtian Bian
  • Vijay Prakash Dwivedi
  • Yiping Ke

Graph Neural Networks (GNNs) are widely used for graph representation learning in many application domains. The expressiveness of vanilla GNNs is upper-bounded by 1-dimensional Weisfeiler-Leman (1-WL) test as they operate on rooted subtrees through iterative message passing. In this paper, we empower GNNs by injecting neighbor-connectivity information extracted from a new type of substructure. We first investigate different kinds of connectivities existing in a local neighborhood and identify a substructure called union subgraph, which is able to capture the complete picture of the 1-hop neighborhood of an edge. We then design a shortest-path-based substructure descriptor that possesses three nice properties and can effectively encode the high-order connectivities in union subgraphs. By infusing the encoded neighbor connectivities, we propose a novel model, namely Union Subgraph Neural Network (UnionSNN), which is proven to be strictly more powerful than 1-WL in distinguishing non-isomorphic graphs. Additionally, the local encoding from union subgraphs can also be injected into arbitrary message-passing neural networks (MPNNs) and Transformer-based models as a plugin. Extensive experiments on 18 benchmarks of both graph-level and node-level tasks demonstrate that UnionSNN outperforms state-of-the-art baseline models, with competitive computational efficiency. The injection of our local encoding to existing models is able to boost the performance by up to 11.09%. Our code is available at https://github.com/AngusMonroe/UnionSNN.

NeurIPS Conference 2023 Conference Paper

Data-Driven Network Neuroscience: On Data Collection and Benchmark

  • Jiaxing Xu
  • Yunhan Yang
  • David Huang
  • Sophi Shilpa Gururajapathy
  • Yiping Ke
  • Miao Qiao
  • Alan Wang
  • Haribalan Kumar

This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such as Alzheimer's, Parkinson's, and Autism. Recently, the study of the brain in the form of brain networks using machine learning and graph analytics has become increasingly popular, especially to predict the early onset of these conditions. A brain network, represented as a graph, retains rich structural and positional information that traditional examination methods are unable to capture. However, the lack of publicly accessible brain network data prevents researchers from data-driven explorations. One of the main difficulties lies in the complicated domain-specific preprocessing steps and the exhaustive computation required to convert the data from MRI images into brain networks. We bridge this gap by collecting a large amount of MRI images from public databases and a private source, working with domain experts to make sensible design choices, and preprocessing the MRI images to produce a collection of brain network datasets. The datasets originate from 6 different sources, cover 4 brain conditions, and consist of a total of 2, 702 subjects. We test our graph datasets on 12 machine learning models to provide baselines and validate the data quality on a recent graph analysis model. To lower the barrier to entry and promote the research in this interdisciplinary field, we release our brain network data and complete preprocessing details including codes at https: //doi. org/10. 17608/k6. auckland. 21397377 and https: //github. com/brainnetuoa/data driven network_neuroscience.

NeurIPS Conference 2023 Conference Paper

SmooSeg: Smoothness Prior for Unsupervised Semantic Segmentation

  • Mengcheng Lan
  • Xinjiang Wang
  • Yiping Ke
  • Jiaxing Xu
  • Litong Feng
  • Wayne Zhang

Unsupervised semantic segmentation is a challenging task that segments images into semantic groups without manual annotation. Prior works have primarily focused on leveraging prior knowledge of semantic consistency or priori concepts from self-supervised learning methods, which often overlook the coherence property of image segments. In this paper, we demonstrate that the smoothness prior, asserting that close features in a metric space share the same semantics, can significantly simplify segmentation by casting unsupervised semantic segmentation as an energy minimization problem. Under this paradigm, we propose a novel approach called SmooSeg that harnesses self-supervised learning methods to model the closeness relationships among observations as smoothness signals. To effectively discover coherent semantic segments, we introduce a novel smoothness loss that promotes piecewise smoothness within segments while preserving discontinuities across different segments. Additionally, to further enhance segmentation quality, we design an asymmetric teacher-student style predictor that generates smoothly updated pseudo labels, facilitating an optimal fit between observations and labeling outputs. Thanks to the rich supervision cues of the smoothness prior, our SmooSeg significantly outperforms STEGO in terms of pixel accuracy on three datasets: COCOStuff (+14. 9\%), Cityscapes (+13. 0\%), and Potsdam-3 (+5. 7\%).

AAAI Conference 2020 Conference Paper

Tweedie-Hawkes Processes: Interpreting the Phenomena of Outbreaks

  • Tianbo Li
  • Yiping Ke

Self-exciting event sequences, in which the occurrence of an event increases the probability of triggering subsequent ones, are common in many disciplines. In this paper, we propose a Bayesian model called Tweedie-Hawkes Processes (THP), which is able to model the outbreaks of events and find out the dominant factors behind. THP leverages on the Tweedie distribution in capturing various excitation effects. A variational EM algorithm is developed for model inference. Some theoretical properties of THP, including the sub-criticality, convergence of the learning algorithm and kernel selection method are discussed. Applications to Epidemiology and information diffusion analysis demonstrate the versatility of our model in various disciplines. Evaluations on real-world datasets show that THP outperforms the rival state-of-the-art baselines in the task of forecasting future events.

NeurIPS Conference 2019 Conference Paper

Thinning for Accelerating the Learning of Point Processes

  • Tianbo Li
  • Yiping Ke

This paper discusses one of the most fundamental issues about point processes that what is the best sampling method for point processes. We propose \textit{thinning} as a downsampling method for accelerating the learning of point processes. We find that the thinning operation preserves the structure of intensity, and is able to estimate parameters with less time and without much loss of accuracy. Theoretical results including intensity, parameter and gradient estimation on a thinned history are presented for point processes with decouplable intensities. A stochastic optimization algorithm based on the thinned gradient is proposed. Experimental results on synthetic and real-world datasets validate the effectiveness of thinning in the tasks of parameter and gradient estimation, as well as stochastic optimization.

UAI Conference 2017 Conference Paper

A Fast Algorithm for Matrix Eigen-decompositionn

  • Zhiqiang Xu 0003
  • Yiping Ke
  • Xin Gao 0001

We propose a fast stochastic Riemannian gradient eigensolver for a real and symmetric matrix, and prove its local, eigengap-dependent and linear convergence. The fast convergence is brought by deploying the variance reduction technique which was originally developed for the Euclidean strongly convex problems. In this paper, this technique is generalized to Riemannian manifolds for solving the geodesically non-convex problem of finding a group of top eigenvectors of such a matrix. We first propose the general variance reduction form of the stochastic Riemannian gradient, giving rise to the stochastic variance reduced Riemannian gradient method (SVRRG). It turns out that the operation of vector transport is necessary in addition to using Riemannian gradients and retraction operations. We then specialize it to the problem in question resulting in our SVRRGEIGS algorithm. We are among the first to propose and analyze the generalization of the stochastic variance reduced gradient (SVRG) to Riemannian manifolds. As an extension of the linearly convergent VR-PCA, it is significant and nontrivial for the proposed algorithm to theoretically achieve a further speedup and empirically make a difference, due to our respect to the inherent geometry of the problem.

ICML Conference 2017 Conference Paper

Source-Target Similarity Modelings for Multi-Source Transfer Gaussian Process Regression

  • Pengfei Wei 0001
  • Ramón Sagarna
  • Yiping Ke
  • Yew-Soon Ong
  • Chi-Keong Goh

A key challenge in multi-source transfer learning is to capture the diverse inter-domain similarities. In this paper, we study different approaches based on Gaussian process models to solve the multi-source transfer regression problem. Precisely, we first investigate the feasibility and performance of a family of transfer covariance functions that represent the pairwise similarity of each source and the target domain. We theoretically show that using such a transfer covariance function for general Gaussian process modelling can only capture the same similarity coefficient for all the sources, and thus may result in unsatisfactory transfer performance. This leads us to propose TC$_{MS}$Stack, an integrated strategy incorporating the benefits of the transfer covariance function and stacking. Extensive experiments on one synthetic and two real-world datasets, with learning settings of up to 11 sources for the latter, demonstrate the effectiveness of our proposed TC$_{MS}$Stack.

IJCAI Conference 2016 Conference Paper

Deep Nonlinear Feature Coding for Unsupervised Domain Adaptation

  • Pengfei Wei
  • Yiping Ke
  • Chi Keong Goh

Deep feature learning has recently emerged with demonstrated effectiveness in domain adaptation. In this paper, we propose a Deep Nonlinear Feature Coding framework (DNFC) for unsupervised domain adaptation. DNFC builds on the marginalized stacked denoising autoencoder (mSDA) to extract rich deep features. We introduce two new elements to mSDA: domain divergence minimization by Maximum Mean Discrepancy (MMD), and nonlinear coding by kernelization. These two elements are essential for domain adaptation as they ensure the extracted deep features to have a small distribution discrepancy and encode data nonlinearity. The effectiveness of DNFC is verified by extensive experiments on benchmark datasets. Specifically, DNFC attains much higher prediction accuracy than state-of-the-art domain adaptation methods. Compared to its basis mSDA, DNFC is able to achieve remarkable prediction improvement and meanwhile converges much faster with a small number of stacked layers.