Arrow Research search

Author name cluster

Jing Gao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers
1 author row

Possible papers

16

TIST Journal 2026 Journal Article

LKAFormer: A Lightweight Kolmogorov-Arnold Transformer Model for Image Semantic Segmentation

  • Shoulin Yin
  • Liguo Wang
  • Tao Chen
  • Huafei Huang
  • Jing Gao
  • Jianing Zhang
  • Meng Liu
  • Peng Li

Transformer-based semantic segmentation methods have demonstrated outstanding performance by leveraging global self-attention to effectively capture long-range dependence. However, there still exist two issues in existing works: (1) Most of them utilize the full-rank weight matrix to support the self-attention mechanism and feed-forward network in modelling long-range dependence between patches/pixels, resulting in a high computational cost during both training and inference. (2) Most of them ignore information interactions between high-level semantics and low-level structures during the image resolution recovery, which leads to the performance degradation in segmenting objects with complex boundaries. To tackle these challenges, a lightweight Kolmogorov-Arnold Transformer model (LKAFormer) is proposed for the image semantic segmentation, containing a two-stream lightweight Transformer encoder and a graph feature pyramid aggregation KAN-decoder. The former constructs a hierarchical feature cross-scale fusion pipeline to obtain sufficient semantics containing comprehensive multi-scale information via setting coarse-grained and fine-grained streams with different-size patches of images. In that pipeline, feature lightweight focusing modules model complex and long-range dependence across patches/pixels to refine image semantics with less computational costs by lightweight multi-head self-attention and lightweight feed-forward network designs. The latter leverages the learnable nonlinear transformation mechanism of the Kolmogorov-Arnold Transformer architecture to adaptively capture spatial structure dependence of distinct sub-regions of images. And then, it jointly performs the intra-scale graph fusion and cross-scale graph fusion during the image resolution recovery to enhance information interactions between high-level semantics and low-level structures, which achieves the robust boundary localization and texture refinement of segmentation objects. Finally, plentiful experiments are conducted on three challenging datasets, and the results show LKAFormer sets a new baseline in the image segmentation task in comparison with 11 methods.

YNIMG Journal 2026 Journal Article

Standardized quantification of [18F]Florbetazine amyloid PET with the Centiloid scale

  • Meiqi Wu
  • Menglin Liang
  • Chenhui Mao
  • Liling Dong
  • Qi Ge
  • Yuying Li
  • Jingnan Wang
  • Chao Ren

C]PiB across different image-processing pipelines and effective image resolutions (EIRs). METHODS: C]PiB SUVR were evaluated under different EIRs. RESULTS: F]FBZ SUVR were observed across EIRs with the SPM pipeline, whereas regression parameters varied across EIRs with the FreeSurfer pipeline. CONCLUSION: F]FBZ demonstrated equal or improved quantification precision, supporting its broader use in clinical and research Aβ imaging.

NeurIPS Conference 2024 Conference Paper

Counterfactual Fairness by Combining Factual and Counterfactual Predictions

  • Zeyu Zhou
  • Tianci Liu
  • Ruqi Bai
  • Jing Gao
  • Murat Kocaoglu
  • David I. Inouye

In high-stakes domains such as healthcare and hiring, the role of machine learning (ML) in decision-making raises significant fairness concerns. This work focuses on Counterfactual Fairness (CF), which posits that an ML model's outcome on any individual should remain unchanged if they had belonged to a different demographic group. Previous works have proposed methods that guarantee CF. Notwithstanding, their effects on the model's predictive performance remain largely unclear. To fill this gap, we provide a theoretical study on the inherent trade-off between CF and predictive performance in a model-agnostic manner. We first propose a simple but effective method to cast an optimal but potentially unfair predictor into a fair one with a minimal loss of performance. By analyzing the excess risk incurred by perfect CF, we quantify this inherent trade-off. Further analysis on our method's performance with access to only incomplete causal knowledge is also conducted. Built upon this, we propose a practical algorithm that can be applied in such scenarios. Experiments on both synthetic and semi-synthetic datasets demonstrate the validity of our analysis and methods.

YNIMG Journal 2024 Journal Article

Evaluation of a novel PET tracer [18F]-Florbetazine for Alzheimer's disease diagnosis and β-amyloid deposition quantification

  • Meiqi Wu
  • Chao Ren
  • Chenhui Mao
  • Liling Dong
  • Bo Li
  • Xueqian Yang
  • Zhenghai Huang
  • Haiqiong Zhang

F]-92) is a selective PET tracer for β-amyloid (Aβ) depositions with a novel diaryl-azine scaffold to reduce lipophilicity and to achieve higher gray-to-white matter contrast. We aimed to assess its diagnostic value in Alzheimer's disease (AD) and pharmacokinetics characteristics in human subjects. METHODS: F]-Florbetazine and a structural MRI scan. The time-activity-curves (TACs) for volumes of interest (VOIs) in cerebral cortex, cerebellar cortex and cerebral white matter was depicted and their standardized uptake value ratios (SUVRs) with cerebellar cortex as reference were compared between HCs and AD patients. The cerebral gray-to-white matter SUV ratio (GWR) was also calculated. RESULTS: In HCs, radioactivities in the cerebral cortex VOIs were homogeneously low and at the same level as in cerebellar cortex, while in AD patients, cortical VOIs expected to contain Aβ exhibited high radioactivity. Cerebral cortex SUVRs remain relatively low in HCs while keep increasing along with time in AD patients. After 15 min, the cerebral cortex SUVRs became significant higher in AD patients compared to HCs with 100 % discrimination accuracy. In AD patients, GWR remained over 1.3 for all time intervals and visual inspection showed lower uptake in cerebral white matter compared to cerebral cortex. CONCLUSION: F]-Florbetazine can be potentially used for detection and quantification of Aβ depositions in the living human brain.

NeurIPS Conference 2024 Conference Paper

FIARSE: Model-Heterogeneous Federated Learning via Importance-Aware Submodel Extraction

  • Feijie Wu
  • Xingchen Wang
  • Yaqing Wang
  • Tianci Liu
  • Lu Su
  • Jing Gao

In federated learning (FL), accommodating clients' varied computational capacities poses a challenge, often limiting the participation of those with constrained resources in global model training. To address this issue, the concept of model heterogeneity through submodel extraction has emerged, offering a tailored solution that aligns the model's complexity with each client's computational capacity. In this work, we propose Federated Importance-Aware Submodel Extraction (FIARSE), a novel approach that dynamically adjusts submodels based on the importance of model parameters, thereby overcoming the limitations of previous static and dynamic submodel extraction methods. Compared to existing works, the proposed method offers a theoretical foundation for the submodel extraction and eliminates the need for additional information beyond the model parameters themselves to determine parameter importance, significantly reducing the overhead on clients. Extensive experiments are conducted on various datasets to showcase the superior performance of the proposed FIARSE.

NeurIPS Conference 2024 Conference Paper

Model LEGO: Creating Models Like Disassembling and Assembling Building Blocks

  • Jiacong Hu
  • Jing Gao
  • Jingwen Ye
  • Yang Gao
  • Xingen Wang
  • Zunlei Feng
  • Mingli Song

With the rapid development of deep learning, the increasing complexity and scale of parameters make training a new model increasingly resource-intensive. In this paper, we start from the classic convolutional neural network (CNN) and explore a paradigm that does not require training to obtain new models. Similar to the birth of CNN inspired by receptive fields in the biological visual system, we draw inspiration from the information subsystem pathways in the biological visual system and propose Model Disassembling and Assembling (MDA). During model disassembling, we introduce the concept of relative contribution and propose a component locating technique to extract task-aware components from trained CNN classifiers. For model assembling, we present the alignment padding strategy and parameter scaling strategy to construct a new model tailored for a specific task, utilizing the disassembled task-aware components. The entire process is akin to playing with LEGO bricks, enabling arbitrary assembly of new models, and providing a novel perspective for model creation and reuse. Extensive experiments showcase that task-aware components disassembled from CNN classifiers or new models assembled using these components closely match or even surpass the performance of the baseline, demonstrating its promising results for model reuse. Furthermore, MDA exhibits diverse potential applications, with comprehensive experiments exploring model decision route analysis, model compression, knowledge distillation, and more.

AAAI Conference 2023 Conference Paper

SimFair: A Unified Framework for Fairness-Aware Multi-Label Classification

  • Tianci Liu
  • Haoyu Wang
  • Yaqing Wang
  • Xiaoqian Wang
  • Lu Su
  • Jing Gao

Recent years have witnessed increasing concerns towards unfair decisions made by machine learning algorithms. To improve fairness in model decisions, various fairness notions have been proposed and many fairness-aware methods are developed. However, most of existing definitions and methods focus only on single-label classification. Fairness for multi-label classification, where each instance is associated with more than one labels, is still yet to establish. To fill this gap, we study fairness-aware multi-label classification in this paper. We start by extending Demographic Parity (DP) and Equalized Opportunity (EOp), two popular fairness notions, to multi-label classification scenarios. Through a systematic study, we show that on multi-label data, because of unevenly distributed labels, EOp usually fails to construct a reliable estimate on labels with few instances. We then propose a new framework named Similarity s-induced Fairness (sγ -SimFair). This new framework utilizes data that have similar labels when estimating fairness on a particular label group for better stability, and can unify DP and EOp. Theoretical analysis and experimental results on real-world datasets together demonstrate the advantage of sγ -SimFair over existing methods on multi-label classification tasks.

AAAI Conference 2023 Conference Paper

Towards Reliable Item Sampling for Recommendation Evaluation

  • Dong Li
  • Ruoming Jin
  • Zhenming Liu
  • Bin Ren
  • Jing Gao
  • Zhi Liu

Since Rendle and Krichene argued that commonly used sampling-based evaluation metrics are ``inconsistent'' with respect to the global metrics (even in expectation), there have been a few studies on the sampling-based recommender system evaluation. Existing methods try either mapping the sampling-based metrics to their global counterparts or more generally, learning the empirical rank distribution to estimate the top-K metrics. However, despite existing efforts, there is still a lack of rigorous theoretical understanding of the proposed metric estimators, and the basic item sampling also suffers from the ``blind spot'' issue, i.e., estimation accuracy to recover the top-K metrics when K is small can still be rather substantial. In this paper, we provide an in-depth investigation into these problems and make two innovative contributions. First, we propose a new item-sampling estimator that explicitly optimizes the error with respect to the ground truth, and theoretically highlights its subtle difference against prior work. Second, we propose a new adaptive sampling method that aims to deal with the ``blind spot'' problem and also demonstrate the expectation-maximization (EM) algorithm can be generalized for such a setting. Our experimental results confirm our statistical analysis and the superiority of the proposed works. This study helps lay the theoretical foundation for adopting item sampling metrics for recommendation evaluation and provides strong evidence for making item sampling a powerful and reliable tool for recommendation evaluation.

AAAI Conference 2021 Conference Paper

On Estimating Recommendation Evaluation Metrics under Sampling

  • Ruoming Jin
  • Dong Li
  • Benjamin Mudrak
  • Jing Gao
  • Zhi Liu

Since the recent study done by Krichene and Rendle on the sampling-based top-k evaluation metric for recommendation, there has been a lot of debates on the validity of using sampling to evaluate recommendation algorithms. Though their work and the recent work done by Li et al. have proposed some basic approaches for mapping the sampling-based metrics to their global counterparts which rank the entire set of items, there is still a lack of understanding and consensus on how sampling should be used for recommendation evaluation. The proposed approaches either are rather uninformative (linking sampling to metric evaluation) or can only work on simple metrics, such as Recall/Precision. In this paper, we introduce a new research problem on learning the empirical rank distribution, and a new approach based on the estimated rank distribution, to estimate the top-k metrics. Since this question is closely related to the underlying mechanism of sampling for recommendation, tackling it can help better understand the power of sampling and can help resolve the questions of if and how should we use sampling for evaluating recommendation. We introduce two approaches based on MLE (Maximal Likelihood Estimation) and its weighted variants, and ME (Maximal Entropy) principals to recover the empirical rank distribution, and then utilize them for metrics estimation. The experimental results show the advantages of using the new approaches for evaluating recommendation algorithms based on top-k metrics.

AAAI Conference 2020 Conference Paper

Weak Supervision for Fake News Detection via Reinforcement Learning

  • Yaqing Wang
  • Weifeng Yang
  • Fenglong Ma
  • Jin Xu
  • Bin Zhong
  • Qiang Deng
  • Jing Gao

Today social media has become the primary source for news. Via social media platforms, fake news travel at unprecedented speeds, reach global audiences and put users and communities at great risk. Therefore, it is extremely important to detect fake news as early as possible. Recently, deep learning based approaches have shown improved performance in fake news detection. However, the training of such models requires a large amount of labeled data, but manual annotation is time-consuming and expensive. Moreover, due to the dynamic nature of news, annotated samples may become outdated quickly and cannot represent the news articles on newly emerged events. Therefore, how to obtain fresh and high-quality labeled samples is the major challenge in employing deep learning models for fake news detection. In order to tackle this challenge, we propose a reinforced weaklysupervised fake news detection framework, i. e. , WeFEND, which can leverage users’ reports as weak supervision to enlarge the amount of training data for fake news detection. The proposed framework consists of three main components: the annotator, the reinforced selector and the fake news detector. The annotator can automatically assign weak labels for unlabeled news based on users’ reports. The reinforced selector using reinforcement learning techniques chooses highquality samples from the weakly labeled data and filters out those low-quality ones that may degrade the detector’s prediction performance. The fake news detector aims to identify fake news based on the news content. We tested the proposed framework on a large collection of news articles published via WeChat official accounts and associated user reports. Extensive experiments on this dataset show that the proposed We- FEND model achieves the best performance compared with the state-of-the-art methods.

IJCAI Conference 2019 Conference Paper

Data Poisoning Attack against Knowledge Graph Embedding

  • Hengtong Zhang
  • Tianhang Zheng
  • Jing Gao
  • Chenglin Miao
  • Lu Su
  • Yaliang Li
  • Kui Ren

Knowledge graph embedding (KGE) is a technique for learning continuous embeddings for entities and relations in the knowledge graph. Due to its benefit to a variety of downstream tasks such as knowledge graph completion, question answering and recommendation, KGE has gained significant attention recently. Despite its effectiveness in a benign environment, KGE's robustness to adversarial attacks is not well-studied. Existing attack methods on graph data cannot be directly applied to attack the embeddings of knowledge graph due to its heterogeneity. To fill this gap, we propose a collection of data poisoning attack strategies, which can effectively manipulate the plausibility of arbitrary targeted facts in a knowledge graph by adding or deleting facts on the graph. The effectiveness and efficiency of the proposed attack strategies are verified by extensive evaluations on two widely-used benchmarks.

IJCAI Conference 2019 Conference Paper

Metric Learning on Healthcare Data with Incomplete Modalities

  • Qiuling Suo
  • Weida Zhong
  • Fenglong Ma
  • Ye Yuan
  • Jing Gao
  • Aidong Zhang

Utilizing multiple modalities to learn a good distance metric is of vital importance for various clinical applications. However, it is common that modalities are incomplete for some patients due to various technical and practical reasons in healthcare datasets. Existing metric learning methods cannot directly learn the distance metric on such data with missing modalities. Nevertheless, the incomplete data contains valuable information to characterize patient similarity and modality relationships, and they should not be ignored during the learning process. To tackle the aforementioned challenges, we propose a metric learning framework to perform missing modality completion and multi-modal metric learning simultaneously. Employing the generative adversarial networks, we incorporate both complete and incomplete data to learn the mapping relationship between modalities. After completing the missing modalities, we use the nonlinear representations extracted by the discriminator to learn the distance metric among patients. Through jointly training the adversarial generation part and metric learning, the similarity among patients can be learned on data with missing modalities. Experimental results show that the proposed framework learns more accurate distance metric on real-world healthcare datasets with incomplete modalities, comparing with the state-of-the-art approaches. Meanwhile, the quality of the generated modalities can be preserved.

IJCAI Conference 2019 Conference Paper

On the Estimation of Treatment Effect with Text Covariates

  • Liuyi Yao
  • Sheng Li
  • Yaliang Li
  • Hongfei Xue
  • Jing Gao
  • Aidong Zhang

Estimating the treatment effect benefits decision making in various domains as it can provide the potential outcomes of different choices. Existing work mainly focuses on covariates with numerical values, while how to handle covariates with textual information for treatment effect estimation is still an open question. One major challenge is how to filter out the nearly instrumental variables which are the variables more predictive to the treatment than the outcome. Conditioning on those variables to estimate the treatment effect would amplify the estimation bias. To address this challenge, we propose a conditional treatment-adversarial learning based matching method (CTAM). CTAM incorporates the treatment-adversarial learning to filter out the information related to nearly instrumental variables when learning the representations, and then it performs matching among the learned representations to estimate the treatment effects. The conditional treatment-adversarial learning helps reduce the bias of treatment effect estimation, which is demonstrated by our experimental results on both semi-synthetic and real-world datasets.

NeurIPS Conference 2018 Conference Paper

Representation Learning for Treatment Effect Estimation from Observational Data

  • Liuyi Yao
  • Sheng Li
  • Yaliang Li
  • Mengdi Huai
  • Jing Gao
  • Aidong Zhang

Estimating individual treatment effect (ITE) is a challenging problem in causal inference, due to the missing counterfactuals and the selection bias. Existing ITE estimation methods mainly focus on balancing the distributions of control and treated groups, but ignore the local similarity information that is helpful. In this paper, we propose a local similarity preserved individual treatment effect (SITE) estimation method based on deep representation learning. SITE preserves local similarity and balances data distributions simultaneously, by focusing on several hard samples in each mini-batch. Experimental results on synthetic and three real-world datasets demonstrate the advantages of the proposed SITE method, compared with the state-of-the-art ITE estimation methods.

IJCAI Conference 2017 Conference Paper

A Correlated Topic Model Using Word Embeddings

  • Guangxu Xun
  • Yaliang Li
  • Wayne Xin Zhao
  • Jing Gao
  • Aidong Zhang

Conventional correlated topic models are able to capture correlation structure among latent topics by replacing the Dirichlet prior with the logistic normal distribution. Word embeddings have been proven to be able to capture semantic regularities in language. Therefore, the semantic relatedness and correlations between words can be directly calculated in the word embedding space, for example, via cosine values. In this paper, we propose a novel correlated topic model using word embeddings. The proposed model enables us to exploit the additional word-level correlation information in word embeddings and directly model topic correlation in the continuous word embedding space. In the model, words in documents are replaced with meaningful word embeddings, topics are modeled as multivariate Gaussian distributions over the word embeddings and topic correlations are learned among the continuous Gaussian topics. A Gibbs sampling solution with data augmentation is given to perform inference. We evaluate our model on the 20 Newsgroups dataset and the Reuters-21578 dataset qualitatively and quantitatively. The experimental results show the effectiveness of our proposed model.

NeurIPS Conference 2009 Conference Paper

Graph-based Consensus Maximization among Multiple Supervised and Unsupervised Models

  • Jing Gao
  • Feng Liang
  • Wei Fan
  • Yizhou Sun
  • Jiawei Han

Little work has been done to directly combine the outputs of multiple supervised and unsupervised models. However, it can increase the accuracy and applicability of ensemble methods. First, we can boost the diversity of classification ensemble by incorporating multiple clustering outputs, each of which provides grouping constraints for the joint label predictions of a set of related objects. Secondly, ensemble of supervised models is limited in applications which have no access to raw data but to the meta-level model outputs. In this paper, we aim at calculating a consolidated classification solution for a set of objects by maximizing the consensus among both supervised predictions and unsupervised grouping constraints. We seek a global optimal label assignment for the target objects, which is different from the result of traditional majority voting and model combination approaches. We cast the problem into an optimization problem on a bipartite graph, where the objective function favors smoothness in the conditional probability estimates over the graph, as well as penalizes deviation from initial labeling of supervised models. We solve the problem through iterative propagation of conditional probability estimates among neighboring nodes, and interpret the method as conducting a constrained embedding in a transformed space, as well as a ranking on the graph. Experimental results on three real applications demonstrate the benefits of the proposed method over existing alternatives.