Arrow Research search

Author name cluster

Li Guo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers
1 author row

Possible papers

25

TMLR Journal 2025 Journal Article

Cross Entropy versus Label Smoothing: A Neural Collapse Perspective

  • Li Guo
  • George Andriopoulos
  • Zifan Zhao
  • Zixuan Dong
  • Shuyang Ling
  • Keith W. Ross

Label smoothing loss is a widely adopted technique to mitigate overfitting in deep neural networks. This paper studies label smoothing from the perspective of Neural Collapse (NC), a powerful empirical and theoretical framework which characterizes model behavior during the terminal phase of training. We first show empirically that models trained with label smoothing converge faster to neural collapse solutions and attain a stronger level of neural collapse compared to those trained with cross-entropy loss. Furthermore, we show that at the same level of NC1, models under label smoothing loss exhibit intensified NC2. These findings provide valuable insights into the impact of label smoothing on model performance and calibration. Then, leveraging the unconstrained feature model, we derive closed-form solutions for the global minimizers under both label smoothing and cross-entropy losses. We show that models trained with label smoothing have a lower conditioning number and, therefore, theoretically converge faster. Our study, combining empirical evidence and theoretical results, not only provides nuanced insights into the differences between label smoothing and cross-entropy losses, but also serves as an example of how the powerful neural collapse framework can be used to improve our understanding of DNNs.

AAAI Conference 2025 Conference Paper

Focus on Local: Finding Reliable Discriminative Regions for Visual Place Recognition

  • Changwei Wang
  • Shunpeng Chen
  • Yukun Song
  • Rongtao Xu
  • Zherui Zhang
  • Jiguang Zhang
  • Haoran Yang
  • Yu Zhang

Visual Place Recognition (VPR) is aimed at predicting the location of a query image by referencing a database of geotagged images. For VPR task, often fewer discriminative local regions in an image produce important effects while mundane background regions do not contribute or even cause perceptual aliasing because of easy overlap. However, existing methods lack precisely modeling and full exploitation of these discriminative regions. In addition, the lack of pixel-level correspondence supervision in the VPR dataset hinders further improvement of the local feature matching capability in the re-ranking stage. In this paper, we propose the Focus on Local (FoL) approach to stimulate the performance of image retrieval and re-ranking in VPR simultaneously by mining and exploiting reliable discriminative local regions in images and introducing pseudo-correlation supervision. First, we design two losses, Extraction-Aggregation Spatial Alignment Loss (SAL) and Foreground-Background Contrast Enhancement Loss (CEL), to explicitly model reliable discriminative local regions and use them to guide the generation of global representations and efficient re-ranking. Second, we introduce a weakly-supervised local feature training strategy based on pseudo-correspondences obtained from aggregating global features to alleviate the lack of local correspondences ground truth for the VPR task. Third, we suggest an efficient re-ranking pipeline that is efficiently and precisely based on discriminative region guidance. Finally, experimental results show that our FoL achieves the state-of-the-art on multiple VPR benchmarks in both image retrieval and re-ranking stages and also significantly outperforms existing two-stage VPR methods in terms of computational efficiency.

JBHI Journal 2024 Journal Article

PSTNet: Enhanced Polyp Segmentation With Multi-Scale Alignment and Frequency Domain Integration

  • Wenhao Xu
  • Rongtao Xu
  • Changwei Wang
  • Xiuli Li
  • Shibiao Xu
  • Li Guo

Accurate segmentation of colorectal polyps in colonoscopy images is crucial for effective diagnosis and management of colorectal cancer (CRC). However, current deep learning-based methods primarily rely on fusing RGB information across multiple scales, leading to limitations in accurately identifying polyps due to restricted RGB domain information and challenges in feature misalignment during multi-scale aggregation. To address these limitations, we propose the Polyp Segmentation Network with Shunted Transformer (PSTNet), a novel approach that integrates both RGB and frequency domain cues present in the images. PSTNet comprises three key modules: the Frequency Characterization Attention Module (FCAM) for extracting frequency cues and capturing polyp characteristics, the Feature Supplementary Alignment Module (FSAM) for aligning semantic information and reducing misalignment noise, and the Cross Perception localization Module (CPM) for synergizing frequency cues with high-level semantics to achieve efficient polyp segmentation. Extensive experiments on challenging datasets demonstrate PSTNet's significant improvement in polyp segmentation accuracy across various metrics, consistently outperforming state-of-the-art methods. The integration of frequency domain cues and the novel architectural design of PSTNet contribute to advancing computer-assisted polyp segmentation, facilitating more accurate diagnosis and management of CRC.

AAAI Conference 2024 Conference Paper

Spectral Prompt Tuning: Unveiling Unseen Classes for Zero-Shot Semantic Segmentation

  • Wenhao Xu
  • Rongtao Xu
  • Changwei Wang
  • Shibiao Xu
  • Li Guo
  • Man Zhang
  • Xiaopeng Zhang

Recently, CLIP has found practical utility in the domain of pixel-level zero-shot segmentation tasks. The present landscape features two-stage methodologies beset by issues such as intricate pipelines and elevated computational costs. While current one-stage approaches alleviate these concerns and incorporate Visual Prompt Training (VPT) to uphold CLIP's generalization capacity, they still fall short in fully harnessing CLIP's potential for pixel-level unseen class demarcation and precise pixel predictions. To further stimulate CLIP's zero-shot dense prediction capability, we propose SPT-SEG, a one-stage approach that improves CLIP's adaptability from image to pixel. Specifically, we initially introduce Spectral Prompt Tuning (SPT), incorporating spectral prompts into the CLIP visual encoder's shallow layers to capture structural intricacies of images, thereby enhancing comprehension of unseen classes. Subsequently, we introduce the Spectral Guided Decoder (SGD), utilizing both high and low-frequency information to steer the network's spatial focus towards more prominent classification features, enabling precise pixel-level prediction outcomes. Through extensive experiments on two public datasets, we demonstrate the superiority of our method over state-of-the-art approaches, performing well across all classes and particularly excelling in handling unseen classes.

NeurIPS Conference 2024 Conference Paper

The Prevalence of Neural Collapse in Neural Multivariate Regression

  • George Andriopoulos
  • Zixuan Dong
  • Li Guo
  • Zifan Zhao
  • Keith Ross

Recently it has been observed that neural networks exhibit Neural Collapse (NC) during the final stage of training for the classification problem. We empirically show that multivariate regression, as employed in imitation learning and other applications, exhibits Neural Regression Collapse (NRC), a new form of neural collapse: (NRC1) The last-layer feature vectors collapse to the subspace spanned by the $n$ principal components of the feature vectors, where $n$ is the dimension of the targets (for univariate regression, $n=1$); (NRC2) The last-layer feature vectors also collapse to the subspace spanned by the last-layer weight vectors; (NRC3) The Gram matrix for the weight vectors converges to a specific functional form that depends on the covariance matrix of the targets. After empirically establishing the prevalence of (NRC1)-(NRC3) for a variety of datasets and network architectures, we provide an explanation of these phenomena by modeling the regression task in the context of the Unconstrained Feature Model (UFM), in which the last layer feature vectors are treated as free variables when minimizing the loss function. We show that when the regularization parameters in the UFM model are strictly positive, then (NRC1)-(NRC3) also emerge as solutions in the UFM optimization problem. We also show that if the regularization parameters are equal to zero, then there is no collapse. To our knowledge, this is the first empirical and theoretical study of neural collapse in the context of regression. This extension is significant not only because it broadens the applicability of neural collapse to a new category of problems but also because it suggests that the phenomena of neural collapse could be a universal behavior in deep learning.

IJCAI Conference 2023 Conference Paper

Accurate MRI Reconstruction via Multi-Domain Recurrent Networks

  • Jinbao Wei
  • Zhijie Wang
  • Kongqiao Wang
  • Li Guo
  • Xueyang Fu
  • Ji Liu
  • Xun Chen

In recent years, deep convolutional neural networks (CNNs) have become dominant in MRI reconstruction from undersampled k-space. However, most existing CNNs methods reconstruct the undersampled images either in the spatial domain or in the frequency domain, and neglecting the correlation between these two domains. This hinders the further reconstruction performance improvement. To tackle this issue, in this work, we propose a new multi-domain recurrent network (MDR-Net) with multi-domain learning (MDL) blocks as its basic units to reconstruct the undersampled MR image progressively. Specifically, the MDL block interactively processes the local spatial features and the global frequency information to facilitate complementary learning, leading to fine-grained features generation. Furthermore, we introduce an effective frequency-based loss to narrow the frequency spectrum gap, compensating for over-smoothness caused by the widely used spatial reconstruction loss. Extensive experiments on public fastMRI datasets demonstrate that our MDR-Net consistently outperforms other competitive methods and is able to provide more details.

AAAI Conference 2020 Conference Paper

A Knowledge-Aware Attentional Reasoning Network for Recommendation

  • Qiannan Zhu
  • Xiaofei Zhou
  • Jia Wu
  • Jianlong Tan
  • Li Guo

Knowledge-graph-aware recommendation systems have increasingly attracted attention in both industry and academic recently. Many existing knowledge-aware recommendation methods have achieved better performance, which usually perform recommendation by reasoning on the paths between users and items in knowledge graphs. However, they ignore the users’ personal clicked history sequences that can better reflect users’ preferences within a period of time for recommendation. In this paper, we propose a knowledge-aware attentional reasoning network KARN that incorporates the users’ clicked history sequences and path connectivity between users and items for recommendation. The proposed KARN not only develops an attention-based RNN to capture the user’s history interests from the user’s clicked history sequences, but also a hierarchical attentional neural network to reason on paths between users and items for inferring the potential user intents on items. Based on both user’s history interest and potential intent, KARN can predict the clicking probability of the user with respective to a candidate item. We conduct experiment on Amazon review dataset, and the experimental results demonstrate the superiority and effectiveness of our proposed KARN model.

IJCAI Conference 2020 Conference Paper

A Relation-Specific Attention Network for Joint Entity and Relation Extraction

  • Yue Yuan
  • Xiaofei Zhou
  • Shirui Pan
  • Qiannan Zhu
  • Zeliang Song
  • Li Guo

Joint extraction of entities and relations is an important task in natural language processing (NLP), which aims to capture all relational triplets from plain texts. This is a big challenge due to some of the triplets extracted from one sentence may have overlapping entities. Most existing methods perform entity recognition followed by relation detection between every possible entity pairs, which usually suffers from numerous redundant operations. In this paper, we propose a relation-specific attention network (RSAN) to handle the issue. Our RSAN utilizes relation-aware attention mechanism to construct specific sentence representations for each relation, and then performs sequence labeling to extract its corresponding head and tail entities. Experiments on two public datasets show that our model can effectively extract overlapping triplets and achieve state-of-the-art performance.

AAAI Conference 2020 Conference Paper

Distilling Knowledge from Well-Informed Soft Labels for Neural Relation Extraction

  • Zhenyu Zhang
  • Xiaobo Shu
  • Bowen Yu
  • Tingwen Liu
  • Jiapeng Zhao
  • Quangang Li
  • Li Guo

Extracting relations from plain text is an important task with wide application. Most existing methods formulate it as a supervised problem and utilize one-hot hard labels as the sole target in training, neglecting the rich semantic information among relations. In this paper, we aim to explore the supervision with soft labels in relation extraction, which makes it possible to integrate prior knowledge. Specifically, a bipartite graph is first devised to discover type constraints between entities and relations based on the entire corpus. Then, we combine such type constraints with neural networks to achieve a knowledgeable model. Furthermore, this model is regarded as teacher to generate well-informed soft labels and guide the optimization of a student network via knowledge distillation. Besides, a multi-aspect attention mechanism is introduced to help student mine latent information from text. In this way, the enhanced student inherits the dark knowledge (e. g. , type constraints and relevance among relations) from teacher, and directly serves the testing scenarios without any extra constraints. We conduct extensive experiments on the TACRED and SemEval datasets, the experimental results justify the effectiveness of our approach.

AAAI Conference 2019 Conference Paper

DAN: Deep Attention Neural Network for News Recommendation

  • Qiannan Zhu
  • Xiaofei Zhou
  • Zeliang Song
  • Jianlong Tan
  • Li Guo

With the rapid information explosion of news, making personalized news recommendation for users becomes an increasingly challenging problem. Many existing recommendation methods that regard the recommendation procedure as the static process, have achieved better recommendation performance. However, they usually fail with the dynamic diversity of news and user’s interests, or ignore the importance of sequential information of user’s clicking selection. In this paper, taking full advantages of convolution neural network (CNN), recurrent neural network (RNN) and attention mechanism, we propose a deep attention neural network DAN for news recommendation. Our DAN model presents to use attention-based parallel CNN for aggregating user’s interest features and attention-based RNN for capturing richer hidden sequential features of user’s clicks, and combines these features for new recommendation. We conduct experiment on real-world news data sets, and the experimental results demonstrate the superiority and effectiveness of our proposed DAN model.

IJCAI Conference 2019 Conference Paper

Deep Active Learning for Anchor User Prediction

  • Anfeng Cheng
  • Chuan Zhou
  • Hong Yang
  • Jia Wu
  • Lei Li
  • Jianlong Tan
  • Li Guo

Predicting pairs of anchor users plays an important role in the cross-network analysis. Due to the expensive costs of labeling anchor users for training prediction models, we consider in this paper the problem of minimizing the number of user pairs across multiple networks for labeling as to improve the accuracy of the prediction. To this end, we present a deep active learning model for anchor user prediction (DALAUP for short). However, active learning for anchor user sampling meets the challenges of non-i. i. d. user pair data caused by network structures and the correlation among anchor or non-anchor user pairs. To solve the challenges, DALAUP uses a couple of neural networks with shared-parameter to obtain the vector representations of user pairs, and ensembles three query strategies to select the most informative user pairs for labeling and model training. Experiments on real-world social network data demonstrate that DALAUP outperforms the state-of-the-art approaches.

IJCAI Conference 2019 Conference Paper

Neighborhood-Aware Attentional Representation for Multilingual Knowledge Graphs

  • Qiannan Zhu
  • Xiaofei Zhou
  • Jia Wu
  • Jianlong Tan
  • Li Guo

Multilingual knowledge graphs constructed by entity alignment are the indispensable resources for numerous AI-related applications. Most existing entity alignment methods only use the triplet-based knowledge to find the aligned entities across multilingual knowledge graphs, they usually ignore the neighborhood subgraph knowledge of entities that implies more richer alignment information for aligning entities. In this paper, we incorporate neighborhood subgraph-level information of entities, and propose a neighborhood-aware attentional representation method NAEA for multilingual knowledge graphs. NAEA devises an attention mechanism to learn neighbor-level representation by aggregating neighbors' representations with a weighted combination. The attention mechanism enables entities not only capture different impacts of their neighbors on themselves, but also attend over their neighbors' feature representations with different importance. We evaluate our model on two real-world datasets DBP15K and DWY100K, and the experimental results show that the proposed model NAEA significantly and consistently outperforms state-of-the-art entity alignment models.

AAAI Conference 2018 Conference Paper

Knowledge Graph Embedding With Iterative Guidance From Soft Rules

  • Shu Guo
  • Quan Wang
  • Lihong Wang
  • Bin Wang
  • Li Guo

Embedding knowledge graphs (KGs) into continuous vector spaces is a focus of current research. Combining such an embedding model with logic rules has recently attracted increasing attention. Most previous attempts made a one-time injection of logic rules, ignoring the interactive nature between embedding learning and logical inference. And they focused only on hard rules, which always hold with no exception and usually require extensive manual effort to create or validate. In this paper, we propose Rule-Guided Embedding (RUGE), a novel paradigm of KG embedding with iterative guidance from soft rules. RUGE enables an embedding model to learn simultaneously from 1) labeled triples that have been directly observed in a given KG, 2) unlabeled triples whose labels are going to be predicted iteratively, and 3) soft rules with various confidence levels extracted automatically from the KG. In the learning process, RUGE iteratively queries rules to obtain soft labels for unlabeled triples, and integrates such newly labeled triples to update the embedding model. Through this iterative procedure, knowledge embodied in logic rules may be better transferred into the learned embeddings. We evaluate RUGE in link prediction on Freebase and YAGO. Experimental results show that: 1) with rule knowledge injected iteratively, RUGE achieves significant and consistent improvements over state-of-the-art baselines; and 2) despite their uncertainties, automatically extracted soft rules are highly bene- ficial to KG embedding, even those with moderate confidence levels. The code and data used for this paper can be obtained from https: //github. com/iieir-km/RUGE.

AAAI Conference 2018 Conference Paper

Social Recommendation with an Essential Preference Space

  • Chun-Yi Liu
  • Chuan Zhou
  • Jia Wu
  • Yue Hu
  • Li Guo

Social recommendation, which aims to exploit social information to improve the quality of a recommender system, has attracted an increasing amount of attention in recent years. A large portion of existing social recommendation models are based on the tractable assumption that users consider the same factors to make decisions in both recommender systems and social networks. However, this assumption is not in concert with real-world situations, since users usually show different preferences in different scenarios. In this paper, we investigate how to exploit the differences between user preference in recommender systems and that in social networks, with the aim to further improve the social recommendation. In particular, we assume that the user preferences in different scenarios are results of different linear combinations from a more underlying user preference space. Based on this assumption, we propose a novel social recommendation framework, called social recommendation with an essential preferences space (SREPS), which simultaneously models the structural information in the social network, the rating and the consumption information in the recommender system under the capture of essential preference space. Experimental results on four real-world datasets demonstrate the superiority of the proposed SREPS model compared with seven stateof-the-art social recommendation methods.

AAAI Conference 2016 Conference Paper

On the Minimum Differentially Resolving Set Problem for Diffusion Source Inference in Networks

  • Chuan Zhou
  • Wei-Xue Lu
  • Peng Zhang
  • Jia Wu
  • Yue Hu
  • Li Guo

In this paper we theoretically study the minimum Differentially Resolving Set (DRS) problem derived from the classical sensor placement optimization problem in network source locating. A DRS of a graph G = (V, E) is defined as a subset S ⊆ V where any two elements in V can be distinguished by their different differential characteristic sets defined on S. The minimum DRS problem aims to find a DRS S in the graph G with minimum total weight v∈S w(v). In this paper we establish a group of Integer Linear Programming (ILP) models as the solution. By the weighted set cover theory, we propose an approximation algorithm with the Θ(ln n) approximability for the minimum DRS problem on general graphs, where n is the graph size.

IJCAI Conference 2015 Conference Paper

Knowledge Base Completion Using Embeddings and Rules

  • Quan Wang
  • Bin Wang
  • Li Guo

Knowledge bases (KBs) are often greatly incomplete, necessitating a demand for KB completion. A promising approach is to embed KBs into latent spaces and make inferences by learning and operating on latent representations. Such embedding models, however, do not make use of any rules during inference and hence have limited accuracy. This paper proposes a novel approach which incorporates rules seamlessly into embedding models for KB completion. It formulates inference as an integer linear programming (ILP) problem, with the objective function generated from embedding models and the constraints translated from rules. Solving the ILP problem results in a number of facts which 1) are the most preferred by the embedding models, and 2) comply with all the rules. By incorporating rules, our approach can greatly reduce the solution space and significantly improve the inference accuracy of embedding models. We further provide a slacking technique to handle noise in KBs, by explicitly modeling the noise with slack variables. Experimental results on two publicly available data sets show that our approach significantly and consistently outperforms state-of-the-art embedding models in KB completion. Moreover, the slacking technique is effective in identifying erroneous facts and ambiguous entities, with a precision higher than 90%.

AAAI Conference 2014 Conference Paper

Combining Heterogenous Social and Geographical Information for Event Recommendation

  • Zhi Qiao
  • Peng Zhang
  • Yanan Cao
  • Chuan Zhou
  • Li Guo
  • Binxing Fang

With the rapid growth of event-based social networks (EBSNs) like Meetup, the demand for event recommendation becomes increasingly urgent. In EBSNs, event recommendation plays a central role in recommending the most relevant events to users who are likely to participate in. Different from traditional recommendation problems, event recommendation encounters three new types of information, i. e. , heterogenous online+offline social relationships, geographical features of events and implicit rating data from users. Yet combining the three types of data for offline event recommendation has not been considered. Therefore, we present a Bayesian latent factor model that can unify these data for event recommendation. Experimental results on real-world data sets show the performance of our method.

AAAI Conference 2014 Conference Paper

Event Recommendation in Event-Based Social Networks

  • Zhi Qiao
  • Peng Zhang
  • Chuan Zhou
  • Yanan Cao
  • Li Guo
  • Yanchuan Zhang

With the rapid growth of event-based social networks, the demand of event recommendation becomes increasingly important. Different from classic recommendation problems, event recommendation generally faces the problems of heterogenous online and offline social relationships among users and implicit feedback data. In this paper, we present a baysian probability model that can fully unleash the power of heterogenous social relations and efficiently tackle with implicit feedback characteristic for event recommendation. Experimental results on several real-world datasets demonstrate the utility of our method.