Arrow Research search

Author name cluster

Feijiang Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

15 papers
2 author rows

Possible papers

15

AAAI Conference 2026 Conference Paper

Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting

  • Jieting Wang
  • Huimei Shi
  • Feijiang Li
  • Xiaolei Shang

Time series forecasting is an important task that involves analyzing temporal dependencies and underlying patterns (such as trends, cyclicality, and seasonality) in historical data to predict future values or trends. Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling. Despite enabling direct value prediction, this method offers no uncertainty estimation and exhibits poor outlier robustness. To address these limitations, we propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss, preserving prediction order while quantifying uncertainty through probability output. Specifically, OCE-TS begins by discretizing observed values into ordered intervals and deriving their probabilities via a parametric distribution as supervision signals. Using a simple linear model, we then predict probability distributions for each timestep. The OCE loss is computed between the cumulative distributions of predicted and ground-truth probabilities, explicitly preserving ordinal relationships among forecasted values. Through theoretical analysis using influence functions, we establish that cross-entropy (CE) loss exhibits superior stability and outlier robustness compared to MSE loss. Empirically, we compared OCE-TS with five baseline models—Autoformer, DLinear, iTransformer, TimeXer, and TimeBridge—on seven public time series datasets. Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models.

AAAI Conference 2026 Conference Paper

RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting

  • Jieting Wang
  • Xiaolei Shang
  • Feijiang Li
  • Furong Peng

Time series forecasting relies on predicting future values from historical data, yet most state-of-the-art approaches—including transformer and multilayer perceptron-based models—optimize using Mean Squared Error (MSE), which has two fundamental weaknesses: its point-wise error computation fails to capture temporal relationships, and it does not account for inherent noise in the data. To overcome these limitations, we introduce the Residual-Informed Loss (RI-Loss), a novel objective function based on the Hilbert-Schmidt Independence Criterion (HSIC). RI-Loss explicitly models noise structure by enforcing dependence between the residual sequence and a random time series, enabling more robust, noise-aware representations. Theoretically, we derive the first non-asymptotic HSIC bound with explicit double-sample complexity terms, achieving optimal convergence rates through Bernstein-type concentration inequalities and Rademacher complexity analysis. This provides rigorous guarantees for RI-Loss optimization while precisely quantifying kernel space interactions. Empirically, experiments across eight real-world benchmarks and five leading forecasting models demonstrate improvements in predictive performance, validating the effectiveness of our approach.

IJCAI Conference 2025 Conference Paper

Frequency-Aware Deep Depth from Focus

  • Tao Yan
  • Yingying Wang
  • Jiangfeng Zhang
  • Yuhua Qian
  • Jieru Jia
  • Lu Chen
  • Feijiang Li

In large aperture imaging, the shallow depth of field (DoF) phenomenon requires capturing multiple images at different focal levels, allowing us to infer depth information using depth from focus (DFF) techniques. However, most previous works design convolutional neural networks from a time domain perspective, often leading to blurred fine details in depth estimation. In this work, we propose a frequency-aware deep DFF network (FAD) that couples multi-scale spatial domain local features with frequency domain global structural features. Our main innovations include two key points: First, we introduce a frequency domain feature extraction module that uses the Fourier transform to transfer latent focus features into the frequency domain. This module adaptively captures essential frequency information for focus changes through element-wise multiplication, enhancing fine details in depth results while preserving global structural integrity. Second, the time-frequency joint module of FAD improves the consistency of depth information in sparse texture regions and the continuity in transition areas from both local and global complementary perspectives. Comprehensive experiments demonstrate that our model achieves compelling generalization and state-of-the-art depth prediction across various datasets. Additionally, it can be quickly adapted to real-world applications as a pre-trained model.

AAAI Conference 2025 Conference Paper

k-HyperEdge Medoids for Clustering Ensemble

  • Feijiang Li
  • Jieting Wang
  • Liuya Zhang
  • Yuhua Qian
  • Shuai Jin
  • Tao Yan
  • Liang Du

Clustering ensemble has been a popular research topic in data science due to its ability to improve the robustness of the single clustering method. Many clustering ensemble methods have been proposed, most of which can be categorized into clustering-view and sample-view methods. The clustering-view method is generally efficient, but it could be affected by the unreliability that existed in base clustering results. The sample-view method shows good performance, while the construction of the pairwise sample relation is time-consuming. In this paper, the clustering ensemble is formulated as a k-HyperEdge Medoids discovery problem and a clustering ensemble method based on k-HyperEdge Medoids that considers the characteristics of the above two types of clustering ensemble methods is proposed. In the method, a set of hyperedges is selected from the clustering view efficiently, then the hyperedges are diffused and adjusted from the sample view guided by a hyperedge loss function to construct an effective k-HyperEdge Medoid set. The loss function is mainly reduced by assigning samples to the hyperedge with the highest degree of belonging. Theoretical analyses show that the solution can approximate the optimal, the assignment method can gradually reduce the loss function, and the estimation of the belonging degree is statistically reasonable. Experiments on artificial data show the working mechanism of the proposed method. The convergence of the method is verified by experimental analysis of twenty data sets. The effectiveness and efficiency of the proposed method are also verified on these data, with nine representative clustering ensemble algorithms as reference.

ICML Conference 2025 Conference Paper

Robust Automatic Modulation Classification with Fuzzy Regularization

  • Xinyan Liang
  • Ruijie Sang
  • Yuhua Qian
  • Qian Guo 0005
  • Feijiang Li
  • Liang Du 0003

Automatic Modulation Classification (AMC) serves as a foundational pillar for cognitive radio systems, enabling critical functionalities including dynamic spectrum allocation, non-cooperative signal surveillance, and adaptive waveform optimization. However, practical deployment of AMC faces a fundamental challenge: prediction ambiguity arising from intrinsic similarity among modulation schemes and exacerbated under low signal-to-noise ratio (SNR) conditions. This phenomenon manifests as near-identical probability distributions across confusable modulation types, significantly degrading classification reliability. To address this, we propose Fuzzy Regularization-enhanced AMC (FR-AMC), a novel framework that integrates uncertainty quantification into the classification pipeline. The proposed FR has three features: (1) Explicitly model prediction ambiguity during backpropagation, (2) dynamic sample reweighting through adaptive loss scaling, (3) encourage margin maximization between confusable modulation clusters. Experimental results on benchmark datasets demonstrate that the FR achieves superior classification accuracy and robustness compared to compared methods, making it a promising solution for real-world spectrum management and communication applications.

AAAI Conference 2025 Conference Paper

Semi-Supervised Multi-View Multi-Label Learning with View-Specific Transformer and Enhanced Pseudo-Label

  • Quanjiang Li
  • Tingjin Luo
  • Mingdie Jiang
  • Zhangqi Jiang
  • Chenping Hou
  • Feijiang Li

Multi-view multi-label learning has become a research focus for describing objects with rich expressions and annotations. However, real-world data often contains numerous unlabeled instances, due to the high cost and technical limitations of manual labeling. This crucial problem involves three main challenges: i) How to extract advanced semantics from available views? ii) How to build a refined classification framework with limited labeled space? iii) How to provide more high-quality supervisory information? To address these problems, we propose a Semi-Supervised Multi-View Multi-Label Learning Method with View-Specific Transformer and Enhanced Pseudo-Label named SMVTEP. Specifically, Generative Adversarial Networks are employed to extract informative shared and specific representations and their consistency and distinctiveness are ensured through the adversarial mechanism and information theory based contrastive learning. Then we build specific classifiers for each extracted feature and apply instance-level manifold constraints to reduce bias across classifiers. Moreover, we design a transformer-style fusion approach that simultaneously captures the imbalance of expressive power among views, mapping effects on specific labels, and label dependencies by incorporating confidence scores and category semantics into the self-attention mechanism. Furthermore, after using Mixup for data augmentation, category-enhanced pseudo-labels are leveraged to improve the reliability of additional annotations by aligning the label distribution of unlabeled samples with the true distribution. Finally, extensive experimental results validate the effectiveness of SMVTEP against state-of-the-art methods.

AAAI Conference 2025 Conference Paper

Sharper Error Bounds in Late Fusion Multi-view Clustering with Eigenvalue Proportion Optimization

  • Liang Du
  • Henghui Jiang
  • Xiaodong Li
  • Yiqing Guo
  • Yan Chen
  • Feijiang Li
  • Peng Zhou
  • Yuhua Qian

Multi-view clustering (MVC) aims to integrate complementary information from multiple views to enhance clustering performance. Late Fusion Multi-View Clustering (LFMVC) has shown promise by synthesizing diverse clustering results into a unified consensus. However, current LFMVC methods struggle with noisy and redundant partitions and often fail to capture high-order correlations across views. To address these limitations, we present a novel theoretical framework for analyzing the generalization error bounds of multiple kernel k-means, leveraging local Rademacher complexity and principal eigenvalue proportions. Our analysis establishes a convergence rate of O(1/n), significantly improving upon the existing rate in the order of O(sqrt(k/n)). Building on this insight, we propose a low-pass graph filtering strategy within a multiple linear K-means framework to mitigate noise and redundancy, further refining the principal eigenvalue proportion and enhancing clustering accuracy. Experimental results on benchmark datasets confirm that our approach outperforms state-of-the-art methods in clustering performance and robustness.

ICML Conference 2025 Conference Paper

Stabilizing Sample Similarity in Representation via Mitigating Random Consistency

  • Jieting Wang
  • Zelong Zhang
  • Feijiang Li
  • Yuhua Qian
  • Xinyan Liang

Deep learning excels at capturing complex data representations, yet quantifying the discriminative quality of these representations remains challenging. While unsupervised metrics often assess pairwise sample similarity, classification tasks fundamentally require class-level discrimination. To bridge this gap, we propose a novel loss function that evaluates representation discriminability via the Euclidean distance between the learned similarity matrix and the true class adjacency matrix. We identify random consistency—an inherent bias in Euclidean distance metrics—as a key obstacle to reliable evaluation, affecting both fairness and discrimination. To address this, we derive the expected Euclidean distance under uniformly distributed label permutations and introduce its closed-form solution, the Pure Square Euclidean Distance (PSED), which provably eliminates random consistency. Theoretically, we demonstrate that PSED satisfies heterogeneity and unbiasedness guarantees, and establish its generalization bound via the exponential Orlicz norm, confirming its statistical learnability. Empirically, our method surpasses conventional loss functions across multiple benchmarks, achieving significant improvements in accuracy, $F_1$ score, and class-structure differentiation. (Code is published in https: //github. com/FeijiangLi/ICML2025-PSED)

ICML Conference 2025 Conference Paper

Trusted Multi-View Classification with Expert Knowledge Constraints

  • Xinyan Liang
  • Shijie Wang
  • Yuhua Qian
  • Qian Guo 0005
  • Liang Du 0003
  • Bingbing Jiang 0001
  • Tingjin Luo
  • Feijiang Li

Multi-view classification (MVC) based on the Dempster-Shafer theory has gained significant recognition for its reliability in safety-critical applications. However, existing methods predominantly focus on providing confidence levels for decision outcomes without explaining the reasoning behind these decisions. Moreover, the reliance on first-order statistical magnitudes of belief masses often inadequately capture the intrinsic uncertainty within the evidence. To address these limitations, we propose a novel framework termed Trusted Multi-view Classification Constrained with Expert Knowledge (TMCEK). TMCEK integrates expert knowledge to enhance feature-level interpretability and introduces a distribution-aware subjective opinion mechanism to derive more reliable and realistic confidence estimates. The theoretical superiority of the proposed uncertainty measure over conventional approaches is rigorously established. Extensive experiments conducted on three multi-view datasets for sleep stage classification demonstrate that TMCEK achieves state-of-the-art performance while offering interpretability at both the feature and decision levels. These results position TMCEK as a robust and interpretable solution for MVC in safety-critical domains. The code is available at https: //github. com/jie019/TMCEK_ICML2025.

IJCAI Conference 2025 Conference Paper

View-Association-Guided Dynamic Multi-View Classification

  • Xinyan Liang
  • Li Lv
  • Qian Guo
  • Bingbing Jiang
  • Feijiang Li
  • Liang Du
  • Lu Chen

In multi-view classification tasks, integrating information from multiple views effectively is crucial for improving model performance. However, most existing methods fail to fully leverage the complex relationships between views, often treating them independently or using static fusion strategies. In this paper, we propose a View-Association-Guided Dynamic Multi-View Classification method (AssoDMVC) to address these limitations. Our approach dynamically models and incorporates the relationships between different views during the classification process. Specifically, we introduce a view-relation-guided mechanism that captures the dependencies and interactions between views, allowing for more flexible and adaptive feature fusion. This dynamic fusion strategy ensures that each view contributes optimally based on its contextual relevance and the inter-view relationships. Extensive experiments on multiple benchmark datasets demonstrate that our method outperforms traditional multi-view classification techniques, offering a more robust and efficient solution for tasks involving complex multi-view data.

AAAI Conference 2024 Conference Paper

Cross-Domain Contrastive Learning for Time Series Clustering

  • Furong Peng
  • Jiachen Luo
  • Xuan Lu
  • Sheng Wang
  • Feijiang Li

Most deep learning-based time series clustering models concentrate on data representation in a separate process from clustering. This leads to that clustering loss cannot guide feature extraction. Moreover, most methods solely analyze data from the temporal domain, disregarding the potential within the frequency domain. To address these challenges, we introduce a novel end-to-end Cross-Domain Contrastive learning model for time series Clustering (CDCC). Firstly, it integrates the clustering process and feature extraction using contrastive constraints at both cluster-level and instance-level. Secondly, the data is encoded simultaneously in both temporal and frequency domains, leveraging contrastive learning to enhance within-domain representation. Thirdly, cross-domain constraints are proposed to align the latent representations and category distribution across domains. With the above strategies, CDCC not only achieves end-to-end output but also effectively integrates frequency domains. Extensive experiments and visualization analysis are conducted on 40 time series datasets from UCR, demonstrating the superior performance of the proposed model.

IJCAI Conference 2024 Conference Paper

Deep Embedding Clustering Driven by Sample Stability

  • Zhanwen Cheng
  • Feijiang Li
  • Jieting Wang
  • Yuhua Qian

Deep clustering methods improve the performance of clustering tasks by jointly optimizing deep representation learning and clustering. While numerous deep clustering algorithms have been proposed, most of them rely on artificially constructed pseudo targets for performing clustering. This construction process requires some prior knowledge, and it is challenging to determine a suitable pseudo target for clustering. To address this issue, we propose a deep embedding clustering algorithm driven by sample stability (DECS), which eliminates the requirement of pseudo targets. Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability. The sample stability aims to explore the deterministic relationship between samples and all cluster centroids, pulling samples to their respective clusters and keeping them away from other clusters with high determinacy. We analyzed the convergence of the loss using Lipschitz continuity in theory, which verifies the validity of the model. The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.

NeurIPS Conference 2024 Conference Paper

Neural Collapse To Multiple Centers For Imbalanced Data

  • Hongren Yan
  • Yuhua Qian
  • Furong Peng
  • Jiachen Luo
  • Zheqing Zhu
  • Feijiang Li

Neural Collapse (NC) was a recently discovered phenomenon that the output features and the classifier weights of the neural network converge to optimal geometric structures at the Terminal Phase of Training (TPT) under various losses. However, the relationship between these optimal structures at TPT and the classification performance remains elusive, especially in imbalanced learning. Even though it is noticed that fixing the classifier to an optimal structure can mitigate the minority collapse problem, the performance is still not comparable to the classical imbalanced learning methods with a learnable classifier. In this work, we find that the optimal structure can be designed to represent a better classification rule, and thus achieve better performance. In particular, we justify that, to achieve better classification, the features from the minor classes should align with more directions. This justification then yields a decision rule called the Generalized Classification Rule (GCR) and we also term these directions as the centers of the classes. Then we study the NC under an MSE-type loss via the Unconstrained Features Model (UFM) framework where (1) the features from a class tend to collapse to the mean of the corresponding centers of that class (named Neural Collapse to Multiple Centers (NCMC)) at the global optimum, and (2) the original classifier approximates a surrogate to GCR when NCMC occurs. Based on the analysis, we develop a strategy for determining the number of centers and propose a Cosine Loss function for the fixed classifier that induces NCMC. Our experiments have shown that the Cosine Loss can induce NCMC and has performance on long-tail classification comparable to the classical imbalanced learning methods.

AAAI Conference 2021 Conference Paper

GoT: a Growing Tree Model for Clustering Ensemble

  • Feijiang Li
  • Yuhua Qian
  • Jieting Wang

The clustering ensemble technique that integrates multiple clustering results can improve the accuracy and robustness of the final clustering. In many clustering ensemble algorithms, the co-association matrix (CA matrix), which reflects the frequency of any two samples being partitioned into the same cluster, plays an important role. However, generally, the CA matrix is highly sparse with low value density, which may limit the performance of an algorithm based on it. To handle these issues, in this paper, we propose a growing tree model (GoT). In this model, the CA matrix is firstly refined by the shortest path technique so that its sparsity will be mitigated. Then, a set of representative prototype examples is discovered. Finally, to handle the low value density of the CA matrix, the prototypes gradually connect to their neighborhood, which likes a set of trees growing up. The rationality of the discovered prototype examples is illustrated by theoretical analysis and experimental analysis. The working mechanism of the GoT is visually shown on synthetic data sets. Experimental analyses on eight UCI data sets and eight image data sets show that the GoT outperforms nine representative clustering ensemble algorithms.