Author name cluster

Jieting Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

AAAI Conference 2026 Conference Paper

Beyond MSE: Ordinal Cross-Entropy for Probabilistic Time Series Forecasting

Jieting Wang
Huimei Shi
Feijiang Li
Xiaolei Shang

Time series forecasting is an important task that involves analyzing temporal dependencies and underlying patterns (such as trends, cyclicality, and seasonality) in historical data to predict future values or trends. Current deep learning-based forecasting models primarily employ Mean Squared Error (MSE) loss functions for regression modeling. Despite enabling direct value prediction, this method offers no uncertainty estimation and exhibits poor outlier robustness. To address these limitations, we propose OCE-TS, a novel ordinal classification approach for time series forecasting that replaces MSE with Ordinal Cross-Entropy (OCE) loss, preserving prediction order while quantifying uncertainty through probability output. Specifically, OCE-TS begins by discretizing observed values into ordered intervals and deriving their probabilities via a parametric distribution as supervision signals. Using a simple linear model, we then predict probability distributions for each timestep. The OCE loss is computed between the cumulative distributions of predicted and ground-truth probabilities, explicitly preserving ordinal relationships among forecasted values. Through theoretical analysis using influence functions, we establish that cross-entropy (CE) loss exhibits superior stability and outlier robustness compared to MSE loss. Empirically, we compared OCE-TS with five baseline models—Autoformer, DLinear, iTransformer, TimeXer, and TimeBridge—on seven public time series datasets. Using MSE and Mean Absolute Error (MAE) as evaluation metrics, the results demonstrate that OCE-TS consistently outperforms benchmark models.

PDF Details DOI

AAAI Conference 2026 Conference Paper

RI-Loss: A Learnable Residual-Informed Loss for Time Series Forecasting

Jieting Wang
Xiaolei Shang
Feijiang Li
Furong Peng

Time series forecasting relies on predicting future values from historical data, yet most state-of-the-art approaches—including transformer and multilayer perceptron-based models—optimize using Mean Squared Error (MSE), which has two fundamental weaknesses: its point-wise error computation fails to capture temporal relationships, and it does not account for inherent noise in the data. To overcome these limitations, we introduce the Residual-Informed Loss (RI-Loss), a novel objective function based on the Hilbert-Schmidt Independence Criterion (HSIC). RI-Loss explicitly models noise structure by enforcing dependence between the residual sequence and a random time series, enabling more robust, noise-aware representations. Theoretically, we derive the first non-asymptotic HSIC bound with explicit double-sample complexity terms, achieving optimal convergence rates through Bernstein-type concentration inequalities and Rademacher complexity analysis. This provides rigorous guarantees for RI-Loss optimization while precisely quantifying kernel space interactions. Empirically, experiments across eight real-world benchmarks and five leading forecasting models demonstrate improvements in predictive performance, validating the effectiveness of our approach.

PDF Details DOI

AAAI Conference 2025 Conference Paper

k-HyperEdge Medoids for Clustering Ensemble

Feijiang Li
Jieting Wang
Liuya Zhang
Yuhua Qian
Shuai Jin
Tao Yan
Liang Du

Clustering ensemble has been a popular research topic in data science due to its ability to improve the robustness of the single clustering method. Many clustering ensemble methods have been proposed, most of which can be categorized into clustering-view and sample-view methods. The clustering-view method is generally efficient, but it could be affected by the unreliability that existed in base clustering results. The sample-view method shows good performance, while the construction of the pairwise sample relation is time-consuming. In this paper, the clustering ensemble is formulated as a k-HyperEdge Medoids discovery problem and a clustering ensemble method based on k-HyperEdge Medoids that considers the characteristics of the above two types of clustering ensemble methods is proposed. In the method, a set of hyperedges is selected from the clustering view efficiently, then the hyperedges are diffused and adjusted from the sample view guided by a hyperedge loss function to construct an effective k-HyperEdge Medoid set. The loss function is mainly reduced by assigning samples to the hyperedge with the highest degree of belonging. Theoretical analyses show that the solution can approximate the optimal, the assignment method can gradually reduce the loss function, and the estimation of the belonging degree is statistically reasonable. Experiments on artificial data show the working mechanism of the proposed method. The convergence of the method is verified by experimental analysis of twenty data sets. The effectiveness and efficiency of the proposed method are also verified on these data, with nine representative clustering ensemble algorithms as reference.

PDF Details DOI

ICML Conference 2025 Conference Paper

Stabilizing Sample Similarity in Representation via Mitigating Random Consistency

Jieting Wang
Zelong Zhang
Feijiang Li
Yuhua Qian
Xinyan Liang

Deep learning excels at capturing complex data representations, yet quantifying the discriminative quality of these representations remains challenging. While unsupervised metrics often assess pairwise sample similarity, classification tasks fundamentally require class-level discrimination. To bridge this gap, we propose a novel loss function that evaluates representation discriminability via the Euclidean distance between the learned similarity matrix and the true class adjacency matrix. We identify random consistency—an inherent bias in Euclidean distance metrics—as a key obstacle to reliable evaluation, affecting both fairness and discrimination. To address this, we derive the expected Euclidean distance under uniformly distributed label permutations and introduce its closed-form solution, the Pure Square Euclidean Distance (PSED), which provably eliminates random consistency. Theoretically, we demonstrate that PSED satisfies heterogeneity and unbiasedness guarantees, and establish its generalization bound via the exponential Orlicz norm, confirming its statistical learnability. Empirically, our method surpasses conventional loss functions across multiple benchmarks, achieving significant improvements in accuracy, $F_1$ score, and class-structure differentiation. (Code is published in https: //github. com/FeijiangLi/ICML2025-PSED)

Details

IJCAI Conference 2024 Conference Paper

Deep Embedding Clustering Driven by Sample Stability

Zhanwen Cheng
Feijiang Li
Jieting Wang
Yuhua Qian

Deep clustering methods improve the performance of clustering tasks by jointly optimizing deep representation learning and clustering. While numerous deep clustering algorithms have been proposed, most of them rely on artificially constructed pseudo targets for performing clustering. This construction process requires some prior knowledge, and it is challenging to determine a suitable pseudo target for clustering. To address this issue, we propose a deep embedding clustering algorithm driven by sample stability (DECS), which eliminates the requirement of pseudo targets. Specifically, we start by constructing the initial feature space with an autoencoder and then learn the cluster-oriented embedding feature constrained by sample stability. The sample stability aims to explore the deterministic relationship between samples and all cluster centroids, pulling samples to their respective clusters and keeping them away from other clusters with high determinacy. We analyzed the convergence of the loss using Lipschitz continuity in theory, which verifies the validity of the model. The experimental results on five datasets illustrate that the proposed method achieves superior performance compared to state-of-the-art clustering approaches.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

PHSIC against Random Consistency and Its Application in Causal Inference

Jue Li
Yuhua Qian
Jieting Wang
Saixiong Liu

The Hilbert-Schmidt Independence Criterion (HSIC) based on kernel functions is capable of detecting nonlinear dependencies between variables, making it a common method for association relationship mining. However, in situations with small samples, high dimensions, or noisy data, it may generate spurious associations, causing two unrelated variables to have certain scores. To address this issue, we propose a novel criterion, named as Pure Hilbert-Schmidt Independence Criterion (PHSIC). PHSIC is achieved by subtracting the mean HSIC obtained under random conditions from the original HSIC value. We demonstrate three significant advantages of PHSIC through theoretical and simulation experiments: (1) PHSIC has a baseline of zero, enhancing the interpretability of HSIC. (2) Compared to HSIC, PHSIC exhibits lower bias. (3) PHSIC enables a fairer comparison across different samples and dimensions. To validate the effectiveness of PHSIC, we apply it to multiple causal inference tasks to measure the independence between cause and residual. Experimental results demonstrate that the causal model based on PHSIC performs well compared to other methods in scenarios involving small sample sizes and noisy data, both in real and simulated datasets.

PDF Details DOI

AAAI Conference 2021 Conference Paper

GoT: a Growing Tree Model for Clustering Ensemble

Feijiang Li
Yuhua Qian
Jieting Wang

The clustering ensemble technique that integrates multiple clustering results can improve the accuracy and robustness of the final clustering. In many clustering ensemble algorithms, the co-association matrix (CA matrix), which reflects the frequency of any two samples being partitioned into the same cluster, plays an important role. However, generally, the CA matrix is highly sparse with low value density, which may limit the performance of an algorithm based on it. To handle these issues, in this paper, we propose a growing tree model (GoT). In this model, the CA matrix is firstly refined by the shortest path technique so that its sparsity will be mitigated. Then, a set of representative prototype examples is discovered. Finally, to handle the low value density of the CA matrix, the prototypes gradually connect to their neighborhood, which likes a set of trees growing up. The rationality of the discovered prototype examples is illustrated by theoretical analysis and experimental analysis. The working mechanism of the GoT is visually shown on synthetic data sets. Experimental analyses on eight UCI data sets and eight image data sets show that the GoT outperforms nine representative clustering ensemble algorithms.

PDF Details

AIJ Journal 2019 Journal Article

Clustering ensemble based on sample's stability

Feijiang Li
Yuhua Qian
Jieting Wang
Chuangyin Dang
Liping Jing

The objective of clustering ensemble is to find the underlying structure of data based on a set of clustering results. It has been observed that the samples can change between clusters in different clustering results. This change shows that samples may have different contributions to the detection of the underlying structure. However, the existing clustering ensemble methods treat all sample equally. To tackle this deficiency, we introduce the stability of a sample to quantify its contribution and present a methodology to determine this stability. We propose two formulas accord with this methodology to calculate sample's stability. Then, we develop a clustering ensemble algorithm based on the sample's stability. With either formula, this algorithm divides a data set into two classes: cluster core and cluster halo. With the core and halo, the proposed algorithm then discovers a clear structure using the samples in the cluster core and assigns samples in the cluster halo to the clear structure gradually. The experiments on eight synthetic data sets illustrate how the proposed algorithm works. This algorithm statistically outperforms twelve state-of-the-art clustering ensemble algorithms on ten real data sets from UCI and six document data sets. The experimental analysis on the case of image segmentation shows that cluster cores discovered by the stability are rational.

Details DOI