Author name cluster

Weiwei Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers

1 author row

AAAI Conference 2026 Conference Paper

Learning Label Distribution with Dirichlet Process Mixture Model

Minglong Wang
Weiwei Li
Yunan Lu
Xiuyi Jia

Label Distribution Learning (LDL) is an effective machine learning paradigm for addressing label ambiguity, where each sample is annotated with a distribution that conveys rich semantic information. However, during the actual annotation process of label distributions, annotators often exhibit divergent labeling preferences for the same sample. Most existing LDL methods overlook this heterogeneity, assuming that the observed label distribution originates from a single labeling pattern. Such an assumption limits their capacity to manage inter-annotator disagreement and constrains the generalization of the resulting models. To address this issue, we propose, for the first time, a Dirichlet process mixture model (DPMM)-based framework for LDL. This framework leverages nonparametric Bayesian methods to adaptively uncover diverse latent labeling patterns from the data and to accurately model annotator heterogeneity. Specifically, the ground-truth label distribution of each sample is modeled as a weighted mixture of multiple latent components, where a feature-conditioned gating mechanism adaptively controls the contribution of each component. Experimental results demonstrate that the proposed model consistently achieves competitive performance on several widely-used benchmark datasets.

PDF Details DOI

EAAI Journal 2025 Journal Article

A credibility and consistency-oriented stochastic aggregation framework for heterogeneous multi-attribute large-scale group decision making with several attribute sets

Weiwei Li
Pingtao Yi
Danning Zhang

In large-scale group decision making, experts usually provide their individual preferences on attribute values, as well as attribute sets. The coexistence of large-scale heterogeneous information poses a significant challenge to the credibility and consistency on decision making. To this issue, the paper proposes a credibility and consistency-oriented stochastic aggregation framework, including four primarily research points. Firstly, a simplified transformation method is developed to convert heterogeneous attribute values into individual attribute superiority-probability-based pairwise comparison matrix (IA-SPMs), which saves transformation cost as well as provides abundant references for credibility analysis. Secondly, the deviation-based credibility measures and the credibility-based weighting methods are proposed. Thirdly, a cluster-based aggregation operator is introduced by considering group consistent preferences on attributes selection to get the collective SPM (C-SPM). Fourthly, the ranking probability matrix (RPM) and the possibility ranking result are calculated based on the C-SPM. Using experiment and application analyses we illustrate that the proposed methods can enhance the credibility of decision outcomes, as well as the stability of the results. This research can provide technical support for effective fusion of heterogeneous information from multiple sources in the artificial intelligence (AI) era, and has broad application potential in large-scale democratic decision making and pre-evaluation engineering projects.

Details DOI

AAAI Conference 2025 Conference Paper

Adaptive-Grained Label Distribution Learning

Yunan Lu
Weiwei Li
Dun Liu
Huaxiong Li
Xiuyi Jia

Label polysemy, where an instance can be associated with multiple labels, is common in real-world tasks. LDL (label distribution learning) is an effective learning paradigm for handling label polysemy, where each instance is associated with a label distribution. Although numerous LDL algorithms have been proposed and achieved satisfactory performance on most existing datasets, they are typically trained directly on the collected label distributions which often lack quality guarantees in real-world tasks due to annotator subjectivity and algorithm assumptions. Consequently, direct learning from such uncertain label distributions can lead to unpredictable generalization performance. To address this problem, we propose an adaptive-grained label distribution learning framework whose main idea is to extract relatively reliable supervision information from unreliable label distributions, and thus the label distribution learning task can be decomposed into three subtasks: coarsening label distributions, learning coarse-grained labels and refining coarse-grained labels. In this framework, we design an adaptive label coarsening algorithm to extract an optimal coarsen-grained labels and a label refining function to enhance the coarse-grained label into the final label distributions. Finally, we conduct extensive experiments on real-world datasets to demonstrate the advantages of our proposal.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Towards a Pairwise Ranking Model with Orderliness and Monotonicity for Label Enhancement

Yunan Lu
Xixi Zhang
Yaojin Lin
Weiwei Li
Lei Yang
Xiuyi Jia

Label distribution in recent years has been applied in a diverse array of complex decision-making tasks. To address the availability of label distributions, label enhancement has been established as an effective learning paradigm that aims to automatically infer label distributions from readily available multi-label data, e. g. , logical labels. Recently, numerous works have demonstrated that the label ranking is significantly beneficial to label enhancement. However, these works still exhibit deficiencies in representing the probabilistic relationships between label distribution and label rankings, or fail to accommodate scenarios where multiple labels are equally important for a given instance. Therefore, we propose PROM, a pairwise ranking model with orderliness and monotonicity, to explain the probabilistic relationship between label distributions and label rankings. Specifically, we propose the monotonicity and orderliness assumptions for the probabilities of different ranking relationships and derive the mass functions for PROM, which are theoretically ensured to preserve the monotonicity and orderliness. Further, we propose a generative label enhancement algorithm based on PROM, which directly learns a label distribution predictor from the readily available multi-label data. Finally, extensive experiments demonstrate the efficacy of our proposed model.

PDF Details

AAAI Conference 2024 Conference Paper

Generative Calibration of Inaccurate Annotation for Label Distribution Learning

Liang He
Yunan Lu
Weiwei Li
Xiuyi Jia

Label distribution learning (LDL) is an effective learning paradigm for handling label ambiguity. When applying LDL, it typically requires datasets annotated with label distributions. However, obtaining supervised data for LDL is a challenging task. Due to the randomness of label annotation, the annotator can produce inaccurate annotation results for the instance, affecting the accuracy and generalization ability of the LDL model. To address this problem, we propose a generative approach to calibrate the inaccurate annotation for LDL using variational inference techniques. Specifically, we assume that instances with similar features share latent similar label distributions. The feature vectors and label distributions are generated by Gaussian mixture and Dirichlet mixture, respectively. The relationship between them is established through a shared categorical variable, which effectively utilizes the label distribution of instances with similar features, and achieves a more accurate label distribution through the generative approach. Furthermore, we use a confusion matrix to model the factors that contribute to the inaccuracy during the annotation process, which captures the relationship between label distributions and inaccurate label distributions. Finally, the label distribution is used to calibrate the available information in the noisy dataset to obtain the ground-truth label distribution.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Generative Label Enhancement with Gaussian Mixture and Partial Ranking

Yunan Lu
Liang He
Fan Min
Weiwei Li
Xiuyi Jia

Label distribution learning (LDL) is an effective learning paradigm for dealing with label ambiguity. When applying LDL, the datasets annotated with label distributions (i.e., the real-valued vectors like the probability distribution) are typically required. Unfortunately, most existing datasets only contain the logical labels, and manual annotating with label distributions is costly. To address this problem, we treat the label distribution as a latent vector and infer its posterior by variational Bayes. Specifically, we propose a generative label enhancement model to encode the process of generating feature vectors and logical label vectors from label distributions in a principled way. In terms of features, we assume that the feature vector is generated by a Gaussian mixture dominated by the label distribution, which captures the one-to-many relationship from the label distribution to the feature vector and thus reduces the feature generation error. In terms of logical labels, we design a probability distribution to generate the logical label vector from a label distribution, which captures partial label ranking in the logical label vector and thus provides a more accurate guidance for inferring the label distribution. Besides, to approximate the posterior of the label distribution, we design a inference model, and derive the variational learning objective. Finally, extensive experiments on real-world datasets validate our proposal.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Label Enhancement via Joint Implicit Representation Clustering

Yunan Lu
Weiwei Li
Xiuyi Jia

Label distribution is an effective label form to portray label polysemy (i. e. , the cases that an instance can be described by multiple labels simultaneously). However, the expensive annotating cost of label distributions limits its application to a wider range of practical tasks. Therefore, LE (label enhancement) techniques are extensively studied to solve this problem. Existing LE algorithms mostly estimate label distributions by the instance relation or the label relation. However, they suffer from biased instance relations, limited model capabilities, or suboptimal local label correlations. Therefore, in this paper, we propose a deep generative model called JRC to simultaneously learn and cluster the joint implicit representations of both features and labels, which can be used to improve any existing LE algorithm involving the instance relation or local label correlations. Besides, we develop a novel label distribution recovery module, and then integrate it with JRC model, thus constituting a novel generative label enhancement model that utilizes the learned joint implicit representations and instance clusters in a principled way. Finally, extensive experiments validate our proposal.

PDF Details DOI

JMLR Journal 2021 Journal Article

Subspace Clustering through Sub-Clusters

Weiwei Li
Jan Hannig
Sayan Mukherjee

The problem of dimension reduction is of increasing importance in modern data analysis. In this paper, we consider modeling the collection of points in a high dimensional space as a union of low dimensional subspaces. In particular we propose a highly scalable sampling based algorithm that clusters the entire data via first spectral clustering of a small random sample followed by classifying or labeling the remaining out-of-sample points. The key idea is that this random subset borrows information across the entire dataset and that the problem of clustering points can be replaced with the more efficient problem of "clustering sub-clusters". We provide theoretical guarantees for our procedure. The numerical results indicate that for large datasets the proposed algorithm outperforms other state-of-the-art subspace clustering algorithms with respect to accuracy and speed. [abs] [ pdf ][ bib ] &copy JMLR 2021. ( edit, beta )

PDF Details

IJCAI Conference 2020 Conference Paper

Privileged label enhancement with multi-label learning

Wenfang Zhu
Xiuyi Jia
Weiwei Li

Label distribution learning has attracted more and more attention in view of its more generalized ability to express the label ambiguity. However, it is much more expensive to obtain the label distribution information of the data rather than the logical labels. Thus, label enhancement is proposed to recover the label distributions from the logical labels. In this paper, we propose a novel label enhancement method by using privileged information. We first apply a multi-label learning model to implicitly capture the complex structural information between instances and generate the privileged information. Second, we adopt LUPI (learning with privileged information) paradigm to utilize the privileged information and employ RSVM+ as the prediction model. Finally, comparison experiments on 12 datasets demonstrate that our proposal can better fit the ground-truth label distributions.

PDF Details DOI

IJCAI Conference 2020 Conference Paper

Tensor-based multi-view label enhancement for multi-label learning

Fangwen Zhang
Xiuyi Jia
Weiwei Li

Label enhancement (LE) is a procedure of recovering the label distributions from the logical labels in the multi-label data, the purpose of which is to better represent and mine the label ambiguity problem through the form of label distribution. Existing LE work mainly concentrates on how to leverage the topological information of the feature space and the correlation among the labels, and all are based on single view data. In view of the fact that there are many multi-view data in real-world applications, which can provide richer semantic information from different perspectives, this paper first presents a multi-view label enhancement problem and proposes a tensor-based multi-view label enhancement method, named TMV-LE. Firstly, we introduce the tensor factorization to get the common subspace which contains the high-order relationships among different views. Secondly, we use the common representation and multiple views to jointly mine a more comprehensive topological structure in the dataset. Finally, the topological structure of the feature space is migrated to the label space to get the label distributions. Extensive comparative studies validate that the performance of multi-view multi-label learning can be improved significantly with TMV-LE.

PDF Details DOI

IJCAI Conference 2019 Conference Paper

Label Distribution Learning with Label Correlations via Low-Rank Approximation

Tingting Ren
Xiuyi Jia
Weiwei Li
Shu Zhao

Label distribution learning (LDL) can be viewed as the generalization of multi-label learning. This novel paradigm focuses on the relative importance of different labels to a particular instance. Most previous LDL methods either ignore the correlation among labels, or only exploit the label correlations in a global way. In this paper, we utilize both the global and local relevance among labels to provide more information for training model and propose a novel label distribution learning algorithm. In particular, a label correlation matrix based on low-rank approximation is applied to capture the global label correlations. In addition, the label correlation among local samples are adopted to modify the label correlation matrix. The experimental results on real-world data sets show that the proposed algorithm outperforms state-of-the-art LDL methods.

PDF Details

IJCAI Conference 2019 Conference Paper

Label distribution learning with label-specific features

Tingting Ren
Xiuyi Jia
Weiwei Li
Lei Chen
Zechao Li

Label distribution learning (LDL) is a novel machine learning paradigm to deal with label ambiguity issues by placing more emphasis on how relevant each label is to a particular instance. Many LDL algorithms have been proposed and most of them concentrate on the learning models, while few of them focus on the feature selection problem. All existing LDL models are built on a simple feature space in which all features are shared by all the class labels. However, this kind of traditional data representation strategy tends to select features that are distinguishable for all labels, but ignores label-specific features that are pertinent and discriminative for each class label. In this paper, we propose a novel LDL algorithm by leveraging label-specific features. The common features for all labels and specific features for each label are simultaneously learned to enhance the LDL model. Moreover, we also exploit the label correlations in the proposed LDL model. The experimental results on several real-world data sets validate the effectiveness of our method.

PDF Details

AAAI Conference 2018 Conference Paper

Label Distribution Learning by Exploiting Label Correlations

Xiuyi Jia
Weiwei Li
Junyu Liu
Yu Zhang

Label distribution learning (LDL) is a newly arisen machine learning method that has been increasingly studied in recent years. In theory, LDL can be seen as a generalization of multilabel learning. Previous studies have shown that LDL is an effective approach to solve the label ambiguity problem. However, the dramatic increase in the number of possible label sets brings a challenge in performance to LDL. In this paper, we propose a novel label distribution learning algorithm to address the above issue. The key idea is to exploit correlations between different labels. We encode the label correlation into a distance to measure the similarity of any two labels. Moreover, we construct a distance-mapping function from the label set to the parameter matrix. Experimental results on eight real label distributed data sets demonstrate that the proposed algorithm performs remarkably better than both the state-ofthe-art LDL methods and multi-label learning methods.

PDF Details

AAAI Conference 2018 Conference Paper

Label Distribution Learning by Exploiting Sample Correlations Locally

Xiang Zheng
Xiuyi Jia
Weiwei Li

Label distribution learning (LDL) is a novel multi-label learning paradigm proposed in recent years for solving label ambiguity. Existing approaches typically exploit label correlations globally to improve the effectiveness of label distribution learning, by assuming that the label correlations are shared by all instances. However, different instances may share different label correlations, and few correlations are globally applicable in real-world applications. In this paper, we propose a new label distribution learning algorithm by exploiting sample correlations locally (LDL-SCL). To encode the inﬂuence of local samples, we design a local correlation vector for each instance based on the clustered local samples. Then we predict the label distribution for an unseen instance based on the original features and the local correlation vector simultaneously. Experimental results demonstrate that LDL-SCL can effectively deal with the label distribution problems and perform remarkably better than the state-of-the-art LDL methods.

PDF Details