Author name cluster

Jianan Wei

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

EAAI Journal 2026 Journal Article

Few-shot learning perfected: The efficacy and simplicity of Mate-baseline++

Lianyang Zhou
Haisong Huang
Jianan Wei

As one of the most extensively researched few-shot methods, meta-learning has garnered significant attention. However, the augmentation in algorithmic complexity does not necessarily translate to commensurate improvements in few-shot accuracy. This paper introduces Moreover, these methods tend to involve complex modules that can compromise the embedding features learned by the backbone model. This paper presents a novel meta-learning baseline that addresses the issue of losing important semantic features during the basic feature extraction process. Unlike previous methods that rely on attention mechanisms, our proposed baseline enhances the extracted basic features without introducing any additional attention module to address these limitations, enhancing the original meta-baseline model. Unlike its predecessors, Meta-Baseline++ avoids the addition of complex modules, thereby preserving the original features while enhancing the extracted basic ones. Furthermore, we introduce the anchor-based classification loss Lanchor, which enables the network to learn sample features more effectively, thereby facilitating it rectifies the issue of semantic feature loss during feature extraction, a common problem in the original meta-baseline model, which employs a simple averaging approach. To facilitate more effective learning of sample features and better model parameter updates during training, we introduce an anchor-based classification loss, Lanchor. We evaluate our proposed method on miniImageNet, tieredImageNet, CUB-200-2011, and CIFAR-FS datasets and contrast our results with those of the previous Meta-baseline model. Our new baseline model achieves a remarkable accuracy improvement of 2. 26% and 1. 58% on miniImageNet and tieredImageNet datasets, respectively. Moreover, our model outperforms some complex meta-learning algorithms, achieving 80. 56% and 70. 43% accuracy on CUB-200-2011 and CIFAR-FS, respectively. Our findings set a new benchmark for few-shot baseline models and prompt a re-evaluation of some methods in few-shot learning.

Details DOI

ICLR Conference 2025 Conference Paper

Learning Clustering-based Prototypes for Compositional Zero-Shot Learning

Hongyu Qu
Jianan Wei
Xiangbo Shu
Wenguan Wang

Learning primitive (i.e., attribute and object) concepts from seen compositions is the primary challenge of Compositional Zero-Shot Learning (CZSL). Existing CZSL solutions typically rely on oversimplified data assumptions, e.g., modeling each primitive with a single centroid primitive presentation, ignoring the natural diversities of the attribute (resp. object) when coupled with different objects (resp. attribute). In this work, we develop ClusPro, a robust clustering-based prototype mining framework for CZSL that defines the conceptual boundaries of primitives through a set of diversified prototypes. Specifically, ClusPro conducts within-primitive clustering on the embedding space for automatically discovering and dynamically updating prototypes. To learn high-quality embeddings for discriminative prototype construction, ClusPro repaints a well-structured and independent primitive embedding space, ensuring intra-primitive separation and inter-primitive decorrelation through prototype-based contrastive learning and decorrelation learning. Moreover, ClusPro effectively performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters or computational budget during testing. Experiments on three benchmarks demonstrate ClusPro outperforms various top-leading CZSL solutions under both closed-world and open-world settings. Our code is available at CLUSPRO.

Details

NeurIPS Conference 2025 Conference Paper

Learning Human-Object Interaction as Groups

Jiajun Hong
Jianan Wei
Wenguan Wang

Human-Object Interaction Detection (HOI-DET) aims to localize human-object pairs and identify their interactive relationships. To aggregate contextual cues, existing methods typically propagate information across all detected entities via self‑attention mechanisms, or establish message passing between humans and objects with bipartite graphs. However, they primarily focus on pairwise relationships, overlooking that interactions in real-world scenarios often emerge from collective behaviors ($\textit{i}. \textit{e}. $, multiple humans and objects engaging in joint activities). In light of this, we revisit relation modeling from a $\textit{group}$ view and propose GroupHOI, a framework that propagates contextual information in terms of $\textit{geometric proximity}$ and $\textit{semantic similarity}$. To exploit the geometric proximity, humans and objects are grouped into distinct clusters using a learnable proximity estimator based on spatial features derived from bounding boxes. In each group, a soft correspondence is computed via self-attention to aggregate and dispatch contextual cues. To incorporate the semantic similarity, we enhance the vanilla transformer-based interaction decoder with local contextual cues from HO-pair features. Extensive experiments on HICO-DET and V-COCO benchmarks demonstrate the superiority of GroupHOI over the state-of-the-art methods. It also exhibits leading performance on the more challenging Nonverbal Interaction Detection (NVI-DET) task, which involves varied forms of higher-order interactions within groups.

PDF Details

NeurIPS Conference 2025 Conference Paper

OmniGaze: Reward-inspired Generalizable Gaze Estimation in the Wild

Hongyu Qu
Jianan Wei
Xiangbo Shu
Yazhou Yao
Wenguan Wang
Jinhui Tang

Current 3D gaze estimation methods struggle to generalize across diverse data domains, primarily due to $\textbf{i)}$ $\textit{the scarcity of annotated datasets}$, and $\textbf{ii)}$ $\textit{the insufficient diversity of labeled data}$. In this work, we present OmniGaze, a semi-supervised framework for 3D gaze estimation, which utilizes large-scale unlabeled data collected from diverse and unconstrained real-world environments to mitigate domain bias and generalize gaze estimation in the wild. First, we build a diverse collection of unlabeled facial images, varying in facial appearances, background environments, illumination conditions, head poses, and eye occlusions. In order to leverage unlabeled data spanning a broader distribution, OmniGaze adopts a standard pseudo-labeling strategy and devises a reward model to assess the reliability of pseudo labels. Beyond pseudo labels as 3D direction vectors, the reward model also incorporates visual embeddings extracted by an off-the-shelf visual encoder and semantic cues from gaze perspective generated by prompting a Multimodal Large Language Model to compute confidence scores. Then, these scores are utilized to select high-quality pseudo labels and weight them for loss computation. Extensive experiments demonstrate that OmniGaze achieves state-of-the-art performance on five datasets under both in-domain and cross-domain settings. Furthermore, we also evaluate the efficacy of OmniGaze as a scalable data engine for gaze estimation, which exhibits robust zero-shot generalization on four unseen datasets.

PDF Details

NeurIPS Conference 2023 Conference Paper

Neural-Logic Human-Object Interaction Detection

Liulei Li
Jianan Wei
Wenguan Wang
Yi Yang

The interaction decoder utilized in prevalent Transformer-based HOI detectors typically accepts pre-composed human-object pairs as inputs. Though achieving remarkable performance, such a paradigm lacks feasibility and cannot explore novel combinations over entities during decoding. We present LogicHOI, a new HOI detector that leverages neural-logic reasoning and Transformer to infer feasible interactions between. entities. Specifically, we modify. self-attention mechanism in the vanilla Transformer, enabling it to reason over the ⟨ human, action, object ⟩ triplet and constitute novel interactions. Meanwhile, such a reasoning process is guided by two crucial properties for understanding HOI: affordances (the potential actions an object can facilitate) and proxemics (the spatial relations between humans and objects). We formulate these two properties in first-order logic and ground them into continuous space to constrain the learning process of our approach, leading to improved performance and zero-shot generalization capabilities. We evaluate L OGIC HOI on V-COCO and HICO-DET under both normal and zero-shot setups, achieving significant improvements over existing methods.

PDF Details

EAAI Journal 2023 Journal Article

Review of resampling techniques for the treatment of imbalanced industrial data classification in equipment condition monitoring

Yage Yuan
Jianan Wei
Haisong Huang
Weidong Jiao
Jiaxin Wang
Hualin Chen

In an actual industrial scenario, machines typically operate normally for the majority of the time, with malfunctions occurring only occasionally. As a result, there is very little recorded data on defects. Consequently, the fault diagnostic dataset becomes imbalanced, with a significantly lower number of fault samples compared to normal samples. Furthermore, with the rapid development of the manufacturing industry, the increasing complexity of machines and equipment leads to various challenges in collecting fault data, such as noise, within-class imbalance, multi-class imbalance, and time series imbalance. It is worth noting that this study is the first to comprehensively summarize these four specific challenges. Therefore, addressing these issues has become a critical research focus and a pain point in the field of fault diagnosis, and numerous solutions have emerged. This study provides a comprehensive overview of these solutions at three levels: data preprocessing, feature extraction, and classifier improvement. It also describes the applications of imbalanced data classification methods, including pure resampling techniques, as well as sampling techniques that combine resampling algorithms with feature extraction and classifier improvement in industrial scenarios. Finally, we summarize the challenges facing imbalanced data classification research and suggest potential directions for future studies.

Details DOI

EAAI Journal 2020 Journal Article

New imbalanced fault diagnosis framework based on Cluster-MWMOTE and MFO-optimized LS-SVM using limited and complex bearing data

Jianan Wei
Haisong Huang
Liguo Yao
Yao Hu
Qingsong Fan
Dong Huang

Due to the complexity of their working conditions, historical rolling bearing datasets are mostly limited and imbalanced. The fault data may be composed of multiple subclusters; that is, the historical rolling bearing data have both between-class and within-class imbalances. While support vector machines (e. g. , least squares support vector machines (LS-SVMs)) offer advantages when dealing with limited data, traditional fault diagnosis using an LS-SVM has the disadvantages of easy failure of complex imbalanced data and large dependence on the classifier hyperparameters. Therefore, this paper presents a new imbalanced fault diagnosis framework based on a cluster-majority weighted minority oversampling technique (Cluster-MWMOTE) and a moth-flame optimization (MFO)-based LS-SVM classifier. As an extension of MWMOTE, our proposed Cluster-MWMOTE combines the clustering algorithm represented by agglomerative hierarchical clustering (AHC) with MWMOTE. Unlike MWMOTE, Cluster-MWMOTE can avoid the ignoring of small subclusters of faulty (minority) instances far from normal (majority) instances. That is, Cluster-MWMOTE further improves the adaptation to within-class imbalances. As a novel heuristic intelligent algorithm, MFO exhibits faster convergence and higher precision than the traditional optimization algorithms (e. g. , particle swarm optimization (PSO) and genetic algorithm (GA)). Therefore, we utilize MFO to optimize the hyperparameters (Sigma & γ ) of the LS-SVM classifier for the first time. The fault diagnosis results represented by CWRU and IMS bearing data suggest that the proposed framework provides higher fault diagnosis recognition rates and algorithm robustness than 16 existing algorithms.

Details DOI