Author name cluster

Yumeng Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

JBHI Journal 2026 Journal Article

A VR-based Automated Strabismus Diagnosis System with Progressive Semi-Supervised Learning

Dehui Qiu
Bowei Ma
Ze Xiong
Yuhao Wang
Liguo Deng
Longfei Zhou
Xiaojie Cao
Weiwei Chen

Strabismus is a prevalent ocular disorder that can impair visual development and cause psychological issues if not diagnosed early. Conventional clinical diagnosis primarily relies on the prism cover test (PCT), which is subjective, requires patient cooperation, and lacks standardization. Recent advances in virtual reality (VR) and deep learning offer promising solutions for automated and standardized diagnosis. However, practical deployment faces three key challenges: realistic VR simulation of clinical exams, addressing image degradation (reflections/occlusions) with limited annotated data, and precise quantification of ocular deviations. In this study, we propose a novel VR-based automated strabismus diagnosis system by leveraging semi-supervised deep learning, and introduce a new clinical dataset, TongRenD. The framework incorporates five standardized clinical examination scenarios within a VR environment to ensure diagnostic consistency. We introduce ProgNet: an uncertainty-guided progressive semi-supervised segmentation network that integrates a Prototype-based Feature Representation Module (PFRM) to enhance robustness against visual noise and distortions under limited annotations. A dedicated 3D deviation estimation algorithm further enables accurate strabismus classification and angular measurement. Extensive experiments on the TongRenD and TEyeD datasets demonstrate that ProgNet outperforms state-of-the-art methods in segmentation accuracy. Clinical validation confirms that our system achieves high consistency with expert assessments, providing a standardized, non-invasive, and reliable solution for strabismus diagnosis.

Details DOI

AAAI Conference 2026 Conference Paper

ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs

Yin Xie
Kaicheng Yang
Peirou Liang
Xiang An
Yongle Zhao
Yumeng Wang
Ziyong Feng
Roy Miles

Large Multimodal Models (LMMs) often face a modality representation gap during pretraining: while language embeddings remain stable, visual representations are highly sensitive to contextual noise (e.g., background clutter). To address this issue, we introduce a visual comprehension stage, which we call ViCToR (Visual Comprehension via Token Reconstruction), a novel pretraining framework for LMMs. ViCToR employs a learnable visual token pool and utilizes the Hungarian matching algorithm to select semantically relevant tokens from this pool for visual token replacement. Furthermore, by integrating a visual token reconstruction loss with dense semantic supervision, ViCToR can learn tokens which retain high visual detail, thereby enhancing the large language model's (LLM's) understanding of visual information. After pretraining on 3 million publicly accessible images and captions, ViCToR achieves state-of-the-art results, improving over LLaVA-NeXT-8B by 10.4%, 3.2%, and 7.2% on the MMStar, SEEDI, and RealWorldQA benchmarks, respectively.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Leveraging Personalized PageRank and Higher-Order Topological Structures for Heterophily Mitigation in Graph Neural Networks

Yumeng Wang
Zengyi Wo
Wenjun Wang
Xingcheng Fu
Minglai Shao

Graph Neural Networks (GNNs) excel in node classification tasks but often assume homophily, where connected nodes share similar labels. This assumption does not hold in many real-world heterophilic graphs. Existing models for heterophilic graphs primarily rely on pairwise relationships, overlooking multi-scale information from higher-order structures. This leads to suboptimal performance, particularly under noise from conflicting class information across nodes. To address these challenges, we propose HPGNN, a novel model integrating Higher-order Personalized PageRank with Graph Neural Networks. HPGNN introduces an efficient high-order approximation of Personalized PageRank (PPR) to capture long-range and multiscale node interactions. This approach reduces computational complexity and mitigates noise from surrounding information. By embedding higher-order structural information into convolutional networks, HPGNN effectively models key interactions across diverse graph dimensions. Extensive experiments on benchmark datasets demonstrate HPGNN’s effectiveness. The model achieves better performance than five out of seven state-of-the-art methods on heterophilic graphs in downstream tasks while maintaining competitive performance on homophilic graphs. HPGNN’s ability to balance multi-scale information and robustness to noise makes it a versatile solution for real-world graph learning challenges. Codes are available at https: //github. com/streetcorner/HPGNN.

PDF Details DOI