Author name cluster

Ping Zhong

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

1 author row

AAAI Conference 2026 Conference Paper

Perspective from a Broader Context: Can Room Style Knowledge Help Visual Floorplan Localization?

Bolei Chen
Shengsheng Yan
Yongzheng Cui
Jiaxu Kang
Ping Zhong
Jianxin Wang

Since a building's floorplan remains consistent over time and is inherently robust to changes in visual appearance, visual Floorplan Localization (FLoc) has received increasing attention from researchers. However, as a compact and minimalist representation of the building's layout, floorplans contain many repetitive structures (e.g., hallways and corners), thus easily result in ambiguous localization. Existing methods either pin their hopes on matching 2D structural cues in floorplans or rely on 3D geometry-constrained visual pre-trainings, ignoring the richer contextual information provided by visual images. In this paper, we suggest using broader visual scene context to empower FLoc algorithms with scene layout priors to eliminate localization uncertainty. In particular, we propose an unsupervised learning technique with clustering constraints to pre-train a room discriminator on self-collected unlabeled room images. Such a discriminator can empirically extract the hidden room type of the observed image and distinguish it from other room types. By injecting the scene context information summarized by the discriminator into an FLoc algorithm, the room style knowledge is effectively exploited to guide definite visual FLoc. We conducted sufficient comparative studies on two standard visual Floc benchmarks. Our experiments show that our approach outperforms state-of-the-art methods and achieves significant improvements in robustness and accuracy.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Towards Ultrasound-based Reliable Disease Diagnosis Using Causal Inference

Bolei Chen
Jiaxu Kang
Haonan Yang
Ping Zhong
Yixiong Liang
Rui Fan
Jianxin Wang

Aligning the decision-making process of deep learning models with that of experienced sonographers is essential for ultrasound-based reliable disease diagnosis. Although existing methods have made significant progress in this aspect, their alignments are primarily associational rather than causal, leading to pseudo-correlations between features and diagnostic results. Such a biased diagnosis blindly models the sonographer's diagnostic skills and attention to specific patterns, which we argue hardly produces an AI diagnoser that is comparable to human experts. To address this issue, we propose a causality-based diagnostic framework to align the model's diagnostic behaviors with those of experts. Specifically, by delving into both conspicuous and inconspicuous confounders within the ultrasound images, the back-door and front-door adjustment causal learning modules are proposed to promote unbiased learning by mitigating potential pseudo-correlations. In addition, we integrate causal inference into a well-designed dual-branch model with feature interaction bridges for compatibility with multimodal ultrasound inputs. To fully evaluate our method, we conduct comparative studies on different diseases and ultrasound modalities. In particular, we publish a carefully constructed multimodal ultrasound dataset for breast lesion diagnosis and segmentation. Sufficient comparative and ablation studies on this dataset emphasize that our method outperforms state-of-the-art methods.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Dive into Aerial Remote Sensing Underwater Depth Estimation with Hyperspectral Imagery

Jiahao Qi
Xingyue Liu
Chen Chen
Dehui Zhu
Kangcheng Bin
Ping Zhong

Visible spectrum images capture limited information from just three discrete bands, often resulting in suboptimal performance in underwater depth estimation (UDE) due to significant information loss from water absorption. In contrast, HSIs, which include hundreds of continuous bands, provide abundant spectral information that offers greater resilience against the adverse effects of water absorption. In this paper, we conduct a comprehensive study to investigate how spectral information can enhance remote sensing UDE through two key aspects: the benchmark dataset and the general framework. For the benchmark dataset, we construct a real-world hyperspectral UDE (HUDE) dataset ATR-HUDE, comprising approximately 500 synchronized hyperspectral and LiDAR data pairs collected from diverse coastal scenes and flight altitudes. Regarding the general framework, we integrate recent advances in state space models and physical imaging models to design a novel HUDE framework named HUDEMamba that estimates underwater depth using both model-driven and data-driven approaches. Experimental results on the constructed benchmark dataset validate the potential of HUDE and the effectiveness of HUDEMamba.

PDF Details DOI

EAAI Journal 2024 Journal Article

SemNav-HRO: A target-driven semantic navigation strategy with human–robot–object ternary fusion

Bolei Chen
Siyi Lu
Ping Zhong
Yongzheng Cui
Yixiong Liang
Jianxin Wang

Target-Driven Semantic Navigation (TDSN) shows great potential to be applied in intelligent domestic assistants supporting humans with daily activities. Although numerous methods have been explored to achieve efficient static TDSN, socially aware TDSN in dynamic and crowded scenarios remains challenging and has not been adequately investigated. The main challenges come from the complex human–robot interaction mechanisms and the semantic relation exploitation, which requires the robot to understand the surroundings and perform foresighted behaviors. In this paper, a TDSN strategy named SemNav-HRO is proposed by considering Human–Robot–Object (HRO) ternary feature fusion. Specifically, a Deep Reinforcement Learning (DRL) based Dual-Channel Value Estimation Network (DCVEN) is first proposed by integrating multi-granularity map features and social awareness to learn crowded TDSN strategies. Meanwhile, the tricky and socially aware TDSN problem is slackened by eliminating the dependence on costly features (e. g. , pedestrian speed) and introducing a pedestrian trajectory predictor. For the learning and evaluation of crowded TDSN strategies, a novel and semantic-rich simulator with complex layouts is constructed based on realistic domestic scenes, instead of employing the previous naive simulation settings. Experimental results show that our method relatively improves the navigation success rates by 12. 8 % ∼ 25. 5 % and 14. 6 % ∼ 19. 2 % compared to the baselines on the MP3D and Gibson datasets, respectively. Furthermore, we experimentally verify the promising generalization and interpretability of our method.

Details DOI

TIST Journal 2023 Journal Article

STExplorer: A Hierarchical Autonomous Exploration Strategy with Spatio-temporal Awareness for Aerial Robots

Bolei Chen
Yongzheng Cui
Ping Zhong
Wang Yang
Yixiong Liang
Jianxin Wang

The autonomous exploration task we consider requires Unmanned Aerial Vehicles (UAVs) to actively navigate through unknown environments with the goal of fully perceiving and mapping the environments. Some existing exploration strategies suffer from rough cost budgets, ambiguous Information Gain (IG), and unnecessary backtracking exploration caused by Fragmented Regions (FRs). In our work, a hierarchical spatio-temporal-aware exploration framework is proposed to alleviate these problems. At the local exploration level, the Asymmetrical Traveling Salesman Problem (ATSP) is solved by comprehensively considering exploration time, IG, and heading consistency to avoid blindly exploring. Specifically, the exploration time is reasonably budgeted by fast marching in an artificial potential field. Meanwhile, a transformer-based map occupancy predictor is designed to assist in IG calculation by imagining spatial clues out of the Field of View (FoV), facilitating the prescient exploration. We verify that our local exploration is effective in alleviating the unnecessary back-and-forth movements caused by FRs and the interference of potential obstacle occlusion on the IG calculation. At the global exploration level, the classical Next Best View Points (NBVP) are generalized to Next Best Sub-Regions (NBSR) to choose informative sub-regions for further forward-looking exploration based on a well-designed utility function. Safe flight paths and dynamically feasible trajectories are reasonably generated throughout the exploration process by fast marching and B-spline curve optimization. Comparative simulations and benchmark tests demonstrate that our proposed exploration strategy is quite competitive in terms of exploration path length, total exploration time, and exploration ratio.

Details DOI

EAAI Journal 2022 Journal Article

Block-based multi-view classification via view-based L 2, p sparse representation and adaptive view fusion

Zhi Wang
Qiang Lin
Yingyi Chen
Ping Zhong

To date, many efficient classification methods have been presented by utilizing the multi-view data’s rich information. Nevertheless, they commonly construct models by concatenating entire views together into the high-dimensional vectors while ignoring the individuality and relationship of views. Also, they often use fixed labels to perform classification, ignoring the requirement of the large margin between distinct classes. To address the above problems, we propose a new block-based multi-view classification model via view-based L 2, p sparse representation and adaptive view fusion. Specifically, the model establishes the L 2, p regularization in each view space to excavate the individuality information of views. Meanwhile, a newly proposed shared loss term across views is combined in the model to learn the complementarity and consistency information of views. The adaptive weighting is introduced to measure the contribution of distinct views while performing adaptive view fusion. The model also adopts slack labels to increase the distance of distinct classes. Furthermore, an Alternating Direction Method of Multipliers (ADMM) based algorithm is designed to solve the model through block calculation rapidly. And a strict theoretical proof of its convergence is provided. Extensive experiments demonstrate that the proposed method achieves superior performance.

Details DOI

IS Journal 2021 Journal Article

Attention-Enhanced Gradual Machine Learning for Entity Resolution

Ping Zhong
Zhanhuai Li
Qun Chen
Boyi Hou

Recent work has shown that entity resolution (ER) can be effectively performed by gradual machine learning (GML). GML begins with some automatically labeled easy instances and, then, gradually labels more challenging instances by iterative factor graph inference without human intervention. In GML, shared features serve as the medium for knowledge conveyance between easy instances and more challenging ones. The existing GML solution supposes that features play independent roles in gradual inference. However, in real scenarios, this assumption may be untenable since features are usually correlated with each other. To address this limitation, this article proposes an attention-enhanced approach to improve the accuracy of gradual inference. We first propose a method of spectral feature representation to map correlated features to close points in the same vector space and, then, present a model of attention neural network to learn the decisive features, given arbitrary combinations of features for improved feature weighting. Finally, our extensive experiments on real benchmark data have validated the efficacy of the proposed approach.

Details DOI

EAAI Journal 2021 Journal Article

Multi-task support vector machine with pinball loss

Yunhao Zhang
Jiajun Yu
Xinyi Dong
Ping Zhong

With the boom in machine learning, support vector machine (SVM) is widely employed in pattern recognition. However, most of SVM models concentrate on single-task learning, multi-task learning has been largely neglected. Compared with single-task learning, multi-task learning can achieve a good performance for each task by mining the shared information among tasks. In addition, loss function also plays an important role in the accuracy of SVM. Inspired by multi-task learning and the SVM with pinball loss (pin-SVM), we propose two novel multi-task support vector machines with pinball loss for binary classification, named as MTL-pin-SVM I and MTL-pin-SVM II. Both methods maximize the quantile distance for each task, which realizes less sensitive to noise and more stable for re-sampling. Moreover, MTL-pin-SVM II can use different combinations of kernel functions for different tasks, which can get better performance than other multi-task models by choosing the suitable combinations of kernel functions for different tasks. And they include the multi-task SVM with hinge loss as their special cases, which are denoted as MTL-C-SVM I and MTL-C-SVM II. The extensive experiments on multi-task datasets fully validate the validity of the proposed models.

Details DOI