Author name cluster

Haoming Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

AAAI Conference 2026 Conference Paper

A Geometric Perspective on Optimizing Vector Quantized Latent Diffusion Model for Image Restoration

Chen Hang
Haoming Chen
Xuwei Fang
Weisheng Xie
Xiangxiang Gao
Faming Fang
Guixu Zhang
Haichuan Song

In this paper, we investigate the limitations of the Vector Quantized Latent Diffusion Model (VQ-LDM) in restoration tasks. We identify a performance gap between the Vector Quantization (VQ) and Diffusion Model components, manifested as a significant discrepancy between the reconstruction quality of ground truth images processed via VQ autoregression and degraded images restored by VQ-LDM. Through experiments, we attribute this gap primarily to the lack of robustness in the mapped points of VQ within the original VQ-LDM framework. To address this issue, we propose a geometric based optimization approach. First, we introduce a simple yet effective method, termed interpolation-based latent initial state optimization, which mitigates the performance gap by replacing the original mapped points with interpolated values, supported by theoretical analysis. Here, the latent initial state refers specifically to the input of the diffusion model. Building upon this, we further propose a Chebyshev center-based latent initial state optimization, an elegant theoretical solution from a geometric perspective, that further enhances restoration performance. Our improvements consistently achieve superior results across nine benchmark datasets.

PDF Details DOI

AAAI Conference 2026 Conference Paper

MS-PPO: Mean Standard Deviation Proximal Policy Optimization for Reliable Parking Space Search in Structured Environments

Haoming Chen
Hongliang Guo

This paper investigates the reliable parking space search problem in structured environments, with the objective of minimizing the linear combination of mean and standard deviation (mean-std) parking space search time. While canonical parking space search algorithms usually target the minimal expected search time, we argue that risk-averse users would like to trade expectation with its variance, leading to the reliable parking space search problem, which minimizes the mean-std search time. However, the non-additive nature of standard deviation makes the reliable parking space search problem difficult to solve with canonical search algorithms. To address the challenge, we propose a model-free reinforcement learning algorithm, namely MS-PPO, which simultaneously estimates the mean and standard deviation of the current decision-making policy's search time, and performs policy optimization via clipped mean-std advantage function maximization. MS-PPO is compared with several baseline parking space search algorithms as well as canonical reinforcement learning algorithms in a range of representative parking lot networks, and achieves the best overall performance in terms of the mean-std parking space search time. We also validate the effectiveness of MS-PPO in a real parking garage by deploying it to an autonomous vehicle testbed.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Exploring Fixed Point in Image Editing: Theoretical Support and Convergence Optimization

Chen Hang
Zhe Ma
Haoming Chen
Xuwei Fang
Weisheng Xie
Faming Fang
Guixu Zhang
Hongbin Wang

In image editing, Denoising Diffusion Implicit Models (DDIM) inversion has become a widely adopted method and is extensively used in various image editing approaches. The core concept of DDIM inversion stems from the deterministic sampling technique of DDIM, which allows the DDIM process to be viewed as an Ordinary Differential Equation (ODE) process that is reversible. This enables the prediction of corresponding noise from a reference image, ensuring that the restored image from this noise remains consistent with the reference image. Image editing exploits this property by modifying the cross-attention between text and images to edit specific objects while preserving the remaining regions. However, in the DDIM inversion, using the $t-1$ time step to approximate the noise prediction at time step $t$ introduces errors between the restored image and the reference image. Recent approaches have modeled each step of the DDIM inversion process as finding a fixed-point problem of an implicit function. This approach significantly mitigates the error in the restored image but lacks theoretical support regarding the existence of such fixed points. Therefore, this paper focuses on the study of fixed points in DDIM inversion and provides theoretical support. Based on the obtained theoretical insights, we further optimize the loss function for the convergence of fixed points in the original DDIM inversion, improving the visual quality of the edited image. Finally, we extend the fixed-point based image editing to the application of unsupervised image dehazing, introducing a novel text-based approach for unsupervised dehazing.

PDF Details DOI

JBHI Journal 2023 Journal Article

CXR-Net: A Multitask Deep Learning Network for Explainable and Accurate Diagnosis of COVID-19 Pneumonia From Chest X-Ray Images

Xin Zhang
Liangxiu Han
Tam Sobeih
Lianghao Han
Nina Dempsey
Symeon Lechareas
Ascanio Tridente
Haoming Chen

Accurate and rapid detection of COVID-19 pneumonia is crucial for optimal patient treatment. Chest X-Ray (CXR) is the first-line imaging technique for COVID-19 pneumonia diagnosis as it is fast, cheap and easily accessible. Currently, many deep learning (DL) models have been proposed to detect COVID-19 pneumonia from CXR images. Unfortunately, these deep classifiers lack the transparency in interpreting findings, which may limit their applications in clinical practice. The existing explanation methods produce either too noisy or imprecise results, and hence are unsuitable for diagnostic purposes. In this work, we propose a novel explainable CXR deep neural Network (CXR-Net) for accurate COVID-19 pneumonia detection with an enhanced pixel-level visual explanation using CXR images. An Encoder-Decoder-Encoder architecture is proposed, in which an extra encoder is added after the encoder-decoder structure to ensure the model can be trained on category samples. The method has been evaluated on real world CXR datasets from both public and private sources, including healthy, bacterial pneumonia, viral pneumonia and COVID-19 pneumonia cases. The results demonstrate that the proposed method can achieve a satisfactory accuracy and provide fine-resolution activation maps for visual explanation in the lung disease detection. Compared to current state-of-the-art visual explanation methods, the proposed method can provide more detailed, high-resolution, visual explanation for the classification results. It can be deployed in various computing environments, including cloud, CPU and GPU environments. It has a great potential to be used in clinical practice for COVID-19 pneumonia diagnosis.

Details DOI