Author name cluster

Chen Hang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

AAAI Conference 2026 Conference Paper

A Geometric Perspective on Optimizing Vector Quantized Latent Diffusion Model for Image Restoration

Chen Hang
Haoming Chen
Xuwei Fang
Weisheng Xie
Xiangxiang Gao
Faming Fang
Guixu Zhang
Haichuan Song

In this paper, we investigate the limitations of the Vector Quantized Latent Diffusion Model (VQ-LDM) in restoration tasks. We identify a performance gap between the Vector Quantization (VQ) and Diffusion Model components, manifested as a significant discrepancy between the reconstruction quality of ground truth images processed via VQ autoregression and degraded images restored by VQ-LDM. Through experiments, we attribute this gap primarily to the lack of robustness in the mapped points of VQ within the original VQ-LDM framework. To address this issue, we propose a geometric based optimization approach. First, we introduce a simple yet effective method, termed interpolation-based latent initial state optimization, which mitigates the performance gap by replacing the original mapped points with interpolated values, supported by theoretical analysis. Here, the latent initial state refers specifically to the input of the diffusion model. Building upon this, we further propose a Chebyshev center-based latent initial state optimization, an elegant theoretical solution from a geometric perspective, that further enhances restoration performance. Our improvements consistently achieve superior results across nine benchmark datasets.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Talon: Breaking the Synchronization Barrier in Speculative Decoding with Hybrid Model-based and Retrieve-based Drafting

Xiangxiang Gao
Weisheng Xie
Lixin
Xuwei Fang
Chen Hang
Changqun Li
Yuhan Lin
Xiaolong Xu

Large Language Models face fundamental deployment challenges due to the computational demands of auto-regressive token-by-token generation. While speculative decoding has emerged as a promising acceleration technique through its draft-then-verify framework, current implementations suffer from two critical limitations: (1) mutual waiting problem caused by sequential dependencies between draft generation and verification phases, and (2) constrained token acceptance rates where retrieval-based drafting methods under-perform in general domains while models-based drafting approaches show reduced efficacy in knowledge-intensive scenarios. To address these challenges, we propose Talon, a novel parallel inference architecture featuring two key innovations: (1) **a novel asynchronous execution paradigm** that decouples draft generation from verification, effectively eliminating synchronization bottlenecks, and (2) **an adaptive hybrid drafting strategy** that dynamically combines model-based and retrieval-based approaches to improve token acceptance rates across diverse domains. Extensive evaluations across standard benchmarks (MT-Bench, HumanEval, GSM8K, Alpaca, CNN/DM) demonstrate Talon's exceptional performance, achieving 4.04x–6.52x acceleration across multiple model families including Vicuna, Deepseek, and LLaMA series. These results represent a significant advancement over existing speculative decoding methods (EAGLE 1-3, Hydra, Medusa, Lookahead, SPS, and PLD), establishing a new paradigm for speculative decoding.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Exploring Fixed Point in Image Editing: Theoretical Support and Convergence Optimization

Chen Hang
Zhe Ma
Haoming Chen
Xuwei Fang
Weisheng Xie
Faming Fang
Guixu Zhang
Hongbin Wang

In image editing, Denoising Diffusion Implicit Models (DDIM) inversion has become a widely adopted method and is extensively used in various image editing approaches. The core concept of DDIM inversion stems from the deterministic sampling technique of DDIM, which allows the DDIM process to be viewed as an Ordinary Differential Equation (ODE) process that is reversible. This enables the prediction of corresponding noise from a reference image, ensuring that the restored image from this noise remains consistent with the reference image. Image editing exploits this property by modifying the cross-attention between text and images to edit specific objects while preserving the remaining regions. However, in the DDIM inversion, using the $t-1$ time step to approximate the noise prediction at time step $t$ introduces errors between the restored image and the reference image. Recent approaches have modeled each step of the DDIM inversion process as finding a fixed-point problem of an implicit function. This approach significantly mitigates the error in the restored image but lacks theoretical support regarding the existence of such fixed points. Therefore, this paper focuses on the study of fixed points in DDIM inversion and provides theoretical support. Based on the obtained theoretical insights, we further optimize the loss function for the convergence of fixed points in the original DDIM inversion, improving the visual quality of the edited image. Finally, we extend the fixed-point based image editing to the application of unsupervised image dehazing, introducing a novel text-based approach for unsupervised dehazing.

PDF Details DOI