Arrow Research search

Author name cluster

Zhiying Jiang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
1 author row

Possible papers

10

AAAI Conference 2026 Conference Paper

Conditional Prompt Learning via Degradation Perception for Underwater Image Enhancement

  • Mingze Yao
  • Zhiying Jiang
  • Xianping Fu
  • Huibing Wang

Underwater Image Enhancement (UIE) focuses on improving visual quality from various underwater scenes. Existing methods simplistically treat various degradations as homogeneous, disregarding their intrinsic connections and causing models to blindly learn, resulting in conflicting optimization goals and visual distortions. To address above limitations, we propose a Conditional Prompt Learning via Degradation Perception (CPLDP) model, which employs conditional prompt as degradation perception priors and guides underwater image enhancement. Specifically, we show that the natural language prompts not only promote distinguishing different degraded images, but also aid in exploring more details with semantic information. Therefore, our method generates five key degradation prompts (green/blue/green-blue color casts, uneven illumination and haze) with conditional prompt learning. Subsequently, considering the intrinsic relationships among different degradations, we employ degradation perceptions as priors and fine-tune the learning strategy to enhance underwater images. During training, an adaptive loss function with multi-degradations is designed, allowing it to effectively handle the task conflicts among multiple underwater degradations. Additionally, we conduct a human visual-based underwater dataset with various degradation types by subjective statistics. Extensive experiments on both full-reference and non-reference datasets demonstrate that our CPLDP can achieve better visual results and outperforms state-of-the-art UIE methods across various degradation scenarios.

AAAI Conference 2026 Conference Paper

Domain Adaptation Guided Infrared and Visible Image Fusion

  • Tianwei Guan
  • Haozhen Wei
  • Yuhan Zhou
  • Jun Ma
  • Zecheng Xu
  • Zhiying Jiang
  • Jinyuan Liu
  • Xingyuan Li

Infrared and Visible Image Fusion (IVIF) integrates complementary information from distinct modalities to enhance image quality. However, the effectiveness declines under unseen conditions such as novel weather or scenes, due to domain shifts primarily from variations of data distribution in the visible modality, while the infrared modality remains relatively stable. To overcome domain shifts caused by the imbalance between modalities during image fusion, we propose a Domain Adaptation Guided Infrared and Visible Image Fusion method, termed DAFusion, leveraging a dual-rank domain adapter to enable fast adaptation to diverse adverse conditions during image fusion. Specifically, trainable low-rank and high-rank embedding spaces are respectively used to capture knowledge common across domains (domain-shared) and those unique to target domains (domain-specific). To leverage the dual-rank adapter more effectively, we develop a homeostatic knowledge allotment strategy to integrate the distinct types of knowledge dynamically based on the uncertainty value of target domains. Since domain adaptation typically optimizes for feature alignment across domains and emphasizes invariance rather than preserving specific cues critical for image fusion, while the fusion objective requires retaining discriminative and complementary features, a conflict between the two modules appears. To reconcile this, we further adopt a bi-level optimization framework that structurally decouples the two objectives, enabling the fusion module to steer the adaptation process while benefiting in return from domain-aligned representations. Experimental results on three benchmarks demonstrate that our method significantly outperforms state-of-the-art approaches, achieving both an enhancement in fusion quality and an improvement on subsequent high-level tasks.

AAAI Conference 2026 Conference Paper

HATIR: Heat-Aware Diffusion for Turbulent Infrared Video Super-Resolution

  • Yang Zou
  • Xingyue Zhu
  • Kaiqi Han
  • Jun Ma
  • Xingyuan Li
  • Zhiying Jiang
  • Jinyuan Liu

Infrared video has been of great interest in visual tasks under challenging environments, but often suffers from severe atmospheric turbulence and compression degradation. Existing video super-resolution (VSR) methods either neglect the inherent modality gap between infrared and visible images or fail to restore turbulence-induced distortions. Directly cascading turbulence mitigation (TM) algorithms with VSR methods leads to error propagation and accumulation due to the decoupled modeling of degradation between turbulence and resolution. We introduce HATIR, a Heat-Aware Diffusion for Turbulent InfraRed Video Super-Resolution, which injects heat-aware deformation priors into the diffusion sampling path to jointly model the inverse process of turbulent degradation and structural detail loss. Specifically, HATIR constructs a Phasor-Guided Flow Estimator, rooted in the physical principle that thermally active regions exhibit consistent phasor responses over time, enabling reliable turbulence-aware flow to guide the reverse diffusion process. To ensure the fidelity of structural recovery under nonuniform distortions, a Turbulence-Aware Decoder is proposed to selectively suppress unstable temporal cues and enhance edge-aware feature aggregation via turbulence gating and structure-aware attention. We built FLIR-IVSR, the first dataset for turbulent infrared VSR, comprising paired LR-HR sequences from a FLIR T1050sc camera (1024 X 768) spanning 640 diverse scenes with varying camera and object motion conditions. This encourages future research in infrared VSR.

NeurIPS Conference 2025 Conference Paper

Depth-Supervised Fusion Network for Seamless-Free Image Stitching

  • Zhiying Jiang
  • Ruhao Yan
  • Zengxi Zhang
  • Bowei Zhang
  • Jinyuan Liu

Image stitching synthesizes images captured from multiple perspectives into a single image with a broader field of view. The significant variations in object depth often lead to large parallax, resulting in ghosting and misalignment in the stitched results. To address this, we propose a depth-consistency-constrained seamless-free image stitching method. First, to tackle the multi-view alignment difficulties caused by parallax, a multi-stage mechanism combined with global depth regularization constraints is developed to enhance the alignment accuracy of the same apparent target across different depth ranges. Second, during the multi-view image fusion process, an optimal stitching seam is determined through graph-based low-cost computation, and a soft-seam region is diffused to precisely locate transition areas, thereby effectively mitigating alignment errors induced by parallax and achieving natural and seamless stitching results. Furthermore, considering the computational overhead in the shift regression process, a reparameterization strategy is incorporated to optimize the structural design, significantly improving algorithm efficiency while maintaining optimal performance. Extensive experiments demonstrate the superior performance of the proposed method against the existing methods. Code is available at https: //github. com/DLUT-YRH/DSFN.

NeurIPS Conference 2025 Conference Paper

Enhancing Infrared Vision: Progressive Prompt Fusion Network and Benchmark

  • Jinyuan Liu
  • Zihang Chen
  • Zhu Liu
  • Zhiying Jiang
  • Long Ma
  • Xin Fan
  • Risheng Liu

We engage in the relatively underexplored task named thermal infrared image enhancement. Existing infrared image enhancement methods primarily focus on tackling individual degradations, such as noise, contrast, and blurring, making it difficult to handle coupled degradations. Meanwhile, all-in-one enhancement methods, commonly applied to RGB sensors, often demonstrate limited effectiveness due to the significant differences in imaging models. In sight of this, we first revisit the imaging mechanism and introduce a Recurrent Prompt Fusion Network (RPFN). Specifically, the RPFN initially establishes prompt pairs based on the thermal imaging process. For each type of degradation, we fuse the corresponding prompt pairs to modulate the model's features, providing adaptive guidance that enables the model to better address specific degradations under single or multiple conditions. In addition, a selective recurrent training mechanism is introduced to gradually refine the model's handling of composite cases to align the enhancement process, which not only allows the model to remove camera noise and retain key structural details, but also enhancing the overall contrast of the thermal image. Furthermore, we introduce the most comprehensive high-quality infrared benchmark covering a wide range of scenarios. Extensive experiments substantiate that our approach not only delivers promising visual results under specific degradation but also significantly improves performance on complex degradation scenes, achieving a notable 8. 76% improvement.

NeurIPS Conference 2025 Conference Paper

Image Stitching in Adverse Condition: A Bidirectional-Consistency Learning Framework and Benchmark

  • Zengxi Zhang
  • Junchen Ge
  • Zhiying Jiang
  • Miao Zhang
  • Jinyuan Liu

Deep learning-based image stitching methods have achieved promising performance on conventional stitching datasets. However, real-world scenarios may introduce challenges such as complex weather conditions, illumination variations, and dynamic scene motion, which severely degrade image quality and lead to significant misalignment in stitching results. To solve this problem, we propose an adverse condition-tolerant image stitching network, dubbed ACDIS. We first introduce a bidirectional consistency learning framework, which ensures reliable alignment through an iterative optimization paradigm that integrates differentiable image restoration and Gaussian-distribute encoded homography estimation. Subsequently, we incorporate motion constraints into the seamless composition network to produce robust stitching results without interference from moving scenes. We further propose the first adverse scene image stitching dataset, which covers diverse parallax and scenes under low-light, haze, and underwater environments. Extensive experiments show that the proposed method can generate visually pleasing stitched images under adverse conditions, outperforming state-of-the-art methods.

IJCAI Conference 2025 Conference Paper

TextMEF: Text-guided Prompt Learning for Multi-exposure Image Fusion

  • Jinyuan Liu
  • Qianjun Huang
  • Guanyao Wu
  • Di Wang
  • Zhiying Jiang
  • Long Ma
  • Risheng Liu
  • Xin Fan

Multi-exposure image fusion~(MEF) aims to integrate a set of low dynamic range images, producing a single image with a higher dynamic range than either one. Despite significant advancements, current MEF approaches still struggle to handle extremely over- or under-exposed conditions, resulting in unsatisfactory visual effects such as hallucinated details and distorted color tones. With this regard, we propose TextMEF, a prompt-driven fusion method enhanced by prompt learning, for multi-exposure image fusion. Specifically, we learn a set of prompts based on text-image similarity among negative and positive samples (over-exposed, under-exposed images, and well-exposed ones). These learned prompts are seamlessly integrated into the loss function, providing high-level guidance for constraining non-uniform exposure regions. Furthermore, we develop a attention Mamba module effectively translates over-/under- exposed regional features into exposure invariant space and ensure them to build efficient long-range dependency to high dynamic range image. Extensive experimental results on three publicly available benchmarks demonstrate that our TextMEF significantly outperforms state-of-the-art approaches in both visual inspection and objective analysis.

AAAI Conference 2024 Conference Paper

Enhancing Neural Radiance Fields with Adaptive Multi-Exposure Fusion: A Bilevel Optimization Approach for Novel View Synthesis

  • Yang Zou
  • Xingyuan Li
  • Zhiying Jiang
  • Jinyuan Liu

Neural Radiance Fields (NeRF) have made significant strides in the modeling and rendering of 3D scenes. However, due to the complexity of luminance information, existing NeRF methods often struggle to produce satisfactory renderings when dealing with high and low exposure images. To address this issue, we propose an innovative approach capable of effectively modeling and rendering images under multiple exposure conditions. Our method adaptively learns the characteristics of images under different exposure conditions through an unsupervised evaluator-simulator structure for HDR (High Dynamic Range) fusion. This approach enhances NeRF's comprehension and handling of light variations, leading to the generation of images with appropriate brightness. Simultaneously, we present a bilevel optimization method tailored for novel view synthesis, aiming to harmonize the luminance information of input images while preserving their structural and content consistency. This approach facilitates the concurrent optimization of multi-exposure correction and novel view synthesis, in an unsupervised manner. Through comprehensive experiments conducted on the LOM and LOL datasets, our approach surpasses existing methods, markedly enhancing the task of novel view synthesis for multi-exposure environments and attaining state-of-the-art results. The source code can be found at https://github.com/Archer-204/AME-NeRF.

AAAI Conference 2024 Conference Paper

Towards Robust Image Stitching: An Adaptive Resistance Learning against Compatible Attacks

  • Zhiying Jiang
  • Xingyuan Li
  • Jinyuan Liu
  • Xin Fan
  • Risheng Liu

Image stitching seamlessly integrates images captured from varying perspectives into a single wide field-of-view image. Such integration not only broadens the captured scene but also augments holistic perception in computer vision applications. Given a pair of captured images, subtle perturbations and distortions which go unnoticed by the human visual system tend to attack the correspondence matching, impairing the performance of image stitching algorithms. In light of this challenge, this paper presents the first attempt to improve the robustness of image stitching against adversarial attacks. Specifically, we introduce a stitching-oriented attack (SoA), tailored to amplify the alignment loss within overlapping regions, thereby targeting the feature matching procedure. To establish an attack resistant model, we delve into the robustness of stitching architecture and develop an adaptive adversarial training (AAT) to balance attack resistance with stitching precision. In this way, we relieve the gap between the routine adversarial training and benign models, ensuring resilience without quality compromise. Comprehensive evaluation across real-world and synthetic datasets validate the deterioration of SoA on stitching performance. Furthermore, AAT emerges as a more robust solution against adversarial perturbations, delivering superior stitching results. Code is available at: https://github.com/Jzy2017/TRIS.

NeurIPS Conference 2022 Conference Paper

Few-Shot Non-Parametric Learning with Deep Latent Variable Model

  • Zhiying Jiang
  • Yiqin Dai
  • Ji Xin
  • Ming Li
  • Jimmy Lin

Most real-world problems that machine learning algorithms are expected to solve face the situation with (1) unknown data distribution; (2) little domain-specific knowledge; and (3) datasets with limited annotation. We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV), a learning framework for any dataset with abundant unlabeled data but very few labeled ones. By only training a generative model in an unsupervised way, the framework utilizes the data distribution to build a compressor. Using a compressor-based distance metric derived from Kolmogorov complexity, together with few labeled data, NPC-LV classifies without further training. We show that NPC-LV outperforms supervised methods on all three datasets on image classification in the low data regime and even outperforms semi-supervised learning methods on CIFAR-10. We demonstrate how and when negative evidence lowerbound (nELBO) can be used as an approximate compressed length for classification. By revealing the correlation between compression rate and classification accuracy, we illustrate that under NPC-LV how the improvement of generative models can enhance downstream classification accuracy.