Author name cluster

Junwei Zhu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

8 papers

2 author rows

AAAI Conference 2026 Conference Paper

TRT: Harnessing Tensor Ring Transformer for Hyperspectral Image Super-Resolution

Honghui Xu
Junwei Zhu
Yubin Gu
Yueqian Quan
Chuangjie Fang
Hong Qiu
Jianwei Zheng

Deep unfolding networks (DUNs) have recently emerged as a promising approach for hyperspectral image super-resolution (HSISR) by combining the benefits of nonlinear deep learning architectures with interpretable optimization techniques. Despite their advantages, current DUNs face significant challenges, particularly in approximating degradation matrices across both spatial and spectral dimensions, which results in complex and cumbersome model construction. By analyzing the difference between the upsampled low-resolution hyperspectral images (LRHS) and the true target image, we observed that the residual image exhibits strong sparsity, akin to noise. Leveraging this insight, we reformulate the HSISR problem as a robust principal component analysis (RPCA)-based denoising task, effectively eliminating the need for the complex approximation of spatial degradation matrix and its transpose. In addition, we introduce a Tensor Ring Transformer based on multilinear products as the prior term, wherein tokens are mapped to a tensor ring factor domain and the traditional dot product is replaced with a multilinear tensor ring product. This significantly reduces the computational complexity of the Transformer model, from $ \mathcal{O}(N^2d) $ to $ \mathcal{O}(Nr^2) $, with \( r<

PDF Details DOI

ECAI Conference 2025 Conference Paper

Gradient Selection Tuning via Information Bottleneck

Xiaoxu Lin
Wei Li 0034
Junwei Zhu
Ni Xu
Honghui Xu 0002
Jianwei Zheng 0001

Pre-trained visual models enjoy strong representations, yet suffer from massive parameters to be shifted in downstream practices. Many parameter-efficient fine-tuning methods have been proposed, mostly requiring only 1% additional parameters to achieve comparable results. However, current solutions either consider all feature channels equally or detect saliencies with individual layer, leading to many redundancies reserved. To address current issues, this paper proposes a new parameter fine-tuning method named “Gradient Selection Tuning” (GST), which leverages gradients that are capable of capturing the cascading effects across successive channels. Instead of saliency detection, we turn to compress the redundancies for channel selection, since the computed gradient values enjoy much lower mutual information. With GST facilitated, we further elaborate an Information-Guided Adapter following information bottleneck theory, effectively performing parameter compression yet with task-specific features preserved. Experimental results demonstrate that our method outperforms the baseline methods by adding only 0. 075M parameters to ViT-B backbone. On domain generalization, our proposal also enjoys strong performance in low-parameter scenarios.

Details

NeurIPS Conference 2025 Conference Paper

Solving Partial Differential Equations via Radon Neural Operator

Wenbin Lu
Yihan Chen
Junnan Xu
Wei Li
Junwei Zhu
Jianwei Zheng

Neural operator is considered a popular data-driven alternative to traditional partial differential equation (PDE) solvers. However, most current solutions, whether fulfilling computations in frequency, Laplacian, and wavelet domains, all deviate far from the intrinsic PDE space. While with meticulous network architecture elaborated, the deviation often leads to biased accuracy. To address the issue, we open a new avenue that pioneers leveraging Radon transform to decompose the input space, finalizing a novel Radon neural operator (RNO) to solve PDEs in infinite-dimensional function space. Distinct from previous solutions, we project the input data into the sinogram domain, shrinking the multi-dimensional transformations to a reduced-dimensional counterpart and fitting compactly with the PDE space. Theoretically, we prove that RNO obeys a property of bilipschitz strongly monotonicity under diffeomorphism, providing deeper insights to guarantee the desired accuracy than typical discrete invariance or continuous-discrete equivalence. Within the sinogram domain, we further evidence that different angles contribute unequally to the overall space, thus engineering a reweighting technique to enable more effective PDE solutions. On that basis, a sinogram-domain convolutional layer is crafted, which operates on a fixed $\theta$-grid that is decoupled from the PDE space, further enjoying a natural guarantee of discrete invariance. Extensive experiments demonstrate that RNO sets new state-of-the-art (SOTA) scores across massive standard benchmarks, with superior generalization performance enjoyed. Code is available at.

PDF Details

NeurIPS Conference 2024 Conference Paper

Alias-Free Mamba Neural Operator

Jianwei Zheng
Wei Li
Ni Xu
Junwei Zhu
Xiaoxu Lin
Xiaoqin Zhang

Benefiting from the booming deep learning techniques, neural operators (NO) are considered as an ideal alternative to break the traditions of solving Partial Differential Equations (PDE) with expensive cost. Yet with the remarkable progress, current solutions concern little on the holistic function features--both global and local information-- during the process of solving PDEs. Besides, a meticulously designed kernel integration to meet desirable performance often suffers from a severe computational burden, such as GNO with $O(N(N-1))$, FNO with $O(NlogN)$, and Transformer-based NO with $O(N^2)$. To counteract the dilemma, we propose a mamba neural operator with $O(N)$ computational complexity, namely MambaNO. Functionally, MambaNO achieves a clever balance between global integration, facilitated by state space model of Mamba that scans the entire function, and local integration, engaged with an alias-free architecture. We prove a property of continuous-discrete equivalence to show the capability ofMambaNO in approximating operators arising from universal PDEs to desired accuracy. MambaNOs are evaluated on a diverse set of benchmarks with possibly multi-scale solutions and set new state-of-the-art scores, yet with fewer parameters and better efficiency.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

DF40: Toward Next-Generation Deepfake Detection

Zhiyuan Yan
Taiping Yao
Shen Chen
Yandan Zhao
Xinghe Fu
Junwei Zhu
Donghao Luo
Chengjie Wang

We propose a new comprehensive benchmark to revolutionize the current deepfake detection field to the next generation. Predominantly, existing works identify top-notch detection algorithms and models by adhering to the common practice: training detectors on one specific dataset ( e. g. , FF++) and testing them on other prevalent deepfake datasets. This protocol is often regarded as a "golden compass" for navigating SoTA detectors. But can these stand-out "winners" be truly applied to tackle the myriad of realistic and diverse deepfakes lurking in the real world? If not, what underlying factors contribute to this gap? In this work, we found the dataset (both train and test) can be the "primary culprit" due to the following: (1) forgery diversity: Deepfake techniques are commonly referred to as both face forgery (face-swapping and face-reenactment) and entire image synthesis (AIGC, especially face). Most existing datasets only contain partial types of them, with limited forgery methods implemented ( e. g. , 2 swapping and 2 reenactment methods in FF++); (2) forgery realism: The dominated training dataset, FF++, contains out-of-date forgery techniques from the past four years. "Honing skills" on these forgeries makes it difficult to guarantee effective detection generalization toward nowadays' SoTA deepfakes; (3) evaluation protocol: Most detection works perform evaluations on one type, e. g. , face-swapping types only, which hinders the development of universal deepfake detectors. To address this dilemma, we construct a highly diverse and large-scale deepfake detection dataset called DF40, which comprises 40 distinct deepfake techniques (10 times larger than FF++). We then conduct comprehensive evaluations using 4 standard evaluation protocols and 8 representative detection methods, resulting in over 2, 000 evaluations. Through these evaluations, we provide an extensive analysis from various perspectives, leading to 7 new insightful findings contributing to the field. We also open up 4 valuable yet previously underexplored research questions to inspire future works. We release our dataset, code, and pre-trained weights at https: //github. com/YZY-stack/DF40.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

HifiHead: One-Shot High Fidelity Neural Head Synthesis with 3D Control

Feida Zhu
Junwei Zhu
Wenqing Chu
Ying Tai
Zhifeng Xie
Xiaoming Huang
Chengjie Wang

We propose HifiHead, a high fidelity neural talking head synthesis method, which can well preserve the source image's appearance and control the motion (e. g. , pose, expression, gaze) flexibly with 3D morphable face models (3DMMs) parameters derived from a driving image or indicated by users. Existing head synthesis works mainly focus on low-resolution inputs. Instead, we exploit the powerful generative prior embedded in StyleGAN to achieve high-quality head synthesis and editing. Specifically, we first extract the source image's appearance and driving image's motion to construct 3D face descriptors, which are employed as latent style codes for the generator. Meanwhile, hierarchical representations are extracted from the source and rendered 3D images respectively to provide faithful appearance and shape guidance. Considering the appearance representations need high-resolution flow fields for spatial transform, we propose a coarse-to-fine style-based generator consisting of a series of feature alignment and refinement (FAR) blocks. Each FAR block updates the dense flow fields and refines RGB outputs simultaneously for efficiency. Extensive experiments show that our method blends source appearance and target motion more accurately along with more photo-realistic results than previous state-of-the-art approaches.

PDF Details DOI

IJCAI Conference 2021 Conference Paper

Context-Aware Image Inpainting with Learned Semantic Priors

Wendong Zhang
Junwei Zhu
Ying Tai
Yunbo Wang
Wenqing Chu
Bingbing Ni
Chengjie Wang
Xiaokang Yang

Recent advances in image inpainting have shown impressive results for generating plausible visual details on rather simple backgrounds. However, for complex scenes, it is still challenging to restore reasonable contents as the contextual information within the missing regions tends to be ambiguous. To tackle this problem, we introduce pretext tasks that are semantically meaningful to estimating the missing contents. In particular, we perform knowledge distillation on pretext models and adapt the features to image inpainting. The learned semantic priors ought to be partially invariant between the high-level pretext task and low-level image inpainting, which not only help to understand the global context but also provide structural guidance for the restoration of local textures. Based on the semantic priors, we further propose a context-aware image inpainting model, which adaptively integrates global semantics and local features in a unified image generator. The semantic learner and the image generator are trained in an end-to-end manner. We name the model SPL to highlight its ability to learn and leverage semantic priors. It achieves the state of the art on Places2, CelebA, and Paris StreetView datasets

PDF Details DOI

IJCAI Conference 2021 Conference Paper

HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping

Yuhan Wang
Xu Chen
Junwei Zhu
Wenqing Chu
Ying Tai
Chengjie Wang
Jilin Li
Yongjian Wu

In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results. Unlike other existing face swapping works that only use face recognition model to keep the identity similarity, we propose 3D shape-aware identity to control the face shape with the geometric supervision from 3DMM and 3D face reconstruction method. Meanwhile, we introduce the Semantic Facial Fusion module to optimize the combination of encoder and decoder features and make adaptive blending, which makes the results more photo-realistic. Extensive experiments on faces in the wild demonstrate that our method can preserve better identity, especially on the face shape, and can generate more photo-realistic results than previous state-of-the-art methods. Code is available at: https: //johann. wang/HifiFace

PDF Details DOI