Author name cluster

Chuming Lin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

1 author row

AAAI Conference 2023 Conference Paper

High-Resolution GAN Inversion for Degraded Images in Large Diverse Datasets

Yanbo Wang
Chuming Lin
Donghao Luo
Ying Tai
Zhizhong Zhang
Yuan Xie

The last decades are marked by massive and diverse image data, which shows increasingly high resolution and quality. However, some images we obtained may be corrupted, affecting the perception and the application of downstream tasks. A generic method for generating a high-quality image from the degraded one is in demand. In this paper, we present a novel GAN inversion framework that utilizes the powerful generative ability of StyleGAN-XL for this problem. To ease the inversion challenge with StyleGAN-XL, Clustering \& Regularize Inversion (CRI) is proposed. Specifically, the latent space is firstly divided into finer-grained sub-spaces by clustering. Instead of initializing the inversion with the average latent vector, we approximate a centroid latent vector from the clusters, which generates an image close to the input image. Then, an offset with a regularization term is introduced to keep the inverted latent vector within a certain range. We validate our CRI scheme on multiple restoration tasks (i.e., inpainting, colorization, and super-resolution) of complex natural images, and show preferable quantitative and qualitative results. We further demonstrate our technique is robust in terms of data and different GAN models. To our best knowledge, we are the first to adopt StyleGAN-XL for generating high-quality natural images from diverse degraded inputs. Code is available at https://github.com/Booooooooooo/CRI.

PDF Details DOI

NeurIPS Conference 2021 Conference Paper

Spectrum-to-Kernel Translation for Accurate Blind Image Super-Resolution

Guangpin Tao
Xiaozhong Ji
Wenzhuo Wang
Shuo Chen
Chuming Lin
Yun Cao
Tong Lu
Donghao Luo

Deep-learning based Super-Resolution (SR) methods have exhibited promising performance under non-blind setting where blur kernel is known; however, blur kernels of Low-Resolution (LR) images in different practical applications are usually unknown. It may lead to a significant performance drop when degradation process of training images deviates from that of real images. In this paper, we propose a novel blind SR framework to super-resolve LR images degraded by arbitrary blur kernel with accurate kernel estimation in frequency domain. To our best knowledge, this is the first deep learning method which conducts blur kernel estimation in frequency domain. Specifically, we first demonstrate that feature representation in frequency domain is more conducive for blur kernel reconstruction than in spatial domain. Next, we present a Spectrum-to-Kernel (S$2$K) network to estimate general blur kernels in diverse forms. We use a conditional GAN (CGAN) combined with SR-oriented optimization target to learn the end-to-end translation from degraded images' spectra to unknown kernels. Extensive experiments on both synthetic and real-world images demonstrate that our proposed method sufficiently reduces blur kernel estimation error, thus enables the off-the-shelf non-blind SR methods to work under blind setting effectively, and achieves superior performance over state-of-the-art blind SR methods, averagely by 1. 39dB, 0. 48dB (Gaussian kernels) and 6. 15dB, 4. 57dB (motion kernels) for scales $2\times$ and $4\times$ respectively.

PDF Details

AAAI Conference 2020 Conference Paper

Fast Learning of Temporal Action Proposal via Dense Boundary Generator

Chuming Lin
Jian Li
Yabiao Wang
Ying Tai
Donghao Luo
Zhipeng Cui
Chengjie Wang
Jilin Li

Generating temporal action proposals remains a very challenging problem, where the main issue lies in predicting precise temporal proposal boundaries and reliable action conﬁdence in long and untrimmed real-world videos. In this paper, we propose an efﬁcient and uniﬁed framework to generate temporal action proposals named Dense Boundary Generator (DBG), which draws inspiration from boundary-sensitive methods and implements boundary classiﬁcation and action completeness regression for densely distributed proposals. In particular, the DBG consists of two modules: Temporal boundary classiﬁcation (TBC) and Action-aware completeness regression (ACR). The TBC aims to provide two temporal boundary conﬁdence maps by low-level two-stream features, while the ACR is designed to generate an action completeness score map by high-level action-aware features. Moreover, we introduce a dual stream BaseNet (DSB) to encode RGB and optical ﬂow information, which helps to capture discriminative boundary and actionness features. Extensive experiments on popular benchmarks ActivityNet-1. 3 and THUMOS14 demonstrate the superiority of DBG over the state-of-the-art proposal generator (e. g. , MGG and BMN).

PDF Details

AAAI Conference 2019 Conference Paper

Frame and Feature-Context Video Super-Resolution

Bo Yan
Chuming Lin
Weimin Tan

For video super-resolution, current state-of-the-art approaches either process multiple low-resolution (LR) frames to produce each output high-resolution (HR) frame separately in a sliding window fashion or recurrently exploit the previously estimated HR frames to super-resolve the following frame. The main weaknesses of these approaches are: 1) separately generating each output frame may obtain high-quality HR estimates while resulting in unsatisfactory flickering artifacts, and 2) combining previously generated HR frames can produce temporally consistent results in the case of short information flow, but it will cause significant jitter and jagged artifacts because the previous super-resolving errors are constantly accumulated to the subsequent frames.In this paper, we propose a fully end-to-end trainable frame and feature-context video super-resolution (FFCVSR) network that consists of two key sub-networks: local network and context network, where the first one explicitly utilizes a sequence of consecutive LR frames to generate local feature and local SR frame, and the other combines the outputs of local network and the previously estimated HR frames and features to super-resolve the subsequent frame. Our approach takes full advantage of the inter-frame information from multiple LR frames and the context information from previously predicted HR frames, producing temporally consistent highquality results while maintaining real-time speed by directly reusing previous features and frames. Extensive evaluations and comparisons demonstrate that our approach produces state-of-the-art results on a standard benchmark dataset, with advantages in terms of accuracy, efficiency, and visual quality over the existing approaches.

PDF Details