Arrow Research search

Author name cluster

Luoqi Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

AAAI Conference 2025 Conference Paper

Memory Efficient Matting with Adaptive Token Routing

  • Yiheng Lin
  • Yihan Hu
  • Chenyi Zhang
  • Ting Liu
  • Xiaochao Qu
  • Luoqi Liu
  • Yao Zhao
  • Yunchao Wei

Transformer-based models have recently achieved outstanding performance in image matting. However, their application to high-resolution images remains challenging due to the quadratic complexity of global self-attention. To address this issue, we propose MEMatte, a memory-efficient matting framework for processing high-resolution images. MEMatte incorporates a router before each global attention block, directing informative tokens to the global attention while routing other tokens to a Lightweight Token Refinement Module (LTRM). Specifically, the router employs a local-global strategy to predict the routing probability of each token, and the LTRM utilizes efficient modules to simulate global attention. Additionally, we introduce a Batch-constrained Adaptive Token Routing (BATR) mechanism, which allows each router to dynamically route tokens based on image content and the stages of attention block in the network. Furthermore, we construct an ultra high-resolution image matting dataset, UHR-395, comprising 35,500 training images and 1,000 test images, with an average resolution of 4872 × 6017. This dataset is created by compositing 395 different alpha mattes across 11 categories onto various backgrounds, all with high-quality manual annotation. Extensive experiments demonstrate that MEMatte outperforms existing methods on both high-resolution and real-world datasets, significantly reducing memory usage by approximately 88% and latency by 50% on the Composition-1K benchmark.

NeurIPS Conference 2023 Conference Paper

Towards Consistent Video Editing with Text-to-Image Diffusion Models

  • Zicheng Zhang
  • Bonan Li
  • Xuecheng Nie
  • Congying Han
  • Tiande Guo
  • Luoqi Liu

Existing works have advanced Text-to-Image (TTI) diffusion models for video editing in a one-shot learning manner. Despite their low requirements of data and computation, these methods might produce results of unsatisfied consistency with text prompt as well as temporal sequence, limiting their applications in the real world. In this paper, we propose to address the above issues with a novel EI$^2$ model towards Enhancing vIdeo Editing consIstency of TTI-based frameworks. Specifically, we analyze and find that the inconsistent problem is caused by newly added modules into TTI models for learning temporal information. These modules lead to covariate shift in the feature space, which harms the editing capability. Thus, we design EI$^2$ to tackle the above drawbacks with two classical modules: Shift-restricted Temporal Attention Module (STAM) and Fine-coarse Frame Attention Module (FFAM). First, through theoretical analysis, we demonstrate that covariate shift is highly related to Layer Normalization, thus STAM employs a Instance Centering layer replacing it to preserve the distribution of temporal features. In addition, STAM employs an attention layer with normalized mapping to transform temporal features while constraining the variance shift. As the second part, we incorporate STAM with a novel FFAM, which efficiently leverages fine-coarse spatial information of overall frames to further enhance temporal consistency. Extensive experiments demonstrate the superiority of the proposed EI$^2$ model.

IJCAI Conference 2018 Conference Paper

High Resolution Feature Recovering for Accelerating Urban Scene Parsing

  • Rui Zhang
  • Sheng Tang
  • Luoqi Liu
  • Yongdong Zhang
  • Jintao Li
  • Shuicheng Yan

Both accuracy and speed are equally important in urban scene parsing. Most of the existing methods mainly focus on improving parsing accuracy, ignoring the problem of low inference speed due to large-sized input and high resolution feature maps. To tackle this issue, we propose a High Resolution Feature Recovering (HRFR) framework to accelerate a given parsing network. A Super-Resolution Recovering module is employed to recover features of large original-sized images from features of down-sampled input. Therefore, our framework can combine the advantages of (1) fast speed of networks with down-sampled input and (2) high accuracy of networks with large original-sized input. Additionally, we employ auxiliary intermediate supervision and boundary region re-weighting to facilitate the optimization of the network. Extensive experiments on the two challenging Cityscapes and CamVid datasets well demonstrate the effectiveness of the proposed HRFR framework, which can accelerate the scene parsing inference process by about 3. 0x speedup from 1/2 down-sampled input with negligible accuracy reduction.

IJCAI Conference 2017 Conference Paper

Salient Object Detection with Semantic Priors

  • Tam V. Nguyen
  • Luoqi Liu

Salient object detection has increasingly become a popular topic in cognitive and computational sciences, including computer vision and artificial intelligence research. In this paper, we propose integrating semantic priors into the salient object detection process. Our algorithm consists of three basic steps. Firstly, the explicit saliency map is obtained based on the semantic segmentation refined by the explicit saliency priors learned from the data. Next, the implicit saliency map is computed based on a trained model which maps the implicit saliency priors embedded into regional features with the saliency values. Finally, the explicit semantic map and the implicit map are adaptively fused to form a pixel-accurate saliency map which uniformly covers the objects of interest. We further evaluate the proposed framework on two challenging datasets, namely, ECSSD and HKUIS. The extensive experimental results demonstrate that our method outperforms other state-of-the-art methods.

TIST Journal 2017 Journal Article

Visual Classification of Furniture Styles

  • Zhenhen Hu
  • Yonggang Wen
  • Luoqi Liu
  • Jianguo Jiang
  • Richang Hong
  • Meng Wang
  • Shuicheng Yan

Furniture style describes the discriminative appearance characteristics of furniture. It plays an important role in real-world indoor decoration. In this article, we explore the furniture style features and study the problem of furniture style classification. Differing from traditional object classification, furniture style classification aims at classifying different furniture in terms of the “style” that describes its appearance (e.g., American style, Gothic style, Rococo style, etc.) rather than the “kind” that is more related to its functional structure (e.g., bed, desk, etc.). To pursue efficient furniture style features, we construct a novel dataset of furniture styles that contains 16 common style categories and implement three strategies with respect to two categories of classification, that is, handcrafted classification and learning-based classification. First, we follow the typical image classification pipeline to extract the handcrafted features and train the classifier by support vector machine. Then we use the convolutional neural network to extract learning-based features from training images. To obtain comprehensive furniture style features, we finally combine the handcrafted image classification pipeline and the learning-based network. We experimentally evaluate the performances of handcrafted features and learning-based features of each strategy, and the results show the superiority of learning-based features and also the comprehensiveness of handcrafted features.