Arrow Research search

Author name cluster

Liang Qiao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

JBHI Journal 2026 Journal Article

Dual-Cross Tri-Level Routing Transformer Based Metric Learning Network for Epileptic Seizure Prediction Using a Single-Channel iEEG

  • Yifan Wang
  • Weidong Yan
  • Yulan Ma
  • Liang Qiao
  • Tao Yu
  • Jingyu Liu

With the development of deep brain stimulation technique, single-channel intracranial electroencephalography (iEEG) based seizure prediction is a necessary and urgent needed tool for epilepsy closed-loop neuromodulation. However, previous prediction methods based on multi-channel scalp signals heavily relied on the spatial information, failing to fully exploit the interdependencies between temporal scales and spectral rhythms of single-channel iEEG. Additionally, current contrastive learning strategies can lead to model overfitting by excessively learning the feature distances in small samples, limiting the precision of seizure prediction. To tackle above issues, based on a single-channel iEEG, we propose a novel dual-cross tri-level routing transformer based metric learning network (DC-TRT-MLNet) for epileptic seizure prediction. First, a scale-rhythm dual-cross (DC) graph attention network is introduced to construct the dependent relationships across multi-scale temporal and multi-rhythm spectral features. Second, we design a tri-level routing transformer (TRT) network to comprehensively refine the most seizure-potential routing features while eliminating redundant information. Finally, a hard triplet optimization based metric learning (ML) strategy is developed to iteratively optimize the intra-class and inter-class distances of inter-ictal and pre-ictal routing features. Competitive experimental results on a private Xuanwu Single-Channel iEEG dataset validate the effectiveness of our proposed method, demonstrating the superior prediction performance of our DC-TRT-MLNet compared with the state-of-the-art methods. Our study may offer a new solution for intracranial single-channel seizure prediction.

ICRA Conference 2024 Conference Paper

SM 3: Self-supervised Multi-task Modeling with Multi-view 2D Images for Articulated Objects

  • Haowen Wang 0001
  • Zhen Zhao
  • Zhao Jin
  • Zhengping Che
  • Liang Qiao
  • Yakun Huang
  • Zhipeng Fan
  • Xiuquan Qiao

Reconstructing real-world objects and estimating their movable joint structures are pivotal technologies within the field of robotics. Previous research has predominantly focused on supervised approaches, relying on annotated datasets to model articulated objects within limited categories. However, these approaches fall short of effectively addressing the diversity present in the real world. To tackle this issue, we propose a self-supervised interaction perception method, referred to as SM 3, which leverages multi-view RGB images captured before and after interaction to model articulated objects, identify the movable parts, and infer the parameters of their rotating joints. By constructing 3D geometries and textures from the captured 2D images, SM 3 achieves integrated optimization of movable part and joint parameters during the reconstruction process, obviating the need for annotations. Furthermore, we introduce the MMArt dataset, an extension of PartNet-Mobility, encompassing multi-view and multi-modal data of articulated objects spanning diverse categories. Evaluations demonstrate that SM 3 surpasses existing benchmarks across various categories and objects, and its adaptability in real-world scenarios has been thoroughly validated.

IJCAI Conference 2023 Conference Paper

Divide Rows and Conquer Cells: Towards Structure Recognition for Large Tables

  • Huawen Shen
  • Xiang Gao
  • Jin Wei
  • Liang Qiao
  • Yu Zhou
  • Qiang Li
  • Zhanzhan Cheng

Recent advanced Table Structure Recognition (TSR) models adopt image-to-text solutions to parse table structure. These methods can be formulated as image caption problem, i. e. , input a single-table image and output table structure description in a specific text format, e. g. , HTML. With the impressive success of Transformer in text generation tasks, these methods use Transformer architecture to predict HTML table text in an autoregressive manner. However, tables always emerge with a large variety of shapes and sizes. Autoregressive models usually suffer from the error accumulation problem as the length of predicted text increases, which results in unsatisfactory performance for large tables. In this paper, we propose a novel image-to-text based TSR method that relieves error accumulation problems and improves performance noticeably. At the core of our method is a cascaded two-step decoder architecture with the former decoder predicting HTML table row tags non-autoregressively and the latter predicting HTML table cell tags of each row in a semi-autoregressive manner. Compared with existing methods that predict HTML text autoregressively, the superiority of our row-to-cell progressive table parsing is twofold: (1) it generates an HTML tag sequence with a vertical-and-horizontal two-step `scanning', which better fits the inherent 2D structure of image data, (2) it performs substantially better for large tables (long sequence prediction) since it alleviates error accumulation problem specific to autoregressive models. Extensive experiments demonstrate that our method achieves competitive performance on three public benchmarks.

AAAI Conference 2021 Conference Paper

MANGO: A Mask Attention Guided One-Stage Scene Text Spotter

  • Liang Qiao
  • Ying Chen
  • Zhanzhan Cheng
  • Yunlu Xu
  • Yi Niu
  • Shiliang Pu
  • Fei Wu

Recently end-to-end scene text spotting has become a popular research topic due to its advantages of global optimization and high maintainability in real applications. Most methods attempt to develop various region of interest (RoI) operations to concatenate the detection part and the sequence recognition part into a two-stage text spotting framework. However, in such framework, the recognition part is highly sensitive to the detected results (e. g. , the compactness of text contours). To address this problem, in this paper, we propose a novel Mask AttentioN Guided One-stage text spotting framework named MANGO, in which character sequences can be directly recognized without RoI operation. Concretely, a positionaware mask attention module is developed to generate attention weights on each text instance and its characters. It allows different text instances in an image to be allocated on different feature map channels which are further grouped as a batch of instance features. Finally, a lightweight sequence decoder is applied to generate the character sequences. It is worth noting that MANGO inherently adapts to arbitraryshaped text spotting and can be trained end-to-end with only coarse position information (e. g. , rectangular bounding box) and text annotations. Experimental results show that the proposed method achieves competitive and even new state-ofthe-art performance on both regular and irregular text spotting benchmarks, i. e. , ICDAR 2013, ICDAR 2015, Total-Text, and SCUT-CTW1500.

AAAI Conference 2020 Conference Paper

Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting

  • Liang Qiao
  • Sanli Tang
  • Zhanzhan Cheng
  • Yunlu Xu
  • Yi Niu
  • Shiliang Pu
  • Fei Wu

Many approaches have recently been proposed to detect irregular scene text and achieved promising results. However, their localization results may not well satisfy the following text recognition part mainly because of two reasons: 1) recognizing arbitrary shaped text is still a challenging task, and 2) prevalent non-trainable pipeline strategies between text detection and text recognition will lead to suboptimal performances. To handle this incompatibility problem, in this paper we propose an end-to-end trainable text spotting approach named Text Perceptron. Concretely, Text Perceptron first employs an efficient segmentation-based text detector that learns the latent text reading order and boundary information. Then a novel Shape Transform Module (abbr. STM) is designed to transform the detected feature regions into regular morphologies without extra parameters. It unites text detection and the following recognition part into a whole framework, and helps the whole network achieve global optimization. Experiments show that our method achieves competitive performance on two standard text benchmarks, i. e. , ICDAR 2013 and ICDAR 2015, and also obviously outperforms existing methods on irregular text benchmarks SCUT-CTW1500 and Total-Text.