Arrow Research search
Back to IJCAI

IJCAI 2025

Exploring Efficient and Effective Sequence Learning for Visual Object Tracking

Conference Paper Agent-based and Multi-agent Systems Artificial Intelligence

Abstract

Sequence learning based tracking frameworks are popular in the tracking community. In practice, its auto-regressive sequence generation manner leads to inferior performance and high latency compared with latest advanced trackers. In this paper, to mitigate this issue, we propose an efficient and effective sequence-to-sequence tracking framework named FastSeqTrack. FastSeqTrack differs from previous sequence learning based trackers in terms of token initialization and sequence generation manner. Four tracking tokens are appended to patch embeddings and generated in the encoder as initial guesses for the bounding box sequence, which improves the tracking accuracy compared with randomly initialized tokens. Tracking tokens are then parallelly fed into the decoder in a one-pass manner and greatly boost the forward inference speed compared with the auto-regressive manner. Inspired by the early-exit mechanism, we inject internal classifiers after each decoder layer to early terminate forward inference when the softmax confidence is sufficiently reliable. In easy tracking frames, early exits avoid network overthinking and unnecessary computation. Extensive experiments on multiple benchmarks demonstrate that FastSeqTrack runs over 100 fps and showcases superior performance against state-of-the-art trackers. Codes and models are available at https: //github. com/vision4drones/FastSeqTrack.

Authors

Keywords

  • Computer Vision: CV: Motion and tracking
  • Computer Vision: CV: Video analysis and understanding

Context

Venue
International Joint Conference on Artificial Intelligence
Archive span
1969-2025
Indexed papers
14525
Paper id
513625054669382452