AAAI Conference 2026 Conference Paper
Learning to LEAP: Efficient Dense Point Tracking by Focusing Where It Matters
- Chenzhi Zhao
- Wufan Wang
- Bo Zhang
- Wendong Wang
Tracking Any Point (TAP) is a foundational task in computer vision with broad applicability. The state-of-the-art self-supervised TAP method leverages a global matching transformer and contrastive random walks to learn point correspondences. However, its dense all-pairs attention and correlation volume computation tend to introduce irrelevant features and produce less informative training signals, degrading both learning efficiency and tracking accuracy. To address these limitations, we introduce LEAP-Track, a self-supervised TAP approach that computes the attention matrices and correlation volume over adaptively selected sparse pairs. It consists of two core designs: (1) Curriculum-based Sparse Attention (CSA), which dynamically focuses on the most relevant keys, promoting the learning of discriminative features; and (2) Progressive k-NN Transition (PkT), which reformulates the contrastive random walk to operate on an increasingly sparse k-NN affinity graph to reinforce the learning of the most informative correspondences. By integrating the above two designs into a two-stage training paradigm, LEAP-Track is shown both theoretically and empirically to effectively boost learning efficiency, achieving superior tracking accuracy over existing self-supervised TAP methods.