SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery

Konstantin Klemmer; Esther Rolf; Caleb Robinson; Lester Mackey; Marc Rußwurm

doi:10.1609/aaai.v39i4.32457

Back to AAAI

AAAI 2025

SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery

Conference Paper AAAI Technical Track on Computer Vision III Artificial Intelligence

PDF Details DOI

Abstract

Geographic information is essential for modeling tasks in fields ranging from ecology to epidemiology. However, extracting relevant location characteristics for a given task can be challenging, often requiring expensive data fusion or distillation from massive global imagery datasets. To address this challenge, we introduce Satellite Contrastive Location-Image Pretraining (SatCLIP). This global, general-purpose geographic location encoder learns an implicit representation of locations by matching CNN and ViT inferred visual patterns of openly available satellite imagery with their geographic coordinates. The resulting SatCLIP location encoder efficiently summarizes the characteristics of any given location for convenient use in downstream tasks. In our experiments, we use SatCLIP embeddings to improve performance on nine diverse geospatial prediction tasks including temperature prediction, animal recognition, and population density estimation. Across tasks, SatCLIP consistently outperforms alternative location encoders and shows promise for improving geographic domain adaptation. These results demonstrate the potential of vision-location models to learn meaningful representations of our planet from the vast, varied, and largely untapped modalities of geospatial data.

SatCLIP: Global, General-Purpose Location Embeddings with Satellite Imagery

Abstract

Authors

Keywords

Context