Geospatial Foundation Models (GFMs) are often used for three core workflows: classification, segmentation, and embedding extraction. TerraTorch provides a no-/low-code interface to fine-tune and evaluate GFMs via configuration files and simple commands—ideal for quickly exploring a task before writing custom code.
Quick environment check
Use this cell to confirm your runtime and whether terratorch is available. If it is not installed, see the optional install cell below.
Code
import sys, platformprint(f"Python: {sys.version.split()[0]}")print(f"Platform: {platform.platform()}")try:import torchprint(f"PyTorch: {torch.__version__}; cuda={torch.cuda.is_available()}")exceptExceptionas e:print("PyTorch not available:", e)try:import terratorchprint("TerraTorch is installed.")exceptExceptionas e:print("TerraTorch not available:", e)
Python: 3.11.13
Platform: macOS-15.6-x86_64-i386-64bit
PyTorch: 2.7.1; cuda=False
TerraTorch is installed.
No-code: Land cover classification (single-label)
Intent: Show how a configuration file can fine-tune a pretrained backbone on a standard classification dataset with no Python coding.
Example configuration
# terratorch-configs/classification_eurosat.yamltask: classificationdata:dataset: geobench.eurosat_rgb # Example GEO-Bench dataset keysplit: standard # Use library-provided splitbatch_size:64num_workers:4model:backbone: prithvi-100m # Example backbone identifierpretrained:truehead: linear # Linear classifier headnum_classes:10 # EuroSAT RGB has 10 classestrainer:max_epochs:5precision:16accelerator: autooptim:name: adamwlr:3.0e-4weight_decay:0.01outputs:dir: runs/classification_eurosat
Data, model, and trainer are declarative—change dataset, backbone, or max_epochs to iterate rapidly.
Outputs are organized under runs/ for easy comparison across experiments.
No-code: Semantic segmentation (pixel-wise)
Intent: Demonstrate swapping the task and head while reusing a pretrained backbone.
Example configuration
# terratorch-configs/segmentation_floods.yamltask: segmentationdata:dataset: geobench.floods_s2 # Example placeholder for a flood datasetsplit: standardbatch_size:4 # Larger images → smaller batchnum_workers:4model:backbone: satmae-basepretrained:truehead: unet # Use a UNet-style decodernum_classes:2 # water vs. non-water (example)trainer:max_epochs:10precision:16accelerator: autooptim:name: adamwlr:1.0e-4weight_decay:0.01outputs:dir: runs/segmentation_floods
Only task, head, and num_classes changed from classification.
You can reuse the same backbone across very different downstream tasks.
No-/Low-code: Embedding extraction for retrieval or clustering
Intent: Extract patch-level embeddings from a pretrained GFM for downstream analytics (nearest neighbors, clustering, or few-shot learning).
Example configuration
# terratorch-configs/embeddings_satellite.yamltask: embeddingsdata:dataset: geobench.eurosat_rgbsplit: trainbatch_size:128num_workers:4model:backbone: prithvi-100mpretrained:truepooling: gap # global average pool token embeddingsoutputs:dir: runs/embeddings_eurosat
Low-code: Load saved features and inspect neighbors
Code
# Example: toy post-processing of saved features (replace with your run path)import osimport numpy as nprun_dir ="runs/embeddings_eurosat"# adjust to your pathfeatures_path = os.path.join(run_dir, "features.npy")labels_path = os.path.join(run_dir, "labels.npy")if os.path.exists(features_path) and os.path.exists(labels_path): feats = np.load(features_path) labels = np.load(labels_path)print("features:", feats.shape, "labels:", labels.shape)# Cosine similarities to the first sample a = feats[0:1] sims = (feats @ a.T) / (np.linalg.norm(feats, axis=1, keepdims=True) * np.linalg.norm(a)) topk = np.argsort(-sims.squeeze())[:5]print("Top-5 nearest neighbors to sample 0:", topk.tolist())else:print("Feature files not found. Run the embedding command first (see above).")
Feature files not found. Run the embedding command first (see above).
What to notice:
Embeddings create a versatile representation for retrieval, clustering, and few-shot tasks.
You can mix no-code extraction with simple, custom analytics.
Tips for adapting configs
Change data.dataset to switch benchmarks or your own dataset key.
Swap model.backbone among supported GFMs (e.g., prithvi-100m, satmae-base).
Choose an appropriate head for the task: linear (classification), unet (segmentation), or pooling options for embeddings.
Keep trainer.max_epochs small for quick sanity checks, then scale up.
Why this matters (reflection)
No-/low-code workflows let you validate feasibility and surface bottlenecks quickly (data quality, class imbalance, resolution). Once you see promising signals, you can transition to custom training loops or integrate advanced augmentations—while keeping the same pretrained backbone and dataset.
Source Code
---title: "TerraTorch: No-/Low-Code GFM Workflows"subtitle: "Classification, Segmentation, and Embeddings"jupyter: geoaiformat: html: toc: true toc-depth: 3 code-fold: showeditor_options: chunk_output_type: console---## Introduction to TerratorchGeospatial Foundation Models (GFMs) are often used for three core workflows: classification, segmentation, and embedding extraction. TerraTorch provides a no-/low-code interface to fine-tune and evaluate GFMs via configuration files and simple commands—ideal for quickly exploring a task before writing custom code.---## Quick environment checkUse this cell to confirm your runtime and whether `terratorch` is available. If it is not installed, see the optional install cell below.```{python}# | echo: trueimport sys, platformprint(f"Python: {sys.version.split()[0]}")print(f"Platform: {platform.platform()}")try:import torchprint(f"PyTorch: {torch.__version__}; cuda={torch.cuda.is_available()}")exceptExceptionas e:print("PyTorch not available:", e)try:import terratorchprint("TerraTorch is installed.")exceptExceptionas e:print("TerraTorch not available:", e)```---## No-code: Land cover classification (single-label)Intent: Show how a configuration file can fine-tune a pretrained backbone on a standard classification dataset with no Python coding.1) Example configuration```yaml# terratorch-configs/classification_eurosat.yamltask: classificationdata:dataset: geobench.eurosat_rgb # Example GEO-Bench dataset keysplit: standard # Use library-provided splitbatch_size:64num_workers:4model:backbone: prithvi-100m # Example backbone identifierpretrained:truehead: linear # Linear classifier headnum_classes:10 # EuroSAT RGB has 10 classestrainer:max_epochs:5precision:16accelerator: autooptim:name: adamwlr:3.0e-4weight_decay:0.01outputs:dir: runs/classification_eurosat```2) Run training from the command line (choose one):```{bash}#| echo: true#| eval: false# Option A: dedicated CLI (if provided by your TerraTorch install)terratorch-train--config terratorch-configs/classification_eurosat.yaml# Option B: Python module entrypoint (Hydra-style)python-m terratorch.train --config terratorch-configs/classification_eurosat.yaml```3) Evaluate or predict (typical patterns):```{bash}#| echo: true#| eval: falseterratorch-eval--run runs/classification_eurosatterratorch-predict--run runs/classification_eurosat --images path/to/*.tif --out preds/```What to notice:- Data, model, and trainer are declarative—change `dataset`, `backbone`, or `max_epochs` to iterate rapidly.- Outputs are organized under `runs/` for easy comparison across experiments.---## No-code: Semantic segmentation (pixel-wise)Intent: Demonstrate swapping the task and head while reusing a pretrained backbone.1) Example configuration```yaml# terratorch-configs/segmentation_floods.yamltask: segmentationdata:dataset: geobench.floods_s2 # Example placeholder for a flood datasetsplit: standardbatch_size:4 # Larger images → smaller batchnum_workers:4model:backbone: satmae-basepretrained:truehead: unet # Use a UNet-style decodernum_classes:2 # water vs. non-water (example)trainer:max_epochs:10precision:16accelerator: autooptim:name: adamwlr:1.0e-4weight_decay:0.01outputs:dir: runs/segmentation_floods```2) Train and visualize predictions```{bash}#| echo: true#| eval: falseterratorch-train--config terratorch-configs/segmentation_floods.yamlterratorch-predict--run runs/segmentation_floods --images path/to/patches/*.tif --out preds/```What to notice:- Only `task`, `head`, and `num_classes` changed from classification.- You can reuse the same `backbone` across very different downstream tasks.---## No-/Low-code: Embedding extraction for retrieval or clusteringIntent: Extract patch-level embeddings from a pretrained GFM for downstream analytics (nearest neighbors, clustering, or few-shot learning).1) Example configuration```yaml# terratorch-configs/embeddings_satellite.yamltask: embeddingsdata:dataset: geobench.eurosat_rgbsplit: trainbatch_size:128num_workers:4model:backbone: prithvi-100mpretrained:truepooling: gap # global average pool token embeddingsoutputs:dir: runs/embeddings_eurosat```2) Extract and save features```{bash}#| echo: true#| eval: falseterratorch-embed--config terratorch-configs/embeddings_satellite.yaml```3) Low-code: Load saved features and inspect neighbors```{python}# | echo: true# Example: toy post-processing of saved features (replace with your run path)import osimport numpy as nprun_dir ="runs/embeddings_eurosat"# adjust to your pathfeatures_path = os.path.join(run_dir, "features.npy")labels_path = os.path.join(run_dir, "labels.npy")if os.path.exists(features_path) and os.path.exists(labels_path): feats = np.load(features_path) labels = np.load(labels_path)print("features:", feats.shape, "labels:", labels.shape)# Cosine similarities to the first sample a = feats[0:1] sims = (feats @ a.T) / (np.linalg.norm(feats, axis=1, keepdims=True) * np.linalg.norm(a)) topk = np.argsort(-sims.squeeze())[:5]print("Top-5 nearest neighbors to sample 0:", topk.tolist())else:print("Feature files not found. Run the embedding command first (see above).")```What to notice:- Embeddings create a versatile representation for retrieval, clustering, and few-shot tasks.- You can mix no-code extraction with simple, custom analytics.---## Tips for adapting configs- Change `data.dataset` to switch benchmarks or your own dataset key.- Swap `model.backbone` among supported GFMs (e.g., `prithvi-100m`, `satmae-base`).- Choose an appropriate head for the task: `linear` (classification), `unet` (segmentation), or `pooling` options for embeddings.- Keep `trainer.max_epochs` small for quick sanity checks, then scale up.---## Why this matters (reflection)No-/low-code workflows let you validate feasibility and surface bottlenecks quickly (data quality, class imbalance, resolution). Once you see promising signals, you can transition to custom training loops or integrate advanced augmentations—while keeping the same pretrained backbone and dataset.