Detecting Objects From Text Prompts with RasterFlow

Private Preview

The following content is a read-only preview of an executable Jupyter notebook.To run this notebook interactively:

Go to Wherobots Cloud.
Start a runtime.
Open the notebook.
In the Jupyter Launcher:
1. Click File > Open Path.
2. Paste the following path to access this notebook: examples/Analyzing_Data/RasterFlow_SAM3.ipynb
3. Click Enter.

This notebook will guide you through detecting objects in aerial imagery using text prompts, powered by Wherobots RasterFlow and Meta’s Segment Anything Model 3 (SAM3). You will gain a hands-on understanding of how to run geometry inference on your selected area of interest, work with the detected geometries, and visualize the results in WherobotsDB.

SAM3

SAM3 is a text-prompted geometry inference model that detects objects in imagery based on natural language descriptions. Given a text prompt like "building" or "roofs", the model produces georeferenced vector geometries (bounding boxes or polygons) for each detected object, along with a confidence score. Unlike segmentation models that produce raster outputs, SAM3 directly outputs vector geometries, making it straightforward to integrate results into geospatial workflows. We will demonstrate results using 30cm resolution data from the National Agriculture Imagery Program (NAIP).

Preview: model inputs and outputs

The interactive map linked below shows the input imagery and model output for this notebook’s example AOI. Toggle layers in the sidebar to compare the input imagery with the model’s PMTiles output side-by-side. Layers:

SAM3 input mosaic: RGB NAIP aerial imagery the model runs on
SAM3 PM Tiles: polygons output directly by the text-prompted model, delivered as PMTiles for fast rendering at scale (SAM3 is a geometry inference model, so there is no intermediate raster output)

View the interactive map here.

Selecting an Area of Interest (AOI)

To start, we will choose an Area of Interest (AOI) for our analysis where 30cm resolution NAIP data is available: College Park, Maryland. The National Agriculture Imagery Program (NAIP) provides aerial imagery for the United States, capturing high-resolution images during the agricultural growing seasons. To try other AOIs, be sure to choose a region where 30cm resolution imagery is available. See this map for more details.

import wkls
import geopandas as gpd
import os

# Generate a geometry for College Park, Maryland using Well-Known Locations (https://github.com/wherobots/wkls)
gdf = gpd.read_file(wkls.us.md.collegepark.geojson())

# Save the geometry to a parquet file in the user's S3 path
aoi_path = os.getenv("USER_S3_PATH") + "collegepark.parquet"
gdf.to_parquet(aoi_path)

Initializing the RasterFlow client

from datetime import datetime

from rasterflow_remote import RasterflowClient
from rasterflow_remote.data_models import GeometryModelRecipes

rf_client = RasterflowClient()

Running geometry inference

RasterFlow has pre-defined recipes that simplify orchestration of the processing steps for geometry inference. These steps include:

Ingesting imagery for the specified Area of Interest (AOI)
Generating a seamless mosaic from multiple image tiles
Running text-prompted geometry inference with the SAM3 model

The output is a GeoDataFrame of detected geometries with confidence scores.

Note: The patch_size configured by InferenceConfig is always resized to 1008x1008 by the GeometryModelRecipes.SAM3_TEXT_GEOMETRY and GeometryModelRecipes.SAM3_TEXT_BBOX recipes. This means you can control the amount of spatial context passed to SAM3 in each pass, but selecting patch sizes larger than 1008x1008 will upsample the resolution.

Note: This step will take approximately 22 minutes to complete the first time it is run.

model_output = rf_client.predict_mosaic_geometries_recipe(
    # Path to our AOI in GeoParquet or GeoJSON format
    aoi=aoi_path,

    # Date range for imagery to be used by the model
    start=datetime(2023, 1, 1),
    end=datetime(2024, 1, 1),

    # Coordinate Reference System EPSG code for the output
    target_crs=3857,

    # The model recipe and text prompt for object detection
    model_recipe=GeometryModelRecipes.SAM3_TEXT_BBOX,
    # You can also pass multiple prompts to detect several object types at once,
    # e.g. text_prompt=["roofs", "swimming pools"]
    text_prompt="roofs",
    confidence_threshold=0.7,
)

detections_gdf = gpd.read_parquet(model_output.uri)
detections_gdf

Explore the detected geometries

The geometry inference output is a GeoDataFrame where each row is a detected object. The columns include:

geometry: the georeferenced polygon for the detection
label: the text prompt category (e.g. "roofs")
bbox_score: confidence score for the detection
bbox: bounding box coordinates
time: timestamp of the source imagery

print(f"Total detections: {len(detections_gdf)}")
print(f"Columns: {list(detections_gdf.columns)}")
print(f"\nConfidence score stats:")
detections_gdf["bbox_score"].describe()

Save the results to the catalog

We can store these geometry outputs in the catalog using WherobotsDB to persist the GeoParquet results.

from sedona.spark import *
from pyspark.sql.functions import expr

config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)

sedona.sql("CREATE DATABASE IF NOT EXISTS examples_temp.sam3_db")

df = sedona.read.format("geoparquet").load(model_output.uri)
df = df.withColumnRenamed("label", "layer")
df.writeTo("examples_temp.sam3_db.sam3_roofs").createOrReplace()

Visualize the detected geometries

We can filter the detections by area to remove noise, then visualize the results.

df_filtered = df.withColumn(
    "area_m2",
    expr("ST_AreaSpheroid(geometry)")
).filter("area_m2 > 10")

df_filtered.show()

from sedona.spark.maps.SedonaKepler import SedonaKepler

map = SedonaKepler.create_map(df=df_filtered, name="SAM3 roof detections")
map

Generate PM Tiles for visualization

To improve visualization performance of a large number of geometries, we can use the Wherobots built-in high performance PM tile generator.

from wherobots import vtiles

full_tiles_path = os.getenv("USER_S3_PATH") + "sam3_roofs_tiles.pmtiles"
vtiles.generate_pmtiles(df_filtered, full_tiles_path)

vtiles.show_pmtiles(full_tiles_path)

​SAM3

​Preview: model inputs and outputs

​Selecting an Area of Interest (AOI)

​Initializing the RasterFlow client

​Running geometry inference

​Explore the detected geometries

​Save the results to the catalog

​Visualize the detected geometries

​Generate PM Tiles for visualization