RasterflowClient
export RASTERFLOW_CONFIG_FILE=/path/to/rasterflow.yaml 2. or create at ~/.config/rasterflow.yaml
build_and_predict_mosaic_recipe()
Build a mosaic and run inference using a pre-configured recipe for a specific model. This convenience method combines mosaic building and inference into a single workflow using a model recipe that defines the input datasets, model, and inference parameters.Parameters
Remote URL path to the area of interest GeoDataFrame.
Start date for the temporal range of the mosaic.
End date for the temporal range of the mosaic.
Pre-configured recipe that defines the datasets, model, and inference parameters to use. Available recipes are defined in the ModelRecipes enum.
EPSG code for the coordinate reference system of the output. Consider using a UTM-specific EPSG code (e.g., EPSG:32610 for UTM Zone 10N) to avoid reprojection/resampling if your AOI falls within a single UTM zone and the model recipe uses a UTM-projected dataset. If None, defaults to the native CRS of the underlying datasets for the model recipe, which are often UTM-projected. Default is None.
Compute resources to allocate for the workflow execution. Options defined in RuntimeEnum (e.g., SMALL, MEDIUM, LARGE). Default is RuntimeEnum.SMALL.
Additional environment variables to pass to the workflow execution. Default is None.
Response
Output URI to a GeoDataFrame artifact with a
location column
containing inferred mosaic Zarr store URIs.build_gti_mosaic()
Build a Zarr mosaic from a GDAL Tile Index (GTI) vector file. See https://gdal.org/en/latest/drivers/raster/gti.html for more information. Note: The GTI tile index must: 1. Have a geometry column corresponding to the spatial extent of each tile 2. Have a column pointing to the remote URL of each GeoTIFF/COG 3. Contain homogeneous bands across all entries Additional recommended metadata as described in the GDAL docs will improve performance or quality of the mosaic.Parameters
Remote URL path to the tile index GeoDataFrame.
Remote URL path to the area of interest GeoDataFrame.
List of band names to include in the mosaic. Must exist in all tiles.
Column name in the GTI that contains the path/URL to each GeoTIFF/COG.
EPSG code for the coordinate reference system to use for the output mosaic. Default is 3857 (Web Mercator).
GTI column used to group entries into mosaic time intervals. Defaults to
None. If set to None, all rows are grouped into a single NaT time slice. For example, a year column with elements from would lead to time dimension of length 2. This enables the user to specify any time resolution based on properties of the underlying raster data. Be sure that the cardinality of the time column is not too high.Whether to skip the XY coordinates when building the mosaic. This is useful for very very large mosaics. Defaults to False.
Chunk size in pixels to use for the X and Y dimensions when building the mosaic. Default is 512.
The size in bytes that determines the shard size for the Zarr store and the partition size for each task. Default is 3.5GB in bytes.
Pandas query string to filter the GTI before processing. Uses DataFrame.query() syntax. See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html. Default is None.
If True, enables requester pays for accessing cloud-stored tiles. The requester’s account will be charged for data transfer. Default is False.
Additional environment variables to pass to the workflow execution. Default is None.
Column name in the GTI to sort entries by before building the mosaic. Sorting can affect which tiles take precedence in overlapping areas. Default is None.
Resampling method to use when building the mosaic, by default ResamplingMethod.NEAREST.
Spatial resolution for the output mosaic in the units of the target CRS. If None, uses the native resolution of the input tiles. Default is None.
Nodata value to assign to the output mosaic. If None, attempts to use the nodata value from the source tiles. Will raise an error if tiles lack a nodata value. Default is None.
Response
Output URI to a GeoDataFrame artifact with a
location column
pointing to built mosaic Zarr store URIs.build_mosaic()
Execute the mosaic building workflow.Parameters
List of datasets to include in the mosaic. Available datasets are defined in DatasetEnum.
Area of interest as any file format supported by GeoPandas read_file or read_parquet (e.g., GeoJSON, GeoParquet, Shapefile) via remote URL.
Start date for the temporal range of the mosaic.
End date for the temporal range of the mosaic.
EPSG code for the coordinate reference system to use for the output mosaic. Default is 3857 (Web Mercator).
Chunk size in pixels to use for the X and Y dimensions when building the mosaic. Larger values use more memory but may be faster. Default is 512.
The size in bytes that determines the shard size for the Zarr store and the partition size for each task. Default is 3.5GB in bytes.
Additional environment variables to pass to the workflow execution. Default is None.
Spatial resolution for the output mosaic in the units of the target CRS. If None, uses the native resolution of the input datasets. Default is None.
Whether to skip the XY coordinates when building the mosaic. This is useful for very very large mosaics. Defaults to False.
Response
Output URI to a GeoDataFrame artifact with a
location column
pointing to built mosaic Zarr store URIs.build_zarr_multiscales()
Build an optimized multiscale Zarr store from an unoptimized Zarr store. This workflow creates a new Zarr store with multiple resolution levels (overviews) and optional histogram statistics, suitable for efficient visualization and analysis.Parameters
URI of the input Zarr store to process (e.g., an S3 path like
Response
Output URI to the generated multiscale Zarr store.
predict_mosaic()
Run inference on a single mosaic zarr store using a specified model.Parameters
URI of the input mosaic Zarr store to run inference on.
Path (local or remote URL) to the model file. Should be compatible with the specified inference actor.
Size of the patches to be used during inference.
Size in pixels to clip from patch edges before merging predictions. Helps reduce edge artifacts in overlapping regions. Must be less than patch_size.
Device to run the model on. Options: “cuda” for GPU, “cpu” for CPU.
List of feature (band) names from the mosaic to use as model inputs. Must exist in the input Zarr store.
List of output label names that the model produces. These will be the band names in the output store.
The inference actor to use for running the model. Available options are defined in InferenceActorEnum.
Maximum number of patches to process in a single batch during inference.
Method for merging predictions from overlapping patches. Options defined in MergeModeEnum.
Subset of labels to extract from model output. If None, all labels produced by the model are saved. Default is None.
Multiplier on Zarr chunks. Larger values process bigger blocks (groups of chunks) at once. Default is 4.
Optional chunk size for output mosaics. If None, the workflow default is used.
Compute resources to allocate for the workflow execution. Options defined in RuntimeEnum (e.g., SMALL, MEDIUM, LARGE). Default is RuntimeEnum.SMALL.
Additional environment variables to pass to the workflow execution. Default is None.
Response
Output URI to a GeoDataFrame artifact with a
location column
updated to point to inference-result Zarr store URIs.vectorize_mosaic()
Convert raster predictions to vector geometries through polygonization. Supports thresholding and then vectorizing float values. Typically these are confidence scores from semantic segmentation workflows.Parameters
URI of the input mosaic Zarr store to vectorize.
List of features (band) names from the mosaic to vectorize. Typically these represent the model predictions from predict_mosaic. Each feature is vectorized separately.
Threshold value for binarizing continuous predictions before vectorization. Pixels with values greater than or equal to this threshold are considered foreground (1), while values below are background (0).
Vectorization method to use. Available methods are defined in VectorizeMethodEnum.
The configuration for the vectorize_method.
Multiplier on Zarr chunks. Larger values process bigger blocks (groups of chunks) at once. Default is 4.
Target coordinate reference system for the output geometries in EPSG format (e.g., “EPSG:4326”). If None, geometries remain in the CRS of the input mosaic. Default is “EPSG:4326” (WGS84 lat/lon).
Compute resources to allocate for the workflow execution. Options defined in RuntimeEnum (e.g., SMALL, MEDIUM, LARGE). Default is RuntimeEnum.SMALL.
Additional environment variables to pass to the workflow execution. Default is None.
Response
Output URI to the merged parquet directory containing vectorization
results, or
None when no vector features were produced.
