Skip to main content

RasterflowClient

RasterflowClient(
    rasterflow_version: str = 'v1.40.2',
    rasterflow_domain: str | None = None,
    mosaics_version: str = 'v0.18.0',
    mosaics_domain: str | None = None,
    dry_run: bool = False,
    cache: bool = True
)
High-level client for executing RasterFlow workflows using FlyteRemote. First, obtain an API key for RasterFlow. Then create a YAML config file at the following location(s):
  1. export RASTERFLOW_CONFIG_FILE=/path/to/rasterflow.yaml
  2. or create at ~/.config/rasterflow.yaml

build_and_predict_mosaic_recipe()

Build a mosaic and run model inference using a pre-configured recipe associated with a specific model. This convenience method combines mosaic building and inference into a single workflow using a model recipe that defines the input datasets, model, and inference parameters.
def build_and_predict_mosaic_recipe(self, aoi: str)

Parameters

aoi
str
required
Remote URL path to the area of interest GeoDataFrame.
start
datetime
required
Start date for the temporal range of the mosaic.
end
datetime
required
End date for the temporal range of the mosaic.
model_recipe
ModelRecipes
required
Pre-configured recipe that defines the datasets, model, and inference parameters to use. Available recipes are defined in the ModelRecipes enum.
crs_epsg
int | None
EPSG code for the coordinate reference system of the output. Consider using a UTM-specific EPSG code (e.g., EPSG:32610 for UTM Zone 10N) to avoid reprojection/resampling if your AOI falls within a single UTM zone and the model recipe uses a UTM-projected dataset. If None, defaults to the native CRS of the underlying datasets for the model recipe, which are often UTM-projected. Default is None.
runtime
RuntimeEnum
Compute resources to allocate for the workflow execution. Options defined in RuntimeEnum (e.g., SMALL, MEDIUM, LARGE). Default is RuntimeEnum.SMALL.
envs
dict[str, str] | None
Additional environment variables to pass to the workflow execution. Default is None.

Response

return
str
URI of the output Zarr store containing the inference results.

build_gti_mosaic()

Build a Zarr mosaic from a GDAL Tile Index (GTI) vector file. See https://gdal.org/en/latest/drivers/raster/gti.html for more information. Note: The GTI tile index must:
  1. Have a geometry column corresponding to the spatial extent of each tile
  2. Have a column pointing to the remote URL of each GeoTIFF/COG
  3. Contain homogeneous bands across all entries
Additional recommended metadata as described in the GDAL docs will improve performance or quality of the mosaic.
def build_gti_mosaic(self, gti: str, aoi: str, bands: list[str])

Parameters

gti
str
required
Remote URL path to the tile index GeoDataFrame.
aoi
str
required
Remote URL path to the area of interest GeoDataFrame.
bands
list[str]
required
List of band names to include in the mosaic. Must exist in all tiles.
location_field
str
required
Column name in the GTI that contains the path/URL to each GeoTIFF/COG.
crs_epsg
int
EPSG code for the coordinate reference system to use for the output mosaic. Default is 3857 (Web Mercator).
time
datetime | None
default:"None"
Optional timestamp to assign to the mosaic for temporal context. If None, no time dimension is added. Default is None.
skip_xy_coords
bool
Whether to skip the XY coordinates when building the mosaic. This is useful for very very large mosaics. Defaults to False.
xy_chunksize
int
Chunk size in pixels to use for the X and Y dimensions when building the mosaic. Default is 512.
max_shard_size
float
The size in bytes that determines the shard size for the Zarr store and the partition size for each task. Default is 3.5GB in bytes.
query
str | None
Pandas query string to filter the GTI before processing. Uses DataFrame.query() syntax. See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html. Default is None.
requester_pays
bool
If True, enables requester pays for accessing cloud-stored tiles. The requester’s account will be charged for data transfer. Default is False.
envs
dict[str, str] | None
Additional environment variables to pass to the workflow execution. Default is None.
sort_field
str | None
Column name in the GTI to sort entries by before building the mosaic. Sorting can affect which tiles take precedence in overlapping areas. Default is None.
buffer
float
Buffer distance to apply around the AOI in the units of the AOI’s CRS. Default is 0.0.
resampling
ResamplingMethod
default:"ResamplingMethod"
Resampling method to use when building the mosaic, by default ResamplingMethod.NEAREST.
resolution
float | None
Spatial resolution for the output mosaic in the units of the target CRS. If None, uses the native resolution of the input tiles. Default is None.
nodata
float | None
Nodata value to assign to the output mosaic. If None, attempts to use the nodata value from the source tiles. Will raise an error if tiles lack a nodata value. Default is None.

Response

return
str
URI of the output Zarr store where the mosaic is saved.

build_mosaic()

Execute the mosaic building workflow.
def build_mosaic()

Parameters

datasets
list[DatasetEnum]
required
List of datasets to include in the mosaic. Available datasets are defined in DatasetEnum.
aoi
str
required
Area of interest as any file format supported by GeoPandas read_file or read_parquet (e.g., GeoJSON, GeoParquet, Shapefile) via remote URL.
start
datetime
required
Start date for the temporal range of the mosaic.
end
datetime
required
End date for the temporal range of the mosaic.
crs_epsg
int
EPSG code for the coordinate reference system to use for the output mosaic. Default is 3857 (Web Mercator).
xy_chunksize
int
Chunk size in pixels to use for the X and Y dimensions when building the mosaic. Larger values use more memory but may be faster. Default is 512.
max_shard_size
float
The size in bytes that determines the shard size for the Zarr store and the partition size for each task. Default is 3.5GB in bytes.
envs
dict[str, str] | None
Additional environment variables to pass to the workflow execution. Default is None.
buffer
float
Buffer distance to apply around the AOI in the units of the AOI’s CRS. Useful for ensuring complete coverage at boundaries. Default is 0.0.
resolution
float | None
Spatial resolution for the output mosaic in the units of the target CRS. If None, uses the native resolution of the input datasets. Default is None.

Response

return
str
URI of the output Zarr store where the mosaic is saved.

predict_mosaic()

Run inference on a single mosaic zarr store using a specified model.
def predict_mosaic(
    self,
    store: str,
    model_path: str,
    patch_size: int
)

Parameters

store
str
required
URI of the input mosaic Zarr store to run inference on.
model_path
str
required
Path (local or remote URL) to the model file. Should be compatible with the specified inference actor.
patch_size
int
required
Size of the patches to be used during inference.
clip_size
int
required
Size in pixels to clip from patch edges before merging predictions. Helps reduce edge artifacts in overlapping regions. Must be less than patch_size.
device
str
required
Device to run the model on. Options: “cuda” for GPU, “cpu” for CPU.
features
list[str]
required
List of feature (band) names from the mosaic to use as model inputs. Must exist in the input Zarr store.
labels
list[str]
required
List of output label names that the model produces. These will be the band names in the output store.
actor
InferenceActorEnum
required
The inference actor to use for running the model. Available options are defined in InferenceActorEnum.
max_batch_size
int
required
Maximum number of patches to process in a single batch during inference.
merge_mode
MergeModeEnum
required
Method for merging predictions from overlapping patches. Options defined in MergeModeEnum.
selected_labels
list[str] | None
Subset of labels to extract from model output. If None, all labels produced by the model are saved. Default is None.
xy_block_multiplier
int
Multiplier on Zarr chunks. Larger values process bigger blocks (groups of chunks) at once. Default is 1.
runtime
RuntimeEnum
Compute resources to allocate for the workflow execution. Options defined in RuntimeEnum (e.g., SMALL, MEDIUM, LARGE). Default is RuntimeEnum.SMALL.
envs
dict[str, str] | None
Additional environment variables to pass to the workflow execution. Default is None.

Response

return
str
URI of the output Zarr store containing the inference results.

vectorize_mosaic()

Convert raster predictions to vector geometries through polygonization. Supports thresholding and then vectorizing float values. Typically these are confidence scores from semantic segmentation workflows.
def vectorize_mosaic(self, store: str, features: list[str])

Parameters

store
str
required
URI of the input mosaic Zarr store to vectorize.
features
list[str]
required
List of features (band) names from the mosaic to vectorize. Typically these represent the model predictions from predict_mosaic. Each feature is vectorized separately.
threshold
float
required
Threshold value for binarizing continuous predictions before vectorization. Pixels with values greater than or equal to this threshold are considered foreground (1), while values below are background (0).
vectorize_method
VectorizeMethodEnum
required
Vectorization method to use. Available methods are defined in VectorizeMethodEnum.
vectorize_config
VECTOR_CONFIG_TYPES
required
The configuration for the vectorize_method.
xy_block_multiplier
int
Multiplier on Zarr chunks. Larger values process bigger blocks (groups of chunks) at once. Default is 1.
dst_crs
str | None
Target coordinate reference system for the output geometries in EPSG format (e.g., “EPSG:4326”). If None, geometries remain in the CRS of the input mosaic. Default is “EPSG:4326” (WGS84 lat/lon).
runtime
RuntimeEnum
Compute resources to allocate for the workflow execution. Options defined in RuntimeEnum (e.g., SMALL, MEDIUM, LARGE). Default is RuntimeEnum.SMALL.
envs
dict[str, str] | None
Additional environment variables to pass to the workflow execution. Default is None.

Response

return
str
URI of the output store containing the vectorized geometries.