Run RasterFlow as a Job

Private Preview

RasterFlow Overview

Learn about RasterFlow’s key features and capabilities

Runs REST API

Full reference for the Runs REST API

Workload History

Monitor your job runs

Runtimes

Choose the right runtime for your workload

RasterFlow workflows can be submitted as Job Runs using the Wherobots Runs REST API. This lets you run RasterFlow workflows as automated, standalone jobs outside of a notebook, which is useful for production pipelines, scheduled processing, and CI/CD integration.

Benefits

Running RasterFlow workflows as jobs provides several advantages over interactive notebook execution:

Automation

Schedule and trigger RasterFlow workflows programmatically.

Production readiness

Run workflows in a managed environment without interactive notebooks.

Integration

Incorporate RasterFlow into existing data pipelines and CI/CD systems.

Monitoring

Track job status and results through the Workload History dashboard.

Before you start

Before submitting RasterFlow jobs, ensure you have the following:

An account within a Professional, Innovation, or Enterprise Edition Organization. For more information, see Create a Wherobots Account.
A Wherobots API key. For more information, see API keys.
Access to Wherobots Managed Storage for uploading scripts and storing results.
RasterFlow enabled for your organization.

RasterFlow is currently in Private Preview. Wherobots is rolling out RasterFlow to a select group of Organizations. If you are interested in gaining early access to these new capabilities and helping shape the future of the product, register your interest here.

Overview

Submitting a RasterFlow workflow as a job involves three steps:

Write a job script: Create a standalone Python script containing your RasterFlow workflow
Upload the script: Upload the script to Wherobots Managed Storage so it can be referenced by the Job Run.
Submit the job: Submit the job using the Runs REST API

Write the job script

Create a Python script that contains your RasterFlow workflow. This script will be executed by the Wherobots Job Run environment. It should be self-contained—all imports, configuration, and processing logic must be included in the script. The following example runs the Fields of the World (FTW) model on Haskell County, Kansas using RasterFlow.

This is a subset of the code included in the Fields of the World solution notebook. See the FTW tutorial for more details on the model and its outputs.

rasterflow_ftw_job.py

#!/usr/bin/env python3
"""
RasterFlow FTW Job Script - Extract field boundaries using Fields of the World model

This script runs the Fields of the World (FTW) model on Haskell County, Kansas
and vectorizes the results using Wherobots RasterFlow.
"""

import os
import wkls
import geopandas as gpd
from datetime import datetime
from rasterflow_remote import RasterflowClient
from rasterflow_remote.data_models import ModelRecipes, VectorizeMethodEnum

def main():
    print("Starting RasterFlow FTW Job...")

    # Initialize RasterFlow client
    rf_client = RasterflowClient(cache=False)

    # Generate AOI for Haskell County, Kansas
    print("Generating AOI for Haskell County, Kansas...")
    gdf = gpd.read_file(wkls['us']['ks']['Haskell County'].geojson())
    aoi_path = os.getenv("USER_S3_PATH") + "haskell_job.parquet"
    gdf.to_parquet(aoi_path)
    print(f"AOI saved to: {aoi_path}")

    # Run FTW model
    print("Running Fields of the World model...")
    model_outputs = rf_client.build_and_predict_mosaic_recipe(
        aoi=aoi_path,
        start=datetime(2023, 1, 1),
        end=datetime(2024, 1, 1),
        crs_epsg=3857,
        model_recipe=ModelRecipes.FTW,
    )
    print(f"Model outputs saved to: {model_outputs}")

    print("RasterFlow FTW Job completed successfully!")
    print(f"Results:")
    print(f"  Model outputs: {model_outputs}")

    return {
         "model_outputs": model_outputs,
    }

if __name__ == "__main__":
    main()

Show how to customize this workflow

Update the following lines for your use case:

In line 25, replace wkls['us']['ks']['Haskell County'] with your own area of interest.
In line 26, replace "haskell_job.parquet" with your desired output filename.
In lines 33–37, update the start, end, crs_epsg, and model_recipe parameters to match your workflow.

Upload the script

Upload the job script to your Wherobots Managed Storage so it can be referenced by the Job Run. Your upload method will depend on where you created the script:

Wherobots Cloud
Local Machine

If you wrote and saved the job script (in this example rasterflow_ftw_job.py) in the Wherobots Cloud notebook environment, upload the job script to Managed Storage using s3fs.

Start a new notebook in Wherobots Cloud.
Paste the following code snippet into a notebook cell in the same directory as rasterflow_ftw_job.py.
```
import os
import s3fs

fs = s3fs.S3FileSystem(profile="default")

# Define the destination path on S3
s3_script_path = os.getenv("USER_S3_PATH") + "rasterflow_ftw_job.py"
fs.put("rasterflow_ftw_job.py", s3_script_path)
```
Show how to customize this upload script
Update the following lines:
- In line 8, replace "rasterflow_ftw_job.py" with the destination filename for your script in Managed Storage.
- In line 9, replace "rasterflow_ftw_job.py" with the local filename of your job script.
Run the cell to upload rasterflow_ftw_job.py. Go to Storage in the Wherobots Cloud interface to confirm that the file is uploaded to your Managed Storage.

If you wrote and saved the job script (in this example rasterflow_ftw_job.py) on your local machine or outside of Wherobots Cloud, upload the job script directly from the Wherobots Cloud interface:

Navigate to Storage in Wherobots Cloud.
Click the Upload button.
Select your rasterflow_ftw_job.py script from your local machine. Go to Storage in the Wherobots Cloud interface to confirm that the file is uploaded to your Managed Storage.

Submit the job

Once rasterflow_ftw_job.py is uploaded to Managed Storage, submit it as a Job Run using the Runs REST API.

Create the following environment variables:

Never hardcode your API key in scripts or source code. Always load it from an environment variable or a secrets manager. The example below reads the key from the WHEROBOTS_API_KEY environment variable.

Variable	Description
`WHEROBOTS_API_KEY`	Your Wherobots API key. See API keys for how to create one.
`USER_S3_PATH`	Your Wherobots managed storage path. Available as a built-in environment variable in Wherobots notebooks.

Run the following script to submit a job that references your uploaded rasterflow_ftw_job.py.

The submission script uses the requests library. If it is not already installed in your environment, install it with pip install requests.

submit_rasterflow_job.py

#!/usr/bin/env python3
"""
Submit a RasterFlow workflow as a Job Run using the Wherobots Runs REST API.

This script submits a job that runs the Fields of the World (FTW) model
on Haskell County, Kansas.

This is a subset of the code included in the Solution Notebook for the Fields of the World model.
See https://cloud.wherobots.com/model-hub/fields-of-the-world for more details.
"""

import os
import requests
from datetime import datetime


def submit_rasterflow_job(
    api_key: str,  # Loaded from WHEROBOTS_API_KEY env var in main()
    script_s3_path: str,
    region: str = "aws-us-west-2",
    runtime: str = "micro",
    job_name: str = "rasterflow-ftw-job",
    timeout_seconds: int = 3600
) -> dict:
    """
    Submit a RasterFlow workflow using the Wherobots Runs REST API.

    Args:
        api_key: Wherobots API key
        script_s3_path: S3 path to the Python script file
        region: Compute region (default: aws-us-west-2)
        runtime: Wherobots runtime size (default: micro). RasterFlow manages its own compute, so use micro unless also running WherobotsDB workloads.
        job_name: Name for the job run
        timeout_seconds: Job timeout in seconds (default: 3600)

    Returns:
        dict: API response containing job run details
    """

    # API endpoint
    url = f"https://api.cloud.wherobots.com/runs?region={region}"

    # Headers
    headers = {
        "accept": "application/json",
        "X-API-Key": api_key,
        "Content-Type": "application/json"
    }

    # Job payload
    payload = {
        "runtime": runtime,
        "name": job_name,
        "runPython": {
            "uri": script_s3_path
        },
        "timeoutSeconds": timeout_seconds,
        "environment": {
            "dependencies": [
            ]
        }
    }

    # Make the request
    print(f"Submitting job to: {url}")
    print(f"Job name: {job_name}")
    print(f"Runtime: {runtime}")
    print(f"Script path: {script_s3_path}")

    response = requests.post(url, headers=headers, json=payload)

    # Handle response
    if response.status_code == 200 or response.status_code == 201:
        result = response.json()
        print(f"Job submitted successfully!")
        print(f"Job ID: {result.get('id')}")
        print(f"Status: {result.get('status')}")
        return result
    else:
        print(f"Job submission failed!")
        print(f"Status Code: {response.status_code}")
        print(f"Response: {response.text}")
        response.raise_for_status()


def main():
    """Main function to run the job submission script."""

    # Configuration - Update these values
    API_KEY = os.getenv("WHEROBOTS_API_KEY")
    USER_S3_PATH = os.getenv("USER_S3_PATH")

    # Validate required environment variables
    if API_KEY is None or API_KEY == "":
        raise ValueError(
            "WHEROBOTS_API_KEY environment variable is required. "
            "Get your API key from https://cloud.wherobots.com/settings#api-keys"
        )

    if USER_S3_PATH is None or USER_S3_PATH == "":
        raise ValueError(
            "USER_S3_PATH environment variable is required. "
            "This should be your Wherobots managed storage path."
        )

    SCRIPT_S3_PATH = USER_S3_PATH + "rasterflow_ftw_job.py"

    print("=== RasterFlow FTW Job Submission ===")
    print("This script will submit a job to run the Fields of the World model")
    print("on Haskell County, Kansas using Wherobots RasterFlow.\n")

    print(f"Make sure you have uploaded rasterflow_ftw_job.py to: {SCRIPT_S3_PATH}")
    print("The job script should already be available in this directory.\n")

    # Submit the job
    result = submit_rasterflow_job(
        api_key=API_KEY,
        script_s3_path=SCRIPT_S3_PATH,
        job_name=f"rasterflow-ftw-{datetime.now().strftime('%Y%m%d-%H%M%S')}"
    )

    print(f"\nJob submission completed!")
    print(f"Monitor your job at: https://cloud.wherobots.com/job-runs")


if __name__ == "__main__":
    exit(main())

Show how to customize this job submission

Update the highlighted lines for your use case:

In line 20, replace "aws-us-west-2" with your desired compute region.
In line 21, replace "micro" with the runtime size for your workload. For RasterFlow-only jobs, use micro — see the tip below.
In line 22, replace "rasterflow-ftw-job" with a name for your job run.
In line 23, replace 3600 with your desired timeout in seconds.
In line 49, set the WHEROBOTS_API_KEY environment variable with your API key. The script reads it at runtime — never hardcode it.
In line 50, replace "rasterflow_ftw_job.py" with the filename of your uploaded script.

RasterFlow manages its own compute resources for raster processing, so the Wherobots Runtime size does not affect RasterFlow workflow performance. Use the Micro runtime to minimize cost. Only select a larger runtime if your job also performs vector processing with WherobotsDB (e.g., spatial SQL with SedonaContext).

Show Request parameters

The key parameters in the job submission payload are:

runtime

string

required

The Wherobots runtime size for the job. RasterFlow manages its own compute resources, so the runtime size does not affect RasterFlow workflow performance. Use micro unless you are also performing vector processing with WherobotsDB in the same job.

name

string

required

A unique name for the job run. Must be 8–255 characters matching ^[a-zA-Z0-9_-.]+$.

runPython.uri

string

required

The S3 path to the Python script in Wherobots Managed Storage.

timeoutSeconds

integer

Maximum execution time in seconds. Defaults to 3600 (1 hour).

environment

object

Optional runtime environment configuration, including dependencies. See Environment keys.

For the complete list of request parameters, see the Runs REST API reference.

Monitor the job

After submitting a job, you can monitor its progress in several ways:

Workload History: Use the Workload History dashboard for a broader view of all workloads across your organization.
REST API: Query the job status programmatically using the Runs REST API.

Next steps

The following resources will help you further explore RasterFlow and its related capabilities:

Explore built-in models

See all available built-in models in RasterFlow Models.

Runs REST API reference

Review all available parameters for the Runs REST API.

Automate with Apache Airflow

Set up automated pipelines with the Wherobots Apache Airflow Provider using the WherobotsRunOperator.

Choose the right runtime

Learn about Runtimes to choose the right compute resources for your workflows.

API reference

For detailed RasterFlow API documentation, see:

Client API Reference

RasterflowClient methods — see Client API Reference

Data Models Reference

Enums and configuration objects — see Data Models Reference

Model Registry Reference

Working with model registries — see Model Registry Reference

Exceptions Reference

Error handling — see Exceptions Reference

RasterFlow Overview

Runs REST API

Workload History

Runtimes

​Benefits

​Before you start

​Overview

​Write the job script

​Upload the script

​Submit the job

​Monitor the job

​Next steps

​API reference

Benefits

Before you start

Overview

Write the job script

Upload the script

Submit the job

Monitor the job

Next steps

API reference