Bring your Own Model

Overview¶

WherobotsAI Raster Inference supports running your own machine learning models on raster images in order to gather insights using the Machine Learning Model Extension Specification (MLM). MLM is the standard for discovering, sharing, and running machine learning models for geospatial data.

Capabilities¶

WherobotsAI Raster Inference currently supports:

The following computer vision tasks:
- Single-label scene classification
- Object detection
- Semantic segmentation
Workloads with single input tensor and single output tensor
NVIDIA GPU acceleration
Torchscript, ExportedProgram, or AOTInductor models.

Wherobots only supports Pytorch's inference-ready export formats.

We do not support inferencing directly with Pytorch checkpoints. Instead, Wherobots only supports Pytorch's inference-ready export formats.

Job Runs¶

You can complete raster inference with WherobotsAI within a Job Run or as a Wherobots Notebook.

This tutorial discusses how to complete raster inference within a Wherobots Notebook. To run this code as a Job Run, combine the code samples from the following sections of this tutorial into a single Python file and execute it as a Job Run.

For more information on creating Job Runs in Wherobots, see WherobotsRunOperator.

You can export a notebook into a Python file

You can also export a notebook into a Python file. For more information, see Export a Python Notebook in the Wherobots Jupyter Notebook Management Documentation.

What to expect from this tutorial¶

This tutorial guides you through preparing your own machine learning model for use with raster inference on the Wherobots Cloud platform.

Learn how to:

Save your model as a Torchscript, ExportedProgram, or AOTInductor model.
Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
Generate and upload your MLM JSON file.
Execute raster inference on Wherobots Cloud.

How to use this tutorial¶

This tutorial can be used in two ways:

Interactive:
1. Log in to Wherobots Cloud.
2. Start a GPU-Optimized runtime.
3. Click Open.
4. In the Jupyter Notebook Instance:
  1. Click File > Open from Path...
  2. Enter the example notebook path: examples/Analyzing_Data/Bring_Your_Own_Model_Raster_Inference.ipynb
Adaptation: Apply the steps to your own model in a new notebook.

Before you start¶

Before attempting to use your own machine learning model in WherobotsAI Raster Inference, ensure that you have the following:

A Professional or Enterprise Edition Wherobots Organization.
- WherobotsAI Raster Inference requires a GPU-Optimized runtime. To access GPU-Optimized runtimes, sign up for a Paid Organization Edition and file a Compute Request. For additional instructions, see Access a GPU-Optimized runtime.
A PyTorch model file.
An Integrated Amazon S3 Bucket or a Wherobots Managed Storage resource for storing your MLM JSON file.

Note

If you add the S3 storage integration after starting the notebook, you must restart the notebook in order to access the newly added storage integration.

Access a GPU-Optimized runtime¶

This notebook requires a GPU-Optimized runtime. For more information on GPU-Optimized runtimes, see Runtime types.

To access this runtime category, do the following:

Sign up for a paid Wherobots Organization Edition (Professional or Enterprise).
Submit a Compute Request for a GPU-Optimized runtime.

Save and upload your model¶

If you already have a Pytorch model, you can skip ahead to Upload your model.

Converting a Pytorch model checkpoint to Torchscript/ExportedProgram/AOTInductor¶

If you have a Pytorch model checkpoint, load and save it to an export model format, either Torchscript (.pt) or a Pytorch archive (.pt2) containing an ExportedProgram or AOTInductor file. For more information, see Saving and Loading Models in the PyTorch documentation. We recommend saving as .pt2 as an ExportedProgram since Torchscript is a deprecated export path by the Pytorch team. If your ExportedProgram model can be compiled with AOTInductor, you get free performance benefits (lower peak memory and faster inference runtime)y.

PyTorch model support only

WherobotsAI Raster Inference currently only supports PyTorch models.

Using an example Torchscript model¶

If you don't have a Pytorch model checkpoint, you can download and use this Torchscript model as an example.

Supported Pytorch model export methods¶

The following Pytorch model export methods are supported:

Artifact Type	Description	File Extension
`torch.jit.script`	A model artifact obtained by `TorchScript`.	`.pt`
`PT2 Archive`	A model artifact obtained by `torch.export.export` or `torch._inductor.aoti_compile_and_package`. WARNING The PT2 export APIs and PT2 archive format are still in prototype status and backwards compatibility breaking changes are likely.	`.pt2`

Checkpoint

At this point, you have completed the following steps:

Save your model as a Torchscript, ExportedProgram, or AOTInductor model.
Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
Generate and upload your MLM JSON file.
Execute raster inference on Wherobots Cloud.

Upload your model¶

Store your model in an S3 bucket that's accessible to Wherobots Cloud.

You can choose to store your model in either of the following ways:

Directly in Wherobots Managed Storage.

For more information, see Wherobots storage and notebook guidance.
By integrating your existing Amazon S3 storage with Wherobots.

For more information on integrating a public or private S3 bucket with Wherobots Cloud see, S3 Storage Integration.

Wherobots Managed Storage example¶

To upload your model to Wherobots Managed Storage, do the following:

Go to Storage.
Navigate to your desired folder location.
Click Upload.
Upload your model .pt file.

For in depth instructions, review the Managed Storage Documentation.

Uploaded model example¶

This tutorial uses a Wherobots-hosted model, but you can follow the same steps to store your own models in Wherobots Managed Storage.

The following image shows the solar_satlas_setinel2_model_pt2.pt model located in the data/customer-XXXX/bring-your-own-model directory.

Tip

You can click the clipboard icon to copy the model's URI. You'll need your model's URI to create an MLM JSON in the next step (Fill out Asset Form).

Checkpoint

At this point, you have completed the following steps:

Save your model as a Torchscript, ExportedProgram, or AOTInductor model.
Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
Generate and upload your MLM JSON file.
Run raster inference on Wherobots Cloud using a Notebook or Python file.

Create an MLM JSON for Your Model¶

MLM specification overview¶

The Machine Learning Model Extension Specification (MLM) is based on the SpatioTemporal Asset Catalog’s (STAC) standardized MLM. MLM defines a JSON format that specifies a model’s properties, input and input processing requirements, and output and output processing requirements.

MLM creates a standardized way to use your own models for inference. MLM accomplishes this by:

Enabling the building of searchable custom models and their associated STAC datasets.
Recording all necessary bands, parameters, modeling artifact locations, and high-level processing steps to deploy an inference service.
Creating an easy and standardized way to use your own models for inference.

MLM specification forms¶

To create an MLM JSON file for your model, do the following:

Fill out the Model Asset Form in the Asset Form tab.
Fill out the Model Metadata form in the MLM Form tab.

Fill out Asset Form¶

To fill out the Model Asset Form, do the following:

Go to the Machine Learning Model Metadata Form site.
Go to the Asset Form tab.
Fill in the MLM Model Asset Form with your model information in accordance with the following chart. For additional information, see Model Asset in Machine Learning Model Extension Specification.

Field Name Type Required or optional Description

Title string Optional Name of model asset

URI string Required S3 address where your model file is stored.

Fill out MLM form¶

To create the MLM JSON your model, do the following:

Within the Machine Learning Model Metadata Form site, go to the MLM Form tab.

This form structures and organizes the information you provide so that is conforms to the MLM specification.
For clarity, we’ve specified a few fields for reference below. For a full breakdown of the inputs and definitions, see Item Properties and Collection Fields in the Machine Learning Model Extension Specification.

Note

While all fields in this form must be filled out, it's important to ensure the information you provide is accurate, as there is no automatic validation.

MLM metadata form field	Expected Input	Example Input
Is it pretrained?	true or false	true
Categories	List of classes for your model	“Solar panels”, “Wind farms”, “Forests”
Tasks	classification, object-detection, or segmentation	segmentation

Note

WherobotsAI Raster Inference only supports the following tasks: classification, object-detection, or segmentation.

Click Download JSON to save the MLM JSON file.

Here is a reference MLM for the landcover-eurosat-sentinel2 Wherobots hosted model.

Upload your model’s MLM JSON¶

After completing the Create an MLM JSON file for Your Model steps, you'll have an MLM JSON file. This file needs to be uploaded to the same Wherobots-accessible S3 bucket where you previously uploaded your model file.

Tip

You can click the clipboard icon to copy the model's URI.

The path to your MLM JSON file will be represented by the variable user_mlm_uri in the following steps.

Checkpoint

At this point, you have completed the following steps:

Save your model as a Torchscript, ExportedProgram, or AOTInductor model.
Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
Generate and upload your MLM JSON file.
Run raster inference on Wherobots Cloud using a Notebook or Python file.

Next steps¶

Run inference using your model on raster data¶

Currently, WherobotsAI Raster Inference supports running model inference on the following tasks:

Single-label scene classification
Object Detection
Semantic Segmentation

The following chart details the WherobotsAI Raster Inference function calls to use for each Computer Vision task.

Computer Vision Task	SQL API	Python API	Notebook Tutorial
Image Classification	`RS_CLASSIFY()`	`rs_classify()`	Run inference for Classification
Object Detection	`RS_DETECT_BBOXES()`	`rs_detect()`	Run inference for Object Detection
Semantic Segmentation	`RS_Segment()` , `RS_SEGMENT_TO_GEOMS()`	`rs_segment()`	Run inference for Semantic Segmentation

Start a notebook¶

To incorporate you own model, create a Python notebook and incorporate the user_mlm_uri.

To start a notebook to run raster inference with WherobotsAI, do the following:

Log in to Wherobots Cloud.
Start a Wherobots notebook instance. We recommend using the GPU-Tiny runtime. It can take several minutes for a runtime to load.

Note

Running this notebook requires a GPU-Optimized runtime. To access GPU-Optimized runtimes, sign up for a Paid Organization Edition and file a Compute Request. For additional instructions, see Access a GPU-Optimized runtime.
Open a Python notebook. For more information on notebook management, see Notebook Instance management and Jupyter Instance Management.

You can run any of the tutorials listed in Run inference using your model on raster data.

Create the URI variable¶

WherobotsAI Raster Inference relies on the MLM JSON's S3 URI to access essential model information and determine the appropriate model for inference execution.

To obtain the S3 URI of the MLM JSON:

Navigate to the MLM JSON in Wherobots Cloud.
Copy/paste the location of the file and set it to user_mlm_uri.

The following code creates the SedonaContext and the user_mlm_uri. The user_mlm_uri is the path to the S3 URI of the MLM JSON that we created in Upload your model's MLM JSON.

import warnings
warnings.filterwarnings('ignore')
import os

from wherobots.inference.data.io import read_raster_table
from sedona.spark import *
from pyspark.sql.functions import expr

config = SedonaContext.builder().appName('segmentation-batch-inference')\
    .getOrCreate()

sedona = SedonaContext.create(config)
user_mlm_uri = [PATH-TO-MLM-JSON]

You have completed this tutorial!

At this point, you have completed the following steps:

Save your model as a Torchscript, ExportedProgram, or AOTInductor model.
Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
Generate and upload your MLM JSON file.
Execute raster inference on Wherobots Cloud.

Field Name	Type	Required or optional	Description
Title	string	Optional	Name of model asset
URI	string	Required	S3 address where your model file is stored.