Skip to content

Bring your Own Model

Overview

WherobotsAI Raster Inference supports running your own machine learning models on raster images in order to gather insights using the Machine Learning Model Extension Specification (MLM). MLM is the standard for discovering, sharing, and running machine learning models for geospatial data.

Capabilities

WherobotsAI Raster Inference currently supports:

  • The following computer vision tasks:
    • Single-label scene classification
    • Object detection
    • Semantic segmentation
  • Workloads with single input tensor and single output tensor
  • NVIDIA GPU acceleration
  • Torchscript models

Job Runs

You can complete raster inference with WherobotsAI within a Job Run or as a Wherobots Notebook.

This tutorial discusses how to complete raster inference within a Wherobots Notebook. To run this code as a Job Run, combine the code samples from the following sections of this tutorial into a single Python file and execute it as a Job Run.

For more information on creating Job Runs in Wherobots, see WherobotsRunOperator.

You can export a notebook into a Python file

You can also export a notebook into a Python file. For more information, see Export a Python Notebook in the Wherobots Jupyter Notebook Management Documentation.

What to expect from this tutorial

This tutorial guides you through preparing your own machine learning model for use with raster inference on the Wherobots Cloud platform.

Learn how to:

  • Save your model using Torchscript.
  • Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
  • Generate and upload your MLM JSON file.
  • Execute raster inference on Wherobots Cloud.

How to use this tutorial

This tutorial can be used in two ways:

  • Interactive: Log in to Wherobots Cloud to follow along in a Wherobots Notebook by opening examples/python/wherobots-ai/gpu/bring_your_own_model_tutorial.ipynb
  • Adaptation: Apply the steps to your own model in a new notebook.

Before you start

Before attempting to use your own machine learning model in WherobotsAI Raster Inference, ensure that you have the following:

Access a GPU-Optimized runtime

This notebook requires a GPU-Optimized runtime. For more information on GPU-Optimized runtimes, see Runtime types.

To access this runtime category, do the following:

  1. Sign up for a paid Wherobots Organization Edition (Professional or Enterprise).
  2. Submit a Compute Request for a GPU-Optimized runtime.

Save and upload your model

Save your model

Save your model checkpoint using Torchscript. For more information, see Saving and Loading Models in the PyTorch documentation.

PyTorch model support only

WherobotsAI Raster Inference currently only supports PyTorch models.

The following Torchscript model checkpoint saving methods are supported:

Artifact Type Description File Extension
torch.jit.script A model artifact obtained by TorchScript. .pt

Checkpoint

At this point, you have completed the following steps:

  • Save your model using Torchscript.
  • Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
  • Generate and upload your MLM JSON file.
  • Execute raster inference on Wherobots Cloud.

Upload your model

Store your model in an S3 bucket that's accessible to Wherobots Cloud.

You can choose to store your model in either of the following ways:

Wherobots Managed Storage example

To upload your model to Wherobots Managed Storage, do the following:

  1. Go to Storage.
  2. Navigate to your desired folder location.
  3. Click Upload.
  4. Upload your model .pt file.

For in depth instructions, review the Managed Storage Documentation.

Uploaded model example

This tutorial uses a Wherobots-hosted model, but you can follow the same steps to store your own models in Wherobots Managed Storage.

The following image shows the solar_satlas_setinel2_model_pt2.pt model located in the data/customer-XXXX/bring-your-own-model directory.

Upload model

Tip

You can click the clipboard icon to copy the model's URI. You'll need your model's URI to create an MLM JSON in the next step (Fill out Asset Form).

Checkpoint

At this point, you have completed the following steps:

  • Save your model using Torchscript.
  • Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
  • Generate and upload your MLM JSON file.
  • Run raster inference on Wherobots Cloud using a Notebook or Python file.

Create an MLM JSON for Your Model

MLM specification overview

The Machine Learning Model Extension Specification (MLM) is based on the SpatioTemporal Asset Catalog’s (STAC) standardized MLM. MLM defines a JSON format that specifies a model’s properties, input and input processing requirements, and output and output processing requirements.

MLM creates a standardized way to use your own models for inference. MLM accomplishes this by:

  • Enabling the building of searchable custom models and their associated STAC datasets.
  • Recording all necessary bands, parameters, modeling artifact locations, and high-level processing steps to deploy an inference service.
  • Creating an easy and standardized way to use your own models for inference.

MLM specification forms

To create an MLM JSON file for your model, do the following:

Fill out Asset Form

To fill out the Model Asset Form, do the following:

  1. Go to the Machine Learning Model Metadata Form site.
  2. Go to the Asset Form tab. Model asset form
  3. Fill in the MLM Model Asset Form with your model information in accordance with the following chart. For additional information, see Model Asset in Machine Learning Model Extension Specification.

    Field Name Type Required or optional Description
    Title string Optional Name of model asset
    URI string Required S3 address where your compiled .pt Torchscript model file is stored.
    type string Optional The artifact’s media type. For more information, see Model Artifact Media-Type on the MLM extension GitHub.
    roles - Required Specify mlm:model. Can include ["mlm:weights", "mlm:checkpoint"] as applicable.
    artifact_type Artifact Type Enum Optional Specifies the kind of model artifact. Typically related to a particular ML framework. For more information, see Artifact Type Enum on the MLM extension GitHub.

Fill out MLM form

To create the MLM JSON your model, do the following:

  1. Within the Machine Learning Model Metadata Form site, go to the MLM Form tab. MLM form

    • This form structures and organizes the information you provide so that is conforms to the MLM specification.
    • For clarity, we’ve specified a few fields for reference below. For a full breakdown of the inputs and definitions, see Item Properties and Collection Fields in the Machine Learning Model Extension Specification.

    Note

    While all fields in this form must be filled out, it's important to ensure the information you provide is accurate, as there is no automatic validation.

    MLM metadata form field Expected Input Example Input
    Is it pretrained? true or false true
    Categories List of classes for your model “Solar panels”, “Wind farms”, “Forests”
    Tasks classification, object-detection, or segmentation segmentation

    Note

    WherobotsAI Raster Inference only supports the following tasks: classification, object-detection, or segmentation.

  2. Click Download JSON to save the MLM JSON file.

Here is a reference MLM for the landcover-eurostat-sentinel2 Wherobots hosted model.

Upload your model’s MLM JSON

After completing the Create an MLM JSON file for Your Model steps, you'll have an MLM JSON file. This file needs to be uploaded to the same Wherobots-accessible S3 bucket where you previously uploaded your Torchscript model's .pt file.

Upload json

Tip

You can click the clipboard icon to copy the model's URI.

The path to your MLM JSON file will be represented by the variable user_mlm_uri in the following steps.

Checkpoint

At this point, you have completed the following steps:

  • Save your model using Torchscript.
  • Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
  • Generate and upload your MLM JSON file.
  • Run raster inference on Wherobots Cloud using a Notebook or Python file.

Next steps

Run inference using your model on raster data

Currently, WherobotsAI Raster Inference supports running model inference on the following tasks:

  • Single-label scene classification
  • Object Detection
  • Semantic Segmentation

The following chart details the WherobotsAI Raster Inference function calls to use for each Computer Vision task.

Computer
Vision Task
SQL API Python API Notebook Tutorial
Image
Classification
RS_CLASSIFY() rs_classify() Run inference
for Classification
Object
Detection
RS_DETECT_BBOXES() rs_detect() Run inference
for Object
Detection
Semantic Segmentation RS_Segment() , RS_SEGMENT_TO_GEOMS() rs_segment() Run inference
for Semantic Segmentation

Start a notebook

To incorporate you own model, create a Python notebook and incorporate the user_mlm_uri.

To start a notebook to run raster inference with WherobotsAI, do the following:

  1. Log in to Wherobots Cloud.
  2. Start a Wherobots notebook instance. We recommend using the GPU-Tiny runtime. It can take several minutes for a runtime to load.

    Note

    Running this notebook requires a GPU-Optimized runtime. To access GPU-Optimized runtimes, sign up for a Paid Organization Edition and file a Compute Request. For additional instructions, see Access a GPU-Optimized runtime.

  3. Open a Python notebook. For more information on notebook management, see Notebook Instance management and Jupyter Instance Management.

You can run any of the tutorials listed in Run inference using your model on raster data.

Create the URI variable

WherobotsAI Raster Inference relies on the MLM JSON's S3 URI to access essential model information and determine the appropriate model for inference execution.

To obtain the S3 URI of the MLM JSON:

  1. Navigate to the MLM JSON in Wherobots Cloud.
  2. Copy/paste the location of the file and set it to user_mlm_uri.

    Copy MLM URI

The following code creates the SedonaContext and the user_mlm_uri. The user_mlm_uri is the path to the S3 URI of the MLM JSON that we created in Upload your model's MLM JSON.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import warnings
warnings.filterwarnings('ignore')
import os

from wherobots.inference.data.io import read_raster_table
from sedona.spark import SedonaContext
from pyspark.sql.functions import expr

config = SedonaContext.builder().appName('segmentation-batch-inference')\
    .getOrCreate()

sedona = SedonaContext.create(config)
user_mlm_uri = [PATH-TO-MLM-JSON]

You have completed this tutorial!

At this point, you have completed the following steps:

  • Save your model using Torchscript.
  • Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
  • Generate and upload your MLM JSON file.
  • Execute raster inference on Wherobots Cloud.