Bring your Own Model
Overview¶
WherobotsAI Raster Inference supports running your own machine learning models on raster images in order to gather insights using the Machine Learning Model Extension Specification (MLM). MLM is the standard for discovering, sharing, and running machine learning models for geospatial data.
Capabilities¶
WherobotsAI Raster Inference currently supports:
- The following computer vision tasks:
- Single-label scene classification
- Object detection
- Semantic segmentation
- Workloads with single input tensor and single output tensor
- NVIDIA GPU acceleration
- Torchscript models
Job Runs¶
You can complete raster inference with WherobotsAI within a Job Run or as a Wherobots Notebook.
This tutorial discusses how to complete raster inference within a Wherobots Notebook. To run this code as a Job Run, combine the code samples from the following sections of this tutorial into a single Python file and execute it as a Job Run.
For more information on creating Job Runs in Wherobots, see WherobotsRunOperator.
You can export a notebook into a Python file
You can also export a notebook into a Python file. For more information, see Export a Python Notebook in the Wherobots Jupyter Notebook Management Documentation.
What to expect from this tutorial¶
This tutorial guides you through preparing your own machine learning model for use with raster inference on the Wherobots Cloud platform.
Learn how to:
- Save your model using Torchscript.
- Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
- Generate and upload your MLM JSON file.
- Execute raster inference on Wherobots Cloud.
How to use this tutorial¶
This tutorial can be used in two ways:
- Interactive: Log in to Wherobots Cloud to follow along in a Wherobots Notebook by opening
examples/python/wherobots-ai/gpu/bring_your_own_model_tutorial.ipynb
- Adaptation: Apply the steps to your own model in a new notebook.
Before you start¶
Before attempting to use your own machine learning model in WherobotsAI Raster Inference, ensure that you have the following:
-
A Professional or Enterprise Edition Wherobots Organization.
- WherobotsAI Raster Inference requires a GPU-Optimized runtime. To access GPU-Optimized runtimes, sign up for a Paid Organization Edition and file a Compute Request. For additional instructions, see Access a GPU-Optimized runtime.
-
A PyTorch model file.
-
An Integrated Amazon S3 Bucket or a Wherobots Managed Storage resource for storing your MLM JSON file.
Note
If you add the S3 storage integration after starting the notebook, you must restart the notebook in order to access the newly added storage integration.
Access a GPU-Optimized runtime¶
This notebook requires a GPU-Optimized runtime. For more information on GPU-Optimized runtimes, see Runtime types.
To access this runtime category, do the following:
- Sign up for a paid Wherobots Organization Edition (Professional or Enterprise).
- Submit a Compute Request for a GPU-Optimized runtime.
Save and upload your model¶
Save your model¶
Save your model checkpoint using Torchscript. For more information, see Saving and Loading Models in the PyTorch documentation.
PyTorch model support only
WherobotsAI Raster Inference currently only supports PyTorch models.
The following Torchscript model checkpoint saving methods are supported:
Artifact Type | Description | File Extension |
---|---|---|
torch.jit.script |
A model artifact obtained by TorchScript . |
.pt |
Checkpoint
At this point, you have completed the following steps:
- Save your model using Torchscript.
- Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
- Generate and upload your MLM JSON file.
- Execute raster inference on Wherobots Cloud.
Upload your model¶
Store your model in an S3 bucket that's accessible to Wherobots Cloud.
You can choose to store your model in either of the following ways:
-
Directly in Wherobots Managed Storage.
For more information, see Wherobots storage and notebook guidance.
-
By integrating your existing Amazon S3 storage with Wherobots.
For more information on integrating a public or private S3 bucket with Wherobots Cloud see, S3 Storage Integration.
Wherobots Managed Storage example¶
To upload your model to Wherobots Managed Storage, do the following:
- Go to Storage.
- Navigate to your desired folder location.
- Click Upload.
- Upload your model
.pt
file.
For in depth instructions, review the Managed Storage Documentation.
Uploaded model example¶
This tutorial uses a Wherobots-hosted model, but you can follow the same steps to store your own models in Wherobots Managed Storage.
The following image shows the solar_satlas_setinel2_model_pt2.pt
model located in the data/customer-XXXX/bring-your-own-model
directory.
Tip
You can click the clipboard icon to copy the model's URI. You'll need your model's URI to create an MLM JSON in the next step (Fill out Asset Form).
Checkpoint
At this point, you have completed the following steps:
- Save your model using Torchscript.
- Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
- Generate and upload your MLM JSON file.
- Run raster inference on Wherobots Cloud using a Notebook or Python file.
Create an MLM JSON for Your Model¶
MLM specification overview¶
The Machine Learning Model Extension Specification (MLM) is based on the SpatioTemporal Asset Catalog’s (STAC) standardized MLM. MLM defines a JSON format that specifies a model’s properties, input and input processing requirements, and output and output processing requirements.
MLM creates a standardized way to use your own models for inference. MLM accomplishes this by:
- Enabling the building of searchable custom models and their associated STAC datasets.
- Recording all necessary bands, parameters, modeling artifact locations, and high-level processing steps to deploy an inference service.
- Creating an easy and standardized way to use your own models for inference.
MLM specification forms¶
To create an MLM JSON file for your model, do the following:
- Fill out the Model Asset Form in the Asset Form tab.
- Fill out the Model Metadata form in the MLM Form tab.
Fill out Asset Form¶
To fill out the Model Asset Form, do the following:
- Go to the Machine Learning Model Metadata Form site.
- Go to the Asset Form tab.
-
Fill in the MLM Model Asset Form with your model information in accordance with the following chart. For additional information, see Model Asset in Machine Learning Model Extension Specification.
Field Name Type Required or optional Description Title string Optional Name of model asset URI string Required S3 address where your compiled .pt
Torchscript model file is stored.type string Optional The artifact’s media type. For more information, see Model Artifact Media-Type on the MLM extension GitHub. roles - Required Specify mlm:model
. Can include["mlm:weights", "mlm:checkpoint"]
as applicable.artifact_type Artifact Type Enum Optional Specifies the kind of model artifact. Typically related to a particular ML framework. For more information, see Artifact Type Enum on the MLM extension GitHub.
Fill out MLM form¶
To create the MLM JSON your model, do the following:
-
Within the Machine Learning Model Metadata Form site, go to the MLM Form tab.
- This form structures and organizes the information you provide so that is conforms to the MLM specification.
- For clarity, we’ve specified a few fields for reference below. For a full breakdown of the inputs and definitions, see Item Properties and Collection Fields in the Machine Learning Model Extension Specification.
Note
While all fields in this form must be filled out, it's important to ensure the information you provide is accurate, as there is no automatic validation.
MLM metadata form field Expected Input Example Input Is it pretrained? true or false true Categories List of classes for your model “Solar panels”, “Wind farms”, “Forests” Tasks classification, object-detection, or segmentation segmentation Note
WherobotsAI Raster Inference only supports the following tasks: classification, object-detection, or segmentation.
-
Click Download JSON to save the MLM JSON file.
Here is a reference MLM for the landcover-eurostat-sentinel2
Wherobots hosted model.
Upload your model’s MLM JSON¶
After completing the Create an MLM JSON file for Your Model steps, you'll have an MLM JSON file. This file
needs to be uploaded to the same Wherobots-accessible S3 bucket where you previously uploaded
your Torchscript model's .pt
file.
Tip
You can click the clipboard icon to copy the model's URI.
The path to your MLM JSON file will be represented by the variable user_mlm_uri
in the following steps.
Checkpoint
At this point, you have completed the following steps:
- Save your model using Torchscript.
- Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
- Generate and upload your MLM JSON file.
- Run raster inference on Wherobots Cloud using a Notebook or Python file.
Next steps¶
Run inference using your model on raster data¶
Currently, WherobotsAI Raster Inference supports running model inference on the following tasks:
- Single-label scene classification
- Object Detection
- Semantic Segmentation
The following chart details the WherobotsAI Raster Inference function calls to use for each Computer Vision task.
Computer Vision Task |
SQL API | Python API | Notebook Tutorial |
---|---|---|---|
Image Classification |
RS_CLASSIFY() |
rs_classify() |
Run inference for Classification |
Object Detection |
RS_DETECT_BBOXES() |
rs_detect() |
Run inference for Object Detection |
Semantic Segmentation | RS_Segment() , RS_SEGMENT_TO_GEOMS() |
rs_segment() |
Run inference for Semantic Segmentation |
Start a notebook¶
To incorporate you own model, create a Python notebook and incorporate the user_mlm_uri
.
To start a notebook to run raster inference with WherobotsAI, do the following:
- Log in to Wherobots Cloud.
-
Start a Wherobots notebook instance. We recommend using the GPU-Tiny runtime. It can take several minutes for a runtime to load.
Note
Running this notebook requires a GPU-Optimized runtime. To access GPU-Optimized runtimes, sign up for a Paid Organization Edition and file a Compute Request. For additional instructions, see Access a GPU-Optimized runtime.
-
Open a Python notebook. For more information on notebook management, see Notebook Instance management and Jupyter Instance Management.
You can run any of the tutorials listed in Run inference using your model on raster data.
Create the URI variable¶
WherobotsAI Raster Inference relies on the MLM JSON's S3 URI to access essential model information and determine the appropriate model for inference execution.
To obtain the S3 URI of the MLM JSON:
- Navigate to the MLM JSON in Wherobots Cloud.
-
Copy/paste the location of the file and set it to
user_mlm_uri
.
The following code creates the SedonaContext
and the user_mlm_uri
. The user_mlm_uri
is the path to the S3 URI of the MLM JSON that we created in Upload your model's MLM JSON.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
You have completed this tutorial!
At this point, you have completed the following steps:
- Save your model using Torchscript.
- Choose and utilize an S3 bucket for model storage on Wherobots Cloud.
- Generate and upload your MLM JSON file.
- Execute raster inference on Wherobots Cloud.