Skip to main content
The following content is a read-only preview of an executable Jupyter notebook.To run this notebook interactively:
  1. Go to Wherobots Cloud.
  2. Start a runtime.
  3. Open the notebook.
  4. In the Jupyter Launcher:
    1. Click File > Open Path.
    2. Paste the following path to access this notebook: examples/scala/Getting_Started.ipynb
    3. Click Enter.
This notebook will get you hands-on with geospatial analysis in Wherobots using Scala. Scala in Wherobots notebooks is similar to Python, but has a few key differences we will highlight here.

Set up your Sedona context

A SedonaContext connects your code to the Wherobots Cloud compute environment, and there are no significant differences between Python and Scala. First, you set up the config for your compute environment, then use that configuration to launch the sedona context. We’ll use the default configuration in this notebook, but you can learn about configuring the context in our documentation.
import org.apache.sedona.spark.SedonaContext

val config = SedonaContext.builder().getOrCreate()
val sedona = SedonaContext.create(config)

Reading Data

If you are familiar with Spark code, reading data in Wherobots will look similar to you. Wherobots includes readers for spatial data formats like GeoParquet and GeoJSON.
GeoParquet is an open, efficient format for storing geospatial data, perfect for large-scale geospatial workflows. (Docs: Loading GeoParquet)
// URI of sample data in an S3 bucket
val geoparquet_uri = "s3://wherobots-examples/data/onboarding_1/nyc_buildings.parquet"

// Load from S3 into a Sedona DataFrame
val buildings = sedona.read.format("geoparquet").load(geoparquet_uri)

buildings.printSchema()

Importing ST Functions

Wherobots in Scala supports a full DataFrame interface, or you can also use SQL queries. Here is how you can import the ST functions commonly used for spatial queries. More on the DataFrame API is available in the Wherobots documentation.
import org.apache.spark.sql.sedona_sql.expressions.st_constructors.ST_MakePoint
import org.apache.spark.sql.functions.lit

buildings.withColumn("geom2", ST_MakePoint(lit(1.2), lit(3.4)))
  .select("geom")
  .show(5, false)