Skip to main content
WherobotsDB provides various SQL APIs to work with vector and raster data. For vector data, it follows SQL/MM Part3 Spatial SQL Standard. WherobotsDB maintains support for the existing public APIs found in Apache Sedona. To make existing Sedona users feel comfortable, our documentation still uses the word Sedona in the context of WherobotsDB.

Create SedonaContext

To start using WherobotsDB, we need to launch a SedonaContext in Wherobots cloud:
  • Python
  • Scala
  • Java
from sedona.spark import *

config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)
Alternatively, expr and selectExpr can be used:
myDataFrame.withColumn("geometry", expr("ST_*")).selectExpr("ST_*")
myDataFrame.withColumn("raster", expr("RS_*")).selectExpr("RS_*")
Detailed Spatial SQL APIs for vector and raster can be found at Vector SQL APIs and Raster SQL APIs respectively. References of Havasu table format can be found at Havasu APIs

Quick start

After starting SedonaContext, users can start working with WherobotsDB on Wherobots cloud without any setup.

Create a WherobotsDB Havasu table

Wherobots cloud provides a set of default configurations for the users to store data under the S3 prefix allocated for the users. Simply use wherobots as the catalog name when creating a Havasu table.
  • Python
  • Scala
  • Java
sedona.sql("CREATE DATABASE IF NOT EXISTS org_catalog.test_db")
sedona.sql("""
CREATE TABLE IF NOT EXISTS org_catalog.test_db.taxi (
  pickup GEOMETRY,
  Trip_Pickup_DateTime STRING,
  Payment_Type STRING,
  Fare_Amt DECIMAL
) USING havasu.iceberg
""")

Read data from external storage

User can ingest and query data using WherobotsDB. For example, users can load an example CSV file:
  • Python
  • Scala
  • Java
taxidf = sedona.read.format('csv').option("header","true").option("delimiter", ",").load("s3://wherobots-examples/data/nyc-taxi-data.csv")
taxidf = taxidf.selectExpr('ST_Point(CAST(Start_Lon AS Decimal(24,20)), CAST(Start_Lat AS Decimal(24,20))) AS pickup', 'Trip_Pickup_DateTime', 'Payment_Type', 'CAST(Fare_Amt AS DECIMAL)')
taxidf = taxidf.filter(col("pickup").isNotNull())

Insert data to a WherobotsDB Havasu table

User can insert the data above to an existing WherobotsDB table.
  • Python
  • Scala
  • Java
taxidf.writeTo("org_catalog.test_db.taxi").append()

Read from a WherobotsDB Havasu table

User can immediately query the data stored in Havasu table.
  • Python
  • Scala
  • Java
df_result = sedona.sql("SELECT ST_Buffer(pickup, 0.001) pickup_buf FROM org_catalog.test_db.taxi WHERE ST_Within(pickup, ST_PolygonFromEnvelope(-74, 40, -73, 41))")
df_result.show()

Feature Limitations

  • MOR Tables: Havasu does not support MOR tables. We will add MOR support to Havasu in the future.
  • File Formats: Havasu only supports storing geometry or raster values in parquet files. Other formats such as ORC or Avro are not supported.