Documentation Index
Fetch the complete documentation index at: https://docs.wherobots.com/llms.txt
Use this file to discover all available pages before exploring further.
The following content is a read-only preview of an executable Jupyter notebook.To run this notebook interactively:
- Go to Wherobots Cloud.
- Start a runtime.
- Open the notebook.
- In the Jupyter Launcher:
- Click File > Open Path.
- Paste the following path to access this notebook:
examples/Reading_and_Writing_Data/Loading_Common_Spatial_File_Types.ipynb
- Click Enter.
📖 Introduction
In this notebook, we will demonstrate how to load geospatial data into Wherobots using the following formats:
- GeoParquet
- GeoJSON and Shapefiles
- Raster Data (GeoTIFF)
- Overture Maps Data
- Data from S3
Each section will walk through the necessary steps with annotated code and provide links to relevant Wherobots documentation.
🗂 Step 1: Loading GeoParquet Files
What you’ll learn:
- How to load GeoParquet files into a DataFrame.
- Perform basic spatial queries.
# Import necessary libraries
from sedona.spark import *
from pyspark.sql import SparkSession
# Initialize Sedona and Spark session
config = SparkSession.builder \
.appName("Dataset Loader") \
.getOrCreate()
sedona = SedonaContext.create(config)
# Load GeoParquet data
gdf = sedona.read.format("geoparquet").load("s3://wherobots-examples/data/mini/es_cn.parquet")
📄 Documentation Reference: Loading GeoParquet
🌍 Step 2: Loading GeoJSON and Shapefiles
What you’ll learn:
- How to ingest GeoJSON and Shapefiles.
# Load GeoJSON file
geojson_df = sedona.read.format("geojson").load("s3://wherobots-examples/data/mini/2015_Tree_Census.geojson")
import pyspark.sql.functions as f
df = sedona.read.format("geojson").load("s3://wherobots-examples/data/mini/2015_Tree_Census.geojson") \
.withColumn("address", f.expr("properties['address']")) \
.withColumn("spc_common", f.expr("properties['spc_common']")) \
.drop("properties").drop("type")
df.printSchema()
# Load Shapefile
shapefile_df = sedona.read.format("shapefile").load("s3://wherobots-examples/data/mini/HurricaneSandy/geo_export_2ca210ed-d8b2-4fe6-81eb-53cc96311073.shp")
# Inspect and perform a query
shapefile_df.printSchema()
📄 Documentation Reference: Ingesting GeoJSON
🖼️ Step 3: Loading Raster Data (GeoTIFF)
What you’ll learn:
- How to load raster datasets and inspect metadata.
# Load a GeoTIFF raster file
raster_df = sedona.read.format("binaryFile").load("s3://wherobots-examples/data/mini/NYC_3ft_Landcover.tif")
# Convert binary content to a raster object
raster_df = raster_df.selectExpr("RS_FromGeoTiff(content) as raster")
📄 Documentation Reference: Loading Raster Data
🗺️ Step 4: Loading Overture Maps Data
What you’ll learn:
- Load and query datasets provided by Overture Maps.
# Load Overture Maps building dataset
buildings_df = sedona.read.format("iceberg").load("wherobots_open_data.overture_maps_foundation.buildings_building")
# Filter based on geometry (example: within a bounding box)
bbox_wkt = '''POLYGON((-122.5 37.0, -122.5 37.5, -121.5 37.5, -121.5 37.0, -122.5 37.0))'''
buildings_filtered = buildings_df.where(ST_Intersects("geometry", f.expr(f'''ST_GeomFromText('{bbox_wkt}')''')))
# Show results
buildings_filtered.show()
🔮 Next Steps
In this notebook, we demonstrated how to:
- Load GeoParquet, GeoJSON, Shapefiles, and raster data into Wherobots.
- Query spatial data using basic spatial operations.
- Integrate datasets directly from S3 and Overture Maps.
What’s next?
- Explore spatial transformations like buffering or intersecting geometries.
- Perform spatial joins for more advanced analytics.
- Visualize query results with SedonaKepler or SedonaPyDeck.
For further details, check out the Wherobots Documentation.