> ## Documentation Index
> Fetch the complete documentation index at: https://docs.wherobots.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Loading Geospatial Data with Wherobots

<Tip>
  The following content is a read-only preview of an executable Jupyter notebook.

  To run this notebook interactively:

  1. Go to [**Wherobots Cloud**](https://cloud.wherobots.com).
  2. Start a runtime.
  3. Open the notebook.
  4. In the Jupyter Launcher:
     1. Click **File > Open Path**.
     2. Paste the following path to access this notebook: `examples/Reading_and_Writing_Data/Loading_Common_Spatial_File_Types.ipynb`
     3. Click **Enter**.
</Tip>

## 📖 Introduction

In this notebook, we will demonstrate how to load geospatial data into Wherobots using the following formats:

1. **GeoParquet**
2. **GeoJSON and Shapefiles**
3. **Raster Data (GeoTIFF)**
4. **Overture Maps Data**
5. **Data from S3**

Each section will walk through the necessary steps with annotated code and provide links to relevant Wherobots documentation.

## 🗂 Step 1: Loading GeoParquet Files

### What you'll learn:

* How to load GeoParquet files into a DataFrame.
* Perform basic spatial queries.

```python theme={"system"}
# Import necessary libraries
from sedona.spark import *
from pyspark.sql import SparkSession
```

```python theme={"system"}
# Initialize Sedona and Spark session
config = SparkSession.builder \
    .appName("Dataset Loader") \
    .getOrCreate()
sedona = SedonaContext.create(config)
```

```python theme={"system"}
# Load GeoParquet data
gdf = sedona.read.format("geoparquet").load("s3://wherobots-examples/data/mini/es_cn.parquet")
```

```python theme={"system"}
gdf.printSchema()
```

📄 **Documentation Reference**: [Loading GeoParquet](https://docs.wherobots.com/#geoparquet-loading)

## 🌍 Step 2: Loading GeoJSON and Shapefiles

### What you'll learn:

* How to ingest GeoJSON and Shapefiles.

```python theme={"system"}
# Load GeoJSON file
geojson_df = sedona.read.format("geojson").load("s3://wherobots-examples/data/mini/2015_Tree_Census.geojson")
```

```python theme={"system"}
geojson_df.printSchema()
```

```python theme={"system"}
import pyspark.sql.functions as f

df = sedona.read.format("geojson").load("s3://wherobots-examples/data/mini/2015_Tree_Census.geojson") \
    .withColumn("address", f.expr("properties['address']")) \
    .withColumn("spc_common", f.expr("properties['spc_common']")) \
    .drop("properties").drop("type")

df.printSchema()
```

```python theme={"system"}
# Load Shapefile
shapefile_df = sedona.read.format("shapefile").load("s3://wherobots-examples/data/mini/HurricaneSandy/geo_export_2ca210ed-d8b2-4fe6-81eb-53cc96311073.shp")
```

```python theme={"system"}
# Inspect and perform a query
shapefile_df.printSchema()
```

📄 **Documentation Reference**: [Ingesting GeoJSON](https://docs.wherobots.com/#geojson-loading)

## 🖼️ Step 3: Loading Raster Data (GeoTIFF)

### What you'll learn:

* How to load raster datasets and inspect metadata.

```python theme={"system"}
# Load a GeoTIFF raster file
raster_df = sedona.read.format("binaryFile").load("s3://wherobots-examples/data/mini/NYC_3ft_Landcover.tif")
```

```python theme={"system"}
# Convert binary content to a raster object
raster_df = raster_df.selectExpr("RS_FromGeoTiff(content) as raster")
```

📄 **Documentation Reference**: [Loading Raster Data](https://docs.wherobots.com/#raster-loading)

## 🗺️ Step 4: Loading Overture Maps Data

### What you'll learn:

* Load and query datasets provided by Overture Maps.

```python theme={"system"}
# Load Overture Maps building dataset
buildings_df = sedona.read.format("iceberg").load("wherobots_open_data.overture_maps_foundation.buildings_building")
```

```python theme={"system"}
# Filter based on geometry (example: within a bounding box)
bbox_wkt = '''POLYGON((-122.5 37.0, -122.5 37.5, -121.5 37.5, -121.5 37.0, -122.5 37.0))'''
buildings_filtered = buildings_df.where(ST_Intersects("geometry", f.expr(f'''ST_GeomFromText('{bbox_wkt}')''')))
```

```python theme={"system"}
# Show results
buildings_filtered.show()
```

## 🔮 Next Steps

In this notebook, we demonstrated how to:

1. Load GeoParquet, GeoJSON, Shapefiles, and raster data into Wherobots.
2. Query spatial data using basic spatial operations.
3. Integrate datasets directly from S3 and Overture Maps.

### What’s next?

* Explore **spatial transformations** like buffering or intersecting geometries.
* Perform **spatial joins** for more advanced analytics.
* Visualize query results with **SedonaKepler** or **SedonaPyDeck**.

For further details, check out the [Wherobots Documentation](https://docs.wherobots.com).
