Generating PMTiles
Vector tiles provide performant rendering of map data for large vector feature datasets across large regions and zoom levels. Here’s why, and when, they should be used:
- Vector tiles are designed for use in web maps, mobile apps, and desktop GIS software.
- WherobotsDB makes it easy and affordable to generate vector tiles at a planetary scale.
- By rendering vector tiles directly, the interactive map experience is more responsive and scalable for large datasets than rendering feature formats (e.g., GeoJSON) directly and allows developers to customize the display, which is otherwise impossible with raster tiles.
In this tutorial we will create PMTiles vector tiles for rendering maps using building and road data from the Overture Maps dataset in the Wherobots Open Data Catalog.
Start a WBC Notebook¶
Follow these instructions to open a notebook in
Wherobots Cloud. Once a notebook is started you can open the tile_generation_example notebook located in
notebooks_examples/python
or notebooks_examples/scala
in Jupyter lab, or create a new notebook tab in Jupyter lab
and select the desired kernel (python or Scala) to follow this tutorial.
Start a SedonaContext¶
As always, begin by starting a Sedona context
from sedona.spark import SedonaContext
config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)
import org.apache.sedona.spark.SedonaContext
val config = SedonaContext.builder().getOrCreate()
val sedona = SedonaContext.create(config)
Load Feature Data¶
Create a Spatial DataFrame with a geometry column and a layer column. The geometry column contains the features to render in the map. The layer column is a string that describes the grouping the feature should be in. Records within the same layer can be styled together, independently of other layers. In this case example features that represent buildings are in the buildings layer and those representing roads are in the roads layer.
The first cell that follows gives some variable to control where we generate tiles for. The default is a small town in Washington: Issaquah.
from sedona.spark import *
import pyspark.sql.functions as f
# Set to False to generate tiles for the entire dataset, True to generate only for region_wkt area
filter = True
region_wkt = "POLYGON ((-122.097931 47.538528, -122.048836 47.566566, -121.981888 47.510012, -122.057076 47.506302, -122.097931 47.538528))"
filter_expression = ST_Intersects(f.col("geometry"), ST_GeomFromText(f.lit(region_wkt)))
import org.apache.spark.sql.sedona_sql.expressions.st_constructors.ST_GeomFromText
import org.apache.spark.sql.sedona_sql.expressions.st_predicates.ST_Intersects
import org.apache.spark.sql.functions.{lit, col}
// Set to False to generate tiles for the entire dataset, true to generate only for regionWkt area
val filter = true
val regionWkt = "POLYGON ((-122.097931 47.538528, -122.048836 47.566566, -121.981888 47.510012, -122.057076 47.506302, -122.097931 47.538528))"
val filterExpression = ST_Intersects(col("geometry"), ST_GeomFromText(lit(regionWkt)))
Next, we create the buildings Spatial DataFrame using the Overture buildings table from the Wherobots Open Data Catalog.
buildings_df = (
sedona.table("wherobots_open_data.overture_2024_02_15.buildings_building")
.select(
f.col("geometry"),
f.lit("buildings").alias("layer"),
f.element_at(f.col("sources"), 1).dataset.alias("source")
)
)
buildings_df.show()
import org.apache.spark.sql.functions.element_at
val buildingsDf = sedona.table("wherobots_open_data.overture_2024_02_15.buildings_building")
.select(
col("geometry"),
lit("buildings").alias("layer"),
element_at(col("sources"), 1)("dataset").alias("source")
)
buildingsDf.show()
Next, we create a Spatial DataFrame for our road features using the Overture transportation segment table.
roads_df = (
sedona.table("wherobots_open_data.overture_2024_02_15.transportation_segment")
.select(
f.col("geometry"),
f.lit("roads").alias("layer"),
f.element_at(f.col("sources"), 1).dataset.alias("source")
)
)
roads_df.show()
val roadsDf = sedona.table("wherobots_open_data.overture_2024_02_15.transportation_segment")
.select(
col("geometry"),
lit("roads").alias("layer"),
element_at(col("sources"), 1)("dataset").alias("source")
)
roadsDf.show()
Next, we prepare a single spatial DataFrame combining our roads and buildings features.
features_df = roads_df.union(buildings_df)
if filter:
features_df = features_df.filter(ST_Intersects(f.col("geometry"), ST_GeomFromText(f.lit(region_wkt))))
features_df.count()
var featuresDf = roadsDf.union(buildingsDf)
featuresDf = if (filter) featuresDf.filter(filterExpression) else featuresDf
featuresDf.count()
Create DataFrame of Tiles¶
Once we have the Spatial DataFrame ready for tile generation, we can use the vtiles.generate
method to create a
DataFrame containing the encoded tiles. A GenerationConfig
object can optionally be provided as a second argument if a
non-default configuration is desired.
from wherobots import vtiles
tiles_df = vtiles.generate(features_df)
tiles_df.show(3, 150, True)
import com.wherobots.VTiles
val tilesDf = VTiles.generate(featuresDf)
tilesDf.show(3, 150, true)
Write Tiles to File Storage¶
WherobotsDB creates vector tiles in the PMTiles format by default. The PMTiles format is a performant, simple, and optimized format for creating vector tiles and provides an interface for configuration through a PMTilesConfig object. In this example, we leverage a WherobotsDB feature to configure the archive directly from the feature DataFrame used to generated the tiles:
import os
full_tiles_path = os.getenv("USER_S3_PATH") + "tiles.pmtiles"
vtiles.write_pmtiles(tiles_df, full_tiles_path, features_df=features_df)
val fullTilesPath = sys.env("USER_S3_PATH") + "tiles.pmtiles"
VTiles.writePMTiles(tilesDf, fullTilesPath, featuresDf = Some(featuresDf))
Visualizing Vector Tiles with leafmap¶
We’ve made it easy to use leafmap to visualize the vector tiles we just generated. Leafmap is a popular geospatial visualization tool for Jupyter notebooks. We offer a function that makes it easy to use leafmap. This function creates a signed URL, styles the tiles, and returns a Leafmap object. The function can be used as follows:
vtiles.show_pmtiles(full_tiles_path)
Note
If you are not using the Wherobots provided bucket to store tiles, ensure CORS is enabled on the bucket. Learn how to configure this in the CORS documentation.
Here is an example visualizing the pmtiles we developed. Purple are a roads layer and green are a buildings layer.
Visualization is only available in python.
Quick Generation of Tiles¶
Sometimes you want to quickly visualize a massive dataset. To achieve this goal, WherobotsDB provides a function for quickly generating, saving, and displaying tiles. When testing this function it completed 100 million features in less than 5 minutes on a Wherobots Cloud Cairo runtime. This is accomplished by limiting the features processed to 100 million and generating fewer zoom levels at a higher resolution. At high zooms, the low precision from the low maximum zoom may be evident.
This feature can be used as follows:
sample_tiles_path = os.getenv("USER_S3_PATH") + "sampleTiles.pmtiles"
vtiles.generate_quick_pmtiles(features_df, sample_tiles_path)
The Scala/Java API exposes the getQuickConfig method which provides the same GenerationConfig. This can be passed to the
vtiles.generate
or vtiles.generatePMTiles
methods for the same tile generation functionality.
val SampleTilesPath = sys.env("USER_S3_PATH") + "sampleTiles.pmtiles"
VTiles.generatePMTiles(featuresDf, SampleTilesPath, VTiles.getQuickConfig)
As a comprehensive map application toolbox, WherobotsDB provides many off-the-shelf scalable tools. In this tutorial, we just focus on a minimum example. Detailed explanation of each tool can be found at References.