Skip to content

Wherobots Open Data

Wherobots collects and maintains open datasets from various data sources for use by Wherobots Cloud users. Those datasets are cleaned and transformed into Havasu format for fast and efficient analytics with WherobotsDB in Wherobots Cloud.

These datasets are available for free within Wherobots Cloud, with a subset of them reserved to our Professional or Enterprise Edition users. If you are interested in upgrading your plan, see Upgrade Organization or Wherobots pricing.

Open data catalogs

Wherobots open data is available through two catalogs: wherobots_open_data for Community Edition datasets (available to all, including Professional Edition users), and wherobots_pro_data for Professional Edition datasets.

Dataset name Availability in Wherobots Type Count Description
Overture Maps buildings/building Community Edition Polygon 785 million Any human-made structures with roofs or interior spaces
Overture Maps places/place Community Edition Point 59 million Any business or point of interest within the world
Overture Maps admins/administrativeBoundary Community Edition LineString 96 thousand Any officially defined border between two Administrative Localities
Overture Maps admins/locality Community Edition Point 2948 Countries and hierarchical subdivisions of countries
Overture Maps transportation/connector Community Edition Point 330 million Points of physical connection between two or more segments
Overture Maps transportation/segment Community Edition LineString 294 million Center-line of a path which may be traveled
Google & Microsoft open buildings Professional
or
Enterprise Edition
Polygon 2.5 billion Google & Microsoft Open Buildings, combined by VIDA
LandSAT surface temperature Professional
or
Enterprise Edition
Raster (GeoTiff) 166K images, 10 TB size The temperature of the Earth's surface in Kelvin, from Aug 2023 to Oct 2023
US Census ZCTA codes Professional
or
Enterprise Edition
Polygon 33144 ZIP Code Tabulation Areas defined in 2018
NYC TLC taxi trip records Professional
or
Enterprise Edition
Point 200 million NYC TLC taxi trip pickup and dropoff records per trip
Open Street Maps all nodes Professional
or
Enterprise Edition
Point 8 billion All the nodes of the OpenStreetMap Planet dataset
Open Street Maps postal codes Professional
or
Enterprise Edition
Polygon 154 thousand Boundaries of postal code areas as defined in OpenStreetMap
Weather events Professional
or
Enterprise Edition
Point 8.6 million Events such as rain, snow, storm, from 2016 - 2022
Wild fires Professional
or
Enterprise Edition
Point 1.8 million Wildfire that occurred in the United States from 1992 to 2015

Accessing open data

Catalogs for the open data your account has access to are automatically configured in your environment's SedonaContext, and can be directly referenced by the following format: CATALOG_NAME.DATABASE_NAME.TABLE_NAME.

Users can read these tables by calling sedona.table(CATALOG_NAME.DATABASE_NAME.TABLE_NAME).show().

Inspecting open data catalogs

You can inspect the existing databases and tables in a catalog as follows:

Show database names

sedona.sql("SHOW SCHEMAS IN wherobots_pro_data").show()
+----------------+
|       namespace|
+----------------+
|google_microsoft|
|         landsat|
|        nyc_taxi|
|             osm|
|       us_census|
|         weather|
+----------------+

Show table names

Use weather database as an example:

sedona.sql("SHOW TABLES IN wherobots_pro_data.weather").show()
+---------+--------------+-----------+
|namespace|     tableName|isTemporary|
+---------+--------------+-----------+
|  weather|weather_events|      false|
|  weather|    wild_fires|      false|
+---------+--------------+-----------+

Show table schema and content

Use weather.weather_events as an example:

sedona.table("wherobots_pro_data.weather.weather_events").printSchema()
root
 |-- EventId: string (nullable = true)
 |-- Type: string (nullable = true)
 |-- Severity: string (nullable = true)
 |-- StartTime(UTC): string (nullable = true)
 |-- EndTime(UTC): string (nullable = true)
 |-- Precipitation(in): string (nullable = true)
 |-- TimeZone: string (nullable = true)
 |-- AirportCode: string (nullable = true)
 |-- LocationLat: string (nullable = true)
 |-- LocationLng: string (nullable = true)
 |-- City: string (nullable = true)
 |-- County: string (nullable = true)
 |-- State: string (nullable = true)
 |-- ZipCode: string (nullable = true)
 |-- geometry: geometry (nullable = true)

Use case notebooks

We provide interesting use case notebooks to demonstrate how you can link your data to the physical world and drive insights. Professional Edition users will be able to execute these notebooks on Wherobots cloud.

Overviews of these notebooks are as follows.