Save as txt files
To save a Spatial DataFrame to some permanent storage such as Hive tables and HDFS, you can simply convert each geometry in the Geometry type column back to a plain String and save the plain DataFrame to wherever you want. Use the following code to convert the Geometry column in a DataFrame back to a WKT string column:ST_AsGeoJSON is also available. We would like to invite you to contribute more functions
Save GeoParquet
WherobotsDB can directly save a DataFrame with the Geometry column as a GeoParquet file. You need to specifygeoparquet as the write format. The Geometry type will be preserved in the GeoParquet file.
CRS Metadata
WherobotsDB supports writing GeoParquet files with custom GeoParquet spec version and crs. The default GeoParquet spec version is1.0.0 and the default crs is null. You can specify the GeoParquet spec version and crs as follows:
geoparquet.crs and geoparquet.crs.<column_name> can be one of the following:
"null": Explicitly settingcrsfield tonull. This is the default behavior.""(empty string): Omit thecrsfield. This implies that the CRS is OGC:CRS84 for CRS-aware implementations."{...}"(PROJJSON string): Thecrsfield will be set as the PROJJSON object representing the Coordinate Reference System (CRS) of the geometry.
ST_SetSRID after reading the file.
Its geoparquet writer will not leverage the SRID field of a geometry so you will have to always set the geoparquet.crs option manually when writing the file, if you want to write a meaningful CRS field.
Due to the same reason, WherobotsDB geoparquet reader and writer do NOT check the axis order (lon/lat or lat/lon) and assume they are handled by the users themselves when writing / reading the files. You can always use ST_FlipCoordinates to swap the axis order of your geometries.
Covering Metadata
WherobotsDB supports writing thecovering field to geometry column metadata. The covering field specifies a bounding box column to help accelerate spatial data retrieval. The bounding box column should be a top-level struct column containing xmin, ymin, xmax, ymax columns. If the DataFrame you are writing contains such columns, you can specify .option("geoparquet.covering.<geometryColumnName>", "<coveringColumnName>") option to write covering metadata to GeoParquet files:
geoparquet.covering option and omit the geometry column name:
Sort then Save GeoParquet
To maximize the performance of WherobotsDB GeoParquet filter pushdown, we suggest that you sort the data by their geohash values (see ST_GeoHash) and then save as a GeoParquet file. An example is as follows:Save as GeoJSON
The GeoJSON data source in WherobotsDB can be used to save a Spatial DataFrame to a single-line JSON file, with geometries written in GeoJSON format.- If there’s a column named “geometry” with geometry type, Sedona will use this column
- Otherwise, Sedona will use the first geometry column found in the root schema
Save to PostGIS
Unfortunately, the Spark SQL JDBC data source doesn’t support creating geometry types in PostGIS using the ‘createTableColumnTypes’ option. Only the Spark built-in types are recognized. This means that you’ll need to manage your PostGIS schema separately from Spark. One way to do this is to create the table with the correct geometry column before writing data to it with Spark. Alternatively, you can write your data to the table using Spark and then manually alter the column to be a geometry type afterward. Postgis uses EWKB to serialize geometries. If you convert your geometries to EWKB format in WherobotsDB you don’t have to do any additional conversion in Postgis.Save to GeoPandas
WherobotsDB DataFrame can be directly converted to a GeoPandas DataFrame. Use a single, valid Tabs block for language variants.- Python
dataframe_to_arrow method in Sedona.
Save to Snowflake / other data warehouses
Below are concise, single Tabs blocks showing code per language. Tabs are not nested inside code blocks.- Python
- Scala
- Java
Save to an AWS RDS PostGIS instance
WherobotsDB dataframes can be saved to PostGIS tables hosted in an AWS RDS instance for persistent storage. A map of configuration and context options must be passed to establish connection with the RDS instance. If you’re unable to establish connection with the RDS instance, double check if the instance is accessible by the server running this code. For more information on intra or inter VPC connection with the RDS instance, consult here. The SedonaContext object also needs to be passed aSaveMode parameter using mode which specifies how to handle collisions with existing tables if any.
- Python
- Scala
- Java

