Skip to content

Havasu (Iceberg) table management

Create Table using CREATE TABLE Command

To create a geospatial table in WherobotsDB, use sedona.sql(...) to run a CREATE TABLE command:

sedona.sql("CREATE DATABASE IF NOT EXISTS wherobots.test_db")
sedona.sql("CREATE TABLE wherobots.test_db.test_table (id bigint, data string, geom geometry)")
sedona.sql("CREATE DATABASE IF NOT EXISTS wherobots.test_db")
sedona.sql("CREATE TABLE wherobots.test_db.test_table (id bigint, data string, geom geometry)")
sedona.sql("CREATE DATABASE IF NOT EXISTS wherobots.test_db");
sedona.sql("CREATE TABLE wherobots.test_db.test_table (id bigint, data string, geom geometry)");

This will create an empty table. Notice that Havasu introduced a new data type geometry to represent geospatial data. Data in columns with geometry type will be loaded as GeometryUDT values in WherobotsDB, user can use any ST_ functions provided by WherobotsDB to manipulate the geospatial data.

Create Table using DataFrame

User can also create a table using a DataFrame. The geometry column in the DataFrame will goes into the GEOMETRY column in the created geospatial table.

df.writeTo("wherobots.test_db.test_table").create()
df.writeTo("wherobots.test_db.test_table").create()
df.writeTo("wherobots.test_db.test_table").create();

Havasu supports a full range of SQL DDL and DML commands which Apache Iceberg supports, such as ALTER TABLE, DROP TABLE, etc.

Writing Geometry Data

Once your table is created, you can insert data using INSERT INTO:

sedona.sql("INSERT INTO wherobots.test_db.test_table VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))")
sedona.sql("INSERT INTO wherobots.test_db.test_table VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))")
sedona.sql("INSERT INTO wherobots.test_db.test_table VALUES (1, 'a', ST_GeomFromText('POINT (1 2)')), (2, 'b', ST_Point(2, 3))");

You may have noticed that we use ST_GeomFromText and ST_Point provided by WherobotsDB to construct geometry values and directly inserted them into the Havasu table.

Users can also write a DataFrame containing geometry data to a Havasu table:

df.writeTo("wherobots.db_name.table_name").create()
df.writeTo("wherobots.db_name.table_name").create()
df.writeTo("wherobots.db_name.table_name").create();

Or insert data into an existing Havasu table:

df.writeTo("wherobots.db_name.table_name").append()
df.writeTo("wherobots.db_name.table_name").append()
df.writeTo("wherobots.db_name.table_name").append();

Havasu also supports a range of table update operations, such as UPDATE, DELETE, MERGE INTO, etc. Please refer to Apache Iceberg - Spark Writes for details.

Reading Geometry Table

Havasu tables can be queried using SQL SELECT statements:

df = sedona.sql("SELECT * FROM wherobots.db_name.table_name")
df = sedona.sql("SELECT * FROM wherobots.db_name.table_name")
Dataset<Row> df = sedona.sql("SELECT * FROM wherobots.db_name.table_name");

Or using DataFrame API:

df = sedona.table("wherobots.db_name.table_name")
df = sedona.table("wherobots.db_name.table_name")
Dataset<Row> df = sedona.table("wherobots.db_name.table_name");

Havasu also supports time travel and inspecting metadata of table. Please refer to Apache Iceberg - Queries for details.

Processing Geometry Data using Spatial SQL

Users can use any ST_ functions provided by WherobotsDB to manipulate geometry data read from Havasu.

# Read and manipulate geospatial data in a Havasu table
df = sedona.table("wherobots.db_name.table_name")
df.withColumn("wkt", expr("ST_AsText(geom)")).show()
df2 = df.withColumn("geom", expr("ST_Buffer(geom, 0.01)"))

# Write geospatial data back to a Havasu table
df2.writeTo("wherobots.db_name.table_name").append()
// Read and manipulate geospatial data in a Havasu table
val df = sedona.table("wherobots.db_name.table_name")
df.withColumn("wkt", expr("ST_AsText(geom)")).show()
val df2 = df.withColumn("geom", expr("ST_Buffer(geom, 0.01)"))

// Write geospatial data back to a Havasu table
df2.writeTo("wherobots.db_name.table_name").append()
// Read and manipulate geospatial data in a Havasu table
Dataset<Row> df = sedona.table("wherobots.db_name.table_name");
df.withColumn("wkt", expr("ST_AsText(geom)")).show();
Dataset<Row> df2 = df.withColumn("geom", expr("ST_Buffer(geom, 0.01)"));

// Write geospatial data back to a Havasu table
df2.writeTo("wherobots.db_name.table_name").append();

Further Reading

For more information, please refer to Geometry Support in Havasu.


Last update: May 20, 2024 07:48:29