Catalog Management¶
Catalogs are the top level namespace in Havasu. Catalogs are used to store schemas and Havasu tables in a user-specified location. Wherobots Cloud provides a default catalog named wherobots
, which is pre-configured to store data under the S3 bucket allocated for the user. Users can configure their own catalogs to store data in their own S3 buckets.
A catalog is created and named by adding a property spark.sql.catalog.(catalog-name)
with its accompanying properties.
Property | Description | Example Value |
---|---|---|
spark.sql.catalog.(catalog-name) |
The class name of the catalog implementation. | org.apache.iceberg.spark.SparkCatalog |
spark.sql.catalog.(catalog-name).type |
The type of the catalog implementation. | hadoop |
spark.sql.catalog.(catalog-name).warehouse |
The location of the catalog. | s3://path/to/your/warehouse |
spark.sql.catalog.(catalog-name).io-impl |
The class name of the file I/O implementation. | org.apache.iceberg.aws.s3.S3FileIO |
Catalog configurations are identical to the catalog configuration of Apache Iceberg on Spark. Please refer to Spark Configuration for details.
It is also possible to use the S3 bucket of your own AWS account as the table storage. Please refer to Access Your Own S3 Buckets for details.
SQL References¶
This section describes the SQL commands for exploring catalogs and schemas.
SHOW CATALOGS¶
Lists the catalogs that match an optionally supplied regular expression pattern. If no pattern is supplied then the command lists all catalogs in the metastore.
Syntax: SHOW CATALOGS [ [ LIKE ] regex_pattern ]
Example:
-- List all catalogs
> SHOW CATALOGS
+-------------+
| catalog|
+-------------+
|spark_catalog|
| wherobots|
+-------------+
-- Lists catalogs with name starting with string pattern `whero`
> SHOW CATALOGS LIKE 'whero*'
+---------+
| catalog|
+---------+
|wherobots|
+---------+
SHOW SCHEMAS¶
Lists the schemas that match an optionally supplied regular expression pattern. If no pattern is supplied then the command lists all the schemas in the catalog.
Syntax: SHOW SCHEMAS [ { FROM | IN } catalog_name ] [ [ LIKE ] regex_pattern ]
Example:
-- List all databases in catalog named `wherobots`
> SHOW SCHEMAS IN wherobots
+---------+
|namespace|
+---------+
| db_name|
| test_db|
+---------+
SHOW DATABASES¶
SHOW DATABASES
is an alias of SHOW SCHEMAS
.
SHOW TABLES¶
Returns all the tables for an optionally specified schema. Additionally, the output of this statement may be filtered by an optional matching pattern.
Syntax: SHOW TABLES [ { FROM | IN } schema_name ] [ [ LIKE ] regex_pattern ]
Example:
> SHOW TABLES IN wherobots.test_db
+---------+----------+-----------+
|namespace| tableName|isTemporary|
+---------+----------+-----------+
| test_db|test_table| false|
| test_db| taxi| false|
| test_db| table| false|
+---------+----------+-----------+
DESCRIBE TABLE¶
Returns the basic metadata information of a table. The metadata information includes column name, column type and column comment.
Syntax: { DESC | DESCRIBE } [ TABLE ] [ EXTENDED | FORMATTED ] table_name
Example:
> DESCRIBE TABLE wherobots.test_db.taxi
+--------------------+-------------+-------+
|col_name |data_type |comment|
+--------------------+-------------+-------+
|pickup |geometry |null |
|Trip_Pickup_DateTime|string |null |
|Payment_Type |string |null |
|Fare_Amt |decimal(10,0)|null |
+--------------------+-------------+-------+
SHOW CREATE TABLE¶
Returns the CREATE TABLE statement or CREATE VIEW statement that was used to create a given table or view.
Syntax: SHOW CREATE TABLE [ catalog_name. ] [ db_name. ] table_name
Example:
> SHOW CREATE TABLE wherobots.test_db.taxi
CREATE TABLE wherobots.test_db.taxi (
pickup BINARY,
Trip_Pickup_DateTime STRING,
Payment_Type STRING,
Fare_Amt DECIMAL(10,0))
USING iceberg
LOCATION 's3://path/to/warehouse/test_db/taxi'
TBLPROPERTIES (
'current-snapshot-id' = '1979130433016581613',
'format' = 'iceberg/parquet',
'format-version' = '1')