Uploading data
You can upload file(s) to Wherobots Cloud for later use within your
Jupyter notebooks and jobs. You can upload individual files directly
through your browser using the file browser, or
with the AWS CLI and aws s3 cp
commands to upload files directly to
Wherobots Cloud's data warehouse. The latter is recommended if you need
to upload multiple files, large files, or more complex folder structures
like partitioned Parquet files.
To upload a file from the browser, navigate to the desired folder, click the "Upload" button, and select the file you want to upload.
To upload data using the AWS CLI:
- Click the "Upload" button and follow the steps to request temporary AWS ingest credentials. A short-lived set of AWS credentials (access key ID, access key secret, and session token) will be generated for you.
- Configure your local environment with those credentials in your
~/.aws/credentials
file. Refer to the AWS CLI documentation for more information on the configuration of AWS credentials for the CLI. - Navigate to the parent folder you want to upload to, and copy the full
S3 path of the target folder by clicking the "Copy" icon on its right
hand side. The path should look like
s3://wbts-wbc-XXXX/XXXX/data/customer-XXXX/
. - From your command line, upload your files with
aws s3 cp
:
$ cat ~/.aws/credentials
[wherobots]
aws_access_key_id = ...
aws_access_secret_key = ...
aws_session_token = ...
$ aws --profile=wherobots s3 cp --recursive my-data/ s3://wbts-wbc-XXXX/XXXX/data/customer-XXXX/
The data directory is accessible from within the Jupyter notebook environment via predefined environment variables:
USER_S3_PATH
- pointing to/data/customer-XXXX
USER_S3_SHARED_PATH
- pointing to/data/shared
USER_WAREHOUSE_PATH
- pointing to/data/customer-XXXX/warehouse
Last update:
July 8, 2024 02:18:16