TileDB Cloud URIs
Introduction
TileDB Cloud provides a secure and convenient way to access data stored in remote object stores, with Amazon S3 being the preferred option for TileDB Cloud SaaS. Using a specialized URI (Uniform Resource Identifier) scheme, TileDB Cloud automatically handles the authentication process, eliminating the need for users to directly configure AWS IAM roles, that can be used to create, access, and manage any of your assets. This document provides an overview of TileDB Cloud URIs and how to use them.
URI format for asset creation
When creating a new array, group, or SOMA experiment on TileDB Cloud, you will need to use a special creation URI in the following format: tiledb://<namespace>/s3://<bucket>/<name>
, where:
<namespace>
is your TileDB username or organization name.<bucket>
is the S3 bucket where the asset will be physically stored.<name>
is the name of the asset.
The following examples use tiledb-inc
as the namespace and demo-data
as the S3 bucket. In practice, you would replace these with your actual values.
Example:
To create a new SOMA experiment named pbmc3k
, you could use the following command:
tiledbsoma.io.from_anndata(="tiledb://tiledb-inc/s3://demo-data/pbmc3k",
experiment_uri="RNA",
measurement_name=pbmc3k,
anndata )
This command will:
- Physically create the new TileDB SOMA experiment at
s3://demo-data/pbmc3k
. - Register this on TileDB Cloud under the
tiledb-inc
namespace and assign it a UUID.
Accessing assets
After an asset is created and registered, it appears in TileDB Cloud’s data catalog under the specified namespace. On the asset’s landing page, you will find two URIs:
tiledb://tiledb-inc/pbmc3k
(assumingpbmc3k
is unique in your namespace)tiledb://tiledb-inc/<uuid>
(where<uuid>
is a unique identifier for the asset)
You can access this experiment using either of these URIs, or with the original creation URI (i.e., tiledb://tiledb-inc/s3://demo-data/pbmc3k
):
with tiledbsoma.Experiment.open("tiledb://tiledb-inc/pbmc3k") as exp:
exp.obs.read().concat()
with tiledbsoma.Experiment.open(
"tiledb://tiledb-inc/078007ae-49c0-46b8-8924-3cc625615ddb"
as exp:
) exp.obs.read().concat()
with tiledbsoma.Experiment.open("tiledb://tiledb-inc/s3://demo-data/pbmc3k") as exp:
exp.obs.read().concat()
While all three forms of TileDB Cloud URIs offer secure access to the asset, each form has their own advantages. The following list offers suggestions on when to use each form of URI:
- Name-based URI: A convenient, human-readable format.
- UUID-based URI: Necessary if your namespace contains more than one asset named
pbmc3k
. - S3-based URI: Useful if you intend to add new nested assets to a TileDB group (this also includes SOMA experiments and its constituent collections).
Conclusion
Understanding the URI schemes in TileDB Cloud is essential for efficient and secure data management. Each URI variant serves a particular use case and has its advantages, from human-readability to disambiguation and organizational logic.