# Import necessary libraries
import os
import numpy as np
import tiledb.vector_search as vs
import tiledb
# You should set the appropriate environment variables with your keys.
# Get the keys from the environment variables.
= os.environ["AWS_ACCESS_KEY_ID"]
aws_access_key_id = os.environ["AWS_SECRET_ACCESS_KEY"]
aws_secret_access_key
# Get the bucket and region from environment variables
= os.environ["S3_BUCKET"]
s3_bucket = os.environ["S3_REGION"]
s3_region
# Set the AWS keys and region to the config of the default context
# This context initialization can be performed only once.
= tiledb.Config(
cfg
{"vfs.s3.aws_access_key_id": aws_access_key_id,
"vfs.s3.aws_secret_access_key": aws_secret_access_key,
"vfs.s3.region": s3_region,
}
)= tiledb.Ctx(cfg)
ctx
# Set index URI
= "basic_s3"
index_name = s3_bucket + "/" + index_name
index_uri
# Clean up previous data
if tiledb.object_type(index_uri, ctx=ctx) == "group":
with tiledb.Group(index_uri, "m") as g:
=True) g.delete(recursive
Basic S3 Example
We recommend running this tutorial, as well as the other various tutorials in the Tutorials section, inside TileDB Cloud. This will allow you to quickly experiment avoiding all the installation, deployment, and configuration hassles. Sign up for the free tier, spin up a TileDB Cloud notebook with a Python kernel, and follow the tutorial instructions. If you wish to learn how to run tutorials locally on your machine, read the Tutorials: Running Locally tutorial.
This tutorial demonstrates how to use TileDB-Vector-Search to store a vector index to S3, and query it efficiently without the need to download it locally. For more information on how TileDB efficiently works on object stores, visit the Array Key Concepts: Object Stores section.
Setup
The only difference to working with local vector indexes is twofold:
- Set the appropriate AWS credentials in environment variables and load them into a configuration object in a TileDB context.
- Use an
s3://
URI instead of a local path for the index location.
Other than the above, the rest of the operations are identical to local indexes.
First, load the appropriate libraries, set the AWS credentials in a context, specify the index S3 URI, and delete any previously created index with the same URI.
Ingestion and querying
Create an empty IVF_FLAT
index.
Add some vectors to the index.
Query the index.
Clean up
Clean up in the end by removing the index: