1. Structure
  2. Arrays
  3. Tutorials
  4. Basics
  5. Configuration
  • Home
  • What is TileDB?
  • Get Started
  • Explore Content
  • Accounts
    • Individual Accounts
      • Apply for the Free Tier
      • Profile
        • Overview
        • Cloud Credentials
        • Storage Paths
        • REST API Tokens
        • Credits
    • Organization Admins
      • Create an Organization
      • Profile
        • Overview
        • Members
        • Cloud Credentials
        • Storage Paths
        • Billing
      • API Tokens
    • Organization Members
      • Organization Invitations
      • Profile
        • Overview
        • Members
        • Cloud Credentials
        • Storage Paths
        • Billing
      • API Tokens
  • Catalog
    • Introduction
    • Data
      • Arrays
      • Tables
      • Single-Cell (SOMA)
      • Genomics (VCF)
      • Biomedical Imaging
      • Vector Search
      • Files
    • Code
      • Notebooks
      • Dashboards
      • User-Defined Functions
      • Task Graphs
      • ML Models
    • Groups
    • Marketplace
    • Search
  • Collaborate
    • Introduction
    • Organizations
    • Access Control
      • Introduction
      • Share Assets
      • Asset Permissions
      • Public Assets
    • Logging
    • Marketplace
  • Analyze
    • Introduction
    • Slice Data
    • Multi-Region Redirection
    • Notebooks
      • Launch a Notebook
      • Usage
      • Widgets
      • Notebook Image Dependencies
    • Dashboards
      • Dashboards
      • Streamlit
    • Preview
    • User-Defined Functions
    • Task Graphs
    • Serverless SQL
    • Monitor
      • Task Log
      • Task Graph Log
  • Scale
    • Introduction
    • Task Graphs
    • API Usage
  • Structure
    • Why Structure Is Important
    • Arrays
      • Introduction
      • Quickstart
      • Foundation
        • Array Data Model
        • Key Concepts
          • Storage
            • Arrays
            • Dimensions
            • Attributes
            • Cells
            • Domain
            • Tiles
            • Data Layout
            • Compression
            • Encryption
            • Tile Filters
            • Array Schema
            • Schema Evolution
            • Fragments
            • Fragment Metadata
            • Commits
            • Indexing
            • Array Metadata
            • Datetimes
            • Groups
            • Object Stores
          • Compute
            • Writes
            • Deletions
            • Consolidation
            • Vacuuming
            • Time Traveling
            • Reads
            • Query Conditions
            • Aggregates
            • User-Defined Functions
            • Distributed Compute
            • Concurrency
            • Parallelism
        • Storage Format Spec
      • Tutorials
        • Basics
          • Basic Dense Array
          • Basic Sparse Array
          • Array Metadata
          • Compression
          • Encryption
          • Data Layout
          • Tile Filters
          • Datetimes
          • Multiple Attributes
          • Variable-Length Attributes
          • String Dimensions
          • Nullable Attributes
          • Multi-Range Reads
          • Query Conditions
          • Aggregates
          • Deletions
          • Catching Errors
          • Configuration
          • Basic S3 Example
          • Basic TileDB Cloud
          • fromDataFrame
          • Palmer Penguins
        • Advanced
          • Schema Evolution
          • Advanced Writes
            • Write at a Timestamp
            • Get Fragment Info
            • Consolidation
              • Fragments
              • Fragment List
              • Consolidation Plan
              • Commits
              • Fragment Metadata
              • Array Metadata
            • Vacuuming
              • Fragments
              • Commits
              • Fragment Metadata
              • Array Metadata
          • Advanced Reads
            • Get Fragment Info
            • Time Traveling
              • Introduction
              • Fragments
              • Array Metadata
              • Schema Evolution
          • Array Upgrade
          • Backends
            • Amazon S3
            • Azure Blob Storage
            • Google Cloud Storage
            • MinIO
            • Lustre
          • Virtual Filesystem
          • User-Defined Functions
          • Distributed Compute
          • Result Estimation
          • Incomplete Queries
        • Management
          • Array Schema
          • Groups
          • Object Management
        • Performance
          • Summary of Factors
          • Dense vs. Sparse
          • Dimensions vs. Attributes
          • Compression
          • Tiling and Data Layout
          • Tuning Writes
          • Tuning Reads
      • API Reference
    • Tables
      • Introduction
      • Quickstart
      • Foundation
        • Data Model
        • Key Concepts
          • Indexes
          • Columnar Storage
          • Compression
          • Data Manipulation
          • Optimize Tables
          • ACID
          • Serverless SQL
          • SQL Connectors
          • Dataframes
          • CSV Ingestion
      • Tutorials
        • Basics
          • Ingestion with SQL
          • CSV Ingestion
          • Basic S3 Example
          • Running Locally
        • Advanced
          • Scalable Ingestion
          • Scalable Queries
      • API Reference
    • AI & ML
      • Vector Search
        • Introduction
        • Quickstart
        • Foundation
          • Data Model
          • Key Concepts
            • Vector Search
            • Vector Databases
            • Algorithms
            • Distance Metrics
            • Updates
            • Deployment Methods
            • Architecture
            • Distributed Compute
          • Storage Format Spec
        • Tutorials
          • Basics
            • Ingestion & Querying
            • Updates
            • Deletions
            • Basic S3 Example
            • Running Locally
          • Advanced
            • Versioning
            • Time Traveling
            • Consolidation
            • Distributed Compute
            • RAG LLM
            • LLM Memory
            • File Search
            • Image Search
            • Protein Search
          • Performance
        • API Reference
      • ML Models
        • Introduction
        • Quickstart
        • Foundation
          • Basics
          • Storage
          • Cloud Execution
          • Why TileDB for Machine Learning
        • Tutorials
          • Ingestion
            • Data Ingestion
              • Dense Datasets
              • Sparse Datasets
            • ML Model Ingestion
          • Management
            • Array Schema
            • Machine Learning: Groups
            • Time Traveling
    • Life Sciences
      • Single-cell
        • Introduction
        • Quickstart
        • Foundation
          • Data Model
          • Key Concepts
            • Data Structures
            • Use of Apache Arrow
            • Join IDs
            • State Management
            • TileDB Cloud URIs
          • SOMA API Specification
        • Tutorials
          • Data Ingestion
          • Bulk Ingestion Tutorial
          • Data Access
          • Distributed Compute
          • Basic S3 Example
          • Multi-Experiment Queries
          • Appending Data to a SOMA Experiment
          • Add New Measurements
          • SQL Queries
          • Running Locally
          • Shapes in TileDB-SOMA
          • Drug Discovery App
        • Spatial
          • Introduction
          • Foundation
            • Spatial Data Model
            • Data Structures
          • Tutorials
            • Spatial Data Ingestion
            • Access Spatial Data
            • Manage Coordinate Spaces
        • API Reference
      • Population Genomics
        • Introduction
        • Quickstart
        • Foundation
          • Data Model
          • Key Concepts
            • The N+1 Problem
            • Architecture
            • Arrays
            • Ingestion
            • Reads
            • Variant Statistics
            • Annotations
            • User-Defined Functions
            • Tables and SQL
            • Distributed Compute
          • Storage Format Spec
        • Tutorials
          • Basics
            • Basic Ingestion
            • Basic Queries
            • Export to VCF
            • Add New Samples
            • Deleting Samples
            • Basic S3 Example
            • Basic TileDB Cloud
          • Advanced
            • Scalable Ingestion
            • Scalable Queries
            • Query Transforms
            • Handling Large Queries
            • Annotations
              • Finding Annotations
              • Embedded Annotations
              • External Annotations
              • Annotation VCFs
              • Ingesting Annotations
            • Variant Statistics
            • Tables and SQL
            • User-Defined Functions
            • Sample Metadata
            • Split VCF
          • Performance
        • API Reference
          • Command Line Interface
          • Python API
          • Cloud API
      • Biomedical Imaging
        • Introduction
        • Foundation
          • Data Model
          • Key Concepts
            • Arrays
            • Ingestion
            • Reads
            • User Defined Functions
          • Storage Format Spec
        • Quickstart
        • Tutorials
          • Basics
            • Ingestion
            • Read
              • OpenSlide
              • TileDB-Py
          • Advanced
            • Batched Ingestion
            • Chunked Ingestion
            • Machine Learning
              • PyTorch
            • Napari
    • Files
  • API Reference
  • Self-Hosting
    • Installation
    • Upgrades
    • Administrative Tasks
    • Image Customization
      • Customize User-Defined Function Images
      • AWS ECR Container Registry
      • Customize Jupyter Notebook Images
    • Single Sign-On
      • Configure Single Sign-On
      • OpenID Connect
      • Okta SCIM
      • Microsoft Entra
  • Glossary

On this page

  • Basic configuration
  • Configuration iterators
  • Save configuration to file
  • Configuration parameters
    • General
    • Storage manager
      • Consolidation and vacuuming
      • Variable-length offsets
      • Queries
      • Memory
      • Groups
    • Virtual filesystem
      • S3
      • Azure Blob Storage
      • Google Cloud Storage (GCS)
    • REST
    • SSL
  1. Structure
  2. Arrays
  3. Tutorials
  4. Basics
  5. Configuration

Work with Configuration Objects

arrays
tutorials
python
r
configuration
Learn how to perform various operations on TileDB configuration objects.
How to run this tutorial

You can run this tutorial in two ways:

  1. Locally on your machine.
  2. On TileDB Cloud.

However, since TileDB Cloud has a free tier, we strongly recommend that you sign up and run everything there, as that requires no installations or deployment.

This tutorial explains how to set configuration parameters into a configuration object, as well as read them back. Configuration objects are passed in different TileDB objects, such as arrays, groups, and more.

Basic configuration

You can set, unset, and read configuration parameters as shown here.

  • Python
  • R
import tiledb

# Create a configuration object
config = tiledb.Config()

# Set a configuration parameter
config["sm.consolidation.buffer_size"] = 55000000

# Config objects may also be initialized with a dictionary
# containing multiple values
config = tiledb.Config(
    {"sm.consolidation.buffer_size": 45000000, "sm.tile_cache_size": 900000000}
)

# Remove a configuration parameter (resets to the default value)
# NOTE: configuration objects are *immutable* after application to a Context
del config["sm.consolidation.buffer_size"]

print("sm.tile_cache_size:", config["sm.tile_cache_size"])
print("sm.consolidation.buffer_size:", config["sm.consolidation.buffer_size"])
sm.tile_cache_size: 900000000
sm.consolidation.buffer_size: 900000000
library(tiledb)

# Create a configuration object
config <- tiledb_config()

# Set a configuration parameter
config["sm.consolidation.buffer_size"] <- "55000000"

# Config objects may also be initialized with a vector
# containing multiple values
config <- tiledb_config(
  config = c(
    "sm.consolidation.buffer_size" = "45000000",
    "sm.tile_cache_size" = "900000000"
  )
)

# Get a configuration parameter
tile_cache_size <- config["sm.tile_cache_size"]

# Unset a configuration parameter
invisible(tiledb_config_unset(config, "sm.consolidation.buffer_size"))

cat(paste("sm.tile_cache_size:", config["sm.tile_cache_size"]), "\n")
cat(
  paste(
    "sm.consolidation.buffer_size:",
    config["sm.consolidation.buffer_size"]
  )
)
sm.tile_cache_size: 900000000 
sm.consolidation.buffer_size: 50000000

Configuration iterators

You can iterate over all configuration parameters as follows.

  • Python
  • R
print("--- Print all parameters: ---\n")
for key, value in config.items():
    print(f"'{key}': '{value}'")

# keys may optionally be filtered by passing a prefix to `items()`
print("\n--- Print parameters starting with 'sm.': ---\n")
for key, value in config.items("sm."):
    print(f"'{key}': '{value}'")
cat("--- Print all parameters: ---\n")
for (n in names(as.vector(config))) {
  if (config[n] != "") {
    cat(sprintf("'%s': '%s'", n, config[n]), "\n")
  }
}

Save configuration to file

You can save your entire configuration to a file and then load it from it.

  • Python
  • R
config_file = "tiledb_config.txt"

# Save to file
config = tiledb.Config()
config["sm.consolidation.buffer_size"] = 50000001
config.save(config_file)

# Load from file
config_load = tiledb.Config.load(config_file)
print(config_load["sm.consolidation.buffer_size"])

# Clean up
import os.path

if os.path.exists(config_file):
    os.remove(config_file)
50000001
config_file <- "tiledb_config.txt"

# Save to file
config <- tiledb_config()
config["sm.consolidation.buffer_size"] <- "50000001"
invisible(tiledb_config_save(config, config_file))

# Load from file
config_load <- tiledb_config_load(config_file)
cat(config_load["sm.consolidation.buffer_size"])

# Clean up
if (file.exists(config_file)) {
  invisible(file.remove(config_file))
}
50000001

Configuration parameters

The following table summarizes all TileDB configuration parameters. Any configuration parameter suffixed with an asterisk (*) is experimental and not yet supported for production use.

General

Parameter Default value Description
"config.env_var_prefix" "TILEDB_" Prefix of environmental variables for reading configuration parameters.
"config.logging_level" "1" if the --enable-verbose bootstrap flag is passed, "0" otherwise The logging level configured. Possible values include:
- 0: FATAL
- 1: ERROR
- 2: WARN
- 3: INFO
- 4: DEBUG
- 5: TRACE
"config.logging_format" "DEFAULT" The logging format configured (DEFAULT or JSON).
"filestore.buffer_size" "104857600" (100 MB) Specifies the size in bytes of the internal buffers used in the filestore API. The size should be larger than the minimum tile size filestore currently supports, which is currently 1024 bytes.

Storage manager

Parameter Default value Description
"sm.allow_separate_attribute_writes"_ false Allow separate attribute write queries.
"sm.allow_updates_experimental"_ false Allow update queries. Experimental for testing purposes, do not use.
"sm.dedup_coords" false If true, cells with duplicate coordinates will be removed during sparse fragment writes. Note that ties during deduplication are broken arbitrarily. Also note that this check means that it will take longer to perform the write operation.
"sm.check_coord_dups" true This applies only if sm.dedup_coords is false. If true, TileDB will throw an error if cells exist with duplicate coordinates during sparse fragment writes. If set to false and duplicates exist, TileDB will write the duplicates without errors. Note that this check is more lightweight than the coordinate deduplication check enabled by sm.dedup_coords.
"sm.check_coord_oob" true If true, TileDB will throw an error if a cell’s coordinates exist outside the domain during sparse fragment writes.
"sm.read_range_oob" "warn" If error, this will check ranges for read with out-of-bounds on the dimension domain’s. If warn, the ranges will be capped at the dimension’s domain and a warning logged.
"sm.check_global_order" true Checks if the coordinates obey the global array order. Applicable only to sparse writes in global order.
"sm.enable_signal_handlers" true Controls whether TileDB will install signal handlers.
"sm.compute_concurrency_level" # cores Upper-bound on number of threads to assign for compute-bound tasks.
"sm.io_concurrency_level" # cores Upper-bound on number of threads to assign for IO-bound tasks.
"sm.memory_budget" 5368709120 (5 GB) The memory budget for tiles of fixed-sized attributes (or offsets for var-sized attributes) to be fetched during reads.
"sm.memory_budget_var" 10737418240 (10 GB) The memory budget for tiles of var-sized attributes to be fetched during reads.
"sm.partial_tile_offsets_loading"* false If true, the readers can partially load and unload the tile offsets.
"sm.fragment_info.preload_mbrs" false If true minimum bounding rectangles (MBRs) will be loaded at the same time as the rest of fragment info, otherwise they will be loaded lazily when the user requests info related to MBRs.
"sm.skip_checksum_validation" false Skip checksum validation on reads for the md5 and sha256 filters.
"sm.encryption_key" "" The key for encrypted arrays.
"sm.encryption_type" "NO_ENCRYPTION" The type of encryption used for encrypted arrays.
"sm.enumerations_max_size" "10485760" (10 MB) Maximum in memory size for an enumeration. If the enumeration is var sized, the size will include the data and the offsets.
"sm.enumerations_max_total_size" "52428800" (50 MB) Maximum in memory size for all enumerations. f the enumeration is var sized, the size will include the data and the offsets.
"sm.max_tile_overlap_size" "314572800" (300 MB) Maximum size for the tile overlap structure which holds information about which tiles are covered by ranges. Only used in dense reads and legacy reads.

Consolidation and vacuuming

Parameter Default value Description
"sm.consolidation.mode" "fragments" The consolidation mode, one of commits, fragments, fragment_meta, array_meta, or group_meta.
"sm.consolidation.amplification" 1.0 . The factor by which TileDB can amplify the size of the dense fragment resulting from consolidating a set of fragments (having at least one dense fragment).
"sm.consolidation.max_fragment_size" UINT64_MAX The size (in bytes) of the maximum on-disk fragment size that will be created by consolidation. When it’s reached, consolidation will continue the operation in a new fragment. The result will be multiple fragments, but with separate MBRs.
"sm.consolidation.steps" "4294967295" (UINT32_MAX) The number of consolidation steps to be performed when executing the consolidation algorithm.
"sm.consolidation.purge_deleted_cells" false Purge deleted cells from the consolidated fragment or not.
"sm.consolidation.step_min_frags" UINT32_MAX The minimum number of fragments to consolidate in a single step.
"sm.consolidation.step_max_frags" UINT32_MAX The maximum number of fragments to consolidate in a single step.
"sm.consolidation.step_size_ratio" 0.0 The size ratio of two (“adjacent”) fragments to be considered for consolidation in a single step.
"sm.consolidation.timestamp_start" 0 When set, an array will be consolidated between this value and sm.consolidation.timestamp_end (inclusive). Only for fragments, array_meta, and group_meta consolidation mode.
"sm.consolidation.timestamp_end" UINT64_MAX When set, an array will be consolidated between sm.consolidation.timestamp_start and this value (inclusive). Only for fragments, array_meta, and group_meta consolidation mode.
"sm.vacuum.mode" "fragments" The consolidation mode, one of commits, fragments, fragment_meta, array_meta, or group_meta.

Variable-length offsets

Parameter Default value Description
"sm.var_offsets.bitsize" 64 The size of offsets in bits to be used for offset buffers of variable-length attributes.
"sm.var_offsets.extra_element" false Add an extra element to the end of the offsets buffer of var-length attributes which will point to the end of the values buffer.
"sm.var_offsets.mode" "bytes" The offsets format (bytes or elements) to be used for variable-length attributes.

Queries

Parameter Default value Description
"sm.query.dense.reader" "refactored" Which reader to use for dense queries (“refactored” or “legacy”).
"sm.query.sparse_global_order.reader" "refactored" Which reader to use for sparse global order queries (“refactored” or “legacy”).
"sm.query.sparse_unordered_with_dups.reader" "refactored" Which reader to use for sparse unordered with duplicates queries (“refactored” or “legacy”).
"sm.query.dense.qc_coords_mode" false Dense configuration that allows to only return the coordinates of the cells that match a query condition without any attribute data.

Memory

Parameter Default value Description
"sm.mem.malloc_trim" true Should malloc_trim be called on context and query destruction? This can, in some situations, reduce residual memory usage.
"sm.mem.tile_upper_memory_limit"* 1073741824 (1 GB) This is the upper memory limit used when loading tiles, in bytes. For now, it’s only used in the dense reader but will be eventually used by all readers. The readers using this value will use it as a way to limit the amount of tile data brought into memory at once so that you don’t incur performance penalties during memory movement operations. It’s a soft limit that TileDB might go over if a single tile doesn’t fit into memory. TileDB will allow to load that tile if it still fits within sm.mem.total_budget.
"sm.mem.total_budget" 10737418240 (10 GB) Memory budget for readers and writers (in bytes).
"sm.mem.consolidation.buffers_weight" 1 Weight used to split sm.mem.total_budget and assign to the consolidation buffers. The budget is split across 3 values: sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight, and sm.mem.consolidation.writer_weight.
"sm.mem.consolidation.reader_weight" 3 Weight used to split sm.mem.total_budget and assign to the reader query. The budget is split across 3 values: sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight, and sm.mem.consolidation.writer_weight.
"sm.mem.consolidation.writer_weight" 2 Weight used to split sm.mem.total_budget and assign to the writer query. The budget is split across 3 values: sm.mem.consolidation.buffers_weight, sm.mem.consolidation.reader_weight, and sm.mem.consolidation.writer_weight.
"sm.mem.reader.sparse_global_order.ratio_coords" 0.5 Ratio of the budget allotted for coordinates in the sparse global order reader.
"sm.mem.reader.sparse_global_order.ratio_tile_ranges" 0.1 Ratio of the budget allotted for tile ranges in the sparse global order reader.
"sm.mem.reader.sparse_global_order.ratio_array_data" 0.1 Ratio of the budget allotted for array data in the sparse global order reader.
"sm.mem.reader.sparse_unordered_with_dups.ratio_coords" 0.5 Ratio of the budget allotted for coordinates in the sparse unordered with duplicates reader.
"sm.mem.reader.sparse_unordered_with_dups.ratio_tile_ranges" 0.1 Ratio of the budget allotted for tile ranges in the sparse unordered with duplicates reader.
"sm.mem.reader.sparse_unordered_with_dups.ratio_array_data" 0.1 Ratio of the budget allotted for array data in the sparse unordered with duplicates reader.

Groups

Parameter Default value Description
"sm.group.timestamp_start" 0 The start timestamp used for opening the group.
"sm.group.timestamp_end" 18446744073709551615 (UINT64_MAX) The end timestamp used for opening the group. Also used for the write timestamp if set.

Virtual filesystem

Parameter Default value Description
"vfs.read_ahead_size" 102400 The maximum byte size to read-ahead from the backend.
"vfs.read_ahead_cache_size" 10485760 (10 MB) The the total maximum size of the read-ahead cache (in bytes), which is a Least Recently Used (LRU) cache.
"vfs.min_parallel_size" 10485760 (10 MB) The minimum number of bytes in a parallel virtual filesystem (VFS) operation (except parallel S3 writes, which are controlled by vfs.s3.multipart_part_size).
"vfs.max_batch_size" "104857600" (100 MB) The maximum number of bytes in a VFS read operation.
"vfs.min_batch_size" 20971520 (20 MB) The minimum number of bytes in a VFS read operation.
"vfs.min_batch_gap" 5120000 (500 KB) The minimum number of bytes between two VFS read batches.
"vfs.read_logging_mode" "" Log read operations at varying levels of verbosity. Possible values:
- "": An empty string disables read logging.
- "fragments": Log each fragment read.
- "fragment_files": Log each individual fragment file read.
- "all_files": Log all files read.
- "all_reads": Log all files with offset and length parameters.
- "all_reads_always": Log all files with offset and length parameters on every read, not just the first read. On large arrays, the read cache can get large, so this exchanges RAM usage for increased log verbosity.
"vfs.file.posix_file_permissions" 644 Permissions to use for POSIX file system with file creation.
"vfs.file.posix_directory_permissions" 755 Permissions to use for POSIX file system with directory creation.

S3

Parameter Default value Description
"vfs.s3.region" "us-east-1" The S3 region, if S3 is enabled.
"vfs.s3.aws_access_key_id" "" The AWS_ACCESS_KEY_ID.
"vfs.s3.aws_secret_access_key" "" The AWS_SECRET_ACCESS_KEY.
"vfs.s3.aws_session_token" "" The AWS_SESSION_TOKEN.
"vfs.s3.aws_role_arn" "" The role that you want to assume (AWS_ROLE_ARN).
"vfs.s3.aws_external_id" "" Third party access ID to your resources when assuming a role (AWS_EXTERNAL_ID).
"vfs.s3.aws_load_frequency" "" Session time limit when assuming a role (AWS_LOAD_FREQUENCY).
"vfs.s3.aws_session_name" "" (Optional) Session name when assuming a role (AWS_SESSION_NAME), which you canuuse for tracing and bookkeeping.
"vfs.s3.scheme" "https" The S3 scheme (http or https), if S3 is enabled.
"vfs.s3.endpoint_override" "" The S3 endpoint, if S3 is enabled.
"vfs.s3.use_virtual_addressing" true The S3 use of virtual addressing (true or false), if S3 is enabled.
"vfs.s3.skip_init" "false" Skip Aws::InitAPI for the S3 layer (true or false).
"vfs.s3.use_multipart_upload" "true" The S3 use of multi-part upload requests (true or false), if S3 is enabled.
"vfs.s3.max_parallel_ops" The value of sm.io_concurrency_level The maximum number of S3 backend parallel operations.
"vfs.s3.multipart_part_size" 5242880 (5 MB) The part size (in bytes) used in S3 multipart writes. Any uint64_t value is acceptable. Note: vfs.s3.multipart_part_size * vfs.s3.max_parallel_ops bytes will be buffered before issuing multipart uploads in parallel.
"vfs.s3.ca_file" "" Path to SSL/TLS certificate file to be used by cURL for S3 HTTPS encryption. Follows cURL conventions.
"vfs.s3.ca_path" "" Path to SSL/TLS certificate directory to be used by cURL for S3 HTTPS encryption. Follows cURL conventions.
"vfs.s3.connect_timeout_ms" 10800 The connection timeout in ms. Any long value is acceptable.
"vfs.s3.connect_max_tries" 5 The maximum tries for a connection. Any long value is acceptable.
"vfs.s3.connect_scale_factor" 25 The maximum tries for a connection. Any long value is acceptable.
"vfs.s3.custom_headers.*" Optional. No default. (Optional) Prefix for custom headers on s3 requests. For each custom header, use "vfs.s3.custom_headers.header_key" = "header_value".
"vfs.s3.logging_level" "Off" The AWS SDK logging level. This is a process-global setting. The configuration of the most recently constructed context will set process state. Log files are written to the process working directory.
"vfs.s3.request_timeout_ms" 3000 The request timeout in ms. Any long value is acceptable.
"vfs.s3.requester_pays" false Whether the requester pays for the S3 access charges.
"vfs.s3.proxy_host" "" The S3 proxy host.
"vfs.s3.proxy_port" 0 The S3 proxy port.
"vfs.s3.proxy_scheme" "http" The S3 proxy scheme.
"vfs.s3.proxy_username" "" The S3 proxy username. Note: this parameter is not serialized by tiledb_config_save_to_file.
"vfs.s3.proxy_password" "" The S3 proxy password. Note: this parameter is not serialized by tiledb_config_save_to_file.
"vfs.s3.verify_ssl" true Enable HTTPS certificate verification.
"vfs.s3.no_sign_request" false Make unauthenticated requests to S3.
"vfs.s3.sse" "" The server-side encryption algorithm to use. Supported non-empty values are "aes256" and "kms" (AWS Key Management Service).
"vfs.s3.sse_kms_key_id" "" The server-side encryption key to use if "vfs.s3.sse" is set to "kms" (AWS Key Management Service).
"vfs.s3.storage_class" "NOT_SET" The storage class to use for the newly uploaded S3 objects. The set of accepted values is found in the Aws::S3::Model::StorageClass enumeration but are included here for reference:
- "NOT_SET"
- "STANDARD"
- "REDUCED_REDUNDANCY"
- "STANDARD_IA"
- "ONEZONE_IA"
- "INTELLIGENT_TIERING"
- "GLACIER"
- "DEEP_ARCHIVE"
- "OUTPOSTS"
- "GLACIER_IR"
- "SNOW"
- "EXPRESS_ONEZONE"
"vfs.s3.bucket_canned_acl" "NOT_SET" Names of values found in Aws::S3::Model::BucketCannedACL enumeration:
- "NOT_SET"
- "private_"
- "public_read"
- "public_read_write"
- "authenticated_read"
"vfs.s3.object_canned_acl" "NOT_SET" Names of values found in Aws::S3::Model::ObjectCannedACL enumeration (The first 5 are the same as for "vfs.s3.bucket_canned_acl"):
- "NOT_SET"
- "private_"
- "public_read"
- "public_read_write"
- "authenticated_read"
The following three items are found only in Aws::S3::Model::ObjectCannedACL:
- "aws_exec_read"
- "owner_read"
- "bucket_owner_full_control"
"vfs.s3.config_source" "auto" Force S3 SDK to only load config options from a set source. The supported options are as follows:
- "auto" (TileDB config options are considered first, then SDK-defined precedence: environment variables, configuration files, EC2 metadata)
- config_files (forces SDK to only consider options found in AWS config files)
- sts_profile_with_web_identity (force SDK to consider assume roles/STS from config files with support for web tokens, commonly used by EKS/ECS)
"vfs.s3.install_sigpipe_handler" true When set to true, the S3 SDK uses a handler that ignores SIGPIPE signals.

Azure Blob Storage

Parameter Default value Description
"vfs.azure.storage_account_name" "" Set the name of the Azure Storage account to use.
"vfs.azure.storage_account_key" "" Set the Shared Key to authenticate to Azure Storage.
"vfs.azure.storage_sas_token" "" Set the Azure Storage SAS (shared access signature) token to use. If this option is set along with vfs.azure.blob_endpoint, the latter must not include a SAS token.
"vfs.azure.blob_endpoint" "" Set the default Azure Storage Blob endpoint.
If not specified, it will take a value of https://<account_name>.blob.core.windows.net, where <account_name> is the value of the vfs.azure.storage_account_name option. This means that at least one of these two options must be set (or both if shared key authentication is used).
"vfs.azure.block_list_block_size" "5242880" The block size (in bytes) used in Azure blob block list writes. Any uint64_t value is acceptable. Note: vfs.azure.block_list_block_size vfs.azure.max_parallel_ops bytes will be buffered before issuing block uploads in parallel.
"vfs.azure.max_parallel_ops" The value of sm.io_concurrency_level The maximum number of Azure backend parallel operations.
"vfs.azure.use_block_list_upload" true Whether the Azure backend can use chunked block uploads.
"vfs.azure.max_retries" 5 The maximum number of times to retry an Azure network request.
"vfs.azure.retry_delay_ms" 800 The minimum permissible delay between Azure network request retries, in milliseconds.
"vfs.azure.max_retry_delay_ms" 60000 The maximum permissible delay between Azure network request retries, in milliseconds.

Google Cloud Storage (GCS)

Parameter Default value Description
"vfs.gcs.project_id" "" Set the GCS project ID to create new buckets to. Not required unless you are going to use the VFS to create buckets.
"vfs.gcs.service_account_key"_ "" Set the JSON string with GCS service account key. Takes precedence over vfs.gcs.workload_identity_configuration if both are specified. If neither is specified, Application Default Credentials will be used.
"vfs.gcs.workload_identity_configuration"_ "" Set the JSON string with Workload Identity Federation configuration. vfs.gcs.service_account_key takes precedence over this if both are specified. If neither is specified, Application Default Credentials will be used.
"vfs.gcs.impersonate_service_account"_ "" Set the GCS service account to impersonate. You can set more than one impersonated account by separating each service account name with a comma.
"vfs.gcs.multi_part_size" 5242880 The part size (in bytes) used in GCS multi part writes. Any uint64_t value is acceptable. Note: vfs.gcs.multi_part_size _ vfs.gcs.max_parallel_opsbytes will be buffered before issuing part uploads in parallel.
"vfs.gcs.max_parallel_ops" The value ofsm.io_concurrency_level The maximum number of GCS backend parallel operations.
"vfs.gcs.use_multi_part_upload" true Controls whether the GCS backend can use chunked part uploads.
"vfs.gcs.request_timeout_ms" 3000 The maximum amount of time to retry network requests to GCS.
"vfs.gcs.max_direct_upload_size" 10737418240(10 GB) The maximum size in bytes of a direct upload to GCS. Ignored ifvfs.gcs.use_multi_part_upload is set to true.
"vfs.gcs.endpoint" | "" | The GCS endpoint. |

REST

Parameter Default value Description
"rest.server_address" "https://api.tiledb.com" URL of the REST server to use for remote arrays.
"rest.server_serialization_format" "CAPNP" Serialization format to use for remote array requests ("CAPNP" or "JSON").
"rest.username" "" Username for logging in to the REST server (not recommended, use "rest.token" instead).
"rest.password" "" Password for logging in to the REST server (not recommended, use "rest.token" instead).
"rest.token" "" Authentication token for REST server. Visit API Tokens for more information.
"rest.resubmit_incomplete" true If true, incomplete queries received from server are automatically resubmitted before returning to user control.
"rest.creation_access_credentials_name" No default set. The name of the registered access key to use for creation of the REST server.
"rest.retry_http_codes" "503" A comma-separated list of HTTP status codes to automatically retry a REST request for.
"rest.retry_count" "25" Number of times to retry failed REST requests.
"rest.retry_initial_delay_ms" 500 Initial delay in milliseconds to wait until retrying a REST request.
"rest.retry_delay_factor" 1.25 The delay factor to exponentially wait until further retries of a failed REST request.
"rest.curl.verbose" false Set curl to run in verbose mode for REST requests. curl will print to stdout with this option.
"rest.load_metadata_on_array_open" true If true, array metadata will be loaded and sent to server together with the open array.
"rest.load_non_empty_domain_on_array_open" true If true, array non empty domain will be loaded and sent to server together with the open array.
"rest.load_enumerations_on_array_open" false If true, enumerations will be loaded for the latest array schema and sent to server together with the open array.
"rest.load_enumerations_on_array_open_all_schemas" false If true, enumerations will be loaded for all array schemas and sent to the server together with the open array.
"rest.use_refactored_array_open" true If true, the new, experimental REST routes and APIs for opening an array will be used.
"rest.use_refactored_array_open_and_query_submit" true If true, the new, experimental REST routes and APIs for opening an array and submitting a query will be used.
"rest.curl.buffer_size" 524288 (512 KB) Set the curl buffer size for REST requests.
"rest.capnp_traversal_limit" 2147483648 (2 GB) The Cap’n Proto (CAPNP) traversal limit used in the deserialization of messages (in bytes).
"rest.custom_headers.*" Optional. No Default (Optional) Prefix for custom headers on REST requests. For each custom header, use "rest.custom_headers.header_key" = "header_value".
"rest.payer_namespace" No default set. The namespace that should be charged for the request.
"rest.http_compressor" "any" Compression used in HTTP requests.

SSL

Parameter Default value Description
"ssl.ca_path" "" The path to a directory with CA certificates to use when validating server certificates. Applies to all SSL/TLS connections. Platforms that have native certificate stores, such as Windows, might ignore this option.
"ssl.ca_file" "" The path to CA certificate to use when validating server certificates. Applies to all SSL/TLS connections. Platforms that have native certificate stores, such as Windows, might ignore this option.
"ssl.verify" true Whether to verify the server’s certificate. Applies to all SSL/TLS connections. Disabling verification is insecure and should only used for testing purposes.
Catching Errors
Basic S3 Example