Encryption

arrays

foundation

encryption

TileDB supports encrypting arrays using AES-256-GCM to protect your organization’s most sensitive data.

Note

It is strongly recommended to read the following sections before you learn about encryption.

How encryption works

TileDB allows you to encrypt your arrays at rest. It currently supports a single type of encryption, AES-256 in the GCM mode, which is a symmetric, authenticated encryption algorithm. When creating, reading, or writing arrays, you must provide the same 256-bit encryption key. The authenticated nature of the encryption scheme means that a message authentication code (MAC) is stored together with the encrypted data, allowing verification that the persisted ciphertext was not modified.

Encryption is specified at the array level upon creation, and applies to each physical data tile across all attributes, coordinates, and offsets for variable-length attributes. TileDB further partitions each physical data tile into chunks of size typically equal to the L1 cache, which are streamed into the encryption process.

Encryption libraries used:

macOS and Linux: OpenSSL
Windows: Next generation cryptography (CNG)

Warning

By default, TileDB caches array data and metadata in main memory after opening and reading from arrays. These caches will store decrypted (plaintext) array data in the case of encrypted arrays. For a bit of extra in-flight security (at the cost of performance), you can disable the TileDB caches (visit Tutorials: Configuration).

Encryption key lifetime

TileDB never persists the encryption key, but TileDB does store a copy of the encryption key in main memory while an encrypted array is open. When the array is closed, TileDB will zero out the memory used to store its copy of the key, and free the associated memory.

Performance

Due to the extra processing required to encrypt and decrypt array metadata and attribute data, you may experience lower performance on opening, reading, and writing for encrypted arrays.

To mitigate this, TileDB internally parallelizes encryption and decryption using a chunking strategy. Additionally, when compression or other filtering is configured on array metadata or attribute data, encryption occurs last, meaning the compressed data (or filtered in general) is what gets encrypted.

Finally, newer generations of some Intel and AMD processors offer instructions for hardware acceleration of encryption and decryption. The encryption libraries that TileDB employs are configured to use hardware acceleration if it is available.