Domain
The term “domain” in TileDB may refer to two things:
- The domain of a particular dimension, which defines the permissible values for the cell coordinates on that dimension
- All the dimensions and their domains collectively, which defines where multi-dimensional cells can be created
Unless stated otherwise, the term “domain” in this section refers to the latter (i.e., to the entire hyperspace of the array where cells can exist).
Note the following differences between a dense and a sparse array domain:
In dense arrays, the domain must consist of dimensions of the same data type, which should be an integer basic data type (e.g.,
int32
,uint64
, etc). The domain of each dimension can take negative values as well (if the data type supports them), whereas the domain range can start from a non-zero value. For example, dimension domains[-2, 10]
and[20, 30]
are both valid.In sparse arrays, the dimensions can be of different data types (e.g., one can be
int32
and the otherint64
). In addition, sparse arrays can have floating point or string dimensions. Sparse arrays may also have cell multiplicities (i.e., cells with the exact same coordinates in the domain).
Visit the Array Data Model section for more details.
When you create an array, no matter how small or large its domain, TileDB does not materialize any cell value. The user can invoke multiple write requests and populate different parts of the array (i.e., subarrays), at any time. This means that the user can populate the array incrementally. The tightest hyper-rectangle that contains non-empty cells (in both a dense and sparse array) is called the non-empty domain, and is useful internally for various optimizations in the ingestion and reading operations.