Aggregates
Aggregates are a powerful feature that allow you to perform operations on data directly in TileDB.
TileDB supports computation of aggregate queries on arrays, such as:
- Count (
UINT64
) - Sum (Signed fields:
INT64
, Unsigned fields:UINT64
, Floating point fields:FLOAT64
) - Min/Max (same as input type)
- Null count (
UINT64
) - Mean (
FLOAT64
)
The advantage of pushing aggregate computations down to TileDB (as opposed to only slicing array values and passing the result to another system for computations) is that TileDB provides extra optimizations that can significantly boost performance:
- TileDB performs aggregation over the qualifying array values in parallel using multi-threading.
- TileDB stores valuable tile metadata upon ingestion, which can help in efficiently computing the specified aggregate even without fetching and decompressing the relevant tiles from storage.
Aggregates work in conjunction with other filter operations, such as dimension ranges and other query conditions. Visit the Tutorials: Aggregates section for an example on performing aggregate queries in TileDB.