Biomedical Imaging Reads
This section covers the key concepts related to reading variant data from a TileDB-BioImaging dataset. Visit the Arrays Key Concepts: Reads section for more foundational information about reading TileDB arrays.
Slice
As described in the storage format spec, each TileDB-BioImaging group includes multiple data arrays, which hold the image data of the image’s multiple resolutions. TileDB-BioImaging offers a set of methods to read these arrays that are familiar to people working with whole-slide bioimaging data, since it wraps the OpenSlide API. For example, you can read a slice of the image at a specific resolution by specifying the following:
- A location tuple giving the top-left pixel of the highest resolution reference frame.
- A resolution level.
- A region size tuple.
The array data model of TileDB is optimized to perform this type of slicing, which is a powerful feature for data exploration and discovery.
Visit the Basic Queries tutorial for examples of TileDB-BioImaging read queries.
You can slice an image stored as a TileDB-BioImaging dataset by either using the TileDB-Py API for groups and arrays or the dedicated TileDBOpenSlide API offered in TileDB-BioImaging. Both APIs are covered in the following tutorials:
Scale with performance
For TileDB-BioImaging datasets with multiple resolution levels, it is more efficient to slice image data by using the distributed compute architecture of TileDB. Partitioning and distributing a read query across multiple compute nodes has these advantages:
- Each query accesses a subset of the total data, reducing the required compute resources (CPU and memory) and the query latency.
- The queries run in parallel, taking advantage of independent compute resources and reducing the total query latency.
TileDB offers full automation of scalable reads.