Task Graphs
We recommend that you familiarize yourself with UDFs before reading this section, as task graphs make heavy use of that primitive.
TileDB Cloud allows you to build arbitrary (directed, acyclic) task graphs to combine any number of different tasks into one workflow. You can combine serverless UDFs, SQL, and array access, along with even local execution of any function. Task graphs provide the foundation for you to build powerful workflows for your organization.
TileDB’s task graphs provide several benefits:
- Scalability: Task graphs allow you to define workflows with hundreds of concurrent jobs and thousands of total tasks. Each task can define its own computational resources, allowing for efficient use of resources.
- Flexibility: Task graphs offer the ability to handle both large scale
BATCH
operations as well as lower-latencyREALTIME
requests.REALTIME
: The default mode of operation,REALTIME
, is designed to return results directly to the client, with an emphasis on low latency.REALTIME
task graphs are scheduled and executed immediately and are well-suited for fast, distributed workloads.BATCH
: In contrast toREALTIME
task graphs,BATCH
task graphs are designed for large, resource-intensive, asynchronous workloads.BATCH
task graphs are defined, uploaded, and scheduled for execution and are well-suited for ingestion-style workloads.
For all the details on how to use task graphs in TileDB, read the API Usage section.
For more information about managing task graph assets, visit Catalog: Task Graphs. Task graphs that are run have detailed logs collected. This allows for viewing details such as who ran the task graph, what were the costs, view logs associated with each task and more. For more information on the logging available for tasks and task graphs, visit Analyze: Monitor.
Several examples of task graph usage are also available in the following sections: