Learn how to retrieve the information of a written fragment.
How to run this tutorial
You can run this tutorial in two ways:
Locally on your machine.
On TileDB Cloud.
However, since TileDB Cloud has a free tier, we strongly recommend that you sign up and run everything there, as that requires no installations or deployment.
This tutorial shows how to retrieve the information of a written fragment, which can be particularly useful in time traveling. For more information on time traveling, visit the following sections:
First, import the necessary libraries, set the array URI (i.e., its path, which in this tutorial will be on local storage), and delete any previously created arrays with the same name.
# Create the two dimensionsd1 = tiledb.Dim(name="d1", domain=(0, 3), tile=2, dtype=np.int32)d2 = tiledb.Dim(name="d2", domain=(0, 3), tile=2, dtype=np.int32)# Create a domain using the two dimensionsdom = tiledb.Domain(d1, d2)# Create an attributea = tiledb.Attr(name="a", dtype=np.int32)# Create the array schema with `sparse=True`.sch = tiledb.ArraySchema(domain=dom, sparse=True, attrs=[a])# Create the array on disk (it will initially be empty)tiledb.Array.create(array_uri, sch)
# Create the two dimensionsd1 <-tiledb_dim("d1", c(0L, 3L), 2L, "INT32")d2 <-tiledb_dim("d2", c(0L, 3L), 2L, "INT32")# Create a domain using the two dimensionsdom <-tiledb_domain(dims =c(d1, d2))# Create an attributea <-tiledb_attr("a", type ="INT32")# Create the array schema, setting `sparse = TRUE`sch <-tiledb_array_schema(dom, a, sparse =TRUE)# Create the array on disk (it will initially be empty)arr <-tiledb_array_create(array_uri, sch)
Populate the TileDB array using a set of 1D input arrays: one for the coordinates of each dimension, and one for the attribute values (TileDB sparse arrays expect the COO format). Observe that you can now retrieve the information of the fragment after the write operation.
# Prepare some data in numpy arraysd1_data = np.array([2, 0, 3, 2, 0, 1], dtype=np.int32)d2_data = np.array([0, 1, 1, 2, 3, 3], dtype=np.int32)a_data = np.array([4, 1, 6, 5, 2, 3], dtype=np.int32)# Open the array in write mode and write the data in COO format.# NOTE: You can get the fragment info after the writewith tiledb.open(array_uri, "w") as A: A[d1_data, d2_data] = a_dataprint(A.last_write_info)
# Prepare some data in an arrayd1_data <-c(2L, 0L, 3L, 2L, 0L, 1L)d2_data <-c(0L, 1L, 1L, 2L, 3L, 3L)a_data <-c(4L, 1L, 6L, 5L, 2L, 3L)# Declare a timestampts <-1# Open the array for writing and write data to the arrayarr <-tiledb_array(uri = array_uri,query_type ="WRITE",keep_open =TRUE)arr[d1_data, d2_data] <- a_data# Close the arrayarr <-tiledb_array_close(arr)# Get fragment infofi <-tiledb_fragment_info(array_uri)cat("Numbmer of fragments:", tiledb_fragment_info_get_num(fi), "\n")cat("Fragment URI:", tiledb_fragment_info_uri(fi, 0), "\n")cat("Fragment timestamp range:",format(as.POSIXct(tiledb_fragment_info_get_timestamp_range(fi, 0),tz ="UTC" ) ))
Numbmer of fragments: 1
Fragment URI: file:///Users/nickv/write_at_a_timestamp_r/__fragments/__1721239751822_1721239751822_29706807e4f663883b3a8ded1afab3d1_21
Fragment timestamp range: 2024-07-17 14:09:11 2024-07-17 14:09:11
The array is a folder in the path specified in array_uri. The contents are explained in other sections of the Academy, but you can now notice that the fragment name in the fragments directory is the same as printed above, with the two numbers corresponding to the timestamps of the fragment.