However, since TileDB Cloud has a free tier, we strongly recommend that you sign up and run everything there, as that requires no installations or deployment.
This tutorial shows how to retrieve the information of all written fragments in an array, which can be particularly useful in time traveling. For more information on time traveling, visit the following sections:
First, import the necessary libraries, set the array URI (i.e., its path, which in this tutorial will be on local storage), and delete any previously created arrays with the same name.
# Create the two dimensionsd1 = tiledb.Dim(name="d1", domain=(0, 3), tile=2, dtype=np.int32)d2 = tiledb.Dim(name="d2", domain=(0, 3), tile=2, dtype=np.int32)# Create a domain using the two dimensionsdom = tiledb.Domain(d1, d2)# Create an attributea = tiledb.Attr(name="a", dtype=np.int32)# Create the array schema with `sparse=True`.sch = tiledb.ArraySchema(domain=dom, sparse=True, attrs=[a])# Create the array on disk (it will initially be empty)tiledb.Array.create(array_uri, sch)
# Create the two dimensionsd1 <-tiledb_dim("d1", c(0L, 3L), 2L, "INT32")d2 <-tiledb_dim("d2", c(0L, 3L), 2L, "INT32")# Create a domain using the two dimensionsdom <-tiledb_domain(dims =c(d1, d2))# Create an attributea <-tiledb_attr("a", type ="INT32")# Create the array schema with `sparse = TRUE`sch <-tiledb_array_schema(dom, a, sparse =TRUE)# Create the array on disk (it will initially be empty)arr <-tiledb_array_create(array_uri, sch)
Populate the array using a set of 1D arrays, one for the coordinates of each dimension, and one for the attribute values (TileDB sparse arrays expect the COO format).
# Prepare some data in numpy arrays for the first writed1_data = np.array([2, 0, 3, 2, 0, 1], dtype=np.int32)d2_data = np.array([0, 1, 1, 2, 3, 3], dtype=np.int32)a_data = np.array([4, 1, 6, 5, 2, 3], dtype=np.int32)# Open the array in write mode and write the data in COO format.with tiledb.open(array_uri, "w") as A: A[d1_data, d2_data] = a_data
# Prepare some data in an arrayd1_data <-c(2L, 0L, 3L, 2L, 0L, 1L)d2_data <-c(0L, 1L, 1L, 2L, 3L, 3L)a_data <-c(4L, 1L, 6L, 5L, 2L, 3L)# Open the array for writing and write data to the arrayarr <-tiledb_array(uri = array_uri,query_type ="WRITE")arr[d1_data, d2_data] <- a_data# Close the arrayinvisible(tiledb_array_close(arr))
# Prepare some data in numpy arrays for the second writed1_data = np.array([0], dtype=np.int32)d2_data = np.array([3], dtype=np.int32)a_data = np.array([10], dtype=np.int32)# Open the array in write mode and write the data in COO format.with tiledb.open(array_uri, "w") as A: A[d1_data, d2_data] = a_data
# Prepare some data in an arrayd1_data <-0Ld2_data <-3La_data <-10L# Open the array for writing and write data to the arrayarr <-tiledb_array(uri = array_uri,query_type ="WRITE")arr[d1_data, d2_data] <- a_data# Close the arrayinvisible(tiledb_array_close(arr))
The array is a folder in the path specified in array_uri. The contents are explained in other sections of the Academy, but notice that the two write operations created two fragments in the fragments directory.
# Get fragment infofragments_info = tiledb.array_fragments(array_uri)# Number of fragmentsprint(len(fragments_info))# URI of given fragment, with 0 <= idx < numfragprint(fragments_info.uri[0])print(fragments_info.uri[1])# Timestamp range of given fragment, with 0 <= idx < numfragprint(fragments_info.timestamp_range[0])print(fragments_info.timestamp_range[1])
# Get fragment infofragments_info <-tiledb_fragment_info(array_uri)# Number of fragmentscat(tiledb_fragment_info_get_num(fragments_info), "\n")# URI of given fragment, with 0 <= idx < numfragcat(tiledb_fragment_info_uri(fragments_info, 0), "\n")cat(tiledb_fragment_info_uri(fragments_info, 1), "\n")# Timestamp range of given fragment, with 0 <= idx < numfragcat(format(as.POSIXct(tiledb_fragment_info_get_timestamp_range(fragments_info, 0)) ),"\n")cat(format(as.POSIXct(tiledb_fragment_info_get_timestamp_range(fragments_info, 1)) ),"\n")