1. Structure
  2. Life Sciences
  3. Single-cell
  4. Spatial
  5. Tutorials
  6. Manage Coordinate Spaces
  • Home
  • What is TileDB?
  • Get Started
  • Explore Content
  • Accounts
    • Individual Accounts
      • Apply for the Free Tier
      • Profile
        • Overview
        • Cloud Credentials
        • Storage Paths
        • REST API Tokens
        • Credits
    • Organization Admins
      • Create an Organization
      • Profile
        • Overview
        • Members
        • Cloud Credentials
        • Storage Paths
        • Billing
      • API Tokens
    • Organization Members
      • Organization Invitations
      • Profile
        • Overview
        • Members
        • Cloud Credentials
        • Storage Paths
        • Billing
      • API Tokens
  • Catalog
    • Introduction
    • Data
      • Arrays
      • Tables
      • Single-Cell (SOMA)
      • Genomics (VCF)
      • Biomedical Imaging
      • Vector Search
      • Files
    • Code
      • Notebooks
      • Dashboards
      • User-Defined Functions
      • Task Graphs
      • ML Models
    • Groups
    • Marketplace
    • Search
  • Collaborate
    • Introduction
    • Organizations
    • Access Control
      • Introduction
      • Share Assets
      • Asset Permissions
      • Public Assets
    • Logging
    • Marketplace
  • Analyze
    • Introduction
    • Slice Data
    • Multi-Region Redirection
    • Notebooks
      • Launch a Notebook
      • Usage
      • Widgets
      • Notebook Image Dependencies
    • Dashboards
      • Dashboards
      • Streamlit
    • Preview
    • User-Defined Functions
    • Task Graphs
    • Serverless SQL
    • Monitor
      • Task Log
      • Task Graph Log
  • Scale
    • Introduction
    • Task Graphs
    • API Usage
  • Structure
    • Why Structure Is Important
    • Arrays
      • Introduction
      • Quickstart
      • Foundation
        • Array Data Model
        • Key Concepts
          • Storage
            • Arrays
            • Dimensions
            • Attributes
            • Cells
            • Domain
            • Tiles
            • Data Layout
            • Compression
            • Encryption
            • Tile Filters
            • Array Schema
            • Schema Evolution
            • Fragments
            • Fragment Metadata
            • Commits
            • Indexing
            • Array Metadata
            • Datetimes
            • Groups
            • Object Stores
          • Compute
            • Writes
            • Deletions
            • Consolidation
            • Vacuuming
            • Time Traveling
            • Reads
            • Query Conditions
            • Aggregates
            • User-Defined Functions
            • Distributed Compute
            • Concurrency
            • Parallelism
        • Storage Format Spec
      • Tutorials
        • Basics
          • Basic Dense Array
          • Basic Sparse Array
          • Array Metadata
          • Compression
          • Encryption
          • Data Layout
          • Tile Filters
          • Datetimes
          • Multiple Attributes
          • Variable-Length Attributes
          • String Dimensions
          • Nullable Attributes
          • Multi-Range Reads
          • Query Conditions
          • Aggregates
          • Deletions
          • Catching Errors
          • Configuration
          • Basic S3 Example
          • Basic TileDB Cloud
          • fromDataFrame
          • Palmer Penguins
        • Advanced
          • Schema Evolution
          • Advanced Writes
            • Write at a Timestamp
            • Get Fragment Info
            • Consolidation
              • Fragments
              • Fragment List
              • Consolidation Plan
              • Commits
              • Fragment Metadata
              • Array Metadata
            • Vacuuming
              • Fragments
              • Commits
              • Fragment Metadata
              • Array Metadata
          • Advanced Reads
            • Get Fragment Info
            • Time Traveling
              • Introduction
              • Fragments
              • Array Metadata
              • Schema Evolution
          • Array Upgrade
          • Backends
            • Amazon S3
            • Azure Blob Storage
            • Google Cloud Storage
            • MinIO
            • Lustre
          • Virtual Filesystem
          • User-Defined Functions
          • Distributed Compute
          • Result Estimation
          • Incomplete Queries
        • Management
          • Array Schema
          • Groups
          • Object Management
        • Performance
          • Summary of Factors
          • Dense vs. Sparse
          • Dimensions vs. Attributes
          • Compression
          • Tiling and Data Layout
          • Tuning Writes
          • Tuning Reads
      • API Reference
    • Tables
      • Introduction
      • Quickstart
      • Foundation
        • Data Model
        • Key Concepts
          • Indexes
          • Columnar Storage
          • Compression
          • Data Manipulation
          • Optimize Tables
          • ACID
          • Serverless SQL
          • SQL Connectors
          • Dataframes
          • CSV Ingestion
      • Tutorials
        • Basics
          • Ingestion with SQL
          • CSV Ingestion
          • Basic S3 Example
          • Running Locally
        • Advanced
          • Scalable Ingestion
          • Scalable Queries
      • API Reference
    • AI & ML
      • Vector Search
        • Introduction
        • Quickstart
        • Foundation
          • Data Model
          • Key Concepts
            • Vector Search
            • Vector Databases
            • Algorithms
            • Distance Metrics
            • Updates
            • Deployment Methods
            • Architecture
            • Distributed Compute
          • Storage Format Spec
        • Tutorials
          • Basics
            • Ingestion & Querying
            • Updates
            • Deletions
            • Basic S3 Example
            • Running Locally
          • Advanced
            • Versioning
            • Time Traveling
            • Consolidation
            • Distributed Compute
            • RAG LLM
            • LLM Memory
            • File Search
            • Image Search
            • Protein Search
          • Performance
        • API Reference
      • ML Models
        • Introduction
        • Quickstart
        • Foundation
          • Basics
          • Storage
          • Cloud Execution
          • Why TileDB for Machine Learning
        • Tutorials
          • Ingestion
            • Data Ingestion
              • Dense Datasets
              • Sparse Datasets
            • ML Model Ingestion
          • Management
            • Array Schema
            • Machine Learning: Groups
            • Time Traveling
    • Life Sciences
      • Single-cell
        • Introduction
        • Quickstart
        • Foundation
          • Data Model
          • Key Concepts
            • Data Structures
            • Use of Apache Arrow
            • Join IDs
            • State Management
            • TileDB Cloud URIs
          • SOMA API Specification
        • Tutorials
          • Data Ingestion
          • Bulk Ingestion Tutorial
          • Data Access
          • Distributed Compute
          • Basic S3 Example
          • Multi-Experiment Queries
          • Appending Data to a SOMA Experiment
          • Add New Measurements
          • SQL Queries
          • Running Locally
          • Shapes in TileDB-SOMA
          • Drug Discovery App
        • Spatial
          • Introduction
          • Foundation
            • Spatial Data Model
            • Data Structures
          • Tutorials
            • Spatial Data Ingestion
            • Access Spatial Data
            • Manage Coordinate Spaces
        • API Reference
      • Population Genomics
        • Introduction
        • Quickstart
        • Foundation
          • Data Model
          • Key Concepts
            • The N+1 Problem
            • Architecture
            • Arrays
            • Ingestion
            • Reads
            • Variant Statistics
            • Annotations
            • User-Defined Functions
            • Tables and SQL
            • Distributed Compute
          • Storage Format Spec
        • Tutorials
          • Basics
            • Basic Ingestion
            • Basic Queries
            • Export to VCF
            • Add New Samples
            • Deleting Samples
            • Basic S3 Example
            • Basic TileDB Cloud
          • Advanced
            • Scalable Ingestion
            • Scalable Queries
            • Query Transforms
            • Handling Large Queries
            • Annotations
              • Finding Annotations
              • Embedded Annotations
              • External Annotations
              • Annotation VCFs
              • Ingesting Annotations
            • Variant Statistics
            • Tables and SQL
            • User-Defined Functions
            • Sample Metadata
            • Split VCF
          • Performance
        • API Reference
          • Command Line Interface
          • Python API
          • Cloud API
      • Biomedical Imaging
        • Introduction
        • Foundation
          • Data Model
          • Key Concepts
            • Arrays
            • Ingestion
            • Reads
            • User Defined Functions
          • Storage Format Spec
        • Quickstart
        • Tutorials
          • Basics
            • Ingestion
            • Read
              • OpenSlide
              • TileDB-Py
          • Advanced
            • Batched Ingestion
            • Chunked Ingestion
            • Machine Learning
              • PyTorch
            • Napari
    • Files
  • API Reference
  • Self-Hosting
    • Installation
    • Upgrades
    • Administrative Tasks
    • Image Customization
      • Customize User-Defined Function Images
      • AWS ECR Container Registry
      • Customize Jupyter Notebook Images
    • Single Sign-On
      • Configure Single Sign-On
      • OpenID Connect
      • Okta SCIM
      • Microsoft Entra
  • Glossary

On this page

  • Setup
  • Create and Access Coordinate Spaces
  • Create Spatial Objects with Coordinate Spaces
  • Apply Coordinate Transforms
  1. Structure
  2. Life Sciences
  3. Single-cell
  4. Spatial
  5. Tutorials
  6. Manage Coordinate Spaces

Manage Coordinate Spaces

life sciences
single cell (soma)
spatial
tutorials
Learn how to manage spatial coordinates in TileDB-SOMA.

This tutorial will guide you through the fundamental concepts of coordinate spaces and coordinate transformations within TileDB-SOMA. These concepts are essential for working with any of the spatial SOMA objects, including MultiscaleImage, PointCloudDataFrame, and GeometryDataFrame.

By the end of this tutorial, you’ll understand how to:

  • Define a coordinate space.
  • Associate coordinate spaces with spatial objects.
  • Create and use coordinate transformations.

Setup

First, import the necessary libraries:

  • Python
import tempfile
from typing import Optional, Sequence

import pyarrow as pa
import pyarrow.compute as pc
import tiledbsoma
import tiledbsoma.io
import tiledbsoma.io.spatial
from matplotlib import patches as mplp
from matplotlib import pyplot as plt
from matplotlib.collections import PatchCollection

tiledbsoma.show_package_versions()

Define the TileDB Cloud SaaS URI for a spatial SOMA experiment containing a 10X Visium generated from a mouse brain coronal section.

  • Python
EXPERIMENT_URI = "tiledb://tiledb-inc/CytAssist_FFPE_Mouse_Brain_Rep2"

Open the experiment in read mode:

  • Python
exp = tiledbsoma.Experiment.open(EXPERIMENT_URI)
scene = exp.spatial["scene0"]
scene
<Scene 'tiledb://TileDB-Inc/8aec3f0f-4ec1-40cf-a180-a1ba8a80f49e' (open for 'r') (3 items)
    'img': 'tiledb://TileDB-Inc/fbf2da75-dbcf-4e12-9e0c-2e6516a58bf0' (unopened)
    'obsl': 'tiledb://TileDB-Inc/47b25792-b016-4157-8cf2-ee534b3a2fad' (unopened)
    'varl': 'tiledb://TileDB-Inc/ddd4ae97-9ef3-4579-a0e4-3fbb821134a3' (unopened)>

Create and Access Coordinate Spaces

A coordinate space defines the frame of reference for your spatial data. Think of it as the “grid” or “axes” upon which your data points, images, or shapes are positioned. Each coordinate space has a set of named axes, and each axis has a defined unit of measurement. This information is stored as metadata and is distinct from any coordinates for data located on the coordinate system.

This section demonstrates how to create coordinate spaces in TileDB-SOMA.

The most common scenario is a 2D space, often representing the X and Y dimensions of an image or tissue section. A 2D space with axes named “x” and “y”, using “pixels” as the unit, can be created as follows:

  • Python
coords2d = tiledbsoma.CoordinateSpace(
    axes=[
        tiledbsoma.Axis(name="x", unit="pixels"),
        tiledbsoma.Axis(name="y", unit="pixels"),
    ]
)

coords2d
CoordinateSpace(axes=(Axis(name='x', unit='pixels'), Axis(name='y', unit='pixels')))

The coordinate space defines the meaning of the coordinates, but it doesn’t store the coordinate values themselves. The actual values are stored within the spatial objects like PointCloudDataFrame, GeometryDataFrame, or DenseNDArray within a MultiscaleImage.

The coordinate spaces are stored as metadata on the underlying TileDB groups and arrays. They are accessible via the coordinate_space attribute of any SOMA spatial object. The coordinate space of the scene in the experiment can be accessed as follows:

  • Python
scene.coordinate_space
CoordinateSpace(axes=(Axis(name='x', unit='pixels'), Axis(name='y', unit='pixels')))

Create Spatial Objects with Coordinate Spaces

Now that you know how to define coordinate spaces, this section shows how to associate them with the actual spatial data objects in TileDB-SOMA. You will primarily use the PointCloudDataFrame for these examples, as it’s the most straightforward to visualize, but the same principles apply to GeometryDataFrame and MultiscaleImage.

A small PointCloudDataFrame can be created to represent a few points in a 2D space. The coords2d object defined earlier (with “x” and “y” axes in pixels) will be used.

  • Python
tbl = pa.Table.from_pydict(
    {
        "x": pa.array([10, 20, 15]),
        "y": pa.array([5, 5, 12]),
        "intensity": pa.array([0.8, 0.2, 0.9]),
        "soma_joinid": pa.array([0, 1, 2]),
    }
)


with tiledbsoma.PointCloudDataFrame.create(
    uri=tempfile.mkdtemp(prefix="tiledb-soma-pcdf-"),
    schema=tbl.schema,
    coordinate_space=coords2d,
    domain=[[0, 20], [0, 20], [0, 2]],
) as pcdf:
    pcdf.write(tbl)

pcdf
<PointCloudDataFrame '/tmp/tiledb-soma-pcdf-qa37umnb' (CLOSED for 'w')>

Now open the PointCloudDataFrame and access the coordinate space information:

  • Python
with tiledbsoma.PointCloudDataFrame.open(pcdf.uri, mode="r") as pcdf:
    print(pcdf.coordinate_space)
CoordinateSpace(axes=(Axis(name='x', unit='pixels'), Axis(name='y', unit='pixels')))

The output confirms that the PointCloudDataFrame is associated with the 2D coordinate space defined earlier.

Apply Coordinate Transforms

Coordinate transforms are essential for relating data that exists in different coordinate spaces. A coordinate transform defines a mapping from one coordinate space to another. In TileDB-SOMA, transforms are applied to data when you read it. They are not stored as part of the data itself. This allows the data to be written on a native coordinate system but read in other coordinate systems.

To prove this, you will project both the Visium spots and images to a common coordinate space. First, retrieve the coordinate space of from the SOMAScene accessed earlier:

  • Python
scene.coordinate_space
CoordinateSpace(axes=(Axis(name='x', unit='pixels'), Axis(name='y', unit='pixels')))

Retrieve the coordinate transform from the scene’s coordinate space to the PointCloudDataFrame’s coordinate space.

  • Python
scene.get_transform_to_point_cloud_dataframe("loc")
<IdentityTransform
  input axes: ('x', 'y')
  output axes: ('x', 'y')>

This returns an IdentityTransform, because the PointCloudDataFrame’s coordinate space is the same as the scene’s coordinate space.

Next retrieve the coordinate transform from the scene’s coordinate space to the MultiscaleImage’s coordinate space.

  • Python
fullres_to_image = scene.get_transform_to_multiscale_image("tissue")
fullres_to_image
<ScaleTransform
  input axes: ('x', 'y')
  output axes: ('x', 'y')
  scales: [0.07127211 0.07127076]>

Here, TileDB-SOMA returns a ScaleTransform, because the MultiscaleImage’s coordinate space was downsampled relative to the full resolution image used to map the Visium spots.

You can use these transforms to project the Visium spots onto a region of interest in the high-resolution image. This will require you to do the following:

  • Read the region from the high-resolution level of the MultiScaleImage.
  • Read the same region from the PointCloudDataFrame storing the spot locations.
  • Use the transformation from the MultiscaleImage to scale align the spot locations with the image.

First, specify the region of interest that encompasses the hippocampus.

  • Python
x_min, x_max = (12000, 19000)
y_min, y_max = (7500, 11500)

hippo_region = [x_min, y_min, x_max, y_max]

Use the bounding box to perform a spatial query on the loc dataframe to get the spots within the region of interest.

  • Python
region_spots = (
    scene.obsl["loc"].read_spatial_region(region=hippo_region).data.concat().to_pandas()
)

region_spots
x y soma_joinid in_tissue array_row array_col spot_diameter_fullres
0 12108 7544 459 1 54 68 255.860716
1 12126 8180 220 1 52 68 255.860716
2 12145 8816 875 1 50 68 255.860716
3 12164 9452 729 1 48 68 255.860716
4 12182 10087 2072 1 46 68 255.860716
... ... ... ... ... ... ... ...
223 18932 9572 290 1 47 105 255.860716
224 18950 10208 285 1 45 105 255.860716
225 18777 10531 1569 1 44 104 255.860716
226 18988 11479 102 1 41 105 255.860716
227 18604 10854 2102 1 43 103 255.860716

228 rows × 7 columns

Now, use the read_spatial_region() method to load only the part of the image that has the hippocampus. When you pass the ScaleTransform object to the region_transform argument it automatically scales the region selected to the downsampled resolution.

  • Python
hires_read = scene.img["tissue"].read_spatial_region(
    level=0, region=hippo_region, region_transform=fullres_to_image
)
hires_read
SpatialRead(data=<pyarrow.Tensor>
type: uint8
shape: (287, 501, 3)
strides: (1503, 3, 1), data_coordinate_space=CoordinateSpace(axes=(Axis(name='x', unit='pixels'), Axis(name='y', unit='pixels'))), output_coordinate_space=CoordinateSpace(axes=(Axis(name='x', unit=None), Axis(name='y', unit=None))), coordinate_transform=<AffineTransform
  input axes: ('x', 'y')
  output axes: ('x', 'y')
  augmented matrix:
    [[1.40307329e+01 0.00000000e+00 1.19962766e+04]
     [0.00000000e+00 1.40310000e+01 7.49255400e+03]
     [0.00000000e+00 0.00000000e+00 1.00000000e+00]]>)

This returns a SpatialRead object that includes both the image data and the coordinate transform that scales and translates the returned data to the original coordinate space you queried. You can use the inverse of this transformation to convert the spot data into the coordinate space of the image data. Underneath the hood, the transforms are accessible directly as matrices:

  • Python
spot_to_hires_matrix = (
    hires_read.coordinate_transform.inverse_transform().augmented_matrix
)
spot_to_hires_matrix
array([[ 7.12721146e-02,  0.00000000e+00, -8.55000000e+02],
       [ 0.00000000e+00,  7.12707576e-02, -5.34000000e+02],
       [ 0.00000000e+00,  0.00000000e+00,  1.00000000e+00]])

You can use this matrix to manually project the data from the spot dataframe to the hires image’s resolution.

  • Python
scale_x = spot_to_hires_matrix[0, 0]
scale_y = spot_to_hires_matrix[1, 1]
offset_x = spot_to_hires_matrix[0, 2]
offset_y = spot_to_hires_matrix[1, 2]

And now, you can plot the spots and image data together.

  • Python
radius = scene.obsl["loc"].metadata["soma_geometry"]
spot_patches = PatchCollection(
    [
        mplp.Ellipse(
            (scale_x * row["x"] + offset_x, scale_y * row["y"] + offset_y),
            width=radius * scale_x,
            height=radius * scale_y,
            fill=False,
            alpha=0.8,
        )
        for _, row in region_spots.iterrows()
    ]
)

fig, ax = plt.subplots()
ax.imshow(hires_read.data)
ax.add_collection(spot_patches)

plt.show()

Access Spatial Data
API Reference