Preprocessing#

Overview#

The goal of preprocessing is to prepare a single-cell dataframe that can be provided as input to the Analysis class.

A processed single-cell dataframe will hold the following columns:

  • x (float) and y (float): spatial coordinates of the cell in the tissue. Standard units are expected (e.g 1 micron = 1e-6)

  • division (bool): a binary label that marks a cell as “curently dividing”.

  • cell_type (str): the cell type (e.g “Fibroblast”)

  • img_num (int, optional): identifier of the tissue sample.

  • subject_id (int | str, optional): identifier of the subject (patient).

Preparing the single-cell dataframe#

check_single_cell_df

Checks that single_cell_df is preprocessed correctly and provides hints in case it isn't.

Defining cell-division events#

The tdm.preprocess.ki67 and tdm.plot.preprocess.ki67 modules provide tools for defining cell-division events based on raw Ki67 measurements. See Tutorial 1 for a detailed walk-through.

Find the background noise level

plot_marker_distributions

Plot the distribution of a marker over multiple cell types.

transform_ki67

Return a single-cell dataframe with standardized Ki67 values above noise, the transformed distributions should have similar shapes.

Define binary cell-division labels

is_dividing

Compute a binary division label for each cell.

Quality control

plot_fraction_of_dividing_cells

Display the fraction of dividing cells of each type.

plot_divisions_per_image

Display the fraction of dividing cells from each image.