tdm.dataset.Dataset#

class tdm.dataset.Dataset[source]#

Base class for all datasets.

A dataset maps cell types to features & labels.

Note

A dataset is typically constructed based on one of the following sources:
  • Tissue:

    Used for direct computations on tissue cells, such as counting neighbors (see: NeighborsDataset)

  • Dataset:

    Typically used for transforming features (see: PolynomialDataset)

  • A list of Datasets:

    Used for combining datasets (see: ConcatDataset)

__init__() None[source]#

Initializes the Dataset with a dictionary mapping cell type to features and obs.

dataset_dict:
  • key:
    • cell_type (str)

  • value:
    • features: dataframe with shape (n_cells, n_features)

    • observations: dataframe with shape (n_cells, 2) holding observations. columns: division, death

cell_types() list[str][source]#

Returns the cell types in the dataset.

See: tdm.tissue.cell_types.CELL_TYPES_ARRAY for possible values.

construct_features_from_counts(cell_counts: dict[str, float | Sequence[float]], target_cell: str, **kwargs) DataFrame[source]#

Constructs features compatible with construct_polynomial_features Input is in raw values!

fetch(cell_type: str) tuple[DataFrame, DataFrame][source]#

Returns the features and observations associated with a cell type.

Parameters:

cell_type – a str from tdm.tissue.cell_types.CELL_TYPES_ARRAY

Returns:

  • features: dataframe with shape (n_cells, n_features)

  • observations: dataframe with shape (n_cells, 2) holding observations. columns: division, death

Return type:

features, observations (tuple)

fetch_all() tuple[DataFrame, DataFrame][source]#

Returns features and observations from all cell types, concatenated.

Returns:

  • features: dataframe with shape (n_cells, n_features)

  • observations: dataframe with shape (n_cells, 2) holding observations. columns: division, death

Return type:

features, observations (tuple)

n_cells(cell_type: str | None = None) int[source]#

Returns the number of cells of cell_type in the dataset, or all cell types if cell_type = None

n_features() int[source]#

Returns the number of features in the dataset.

Warning

Fails if there are different numbers of features for different cell types

n_obs(cell_type: str, obs: Literal['division', 'death']) int[source]#

Returns the number of division or death events

set_dataset(cell_type: str, features: DataFrame, obs: DataFrame)[source]#

Manually write the features and obs for a cell type.

Parameters:
  • cell_type (str) – string identifier of a cell type.

  • features (pd.DataFrame) – dataframe with shape (n_cells, n_features)

  • obs (pd.DataFrame) – dataframe with shape (n_cells, 2) holding observations. columns: division, death