tdm.dataset.Dataset#
- class tdm.dataset.Dataset[source]#
Base class for all datasets.
A dataset maps cell types to features & labels.
Note
- A dataset is typically constructed based on one of the following sources:
- Tissue:
Used for direct computations on tissue cells, such as counting neighbors (see: NeighborsDataset)
- Dataset:
Typically used for transforming features (see: PolynomialDataset)
- A list of Datasets:
Used for combining datasets (see: ConcatDataset)
- __init__() None [source]#
Initializes the Dataset with a dictionary mapping cell type to features and obs.
- dataset_dict:
- key:
cell_type (str)
- value:
features: dataframe with shape (n_cells, n_features)
observations: dataframe with shape (n_cells, 2) holding observations. columns: division, death
- cell_types() list[str] [source]#
Returns the cell types in the dataset.
See: tdm.tissue.cell_types.CELL_TYPES_ARRAY for possible values.
- construct_features_from_counts(cell_counts: dict[str, float | Sequence[float]], target_cell: str, **kwargs) DataFrame [source]#
Constructs features compatible with construct_polynomial_features Input is in raw values!
- fetch(cell_type: str) tuple[DataFrame, DataFrame] [source]#
Returns the features and observations associated with a cell type.
- Parameters:
cell_type – a str from tdm.tissue.cell_types.CELL_TYPES_ARRAY
- Returns:
features: dataframe with shape (n_cells, n_features)
observations: dataframe with shape (n_cells, 2) holding observations. columns: division, death
- Return type:
features, observations (tuple)
- fetch_all() tuple[DataFrame, DataFrame] [source]#
Returns features and observations from all cell types, concatenated.
- Returns:
features: dataframe with shape (n_cells, n_features)
observations: dataframe with shape (n_cells, 2) holding observations. columns: division, death
- Return type:
features, observations (tuple)
- n_cells(cell_type: str | None = None) int [source]#
Returns the number of cells of cell_type in the dataset, or all cell types if cell_type = None
- n_features() int [source]#
Returns the number of features in the dataset.
Warning
Fails if there are different numbers of features for different cell types
- n_obs(cell_type: str, obs: Literal['division', 'death']) int [source]#
Returns the number of division or death events
- set_dataset(cell_type: str, features: DataFrame, obs: DataFrame)[source]#
Manually write the features and obs for a cell type.
- Parameters:
cell_type (str) – string identifier of a cell type.
features (pd.DataFrame) – dataframe with shape (n_cells, n_features)
obs (pd.DataFrame) – dataframe with shape (n_cells, 2) holding observations. columns: division, death