tdm.analysis.Analysis#

class tdm.analysis.Analysis(single_cell_df: DataFrame, cell_types_to_model: list[str] | None = None, neighborhood_size: float = 7.999999999999999e-05, neighborhood_mode: Literal['exclude', 'extrapolate'] = 'exclude', nds_class_kwargs: dict | None = None, allowed_neighbor_types: list[str] | None = None, enforce_max_density: bool = True, max_density_enforcer_power: int = 8, polynomial_dataset_kwargs: dict | None = None, model_class: type[Model] = type[tdm.model.logistic_regression.LogisticRegressionModel], model_kwargs: dict | None = None, end_phase: int = 5, supported_cell_types: list[str] | None = None, verbose: bool = False)[source]#
__init__(single_cell_df: DataFrame, cell_types_to_model: list[str] | None = None, neighborhood_size: float = 7.999999999999999e-05, neighborhood_mode: Literal['exclude', 'extrapolate'] = 'exclude', nds_class_kwargs: dict | None = None, allowed_neighbor_types: list[str] | None = None, enforce_max_density: bool = True, max_density_enforcer_power: int = 8, polynomial_dataset_kwargs: dict | None = None, model_class: type[Model] = type[tdm.model.logistic_regression.LogisticRegressionModel], model_kwargs: dict | None = None, end_phase: int = 5, supported_cell_types: list[str] | None = None, verbose: bool = False)[source]#

An analysis organizes and performs all of the steps required to fit a model to the data in single_cell_df.

Parameters:
  • single_cell_df (pd.DataFrame) – the single cell dataframe (see Preprocessing)

  • cell_types_to_model (list[str] | None, optional) – defines the axes of the state-space of the dynamical model. defaults to all cell types.

  • neighborhood_size (float, optional) – radius for counting cell types. Defaults to _80_microns.

  • neighborhood_mode (Literal["exclude", "extrapolate"], optional) – exclude cells whose neighborhood exceeds tissue limits or correct for the unobserved fraction. Defaults to “exclude”.

  • nds_class_kwargs (dict | None, optional) – keywords for the neighbors dataset class. Defaults to None.

  • allowed_neighbor_types (list[str] | None, optional) – exclude cells with neighbors outside this list. Defaults to None.

  • xlim (tuple[float, float], optional) – x limits for plots. Defaults to the maximal density of the first cell type.

  • ylim (tuple[float, float], optional) – y limits for plots. Defaults to the maximal density of the second cell type.

  • enforce_max_density (bool, optional) – whether to add a correction so that there is no net growth at maximal density. Defaults to False.

  • max_density_enforcer_power (int, optional) – high powers produce smaller corrections. Defaults to 4.

  • polynomial_dataset_kwargs (dict | None, optional) – parameters for the tdm.dataset.PolynomialDataset. Defaults to None.

  • model_class (type[Model], optional) – type of model to fit. Defaults to type[LogisticRegressionModel].

  • end_phase (int, optional) – stop analysis at this phase (inclusive). Defaults to 5. For example, end_phase = 4 skips model fit.

  • supported_cell_types (list[str] | None, optional) – provide an explicit list of supported cell types when using a patient-level single_cell_df that might not have an instance of all types. Defaults to None.

  • verbose (bool, optional) – print stages of the analysis. Defaults to False.

Examples

>>> ana = Analysis(
>>>    single_cell_df=single_cell_df,
>>>    cell_types_to_model=[FIBROBLAST, MACROPHAGE],
>>>    allowed_neighbor_types=[FIBROBLAST, MACROPHAGE, TUMOR, ENDOTHELIAL],
>>>    polynomial_dataset_kwargs={"degree":2},
>>>    xlim=(0,7.1),
>>>    ylim=(0,6.1),
>>>    neighborhood_mode='extrapolate',
>>> )

Warning

Analysis infers the cell types from the single_cell_df:

>>> single_cell_df[CELL_TYPE_COL].unique()

Provide an explicit list of supported_cell_types when using a small single_cell_df (e.g., one patient) that might not have an instance of every cell type.

property cell_a: str#
property cell_b: str#
property cell_c: str#
property cell_types: list[str]#
dump(filename: str)[source]#

Caches the analysis object.

Warning

Overwrites existing files by default.

Examples

>>> ana.dump("fibros_and_macs_15-05-2024.pkl")
>>> ana = Analysis.load("fibros_and_macs_15-05-2024.pkl")
Parameters:

filename (str, optional) – a descriptive name for the analysis, e.g “fibros_and_macs_15-05-2024.pkl”.

property enforce_max_density: bool#
get_tissues_by_ids(subject_ids: str | float | int | list | ndarray)[source]#
static load(filename: str)[source]#

Loads the analysis object.

Examples

>>> ana.dump("fibros_and_macs_15-05-2024.pkl")
>>> ana = Analysis.load("fibros_and_macs_15-05-2024.pkl")
Parameters:

filename (str) – a descriptive name for the analysis, e.g “fibros_and_macs_15-05-2024.pkl”

property model: Model#
property nds: NeighborsDataset#
property ndss: list[NeighborsDataset]#
property neighborhood_size: float#
property pds: PolynomialDataset#
property rnds: RestrictedNeighborsDataset#
run(start_phase: int = 1, end_phase: int = 5, verbose: bool = True)[source]#

Run the analysis.

  • 1: construct tissues

  • 2: count neighbors

  • 3: filter cell types

  • 4: transform features

  • 5: fit model

Parameters:
  • start_phase (int, optional) – run phases from here (inclusive). Defaults to 1.

  • end_phase (int, optional) – stop at this phase (inclusive). Defaults to 5.

set_maximal_density_enforcer(model: Model, max_density_enforcer_power: int | None = None, max_density_enforcer_fixed_cell_counts: dict | None = None)[source]#

_summary_

Warning

modifies the passed model.

Parameters:
  • model (Model) – a fitted model.

  • max_density_enforcer_power (int | None) – uses self._max_density_enforcer_power if None.

  • max_density_enforcer_fixed_cell_counts (dict | None, optional) – _description_. Defaults to None.

tissue_states(scale_counts_to_common_radius: bool = True) DataFrame[source]#

Returns the number of cells of each type in each tissue in the analysis.

Parameters:

scale_counts_to_common_radius (bool, optional) – scales the cell counts to a shared area of radius self.neighborhood_size. Defaults to True.

Returns:

the cell counts

Return type:

pd.DataFrame

property tissues: list[Tissue]#
property xlim: tuple[float, float]#
property ylim: tuple[float, float]#
property zlim: tuple[float, float]#