tdm.analysis.Analysis#
- class tdm.analysis.Analysis(single_cell_df: DataFrame, cell_types_to_model: list[str] | None = None, neighborhood_size: float = 7.999999999999999e-05, neighborhood_mode: Literal['exclude', 'extrapolate'] = 'exclude', nds_class_kwargs: dict | None = None, allowed_neighbor_types: list[str] | None = None, enforce_max_density: bool = True, max_density_enforcer_power: int = 8, polynomial_dataset_kwargs: dict | None = None, model_class: type[Model] = type[tdm.model.logistic_regression.LogisticRegressionModel], model_kwargs: dict | None = None, end_phase: int = 5, supported_cell_types: list[str] | None = None, verbose: bool = False)[source]#
- __init__(single_cell_df: DataFrame, cell_types_to_model: list[str] | None = None, neighborhood_size: float = 7.999999999999999e-05, neighborhood_mode: Literal['exclude', 'extrapolate'] = 'exclude', nds_class_kwargs: dict | None = None, allowed_neighbor_types: list[str] | None = None, enforce_max_density: bool = True, max_density_enforcer_power: int = 8, polynomial_dataset_kwargs: dict | None = None, model_class: type[Model] = type[tdm.model.logistic_regression.LogisticRegressionModel], model_kwargs: dict | None = None, end_phase: int = 5, supported_cell_types: list[str] | None = None, verbose: bool = False)[source]#
An analysis organizes and performs all of the steps required to fit a model to the data in single_cell_df.
- Parameters:
single_cell_df (pd.DataFrame) – the single cell dataframe (see Preprocessing)
cell_types_to_model (list[str] | None, optional) – defines the axes of the state-space of the dynamical model. defaults to all cell types.
neighborhood_size (float, optional) – radius for counting cell types. Defaults to _80_microns.
neighborhood_mode (Literal["exclude", "extrapolate"], optional) – exclude cells whose neighborhood exceeds tissue limits or correct for the unobserved fraction. Defaults to “exclude”.
nds_class_kwargs (dict | None, optional) – keywords for the neighbors dataset class. Defaults to None.
allowed_neighbor_types (list[str] | None, optional) – exclude cells with neighbors outside this list. Defaults to None.
xlim (tuple[float, float], optional) – x limits for plots. Defaults to the maximal density of the first cell type.
ylim (tuple[float, float], optional) – y limits for plots. Defaults to the maximal density of the second cell type.
enforce_max_density (bool, optional) – whether to add a correction so that there is no net growth at maximal density. Defaults to False.
max_density_enforcer_power (int, optional) – high powers produce smaller corrections. Defaults to 4.
polynomial_dataset_kwargs (dict | None, optional) – parameters for the
tdm.dataset.PolynomialDataset
. Defaults to None.model_class (type[Model], optional) – type of model to fit. Defaults to type[LogisticRegressionModel].
end_phase (int, optional) – stop analysis at this phase (inclusive). Defaults to 5. For example, end_phase = 4 skips model fit.
supported_cell_types (list[str] | None, optional) – provide an explicit list of supported cell types when using a patient-level single_cell_df that might not have an instance of all types. Defaults to None.
verbose (bool, optional) – print stages of the analysis. Defaults to False.
Examples
>>> ana = Analysis( >>> single_cell_df=single_cell_df, >>> cell_types_to_model=[FIBROBLAST, MACROPHAGE], >>> allowed_neighbor_types=[FIBROBLAST, MACROPHAGE, TUMOR, ENDOTHELIAL], >>> polynomial_dataset_kwargs={"degree":2}, >>> xlim=(0,7.1), >>> ylim=(0,6.1), >>> neighborhood_mode='extrapolate', >>> )
Warning
Analysis infers the cell types from the single_cell_df:
>>> single_cell_df[CELL_TYPE_COL].unique()
Provide an explicit list of
supported_cell_types
when using a small single_cell_df (e.g., one patient) that might not have an instance of every cell type.
- property cell_a: str#
- property cell_b: str#
- property cell_c: str#
- property cell_types: list[str]#
- dump(filename: str)[source]#
Caches the analysis object.
Warning
Overwrites existing files by default.
Examples
>>> ana.dump("fibros_and_macs_15-05-2024.pkl") >>> ana = Analysis.load("fibros_and_macs_15-05-2024.pkl")
- Parameters:
filename (str, optional) – a descriptive name for the analysis, e.g “fibros_and_macs_15-05-2024.pkl”.
- property enforce_max_density: bool#
- static load(filename: str)[source]#
Loads the analysis object.
Examples
>>> ana.dump("fibros_and_macs_15-05-2024.pkl") >>> ana = Analysis.load("fibros_and_macs_15-05-2024.pkl")
- Parameters:
filename (str) – a descriptive name for the analysis, e.g “fibros_and_macs_15-05-2024.pkl”
- property nds: NeighborsDataset#
- property ndss: list[NeighborsDataset]#
- property neighborhood_size: float#
- property pds: PolynomialDataset#
- property rnds: RestrictedNeighborsDataset#
- run(start_phase: int = 1, end_phase: int = 5, verbose: bool = True)[source]#
Run the analysis.
1: construct tissues
2: count neighbors
3: filter cell types
4: transform features
5: fit model
- Parameters:
start_phase (int, optional) – run phases from here (inclusive). Defaults to 1.
end_phase (int, optional) – stop at this phase (inclusive). Defaults to 5.
- set_maximal_density_enforcer(model: Model, max_density_enforcer_power: int | None = None, max_density_enforcer_fixed_cell_counts: dict | None = None)[source]#
_summary_
Warning
modifies the passed model.
- Parameters:
model (Model) – a fitted model.
max_density_enforcer_power (int | None) – uses self._max_density_enforcer_power if None.
max_density_enforcer_fixed_cell_counts (dict | None, optional) – _description_. Defaults to None.
- tissue_states(scale_counts_to_common_radius: bool = True) DataFrame [source]#
Returns the number of cells of each type in each tissue in the analysis.
- Parameters:
scale_counts_to_common_radius (bool, optional) – scales the cell counts to a shared area of radius self.neighborhood_size. Defaults to True.
- Returns:
the cell counts
- Return type:
pd.DataFrame
- property xlim: tuple[float, float]#
- property ylim: tuple[float, float]#
- property zlim: tuple[float, float]#