tdm.dataset.PolynomialDataset#
- class tdm.dataset.PolynomialDataset(ds: Dataset, degree: int = 2, feature_cell_types: list[str] | None = None, include_bias: bool = True, scale_features: bool = False, log_transform: bool = False)[source]#
Performs feature transformations and constructs polynomial features from a dataset.
- __init__(ds: Dataset, degree: int = 2, feature_cell_types: list[str] | None = None, include_bias: bool = True, scale_features: bool = False, log_transform: bool = False) None [source]#
Performs feature transformations and constructs polynomial features from a dataset.
- Parameters:
ds (Dataset) – dataset whose features we transform
degree (int, optional) – order of polynomial interactions to construct. Defaults to 2.
feature_cell_types (list[str] | None, optional) – selects a subset of cell types for features. Defaults to all cell types present.
include_bias (bool, optional) – add a bias term to the features. Defaults to True.
scale_features (bool, optional) – standardize features (subtract mean and divide by standard deviation). Defaults to False.
log_transform (bool, optional) – log2(1+x) transformation of features. Defaults to False.
Examples
>>> # nds is a NeighborsDataset instance >>> pds = PolynomialDataset(nds, degree=2)
- construct_features_from_counts(cell_counts: dict | DataFrame, target_cell: str, **kwargs) DataFrame [source]#
Constructs features compatible with construct_polynomial_features Cell vals is in raw counts (i.e “64” cells, not the log-value: 6)!
- construct_polynomial_features(neighbor_features: DataFrame, scale_features: bool, fit_scaler: bool = False, target_cell: str | None = None) tuple[DataFrame, StandardScaler] [source]#
- Parameters:
neighbor_features – output of fetch() from a NeighborsDataset
scale_features – True performs standard scaling on the resulting polynomial features
fit_scaler – True fits a new standard scaler to the resulting polynomial features
target_cell – cell we’re modelling, used to fetch the correct StandardScaler
- Returns:
(a polynomial features dataframe, a standard scaler)
- Return type:
tuple