{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# One-Shot Dynamics in 7 Minutes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Welcome!\n", "\n", "This tutorial will take you through a complete *One-Shot Tissue Dynamics Reconstruction* (OSDR) workflow by reproducing figure 3 from [Somer et. al 2024](https://www.biorxiv.org/content/10.1101/2024.04.22.590503v1). We will download a dataset, estimate a dynamical model and plot a 2D phase-portrait of fibroblast-macropahge dynamics. \n", "\n", "Before we begin, let's download the dataset we'll use throughout the tutorial by running the following code block. \n", "\n", "This could take a few minutes the first time so continue reading while the data is downloading.\n", "\n", "We'll work with the breast cancer IMC dataset by [Danenberg et. al 2022](https://www.nature.com/articles/s41588-022-01041-y). \n", "This dataset includes 793 spatial proteomics tissue sections from 717 breast cancer patients, totalling ~864K cells.\n", "\n", "Fibroblast-macrophage dynamics in breast cancer serve as a useful test-case for our model because [Mayer et. al 2023](https://www.nature.com/articles/s41467-023-41518-w) experimentally discovered the dynamics between these two cell types." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "from tdm.raw.breast_mibi import read_single_cell_df\n", "\n", "single_cell_df = read_single_cell_df()" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(859710, 61)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "single_cell_df.shape" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview of OSDR\n", "\n", "OSDR uses statistical models to learn how the composition of a cell's neighborhood influences it's division rate.\n", "\n", "Thus, the core input to OSDR consists of:\n", "\n", "* x,y positions of cells in the tissue\n", "* cell types\n", "* cell division labels (1 = dividing, 0 = not dividing)\n", "\n", "For datasets including samples from multiple patients or multiple tissue sections (\"images\") per patient, we also require:\n", "\n", "* subject id\n", "* image id\n", "\n", "## The Single-Cell Dataframe\n", "\n", "To use OSDR we first prepare one large table with a row for each cell and a column for each parameter above. \n", "\n", "We call this table the \"single-cell dataframe\".\n", "\n", "The module `tdm.preprocess.single_cell_df` contains utilities for preparing and validating the table. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The raw data we downloaded includes many columns, including levels of various markers:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['index',\n", " 'ImageNumber',\n", " 'ObjectNumber',\n", " 'metabric_id',\n", " 'cellPhenotype',\n", " 'is_epithelial',\n", " 'is_tumour',\n", " 'is_normal',\n", " 'is_dcis',\n", " 'is_interface',\n", " 'is_perivascular',\n", " 'is_hotAggregate',\n", " 'Histone H3',\n", " 'SMA',\n", " 'CK5',\n", " 'CD38',\n", " 'HLA-DR',\n", " 'CK8-18',\n", " 'CD15',\n", " 'FSP1',\n", " 'CD163',\n", " 'ICOS',\n", " 'OX40',\n", " 'CD68',\n", " 'HER2 (3B5)',\n", " 'CD3',\n", " 'Podoplanin',\n", " 'CD11c',\n", " 'PD-1',\n", " 'GITR',\n", " 'CD16',\n", " 'HER2 (D8F12)',\n", " 'CD45RA',\n", " 'B2M',\n", " 'CD45RO',\n", " 'FOXP3',\n", " 'CD20',\n", " 'ER',\n", " 'CD8',\n", " 'CD57',\n", " 'Ki-67',\n", " 'PDGFRB',\n", " 'Caveolin-1',\n", " 'CD4',\n", " 'CD31-vWF',\n", " 'CXCL12',\n", " 'HLA-ABC',\n", " 'panCK',\n", " 'c-Caspase3',\n", " 'DNA1',\n", " 'DNA2',\n", " 'Location_Center_X',\n", " 'Location_Center_Y',\n", " 'AreaShape_Area',\n", " 'x',\n", " 'y',\n", " 'ki67',\n", " 'cell_type',\n", " 'img_id',\n", " 'subject_id',\n", " 'division']" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(single_cell_df.columns)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can use the function `restrict_df_to_required_columns` to select just the subset of columns OSDR requires. \n", "\n", "A preprocessed table looks like this:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | x | \n", "y | \n", "division | \n", "cell_type | \n", "img_id | \n", "subject_id | \n", "
---|---|---|---|---|---|---|
0 | \n", "0.000121 | \n", "0.000004 | \n", "False | \n", "Tu | \n", "1 | \n", "MB-0282 | \n", "
1 | \n", "0.000222 | \n", "0.000005 | \n", "False | \n", "T | \n", "1 | \n", "MB-0282 | \n", "
2 | \n", "0.000354 | \n", "0.000006 | \n", "False | \n", "Tu | \n", "1 | \n", "MB-0282 | \n", "