- LDA: limiting dilution assay
- coop: (cellular) cooperation

The LDA is the gold standard for quantification of the clonogenic capacity of non-adherent cells. The most common method for calculating this capacity from an LDA experiment assumes implicitly independence of all cells. Interestingly, this assumption does not hold for many cell lines (i.e. cooperatively growing cells).

Cellular cooperation induces a non-linearity to the dose-response-curve (the (log) fraction of non-responding wells over the number of cells seeded) and thereby biases the estimate of the gold standard analysis method, which assumes linearity. In other words: when surrounded by many other cells, the combined clonogenic activity of 100 cells is higher than the activity of 100 separated single cells. In terms of single cell activity, 100 cells act as if they were more.

Thus, the gold standard analysis method is syntactically applicable but its result is biased (i.e. meaningless) for cooperatively growing cells.

Therefore, this package equips you with the tools you need to robustly quantify the clonogenic capacity of cell lines as well as to quantify the clonogenic survival of those cell lines after treatment.

As known from tools for quantification of the clonogenic capacity from LDA data, the input is a data frame (table, matrix) with entries for

- cells: the number of cells seeded in a well
- wells: the number of replicate wells
- response: the number of wells with positive response (growth)
- (optional) group: treatment group - e.g. irradiation dose

All commands to get tables with numbers or plots of the activity are explained below.

Let’s assume, you have conducted an experiment with a set of experimental conditions and biological replicates.

Your data will look something like this

```
#install.packages("LDAcoop")
library(LDAcoop)
data(LDAdata)
head(LDAdata)
#> name replicate Group S-value # Tested # Clonal growth
#> 1 MDA.MB321 1 0 32 12 12
#> 2 MDA.MB321 1 0 16 12 12
#> 3 MDA.MB321 1 0 8 12 12
#> 4 MDA.MB321 1 0 4 12 12
#> 5 MDA.MB321 1 0 2 12 10
#> 6 MDA.MB321 1 0 1 12 7
```

A table of clonogenic activities, cooperativity coefficients and survival fractions can be generated from such a data table as follows:

```
BT20 <- subset.data.frame(x = LDAdata,
subset = name == "BT.20")
BT20 <- BT20[,c("S-value","# Tested","# Clonal growth","Group","replicate")]
round(LDA_table(x = BT20),digits = 3)
#> reference class is 0
#> treatment act act.CI.lb act.CI.ub b b.pvalue SF SF.CI.lb SF.CI.ub
#> 1 0 29.613 24.943 35.339 1.135 0.176 NA NA NA
#> 2 1 33.420 28.330 39.632 1.220 0.043 0.886 0.696 1.129
#> 3 2 40.632 34.391 48.223 1.206 0.062 0.729 0.572 0.929
#> 4 4 72.997 62.611 85.477 1.434 0.003 0.406 0.321 0.513
#> 5 6 180.201 152.880 213.427 1.234 0.038 0.164 0.129 0.209
#> 6 8 423.447 375.448 482.003 2.266 0.000 0.070 0.057 0.086
```

In this table, we find

`treatment`

: the experimental ‘Group’`act`

: the calculated activity for the treatment`act.CI.lb`

and`act.CI.ub`

: its uncertainty (lower and upper bound of the 95%-confidence interval)`b`

: the cooperativity coefficient (1: no cooperativity)`b.pvalue`

: the p-value for test hypothesis of b=1`SF`

: the calculated survival fraction (SF)`SF.CI.lb`

and`SF.CI.ub`

: SF-uncertainty (lb: lower bound, ub: upper bound)

A plot of the experiment and the estimated survival fractions can be generated as follows:

In case the figure looks overcrowded,
`uncertainty.band = FALSE`

can switch off the confidence
bands and `xlim`

(e.g. `xlim = c(0,100)`

) can be
used, to adjust the plotted range of seeded cells.

Clonogenicity refers to the capacity of single cells to grow into new colonies. Since not every single cell will successfully form a new colony, in a given culture the fraction of cells that do so is often of interest. This fraction is then called clonogenic activity. Mathematically it is a probability \(p\). Often the activity is communicated by the reciprocal, which corresponds to a number of cells (e.g. activity \(p = 1/42\) written as \(a = 1/p = 42\) for ‘one active cell among \(a = 42\) non-active cells’).

In the LDA, distinct numbers of cells are seeded in wells. The number of seeded cells per well is often called dose. Since, we do not want to get confused with the irradiation dose, which is a frequent treatment in the field of radiation oncology, we denote the number of cells seeded with \(s\). The readout of each well in the LDA is dichotomous (growth / no growth). So, from the binomially distributed number of active cells in a well \(X\) where

\[P(X=k) = \left( \begin{matrix} s \\ k \end{matrix}\right) p^{k} (1-p)^{s-k} \]

only \(X=0\) and \(X \neq 0\) can be observed experimentally. Thus, it is about the frequency of positive wells as a function of the number of cells seeded (\(k/n = f(s)\)). For dealing with high cell numbers, the binomial distribution is approximated by the asymptotically identical (for \(s \to \infty\)) Poisson distribution (with \(\lambda = s \cdot p\)): \[P_{\lambda}(X=k) = \frac{\lambda^k}{k!} e^{-\lambda}.\]

The basic concept for estimating the clonogenic activity of a cell suspension through diluting the number of cells in different wells is the following:

- imagine a number of cells in a well, which is so high that we expect many active cells among them, then this well will be positive.
- imagine a number of cells so low

that the chance of an active cell among them is very low, then this well will be negative. - thus, between those two numbers there will be cell numbers, where no-one can say, whether a colony will grow or not. It is rather a matter of chance.
- from Poisson statistics we know, that a frequency of 37% negative wells will be observed, when the average number of active cells per well is one (\(e^{-1} \approx 0.368\)).

So taking the graph ‘observed fraction of positive wells over the number of cells per well’, we find the clonogenic activity as the number of cells (x-axis) just where the curve passes the expectation of 37% negative wells.

Let \(n\) be the number of wells, and let \(\mu\) denote the expected fraction of positive wells. Following the single-hit Poisson model (SHPM, \(P_{\lambda}(k=0) = e^{-p s}\)), the expectation is: \[\mu = 1-e^{-ps}.\]

The number of failures \(r\) is again binomially distributed

\[ P(Y=r) = \left( \begin{matrix} n \\ r \end{matrix}\right) (1-\mu)^{r} \mu^{n-r}. \]

We find (for \(\alpha = log(p)\), \(\xi = log(s)\))

\[\begin{align*} \Leftrightarrow 1-\mu &= e^{- p s}\\ \Leftrightarrow log(1-\mu) &= - p s \\ \Leftrightarrow -log(1-\mu) &= p \cdot s \\ \Leftrightarrow log(-log(1-\mu)) &= \alpha + \xi \end{align*}\]

Therefore, technically the identification of the required number of cells is done via a generalized linear regression model (binomial family, cloglog link function). The clonogenic activity is \(p = e^{\alpha}\).

In the presented model, \(p\) is independent from \(s\). So implicitly, it is assumed, that 100 single cells in single wells have the identical chance to result in at least one colony as a single well with those 100 cells would have.

But this is often not the case. Cells export or release substances into the medium and thereby communicate and cooperate. The combined chance of a group of cells to grow into a clone often exceeds the sum of the individual chances. (See CFAcoop, for a number of cells seeded \(S\) and a number of resulting colonies \(C\) we find \(C = a \cdot S^b\).)

Thus, allowing for the same communication and cooperativity, we replace \(\lambda = p \cdot s\) with \(\lambda = p \cdot s^b\) analogue to CFA. The modification reads in both cases as ‘a number of \(s\) cells shows the same activity as if they were \(s^b\) single cells’.

With this generalization we find

\[\begin{align*} P_{\lambda}(k=0) &= e^{-\lambda}\\ \Leftrightarrow 1-\mu &= e^{- p s^b}\\ \Leftrightarrow log(1-\mu) &= - p s^b \\ \Leftrightarrow log(-log(1-\mu)) &= \alpha + \xi \cdot b. \end{align*}\]

Interestingly, this equation is the same as in the special case of non-cooperativity, just without the restriction of \(b=1\), which corresponds to the non-cooperative case. The degree of cooperativity is quantified by the coefficient \(b\).

When the probability of a cell to grow into a colony is not independent from the number of surrounding cells, the earlier definition of clonogenic activity (the chance of an isolated single cell) is pointless. Thus, the approach to deal with cooperatively growing cells, requires a generalized definition for clonogenic activity.

For a fixed outcome of e.g. \(\lambda = 1\), which is equivalent with \(\mu \approx 0.63\), we still may calculate the required number of cells to be seeded from the regression coefficients \[s_{\lambda = 1} = e^{\frac{-\alpha}{b}} = \left(p^{-1}\right)^{\frac{1}{b}}.\] On average, one out of these cells will grow into a colony. Therefore, we define this number \(s_{\lambda=1}\) as the (generalized) clonogenic activity.

This is analogue to the CFA approach for cellular cooperation (which sets the reference at an expectation of \(20\) colonies.). Please note, the non-cooperative case (\(b=1\)) results in the identical activity values as from the previous definition.

Given the data of an LDA experiment, we find the clonogenic activity where expected. The only change in contrast to conventional LDA data analysis is the curve, which replaces the linear model.

```
cell.line <- unique(LDAdata$name)[2]
AD <- subset.data.frame(LDAdata, subset = (name==cell.line) &
(replicate==1) & (Group == 2))[,4:6]
LDA_plot(LDA_tab = AD,uncertainty = "act", uncertainty.band = T)
```

```
LDA_table(x = AD)
#> $`activity^-1 [N]`
#> [1] 37.432
#>
#> $`confidence interval`
#> [1] 25.742 54.771
#>
#> $`cooperativity coefficient b`
#> [1] 0.992
#>
#> $`p-value cooperativity`
#> [1] 0.962
full_model_fit <- LDA_activity_single(x = AD)
```

In case, there are biological replicates, the replicate may be indicated by the fifth column. Thereby the frequency of responding wells can be plotted separately.

```
AD <- subset.data.frame(LDAdata, subset = (name==cell.line) &
(Group == 2))[,c(4:6,3,2)]
LDA_plot(LDA_tab = AD,uncertainty = "act", uncertainty.band = T)
```

```
LDA_table(x = AD[,1:3],ref_class = 0)
#> $`activity^-1 [N]`
#> [1] 40.632
#>
#> $`confidence interval`
#> [1] 34.391 48.223
#>
#> $`cooperativity coefficient b`
#> [1] 1.206
#>
#> $`p-value cooperativity`
#> [1] 0.062
```

Please note that the starting motivation of a concept ‘one active cell among how many?’ is still the same. It is just restricted to the special outcome of 37% negative wells. The choice of ‘outcome of 37% negative wells’ is quite artificial and the activity for higher or lower active well ratios would change. But when it comes to survival fractions, this shift is mostly cut out and thereby this approach filters (in parts) a cooperativity-bias in those survival fractions.

Even though the concept of clonogenic activity is somewhat fuzzy in the presence of cellular cooperation, the effect of treatments on this clonogenic capacity can be investigated quite accurately by the same way it is done for the CFA.

The only requirement is a statement on the outcome of interest. At CFA, a number of 20 colonies can be agreed upon - and a comparison of the number of required cells seeded with and without treatment is reasonable.

Analogously, the ratio of cell numbers required to observe the same \(\mu\) (e.g. \(\mu = 1-e^{-1}\)) is of interest, when investigating the effect of certain treatments.

\[SF_x = \frac{s_0(\lambda_0 = 1)}{s_x(\lambda_x = 1)}\] can be calculated from the fitted GLM parameters as \[SF_x = exp\left( \frac{log(p_x)}{b_x} - \frac{log(p_0)}{b_0}\right)\]

Therefore, the interpretation of survival fractions is straight forward. If the number of cells needed to be seeded with treatment is ten times the number without treatment, the survival fraction is calculated as \(0.1\). Note that it is implicitly assumed, that those cells that are non-survivors of the treatment, do not contribute to the cooperative stimulation of the survivors. Otherwise the treatment effect will be underestimated.

Uncertainties for the parameters of the generalized linear model are
returned by the fitting procedure and can be transferred to the
clonogenic activity via the general `predict.glm`

function.

Thus, uncertainties of clonogenic activities are calculated the same way as the activities are (cutting \(y=-1\)).

Uncertainties of the survival fractions can be generally assessed by two ways

- robust approximation by combination of the 84%-confidence-intervals of the activity (method (a), see Austin et al. (2002), Payton et al. (2003), Knol et al. (2011))
- calculation following the law of error propagation through first order Taylor series expansion (method (b))

In cases of extreme cooperativity, method (b) can be numerically unstable and therefore we recommend the use of method (a) instead.

Analysis of the calculated uncertainty bounds of the survival fractions of the 10 non-extreme cell lines under all treatments (see data delivered with this package), we find a mean ratio (uncertainty-bound method(a) / uncertainty-bound method(b)) of 0.9969 and a standard deviation of 0.0049. Thus, the deviation of those two methods is very stable in the order of 1%.

```
LDA_table_act <- LDA_table(x = BT20,uncertainty = "act")
#> reference class is 0
cbind(round(LDA_table_act[,1:5],digits = 1),
round(LDA_table_act[,6:9],digits = 4))
#> treatment act act.CI.lb act.CI.ub b b.pvalue SF SF.CI.lb SF.CI.ub
#> 1 0 29.6 24.9 35.3 1.1 0.1762 NA NA NA
#> 2 1 33.4 28.3 39.6 1.2 0.0430 0.8861 0.6955 1.1288
#> 3 2 40.6 34.4 48.2 1.2 0.0620 0.7288 0.5717 0.9294
#> 4 4 73.0 62.6 85.5 1.4 0.0033 0.4057 0.3214 0.5125
#> 5 6 180.2 152.9 213.4 1.2 0.0378 0.1643 0.1291 0.2092
#> 6 8 423.4 375.4 482.0 2.3 0.0000 0.0699 0.0565 0.0863
LDA_table_ep <- LDA_table(x = BT20,uncertainty = "ep")
#> reference class is 0
cbind(round(LDA_table_ep[,1:5],digits = 1),
round(LDA_table_ep[,6:9],digits = 4))
#> treatment act act.CI.lb act.CI.ub b b.pvalue SF SF.CI.lb SF.CI.ub
#> 1 0 29.6 24.9 35.3 1.1 0.1762 NA NA NA
#> 2 1 33.4 28.3 39.6 1.2 0.0430 0.8861 0.6959 1.1282
#> 3 2 40.6 34.4 48.2 1.2 0.0620 0.7288 0.5720 0.9286
#> 4 4 73.0 62.6 85.5 1.4 0.0033 0.4057 0.3213 0.5121
#> 5 6 180.2 152.9 213.4 1.2 0.0378 0.1643 0.1292 0.2091
#> 6 8 423.4 375.4 482.0 2.3 0.0000 0.0699 0.0565 0.0866
```