T-cell activation in numbers - Rare event simulation for probabilistic models of T-cell activat

3.2 T-cell activation in numbers 35

36 BRB model of T-cell activation

beyond the scope of this thesis. We therefore provide only some data and refer to dif-ferent relevant publications. We make use of a top down approach, starting with APCs and T-cells in general ending with signalling events at the TCR-pMHC complex.

It is safe to assume that every APC is unique with respect to its antigen profile, that is the collection of pMHC complexes it presents. This is due to the fact that these pMHC complexes are not part of the APC but are a result of the constant internalisation of cell material in the APC’s surrounding. For T-cells the situation is slightly different.

We already mentioned in 2.2.4 the number of unique T-cells. However, through different events in the periphery a T-cell exists in a low number of copies rather than being really unique (for numbers see Section 2.2.4). On the other side, the number of possible antigens in an individual is estimated to be in the range of 10¹³.

The next lower level are the single APCs and T-cells. For the former the number of antigens per cell is estimated to be in the range of 300000 with about 3000 unique antigen types in several copies [205, 52, 128]. The number of TCRs per T-cell is in the range of 30000 [93, 127, 187, 70, 43, 189]. The surface area of a DC as the most important APC is estimated to be 500µm² [176, 85] to 1800−2400µm², where typically more than two-thirds of this volume is deployed as dendrites [134]. A T-cell on the other hand is much smaller with an estimated surface area between 19−40µm² [78] and 150µm² [134].

The size of the contact region between an APC and a T-cell can vary greatly and was experimentally observed to be in the range of 1 to >70µm² with a mean of 8µm² [134].

On the level of the TCR-pMHC interaction we have to deal with the association and dissociation rates and of course the crossreactivity of a single TCR. The association and dissociation rates vary greatly and there exists many measurements for different TCRs and antigens together with their capability to induce a T-cell reaction, see for example [189, 146, 28, 48, 103]. The crossreactivity, that is the number of different antigens a TCR can react to, is also under discussion and is estimated to be in the range of 100 to 10⁶, where the lower bound seems to be more reasonable [127, 145, 67].

TCR triggering is dependent on intracellular phosphorylation events, which can be captured by the kinetic proof-reading model. Coombs et al., for example, assume 6 proof-reading steps with a rate of 0.25s⁻¹.

Many of the here mentioned numbers are still under experimental investigation and others are still missing. However, these numbers should be a good point to start the development of T-cell activation models.

Chapter 4

Mathematical background and computational methodology

This thesis has two main focuses. The one side is the development of a deeper under-standing of the mechanism of T-cell activation and therewith foreign-self discrimination.

The other side is the development of computational methods that allow for an efficient analysis of probabilistic T-cell activation models that already exist, like the BRB model, or will emerge in the course of this thesis. In this chapter we introduce the relevant mathematical and computational background and establish a first theoretical result that will allow us in the next chapters to develop and modify such an efficient simulation method.

In probabilistic models of T-cell activation the main task is the estimation of the probability of T-cell activation given different parameter values. Hence, the general problem we now consider is to estimate the probabilityP(A) of a (rare) eventAunder a probability measure P. The straightforward approach, known as simple sampling, uses the estimate

(P[(A))_N := 1 N

i=1

1{S⁽ⁱ⁾∈A}= 1

Ncard{1≤i≤N |S⁽ⁱ⁾ ∈A}, (4.1) where the {S⁽ⁱ⁾}1≤i≤N are independent and identically distributed (i.i.d.) random vari-ables with distribution P, 1{.} denotes the indicator function, and N is the sample size; we will throughout use bv for an estimate of a quantity v. (P[(A))N is obviously an unbiased and consistent estimate, but, for small P(A), the convergence toP(A) is slow, and large samples are required to get reliable estimates.

Various simulation methods are available that deal with this problem and yield a better rate of convergence under the right circumstances (see the monograph by Bucklew [25]

for an overview). Most of them achieve this improvement by reducing the variance of the estimator. We will concentrate here on the most wide-spread class of methods, namely importance sampling. As is well known, one introduces a new sampling distribution Q here under which A is more likely to happen, produces samples from this distribution and gets back to the original distribution by reweighting. In general, finding a good importance sampling distribution that reduces the variance as much as possible is an art, and much of the literature revolves around this. Some “general purpose” and many ad hoc strategies exist, but usually, importance sampling distributions are best tailored by exploiting the structure of the specific problem at hand. However, if the problem can be embedded into a sequence of problems for which a so-called large deviation

38 Mathematical background and computational methodology

principle is valid, a unified theory is available that identifies the most efficient simulation distribution. This technique of “large deviation simulation” was introduced by Sadowski and Bucklew [169], laid down in the monograph by Bucklew [25], and further developed by Dieker and Mandjes [58]. It rests on the well-established theory of large deviations, as summarised, for example, in the books by Dembo and Zeitouni [53] or den Hollander [88]. Let us recapitulate the basic background.

4.1 Large deviation probabilities

Consider a sequence{Sn}of random variables on the probability space (R^d,B,P), where B is the Borel σ-algebra of R^d. Let {P_n} be the family of probability measures induced by{S_n}, i.e.,P_n(B) = P(S_n ∈B) for B ∈ B. We assume throughout that {S_n} satisfies a large deviation principle (LDP) according to the following definition [53, 58]:

Definition 4.1 (Large deviation principle). A family of probability measures {P_n} on (R^d,B) satisfies the large deviation principle (LDP) with rate function I if I : R^d → [0,∞] is lower semicontinuous and, for all B ∈ B,

− inf

x∈B^◦I(x)≤lim inf

n→∞

n logP_n(B)≤lim sup

n→∞

n logP_n(B)≤ −inf

x∈B

I(x), (4.2) where B^◦ := int(B) and B := clos(B) denote the interior and the closure of B, re-spectively. I is said to be a good rate function if it has compact level sets in that I⁻¹([0, c]) ={x∈R^d:I(x)≤c} is compact for all c∈R^d.

A set B is called an I-continuity set if

x∈Binf^◦I(x) = inf

x∈BI(x) = inf

x∈B

I(x). (4.3)

IfBis such a set, the LDP means thatPn(B) decays exponentially for largen, with decay coefficient infx∈BI(x). A point b is called a minimum rate point of B if infx∈BI(x) = I(b).

Large deviation principles are well known for many families of random variables, like empirical means of i.i.d. random variables or empirical measures of Markov chains. For the application we have in mind, which involves sums of independent, but not identically distributed random variables, we need the fairly general setting of the G¨artner-Ellis theorem, which we recapitulate here (cf. [53, Thm. 2.3.6] and [88, Ch. V]). Letϕ_n(ϑ) :=

EPn(e^hϑ,Sⁿⁱ), ϑ ∈ R^d, be the moment-generating function of S_n, where h., .i denotes the scalar product and E^µ(.) denotes the expectation of a random variable with respect to the probability measure µ.

Theorem 4.2 (G¨artner-Ellis). Assume that (G1) limn→∞ 1

nlogϕn(nϑ) =: Λ(ϑ)∈[−∞,∞] exists,

(G2) 0∈int(D_Λ), whereD_Λ:={ϑ∈R^d: Λ(ϑ)<∞}is the effective domain of Λ,

4.2 Simulating rare event probabilities 39

Im Dokument Rare event simulation for probabilistic models of T-cell activation (Seite 43-47)