• Keine Ergebnisse gefunden

Then the procedure is applied to simulated data sets in order to evaluate its statistical

properties including considerations of the impact of between-marker LD. In the next

step, a real data set is re-analyzed for illustration. Finally, it is shown and argued that

a combination of the proposed procedure with an optimized multi-stage design will offer

both gains in cost efficiency and increased flexibility.

for all J ⊂ M

1

∪ M

2

with i ∈ J and for all Borel sets B and for all k.

The researcher starts with an initial design consisting of a planned total sample size n, a sub-sample size n

1

< n for the first stage and a marker set M

1

⊂ M of m

1

markers to be genotyped in the first n

1

subjects. At the planning stage, she or he furthermore specifies m, the maximum number of markers which can be tested, i.e. the whole set of markers M

1

∪ M

2

considered for testing has to consist of no more than m markers. For example, for fine mapping in regions of interest the researcher may wish to add some markers in the second stage. Note that the decision on which markers to choose can be postponed until the start of the second stage. For i ∈ M

1

the planned final test of H

0i

adjusted for multiple testing in M is given by the indicator function I { T

ni

> t

α / m

} where t

γ

:= F

−1

(1 − γ) denote quantiles of F and where α is the pre-specified level for the genomewide FWER. Of note, specification of m depends on both power \ sample size considerations and flexibility. Hence, usually | M

1

| < m << | M | .

Let X denote the random vector of the marker information on markers in M

1

collected

at the first stage of the study and let x denote the observed value of X. After the first

stage, the researcher can use the whole information in X to re-define the total sample

size n

and to define the marker set M

2

⊂ M , | M

1

∪ M

2

| ≤ m , of m

2

markers which

will be genotyped in the new n

2

= n

− n

1

subjects and for which a statistical test

will be performed at the end of the study. Both M

2

and n

are random variables,

M

2

= M

2

(X) and n

= n

(X), but a formal rule need not be specified for the selection

of M

2

and n

. The final statistical test for all markers in M

2

will be based on all marker

information available from both stages (Skol et al., 2006), so the following marker sets

can be identified: Markers selected from M

1

(M

2

∩ M

1

), new previously untyped markers

(M

2

\ M

1

), and abandoned markers (M

1

\ M

2

). As an example imagine that M

1

is one

of the currently available SNP arrays (Barrett and Cardon, 2006) while M

2

is a set of

markers where genotyping is either done by using a custom array or a different technology

like Matrix-assisted Laser Desorption Ionisation Time-of-Flight Mass Spectrometry.

5.2.2 The flexible two-stage procedure

The basic idea behind the construction of the proposed flexible two-stage multiple testing procedure is to combine closed testing (Marcus et al., 1976) with the CRP approach (M¨ uller and Sch¨afer, 2004). According to the closed testing principle, any single marker hypothesis H

0i

, i ∈ M

2

, can be rejected with control of the FWER at level α, if all intersection hypotheses H

0J

where i ∈ J ⊂ M can be rejected at level α. For any J ⊂ M , J 6 = ∅ , to test the intersection hypothesis H

0J

, according to the initial design of the study Bonferroni-type adjusted critical limits will be used, which correspond to the decision function ˜ ϕ

J

:= min { 1 , ϕ

J

} where ϕ

J

:= P

i∈J∩M1

I { T

ni

> t

αJ

} + min { m − m

1

, | J \ M

1

|}· α

J

and α

J

=

|J∩M α

1|+min{m−m1,|J\M1|}

. After observation of X the researcher is allowed to change every ϕ

J

into a new function ψ

J

≥ 0 along with the decision function ˜ ψ

J

:=

min { 1, ψ

J

} such that E

J

J

| X) ≤ E

J

J

| X) holds for the conditional expectations under Pr

J

for all J ⊂ M , J 6 = ∅ , according to the CRP principle. Then this flexible Bonferroni-Holm-type test procedure for the family H

0J

J⊂M,J6=∅

maintains control of the FWER if E

J

J

) ≤ α for all J ⊂ M, J 6 = ∅ . The conditional expectations of decision functions are called conditional rejection probabilities.

Throughout this chapter, let CRP

J

(c) denote the sum of the conditional rejection probabilities of all markers in J ⊂ M , J 6 = ∅ , for an arbitrary critical limit c at the planning stage, i.e.,

CRP

J

(c) := X

i∈J∩M1

E

i

I { T

ni

> c }| X = x

+ min n

m − m

1

, | J \ M

1

| o

·

1 − F (c) .

In the new procedure, after the first stage has been completed, ϕ

J

is replaced by ψ

J

:= P

i∈J∩M2

I { T

ni

> c

J

} if J ∩ M

2

6 = ∅ and by ψ

J

:= 0 if J ∩ M

2

= ∅ , where T

ni

is the test statistic for the single marker i ∈ M

2

based on the modified total sample size n

, i.e. based on n

if i ∈ M

1

∩ M

2

and based on the n

2

new subjects if i ∈ M

2

\ M

1

where c

J

is a common critical limit for markers in J.

After these design modifications let in the following denote by CRP

J

(c) the sum of the conditional rejection probabilities of all markers in J ⊂ M , J 6 = ∅ at an arbitrary critical limit c, i.e., CRP

J

(c) = P

i∈J∩M2

E

i

(I { T

ni

> c }| X = x).

For every J ⊂ M, J 6 = ∅ , define c

J

:= inf { c | CRP

J

(c) ≤ C R P

J

t

αJ

}. In S ch e ra g e t a l. (2 0 0 8 ) a m a th e m a tic a l p ro o f is g iv e n th a t E

J

J

|X) ≤ E

J

J

|X) fo r a ll J ⊂ M , J 6= ∅. In th is w a y a fl e x ib le c lo se d te stin g p ro c e d u re a t le v e l α is d e fi n e d .

In a d d itio n , it is d e m o n stra te d in S ch e ra g e t a l. (2 0 0 8 ) th a t H

0i

c a n b e re je c te d w ith c o n tro l o f F W ER a t le v e l α fo r a n y m a rk e r i ∈ M

2

fo r w h ich T

ni

> ˜ c h o ld s tru e . If th e to ta l sa m p le siz e is n o t m o d ifi e d , i.e ., n

= n th e c o m m o n c ritic a l lim it ˜ c is d e fi n e d b y

˜

c := inf {c|C R P

M2

(c) ≤ min

J⊂M,J⊃M2

C R P

J

(t

αJ

)}. (5 .1 ) If th e re se a rch e r m o d ifi e s th e to ta l sa m p le siz e , a fu rth e r a d ju stm e n t o f th e c ritic a l lim it ˜ c m a y b e n e c e ssa ry (se e S ch e ra g e t a l., 2 0 0 8 ).

T h e min in c o n d itio n (5 .1 ) c a n b e fo u n d b y th e fo llo w in g a lg o rith m . In itia liz e : min = C R P

M2

(t

α/ m2

)

D O h = 1 T O |M

1

\M

2

| + m − |M

1

∪ M

2

|

[ Id e n tify th e m a rk e r se t J ⊂ M \M

2

, |J | = h w ith C R P

M2∪J

(t

α/(m2+h)

)

= min

H⊂M\M2,|H|=h

C R P

M2∪H

(t

α/(m2+h)

).

C a lc u la te C R P

M2∪J

( t

α/(m2+h)

) g iv e n th e in itia l p la n n in g w ith n in d iv id u a ls.

IF C R P

M2∪J

(t

α/(m2+h)

) < min, T H EN min = C R P

M2∪J

(t

α/(m2+h)

)]

N o te th a t th is a lg o rith m is o f lin e a r c o m p le x ity if a r g min

H⊂M1\M2,|H|=h

C R P

M2∪H

(c)

d o e s n o t d e p e n d o n th e ch o ic e o f c, w h ich is th e c a se in m o st p ra c tic a l situ a tio n s w h e re

th e sa m e n u m b e r o f in d iv id u a ls is g e n o ty p e d in th e fi rst sta g e fo r a ll m a rk e rs in M

1

. In

th is c a se , b e fo re sta rtin g th e D O lo o p th e m a rk e rs in M

1

\M

2

a re o rd e re d b y th e v a lu e s

o f e

i

= E

i

(I{T

ni

> c}|X = x), fo r a n a rb itra ry ch o ic e o f c. In e v e ry ste p o f th e D O lo o p ,

th e m a rk e r se t J is th e n g e n e ra te d b y a d d in g th e m a rk e r w ith th e sm a lle st e

i

to th e se t J

id e n tifi e d in th e la st ste p . In a c e rta in ste p o f th is lo o p , th e m − |M

1

∪ M

2

| p la c e h o ld e rs

fo r a d d itio n a l m a rk e rs w h ich fi n a lly w e re n o t in c lu d e d in M

2

c a n a lso b e a d d e d to th e

se t J .

The common critical limit ˜ c for the final test of each marker can be determined by a bi-sectional search over ˜ c in eq uation (5.1). Note that E

i

(I{T

ni

> c}|X = x) = 1 − F (c) for markers i ∈ M

2

\M

1

as their T

ni

and X are stochastically independent.

If the null hypotheses for a subset of M

2

can be rejected at a critical limit of ˜ c,

Holm-steps may follow in order to reject null hypotheses for further markers in M

2

until no

additional null hypotheses can be rejected. For such a step, M (and accordingly M

1

and

M

2

) is reduced by the markers for which the null hypotheses could be rejected and then

the algorithm is repeated for determination of a lower critical limit than ˜ c .