• Keine Ergebnisse gefunden

Limits and prospects of gene characterization using estimates of genetic

In chapter 4, interim effect size estimators were used for sample size reassessment in a candidate gene association study. Furthermore, the usefulness of interval estimators, derived in chapter 3 (for details see Scherag et al., 2002), to indicate the necessity of sample size reassessments was explored. Initial simulations at least warranty a need for further investigation.

With regard to the role of interim estimators, it was argued that the estimation of genetic effect sizes provides more information than the mere result of a statistical test such as the TDT (Spielman et al., 1993). Yet, one should also note that using the estimates proposed in chapter 3 implies that marker and trait locus coincide. Otherwise, LD between the marker SNP and the disease locus will lead to effect size estimates biased towards the null hypothesis (Franke et al., 2005; Zondervan and Cardon, 2004). If the same data set was used to identify the variant of interest (Garner, 2007; Lohmueller et al., 2003; G¨oring et al., 2001), an upward bias called the “winners curse” is more likely to be present. This is due to at least two reasons: First, samples that are used for the initial association mapping are often collected to oversample affected individuals relative to their frequency in the population. As a consequence, genetic risk estimates may not be transferable to the general population. Second, for the situation of a genomewide association study selecting the most extreme test statistics such as the smallest p-values is often equivalent to selecting the most extreme genetic effect size estimates. Hence, the estimates must also be adjusted with respect to this multiplicity.

To circumvent this problem, estimation of genetic effect sizes may be done in indepen-dent, appropriately powered confirmatory studies (Garner, 2007). Alternatively, these problems may be addressed by developing bias-corrected estimators which is an active area of research (Yu et al., 2007; Z¨ollner and Pritchard, 2007; Huang and Lin, 2007).

Note that unbiased estimation of genetic effects is also of crucial importance for sample

size planning of confirmatory studies. Moreover, reliable gene characterization is a first step towards clinical applications - the implementation of genetic marker information for prognostic, diagnostic or interventional purposes. A vision for such aspects of ge-nomics research can be found in Collins et al. (2003). Zeggini and McCarthy (2007) and Janssens et al. (2006c,a,b), however, discuss the merits of predictive genetic testing for non-insulin dependent diabetes mellitus given the real data example of the transcrip tion factor 7 -lik e 2 (T C F7 L 2 ) haplotypes (e.g., Grant et al., 2006; Helgason et al., 2007) and within a wider scope.

More than 10 years after the proposal of Risch and Merikangas (1996) to initiate

genomewide association studies, the “harvest” seems to begin as indicated by weekly

reports of such studies in high-ranking journals (e.g., Wellcome Trust Case Control

Consortium, 2007; Frayling et al., 2007; Frayling, 2007; McPherson et al., 2007; Hampe

et al., 2007; Sladek et al., 2007; Herbert et al., 2006; Arking et al., 2006; K lein et al.,

2005). Exploiting these findings in order to understand pathological processes (e.g.,

Bourgain et al., 2007) and to address clinical goals, however, is a far more difficult task

that will vary substantially between the complex clinical phenotypes investigated. While

the identification of associated markers with smaller, consistent effects will continue,

cooperations with experts in clinical, cellular, animal and molecular biology will be

needed in order to work out the mechanisms behind these associations and to bridge the

gap between a statistical finding and its clinical implication.

7 Summary

Genetic association studies have become the most widely used gene mapping tool for the identification of disease susceptibility loci of complex common traits or diseases. In order to obtain sufficient power at a certain significance level (type I error risk), these analyses require a complete pre-specification of the total number of individuals to be sampled and genotyped. However, in most of these studies little information about the genetic effect size is available beforehand and thus it is difficult to calculate a reasonable sample size.

By addressing these problems, this thesis aims at introducing, extending, and evaluating statistical methodology on design adaptations for genetic association studies.

In particular, it is shown how a confirmatory candidate gene association study can be planned and analyzed given the mentioned uncertainties. For this purpose, an adaptive group sequential procedure is developed. If no rejection of the null hypothesis is pos-sible at the interim analysis, the design of the subsequent study part can be modified depending on interim data. As an example, sample size reassessment may be done using interim effect size estimates developed by the author of this thesis, as well. Finally, a new flexible two-stage design for genomewide association studies is presented. While providing strong control of the genomewide, family-wise type I error rate, the new design might also be more cost-efficient due to greater flexibility in comparison to all previously suggested designs. Examples of which are the possibility to base marker selection upon biological criteria instead of statistical criteria or the option to modify the sample size at any time during the course of the project.

Both the candidate and the genomewide association designs are evaluated using

simu-lated and real data sets. Finally, prospects and limits of design adaptation methods and

estimators of genetic effects in genetic association studies of complex traits are discussed.

8 Zusammenfassung

Genetische Assoziationsstudien stellen eine h¨aufig verwendete Methode zur Identifikation von Suszeptibilit¨atsgenen komplexer Erkrankungen (z.B. Asthma, Adipositas, Brustkrebs) dar. Um bei diesen Studien eine Aussage bez¨ uglich der statistischen Power bei vorgegebenem Signifikanzniveau (Risiko eines Fehlers 1. Art) machen zu k¨onnen, ist die Angabe der zu genotypisierenden Personen notwendig. Valide Fallzahlplanungen h¨angen stark von den erwarteten genetischen Effekten ab und ¨ uber letztere sind oft keine oder nur sehr wenige Informationen verf¨ ugbar. Ziel dieser Dissertation ist es, Methoden zur daten-adaptiven Anpassung des Studienplans f¨ ur den Bereich genetischer Assoziationsstudien einzuf¨ uhren, zu entwickeln und zu evaluieren.

Nach einer Einleitung in die oben skizzierte Problematik wird zun¨achst die

Pla-nung und Auswertung einer konfirmatorischen Kandidatengenstudie unter Einbeziehung

der gegebenen Unsicherheiten dargestellt. Zu diesem Zweck wird eine adaptive,

grup-pensequentielle Prozedur entwickelt. Ist bei einer Zwischenauswertung kein Verwerfen

der Nullhypothese m¨oglich, kann mit Hilfe des vorgeschlagenen Verfahrens eine

daten-abh¨angige Anpassung des Studiendesigns erfolgen. Ein Beispiel ist die Fallzahl¨anderung

in Abh¨angigkeit von Sch¨atzern genetischer Effekte, die ebenfalls im Rahmen dieser

Ar-beit erarAr-beitet werden. Anschließ end erfolgt die ¨ Ubertragung dieser Idee auf genomweite

Assoziationsstudien. Neben der genomweiten Kontrolle des Fehlers 1. Art (family-wise

type I error rate in a strong sense) kann das neu entwickelte zweistufige Verfahren,

be-dingt durch seine Flexibilit¨at, zu einer gr¨osseren Kosteneffizienz im Vergleich zu allen

bisher propagierten Verfahren beitragen. So ist nun beispielsweise auch die Auswahl

genetischer Marker f¨ ur eine zweite Genotypisierungsstufe anhand biologischer Kriterien

m¨oglich.

Die entwickelten Verfahren f¨ ur Kandidatengen- und genomweite Assoziationsstudien

werden sowohl theoretisch als auch an Hand von Simulationsstudien evaluiert. Zus¨atzlich

werden reale Datens¨atze komplexer Ph¨anotypen zur Demonstration der Anwendbarkeit

der Verfahren verwendet. Den Abschluss der Dissertation bildet eine Diskussion zu

Perspektiven und Grenzen adaptiver Verfahren und genetischer Sch¨atzer in genetischen

Assoziationsstudien komplexer Erkrankungen.

References

Ahituv N, Kavaslar N, Schackwitz W, Ustaszewska A, Martin J, Hebert S, Doelle H, et al (2007). Medical sequencing at the extremes of human body mass. Am J Hum Genet 80(4):779– 791.

Altm¨ uller J, Palmer LJ, Fischer G, Scherb H, Wjst M (2001). Genomewide scans of complex human diseases: true linkage is hard to find. Am J Hum Genet 69(5):936–

950.

Altshuler D, Clark AG (2005). Genetics. Harvesting medical information from the human family tree. Science 307(5712):1052– 1053.

Arking DE, Pfeufer A, Post W, Kao WH, Newton-Cheh C, Ikeda M, West K, et al (2006). A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization. Nat Genet 38(6):644– 651.

Armitage P (1955). Tests for linear trends in proportions and frequencies. Biomet-rics 11:375– 386.

Armitage P, McPherson CK, Rowe BC (1969). Repeated significance tests on accumu-lating data. Journal of the Royal Statistical Society A 132:235– 244.

Balding DJ (2006). A tutorial on statistical methods for population association studies.

Nat Rev Genet 7(10):781– 791.

Barrett JC, Cardon LR (2006). Evaluating coverage of genome-wide association studies.

Nat Genet 38(6):659– 662.

Bauer P (1989). Multistage testing with adaptive designs (with discussion). Biometrie und Informatik in Medizin und Biologie 20:130– 148.

Bauer P, Brannath W (2004). The advantages and disadvantages of adaptive designs for clinical trials. Drug Discov Today 9(8):351– 357.

Bauer P, Einfalt J (2006). Application of adaptive designs– a review. Biom J 48(4):493–

506.

Bauer P, K¨ohne K (1994). Evaluation of experiments with adaptive interim analyses.

Biometrics 50(4):1029– 1041.

Benichou J, Palta M (2005). Handbook of Epidemiology., pp. 89– 156. New York:

Springer.

Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I (2001). Controlling the false discovery

rate in behavior genetics research. Behav. Brain Res 125(1-2):279– 284.

Bentley DR (2006). Whole-genome re-sequencing. Curr Opin Genet Dev 16(6):545–552.

Boehnke M (1994). Limits of resolution of genetic linkage studies: implications for the positional cloning of human disease genes. Am J Hum Genet 55(2):379–390.

Boehringer S, Epplen JT, Krawczak M (2000). Genetic association studies of bronchial asthma–a need for Bonferroni correction? Hum Genet 107(2):197.

Botstein D, Risch N (2003). Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet 33 Suppl:228–237.

Bourgain C, Gnin E, Cox N, Clerget-Darpoux F (2007). Are genome-wide association studies all that we need to dissect the genetic component of complex human diseases?

Eur J Hum Genet 15(3):260–263.

Brannath W, Bauer P (2004). Optimal conditional error functions for the control of conditional power. Biometrics 60(3):715–723.

Brannath W, K¨onig F, Bauer P (2006). Estimation in flexible two stage designs. Stat Med 25(19):3366–3381.

Brannath W, Posch M, Bauer P (2002). Recursive combination tests. JASA 97:236–244.

Bretz F, Schmidli H, K¨onig F, Racine A, Maurer W (2006). Confirmatory seamless phase II/ III clinical trials with hypotheses selection at interim: general concepts.

Biom J 48(4):623–634.

Brunner E, Munzel U (2002). Nichtparametrische Datenanalysen. Berlin: Springer.

Burman CF, Sonesson C (2006). Are Flexible Designs Sound? (with discussion). Bio-metrics 62:664–683.

Cardon LR, Bell JI (2001). Association study designs for complex diseases. Nat Rev Genet 2(2):91–99.

Cardon LR, Palmer LJ (2003). Population stratification and spurious allelic association.

Lancet 361(9357):598–604.

Carlson CS, Eberle MA, Kruglyak L, Nickerson DA (2004). Mapping complex disease loci in whole-genome association studies. Nature 429(6990):446–452.

Chang M (2007). Adaptive design method based on sum of p-values. Stat Med 26(14):2772–2784.

Check E (2005). Human genome: patchwork people. Nature 437(7062):1084–1086.

Chiano MN, Clayton DG (1998). Genotypic relative risks under ordered restriction.

Genet Epidemiol 15(2):135–146.

Clayton D (2001). Handbook of Statistical Genetics., pp. 519–540. New York:

Wiley-Interscience.

Clayton D, Chapman J, Cooper J (2004). Use of unphased multilocus genotype data in indirect association studies. Genet Epidemiol 27(4):415–428.

Clopper C, Pearson E (1934). The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26:404–413.

Cochran WG (1954). Some methods for strengthening the common chi-square tests.

Biometrics 10:417–451.

Collins FS, Green ED, Guttmacher AE, Guyer MS, Institute USNHGR (2003). A vision for the future of genomics research. Nature 422(6934):835–847.

Conneely KN, Boehnke M (2007). So many correlated tests, so little time! Rapid adjustment of p-values for multiple correlated tests. Am J Hum Genet (in press).

Cooper DN, Nussbaum RL, Krawczak M (2002). Proposed guidelines for papers de-scribing DNA polymorphism-disease associations. Hum Genet 110(3):207–208.

Cordell HJ, Clayton DG (2002). A unified stepwise regression procedure for evaluating the relative effects of polymorphisms within a gene using case/control or family data:

application to HLA in type 1 diabetes. Am J Hum Genet 70(1):124–141.

Cordell HJ, Clayton DG (2005). Genetic association studies. Lancet 366(9491):1121–

1131.

Cui L, Hung HM, Wang SJ (1999). Modification of sample size in group sequential clinical trials. Biometrics 55(3):853–857.

DeMets DL, Lan KK (1994). Interim analysis: the alpha spending function approach.

Stat Med 13(13-14):1341–52; discussion 1353–6.

Dempfle A, Loesgen S (2004). Meta-analysis of linkage studies for complex diseases: an overview of methods and a simulation study. Ann Hum Genet 68(Pt 1):69–83.

Devlin B, Risch N (1995). A comparison of linkage disequilibrium measures for fine-scale mapping. Genomics 29(2):311–322.

Devlin B, Roeder K (1999). Genomic control for association studies. Biomet-rics 55(4):997–1004.

D’haeseleer P (2006). How does DNA sequence motif discovery work? Nat Biotech-nol 24(8):959–961.

Dixon AL, Liang L, Moffatt MF, Chen W, Heath S, Wong KCC, Taylor J, et al (2007).

A genome-wide association study of global gene expression. Nat Genet 39(10):1202–

1207.

Dupuis J, O’Donnell CJ (2007). Interpreting results of large-scale genetic association studies: separating gold from fool’s gold. JAMA 297(5):529–531.

Dupuy A, Simon RM (2007). Critical review of published microarray studies for

can-cer outcome and guidelines on statistical analysis and reporting. J Natl Cancan-cer

Inst 99(2):147–157.

Eberle MA, Rieder MJ, Kruglyak L, Nickerson DA (2006). Allele frequency matching between SNPs reveals an excess of linkage disequilibrium in genic regions of the human genome. PLoS Genet 2(9):e142.

Elston RC (1998). Methods of linkage analysis–and the assumptions underlying them [see comment]. Am J Hum Genet 63(4):931–934.

English SB, Butte AJ (2007). Evaluation and Integration of 49 Genome-wide Exper-iments and the Prediction of Previously Unknown Obesity-related Genes. Bioinfor-matics 23:2910–2917.

Epstein MP, Allen AS, Satten GA (2007). A simple and improved correction for popu-lation stratification in case-control studies. Am J Hum Genet 80(5):921–930.

Evans DM, Marchini J, Morris AP, Cardon LR (2006). Two-stage two-locus models in genome-wide association. PLoS Genet 2(9):e157.

Ewens WJ, Spielman RS (2005). What is the significance of a significant TDT? Hum Hered 60(4):206–210.

Falk CT, Rubinstein P (1987). Haplotype relative risks: an easy reliable way to construct a proper control sample for risk calculations. Ann Hum Genet 51(Pt 3):227–233.

Fan JB, Chee MS, Gunderson KL (2006). Highly parallel genomic assays. Nat Rev Genet 7(8):632–644.

Fisher LD (1998). Self-designing clinical trials. Stat Med 17:1551–1562.

Fisher RA (1932). Statistical Methods for Research Workers. (4 ed.). London: Oliver and Boyd.

Franke D, Philippi A, Tores F, Hager J, Ziegler A, K¨onig IR (2005). On confidence intervals for genotype relative risks and attributable risks from case parent trio designs for candidate-gene studies. Hum Hered 60(2):81–88.

Frayling TM (2007). Genome-wide association studies provide new insights into type 2 diabetes aetiology. Nat Rev Genet 8(9):657–662.

Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, Perry JRB, et al (2007). A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 316(5826):889–894.

Freimer NB, Sabatti C (2005). Guidelines for association studies in Human Molecular Genetics. Hum Mol Genet 14(17):2481–2483.

Freimer NB, Sabatti C (2007). Human genetics: variants in common diseases. Na-ture 445(7130):828–830.

Friede T, Kieser M (2006). Sample size recalculation in internal pilot study designs: a review. Biom J 48(4):537–555.

Garner C (2007). Upward bias in odds ratio estimates from genome-wide association

studies. Genet Epidemiol 31(4):288–295.

Gauderman WJ (2002). Sample size requirements for association studies of gene-gene interaction. Am J Epidemiol 155(5):478–484.

Geller F, Reichwald K, Dempfle A, Illig T, Vollmert C, Herpertz S, Siffert W, et al (2004).

Melanocortin-4 receptor gene variant I103 is negatively associated with obesity. Am J Hum Genet 74(3):572–581.

Goll A, Bauer P (2007). Two-stage designs applying methods differing in costs. Bioin-formatics 23(12):1519–1526.

Gonzalez CA (2006). The European Prospective Investigation into Cancer and Nutrition (EPIC). Public Health Nutr 9(1A):124–126.

G¨oring HH, Terwilliger JD, Blangero J (2001). Large upward bias in estimation of locus-specific effects from genomewide scans. Am J Hum Genet 69(6):1357–1369.

Grant SFA, Thorleifsson G, Reynisdottir I, Benediktsson R, Manolescu A, Sainz J, Helgason A, et al (2006). Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat Genet 38(3):320–323.

Greenland S, Rothman KJ (1998). Modern Epidemiology, 2nd edition., pp. 47–64.

Philadelphia: Lippincott Williams & Wilkins.

Guo SW (1997). Linkage disequilibrium measures for fine-scale mapping: a comparison.

Hum Hered 47(6):301–314.

Hampe J, Franke A, Rosenstiel P, Till A, Teuber M, Huse K, Albrecht M, et al (2007).

A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nat Genet 39(2):207–211.

Hartl DL, Clark AG (2007). Principles of Population Genetics. (4 ed.). Sunderland:

Sinauer Associates.

Hartung J, Knapp G (2003). A new class of completely self-designing clinical trials.

Biometrical Journal 45:3–19.

Hattersley AT, McCarthy MI (2005). What makes a good genetic association study?

Lancet 366(9493):1315–1323.

Hedges LV, Olkin I (1985). Statistical Methods for Meta-Analysis. Orlando: Academic Press.

Helgason A, P´alsson S, Thorleifsson G, Grant SFA, Emilsson V, Gunnarsdottir S, Adeyemo A, et al (2007). Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution. Nat Genet 39(2):218–225.

Herbert A, Gerry NP, McQ ueen MB, Heid IM, Pfeufer A, Illig T, Wichmann HE, et al (2006). A common genetic variant is associated with adult and childhood obesity.

Science 312(5771):279–283.

Hinds DA, Stuve LL, Nilsen GB, Halperin E, Eskin E, Ballinger DG, Frazer KA, et al

(2005). Whole-genome patterns of common DNA variation in three human

popula-tions. Science 307(5712):1072–1079.

Hinney A, Nguyen TT, Scherag A, Friedel S, Br¨onner G, M¨ uller TD, Grallert H, et al (2007). Genome wide association (GWA) study for early onset extreme obesity supports the role of fat mass and obesity associated gene (FTO) variants. PLoS ONE 2(12):e1361.

Hirschhorn JN, Altshuler D (2002). Once and again-issues surrounding replication in genetic association studies. J Clin Endocrinol Metab 87(10):4438–4441.

Hirschhorn JN, Daly MJ (2005). Genome-wide association studies for common diseases and complex traits. Nat Rev Genet 6(2):95–108.

Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K (2002). A comprehensive review of genetic association studies. Genet Med 4(2):45–61.

Hodge SE (2001). Model-free vs. model-based linkage analysis: a false dichotomy? Am J Med Genet 105(1):62–64.

Hoggart CJ, Parra EJ, Shriver MD, Bonilla C, Kittles RA, Clayton DG, McKeigue PM (2003). Control of confounding of genetic associations in stratified populations. Am J Hum Genet 72(6):1492–1504.

Huang BE, Lin DY (2007). Efficient association mapping of quantitative trait loci with selective genotyping. Am J Hum Genet 80(3):567–576.

Hunter DJ (2005). Gene-environment interactions in human diseases. Nat Rev Genet 6(4):287–298.

International HapMap Consortium (2003). The International HapMap Project. Na-ture 426(6968):789–796.

International HapMap Consortium (2005). A haplotype map of the human genome.

Nature 437(7063):1299–1320.

Ioannidis JP, Ntzani EE, Trikalinos TA, Contopoulos-Ioannidis DG (2001). Replication validity of genetic association studies. Nat Genet 29(3):306–309.

Ioannidis JPA (2005). Why most published research findings are false. PLoS Med 2(8):e124.

Ioannidis JPA (2007). Limitations are not properly acknowledged in the scientific liter-ature. J Clin Epidemiol 60(4):324–329.

Ioannidis JPA, Bernstein J, Boffetta P, Danesh J, Dolan S, Hartge P, Hunter D, et al (2005). A network of investigator networks in human genome epidemiology. Am J Epidemiol 162(4):302–304.

Ioannidis JPA, Trikalinos TA, Khoury MJ (2006). Implications of small effect sizes of individual genetic variants on the design and interpretation of genetic association studies of complex diseases. Am J Epidemiol 164(7):609–614.

Ioannidis JPA, Trikalinos TA, Ntzani EE, Contopoulos-Ioannidis DG (2003). Ge-netic associations in large versus small studies: an empirical assessment.

Lancet 361(9357):567–571.

Jahn-Eimermacher A, Hommel G (2007). Performance of adaptive sample size ad-justment with respect to stopping criteria and time of interim analysis. Stat Med 26(7):1450–1461.

Janssens ACJW, Aulchenko YS, Elefante S, Borsboom GJJM, Steyerberg EW, van Duijn CM (2006a). Predictive testing for complex diseases using multiple genes: fact or fiction? Genet Med 8(7):395–400.

Janssens ACJW, Gwinn M, Khoury MJ, Subramonia-Iyer S (2006b). Does genetic testing really improve the prediction of future type 2 diabetes? PLoS Med 3(2):e114;

author reply e127.

Janssens ACJW, Gwinn M, Valdez R, Narayan KMV, Khoury MJ (2006c). Predictive genetic testing for type 2 diabetes. BMJ 333(7567):509–510.

Jennison C, Turnbull BW (1999). Group Sequential Methods with Applications to Clinical Trials. Boca Rato: CRC Press Inc.

Jennison C, Turnbull BW (2003). Mid-course sample size modification in clinical trials based on the observed treatment effect. Stat Med 22(6):971–993.

Jennison C, Turnbull BW (2006). Adaptive and nonadaptive group sequential tests.

Biometrika 93(1):1–21.

Jorde LB (2000). Linkage disequilibrium and the search for complex disease genes.

Genome Res 10(10):1435–1444.

Jorgenson E, Witte JS (2006a). Coverage and power in genomewide association studies.

Am J Hum Genet 78(5):884–888.

Jorgenson E, Witte JS (2006b). A gene-centric approach to genome-wide association studies. Nat Rev Genet 7(11):885–891.

Khaja R, Zhang J, MacDonald JR, He Y, Joseph-George AM, Wei J, Rafiq MA, et al (2006). Genome assembly comparison identifies structural variants in the human genome. Nat Genet 38(12):1413–1418.

Klein RJ, Zeiss C, Chew EY, Tsai JY, Sackler RS, Haynes C, Henning AK, et al (2005). Complement factor H polymorphism in age-related macular degeneration.

Science 308(5720):385–389.

Knapp M, Strauch K (2004). Affected-sib-pair test for linkage based on constraints for identical-by-descent distributions corresponding to disease models with imprinting.

Genet Epidemiol 26(4):273–285.

Knapp M, Wassmer G, Baur MP (1995). The relative efficiency of the Hardy-Weinberg equilibrium-likelihood and the conditional on parental genotype-likelihood methods for candidate-gene association studies. Am J Hum Genet 57(6):1476–1485.

Koch A (2006). Confirmatory clinical trials with an adaptive design. Biom J 48(4):574–

585.

K¨ohler K, Bickeb¨oller H (2006). Case-control association tests correcting for population stratification. Ann Hum Genet 70(Pt 1):98–115.

Kolonel LN, Altshuler D, Henderson BE (2004). The multiethnic cohort study: exploring genes, lifestyle and cancer risk. Nat Rev Cancer 4(7):519–527.

K¨onig IR, Sch¨afer H, M¨ uller HH, Ziegler A (2001). Optimized group sequential study designs for tests of genetic linkage and association in complex diseases. Am J Hum Genet 69(3):590–600.

K¨onig IR, Sch¨afer H, Ziegler A, M¨ uller HH (2003). Reducing sample sizes in genome scans: group sequential study designs with futility stops. Genet Epidemiol 25(4):339–

349.

Laird NM, Lange C (2006). Family-based designs in the age of large-scale gene-association studies. Nat Rev Genet 7(5):385–394.

Lan KK, Zucker DM (1993). Sequential monitoring of clinical trials: the role of infor-mation and Brownian motion. Stat Med 12(8):753–765.

Lan KKG, DeMets DL (1983). Discrete sequential boundaries for clinical trials.

Biometrika 70:659–663.

Lander ES, Schork NJ (1994). Genetic dissection of complex traits. Sci-ence 265(5181):2037–2048.

Lehmacher W, Wassmer G (1999). Adaptive sample size calculation in group sequential trials. Biometrics 55:131–135.

Lewontin RC, Kojima K (1960). The Evolutionary Dynamics of Complex Polymor-phisms. Evolution 14(4):458–472.

Little J, Bradley L, Bray MS, Clyne M, Dorman J, Ellsworth DL, Hanson J, et al (2002).

Reporting, appraising, and integrating data on genotype prevalence and gene-disease associations. Am J Epidemiol 156(4):300–310.

Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN (2003). Meta-analysis of genetic association studies supports a contribution of common variants to suscep-tibility to common disease. Nat Genet 33(2):177–182.

Lyon HN, Emilsson V, Hinney A, Heid IM, Lasky-Su J, Zhu X , Thorleifsson G, et al (2007). The Association of a SNP Upstream of INSIG2 with Body Mass Index is Reproduced in Several but Not All Cohorts. PLoS Genet 3(4):e61.

Marchini J, Donnelly P, Cardon LR (2005). Genome-wide strategies for detecting mul-tiple loci that influence complex diseases. Nat Genet 37(4):413–417.

Marcus R, Peritz E, Gabriel KR (1976). On closed testing procedures with special reference to ordered analysis of variance. Biometrika 63(3):655–660.

Mathers CD, Loncar D (2006). Projections of global mortality and burden of disease

from 2002 to 2030. PLoS Med 3(11):e442.

McPherson R, Pertsemlidis A, Kavaslar N, Stewart A, Roberts R, Cox DR, Hinds DA, et al (2007). A common allele on chromosome 9 associated with coronary heart disease.

Science 316(5830):1488–1491.

Mehta CR, Patel NR (2006). Adaptive, group sequential and decision theoretic ap-proaches to sample size determination. Stat Med 25(19):3250–69; discussion 3297–301, 3302–4, 3313–4, 3326–47.

Moonesinghe R, Khoury MJ, Janssens ACJW (2007). Most published research findings are false-but a little replication goes a long way. PLoS Med 4(2):e28.

M¨ uller HH, Pahl R, Sch¨afer H (2007). Including sampling and phenotyping costs into the optimization of two stage designs for genome wide association studies. Genet Epidemiol 31:844–852.

M¨ uller HH, Sch¨afer H (1999). Optimization of testing times and critical values in se-quential equivalence testing. Stat Med 18(14):1769–88; discussion 1789.

M¨ uller HH, Sch¨afer H (2001). Adaptive group sequential designs for clinical trials:

combining the advantages of adaptive and of classical group sequential approaches.

Biometrics 57(3):886–891.

M¨ uller HH, Sch¨afer H (2004). A general statistical principle for changing a design any time during the course of a trial. Stat Med 23(16):2497–2508.

Nicodemus KK, Luna A, Shugart YY (2007). An evaluation of power and type I error of single-nucleotide polymorphism transmission/disequilibrium-based statistical meth-ods under different family structures, missing parental data, and population stratifi-cation. Am J Hum Genet 80(1):178–185.

Nothnagel M (2004). The Definition of Multilocus Haplotype Blocks and Common Diseases (PhD thesis).

O’Brien PC, Fleming TR (1979). A multiple testing procedure for clinical trials. Bio-metrics 35(3):549–556.

Ollier W, Sprosen T, Peakman T (2005). UK Biobank: from concept to reality. Phar-macogenomics 6(6):639–646.

Ott J (1989). Statistical properties of the haplotype relative risk. Genet Epi-demiol 6(1):127–130.

Ott J (1999). Analysis of Human Genetic Linkage. (3 ed.). Baltimore: Johns Hopkins University Press.

Ott J (2004). Association of genetic loci: Replication or not, that is the question.

Neurology 63(6):955–958.

Ott J, Hoh J (2001). Statistical multilocus methods for disequilibrium analysis in com-plex traits. Hum Mutat 17(4):285–288.

Peng B, Kimmel M (2007). Simulations provide support for the common disease-common

variant hypothesis. Genetics 175(2):763–776.

Pocock SJ (1977). Group sequential methods in the design and analysis of clinical trials.

Biometrika 64:191–199.

Posch M, Bauer P, Brannath W (2003). Issues in designing flexible trials. Stat Med 22(6):953–969.

Pritchard JK (2001). Are rare variants responsible for susceptibility to complex diseases?

Am J Hum Genet 69(1):124–137.

Pritchard JK, Cox NJ (2002). The allelic architecture of human disease genes: common disease-common variant...or not? Hum Mol Genet 11(20):2417–2423.

Pritchard JK, Przeworski M (2001). Linkage disequilibrium in humans: models and data. Am J Hum Genet 69(1):1–14.

Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000). Association mapping in structured populations. Am J Hum Genet 67(1):170–181.

Proschan MA, Follmann DA, A. M, Waclawiw (1992). Effects of assumption violations on type I error rate in group sequential monitoring. Biometrics 48:1131–1143.

Proschan MA, Hunsberger SA (1995). Designed extension of studies based on conditional power. Biometrics 51(4):1315–1324.

Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, et al (2007). PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81(3):559–575.

R Development Core Team. (2007). R: A Language and Environment for Statistical Computing.

Risch N, Merikangas K (1996). The future of genetic studies of complex human diseases.

Science 273(5281):1516–1517.

Risch NJ (2000). Searching for genetic determinants in the new millennium. Na-ture 405(6788):847–856.

Rodrigues L, Kirkwood BR (1990). Case-control designs in the study of common diseases:

updates on the demise of the rare disease assumption and the choice of sampling scheme for controls. Int J Epidemiol 19(1):205–213.

Rothman KJ, Greenland S (2005). Handbook of Epidemiology., pp. 43–88. New York:

Springer.

Sachidanandam R, Weissman D, Schmidt SC, Kakol JM, Stein LD, Marth G, Sherry S, et al (2001). A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 409(6822):928–933.

SAS Institute Inc. (2001). SAS/IML Software: Changes and Enhancements, Release 8.1.

Sasieni PD (1997). From genotypes to genes: doubling the sample size.

Biomet-rics 53(4):1253–1261.

Satagopan JM, Elston RC (2003). Optimal two-stage genotyping in population-based association studies. Genet Epidemiol 25(2):149–157.

Satagopan JM, Venkatraman ES, Begg CB (2004). Two-stage designs for gene-disease association studies with sample size constraints. Biometrics 60(3):589–597.

Satagopan JM, Verbel DA, Venkatraman ES, Offit KE, Begg CB (2002). Two-stage designs for gene-disease association studies. Biometrics 58(1):163–170.

Satten GA, Flanders WD, Yang Q (2001). Accounting for unmeasured population sub-structure in case-control studies of genetic association using a novel latent-class model.

Am J Hum Genet 68(2):466–477.

Saunders CL, Chiodini BD, Sham P, Lewis CM, Abkevich V, Adeyemo AA, de Andrade M, et al (2007). Meta-Analysis of Genome-wide Linkage Studies in BMI and Obesity.

Obesity (Silver Spring) 15(9):2263–2275.

Schadt EE, Lamb J, Yang X, Zhu J, Edwards S, Guhathakurta D, Sieberts SK, et al (2005). An integrative genomics approach to infer causal associations between gene expression and disease. Nat Genet 37(7):710–717.

Sch¨afer H, M¨ uller HH (2001). Modification of the sample size and the schedule of interim analyses in survival trials based on data inspections. Stat Med 20(24):3741–3751.

Sch¨afer H, Timmesfeld N, M¨ uller HH (2006). An overview of statistical approaches for adaptive designs and design modifications. Biom J 48(4):507–520.

Schaid DJ (1998). Transmission disequilibrium, family controls, and great expectations.

Am J Hum Genet 63(4):935–941.

Schaid DJ (1999). Likelihoods and TDT for the case-parents design. Genet Epi-demiol 16(3):250–260.

Schaid DJ (2006). Power and sample size for testing associations of haplotypes with complex traits. Ann Hum Genet 70(Pt 1):116–130.

Schaid DJ, Sommer SS (1993). Genotype relative risks: methods for design and analysis of candidate-gene association studies. Am J Hum Genet 53(5):1114–1126.

Schaid DJ, Sommer SS (1994). Comparison of statistics for candidate-gene association studies using cases and parents. Am J Hum Genet 55(2):402–409.

Scherag A, Dempfle A, Hinney A, Hebebrand J, Sch¨afer H (2002). Confidence intervals for genotype relative risks and allele frequencies from the case parent trio design for candidate-gene studies. Hum Hered 54(4):210–217.

Scherag A, Hebebrand J, Sch¨afer H, M¨ uller HH (2008). Flexible Designs for Genomewide Association Studies. Biometrics (re-submitted).

Scherag A, M¨ uller HH, Dempfle A, Hebebrand J, Sch¨afer H (2003). Data adaptive

interim modification of sample sizes for candidate-gene association studies. Hum

Hered 56(1-3):56–62.

Scuteri A, Sanna S, Chen WM, Uda M, Albai G, Strait J, Najjar S, et al (2007). Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits. PLoS Genet 3(7):e115.

Shen Y, Fisher L (1999). Statistical inference for self-designing clinical trials with a one-sided hypothesis. Biometrics 55(1):190–197.

Skol AD, Scott LJ, Abecasis GR, Boehnke M (2006). Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nat Genet 38(2):209–213.

Skol AD, Scott LJ, Abecasis GR, Boehnke M (2007). Optimal designs for two-stage genome-wide association studies. Genet Epidemiol 31:776–788.

Sladek R, Rocheleau G, Rung J, Dina C, Shen L, Serre D, Boutin P, et al (2007).

A genome-wide association study identifies novel risk loci for type 2 diabetes. Na-ture 445(7130):881–885.

Spence MA, Greenberg DA, Hodge SE, Vieland VJ (2003). The emperor’s new methods.

Am J Hum Genet 72(5):1084–1087.

Spielman RS, Ewens WJ (1996). The TDT and other family-based tests for linkage disequilibrium and association. Am J Hum Genet 59(5):983–989.

Spielman RS, McGinnis RE, Ewens WJ (1993). Transmission test for linkage disequilib-rium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52(3):506–516.

Spielman RS, McGinnis RE, Ewens WJ (1994). The transmission/disequilibrium test detects cosegregation and linkage. Am J Hum Genet 54(3):559–60; author reply 560–3.

Steemers FJ, Gunderson KL (2007). Whole genome genotyping technologies on the BeadArray platform. Biotechnol J 2(1):41–49.

Storey JD, Tibshirani R (2003). Statistical significance for genomewide studies. Proc Natl Acad Sci U S A 100(16):9440–9445.

Stram DO (2004). Tag SNP selection for association studies. Genet Epidemiol 27(4):365–

374.

Strauch K, Fimmers R, Baur MP, Wienker TF (2003a). How to model a complex trait.

1. General considerations and suggestions. Hum Hered 55(4):202–210.

Strauch K, Fimmers R, Baur MP, Wienker TF (2003b). How to model a complex trait.

2. Analysis with two disease loci. Hum Hered 56(4):200–211.

Syv¨anen AC (2005). Toward genome-wide SNP genotyping. Nat Genet 37 Suppl:S5–10.

Terwilliger JD, Hiekkalinna T (2006). An utter refutation of the ”Fundamental Theorem of the HapMap”. Eur J Hum Genet 14(4):426–437.

Terwilliger JD, Weiss KM (1998). Linkage disequilibrium mapping of complex disease:

fantasy or reality? Curr Opin Biotechnol 9(6):578–594.

Thomas DC (2004). Statistical Methods In Genetic Epidemiology. New York: Oxford University Press Inc.

Thomas DC, Haile RW, Duggan D (2005). Recent developments in genomewide associ-ation scans: a workshop summary and review. Am J Hum Genet 77(3):337–345.

Timmesfeld N, Sch¨afer H, M¨ uller HH (2007). Increasing the sample size during clinical trials with t-distributed test statistics without inflating the type I error rate. Stat Med 26(12):2449–2464.

Todd JA (2006). Statistical false positive or true disease pathway? Nat Genet 38(7):731–

733.

Tsiatis AA, Mehta C (2003). On the inefficiency of the adaptive design for monitoring clinical trials. Biometrika 90(2):367–378.

van Houwelingen HC, Arends LR, Stijnen T (2002). Advanced methods in meta-analysis:

multivariate approach and meta-regression. Stat Med 21(4):589–624.

Victor A, Hommel G (2007). Combining adaptive designs with control of the false discovery rate–a generalized definition for a global p-value. Biom J 49(1):94–106.

Wacholder S, Chanock S, Garcia-Closas M, Ghormli LE, Rothman N (2004). Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. J Natl Cancer Inst 96(6):434–442.

Wacholder S, Rothman N, Caporaso N (2002). Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiolog-ical studies of common polymorphisms and cancer. Cancer Epidemiol Biomarkers Prev 11(6):513–520.

Wakefield J (2007). A Bayesian Measure of the Probability of False Discovery in Genetic Epidemiology Studies. Am J Hum Genet 81(2):208–227.

Wang H, Thomas DC, Pe’er I, Stram DO (2006). Optimal two-stage genotyping designs for genome-wide association scans. Genet Epidemiol 30(4):356–368.

Wang K, Li M, Bucan M (2007). Pathway-based approaches for analysis of genome-wide association studies. Am J Hum Genet (in press).

Wang WYS, Barratt BJ, Clayton DG, Todd JA (2005). Genome-wide association stud-ies: theoretical and practical concerns. Nat Rev Genet 6(2):109–118.

Wassmer G (1998). A comparison of two methods for adaptive interim analyses in clinical trials. Biometrics 54(2):696–705.

Wassmer G (1999). Multistage adaptive test procedures based on Fishers product cri-terion. Biometrical Journal 41:279–293.

Wassmer G (2001). Statistische Testverfahren f¨ ur gruppensequentielle und adaptive

Pl¨ane in klinischen Studien. (2 ed.). M¨ unchen: Verlag Alexander M¨onch.

Weeks DE, Lathrop GM (1995). Polygenic disease: methods for mapping complex disease traits. Trends Genet 11(12):513–519.

Weinberg CR, Umbach DM (2005). A hybrid design for studying genetic influences on risk of diseases with onset early in life. Am J Hum Genet 77(4):627–636.

Weinberg W (1909). ¨ Uber Vererbungsgesetze beim Menschen. Molecular and General Genetics MGG 2(1):276–330.

Weiss KM, Terwilliger JD (2000). How many diseases does it take to map a gene with SNPs? Nat Genet 26(2):151–157.

Wellcome Trust Case Control Consortium (2007). Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Na-ture 447(7145):661–678.

Wen SH, Tzeng JY, Kao JT, Hsiao CK (2006). A two-stage design for multiple testing in large-scale association studies. J Hum Genet 51(6):523–532.

Wermter AK, Reichwald K, B¨ uch T, Geller F, Platzer C, Huse K, Hess C, et al (2005).

Mutation analysis of the MCHR1 gene in human obesity. Eur J Endocrinol 152(6):851–

862.

Whitehead J (1999). A unified theory for sequential clinical trials. Stat Med 18(17-18):2271–2286.

WHO (2006). Global Burden of Disease and Risk Factors. Geneva: WHO Press.

Witte JS, Elston RC, Cardon LR (2000). On the relative sample size required for multiple comparisons. Stat Med 19(3):369–372.

Yang Q, Khoury MJ, Friedman J, Little J, Flanders WD (2005). How many genes underlie the occurrence of common complex diseases in the population? Int J Epi-demiol 34(5):1129–1137.

Young EH, Wareham NJ, Farooqi S, Hinney A, Hebebrand J, Scherag A, O’rahilly S, et al (2007). The V103I polymorphism of the MC4R gene and obesity: population based studies and meta-analysis of 29 563 individuals. Int J Obes (Lond) 31(9):1437–

1441.

Yu K, Chatterjee N, Wheeler W, Li Q, Wang S, Rothman N, Wacholder S (2007).

Flexible design for following up positive findings. Am J Hum Genet 81(3):540–551.

Zaykin DV, Zhivotovsky LA (2005). Ranks of genuine associations in whole-genome scans. Genetics 171(2):813–823.

Zeggini E, McCarthy MI (2007). TCF7L2: the biggest story in diabetes genetics since HLA? Diabetologia 50(1):1–4.

Zehetmayer S, Bauer P, Posch M (2005). Two-stage designs for experiments with a large

number of hypotheses. Bioinformatics 21(19):3771–3777.

Zheng G, Freidlin B, Li Z, Gastwirth JL (2003). Choice of scores in trend tests for casecontrol studies of candidate-gene associations. Biometrical Journal 45:335–348.

Zheng G, Gastwirth JL (2006). On estimation of the variance in Cochran-Armitage trend tests for genetic association using case-control studies. Stat Med 25(18):3150–3159.

Z¨ollner S, Pritchard JK (2007). Overcoming the winner’s curse: estimating penetrance parameters from case-control data. Am J Hum Genet 80(4):605–615.

Zondervan KT, Cardon LR (2004). The complex interplay among factors that influence

allelic association. Nat Rev Genet 5(2):89–100.

Pers¨ onliche Daten

Andr´e Scherag Rellinghauserstr. 266 45136 Essen

Tel.: (0201) 266 7405

E-Mail: andre scherag@hotmail.de geb. am 07. 12. 1974 in Bad Hersfeld ledig, deutsch

Berufserfahrung

seit 2007 Wissenschaftliche Mitarbeiter am Institut f¨ur Medizinische Infor-matik, Biometrie und Epidemiologie (Prof. Dr. J¨ockel), Universit¨at Duisburg-Essen

2002–2007 Wissenschaftliche Mitarbeiter am Institut f¨ur Medizinische Biome-trie und Epidemiologie (Prof. Dr. Sch¨afer), Philipps-Universit¨at Mar-burg

1998–2001 Wissenschaftliche Hilfskraft im DFG-Projekt

” Dynamik mentaler Re-pr¨asentationen“ (Prof. Dr. R¨osler), Philipps-Universit¨at Marburg

Studium und Weiterbildung

seit 2006 Masterstudiengang

” Medical Biometry/Biostatistics“, Ruprecht-Karls-Universit¨at Heidelberg

2003–2007 Promotionsstudium Erg¨anzungsstudiengang

” Humanbiologie“, Philipps-Universit¨at Marburg

2002–2007 Postgraduelle Ausbildung

” Medizinische Biometrie“, Ruprecht-Karls-Universit¨at Heidelberg,

(Abschlussnote: summa cum laude)

1996–2002 Studium der Psychologie, Philipps-Universit¨at Marburg, Diplomarbeit

” Gibt es r¨aumlich getrennte semantische Repr¨asen-tationen von Nomen und Verben im Mentalen Lexikon ?“ (Prof.

R¨osler), Diplom in Psychologie (Note: sehr gut)

(Exmatrikulation auf eigenen Wunsch)

Auslandsaufenthalt

2001–2002 Forschungsaufenthalt am

” Brain Development Lab“

(Prof. Dr. Neville), University of Oregon Eugene, USA

Schulbildung und Wehrersatzdienst

1994–1995 Wehrersatzdienst im Bereich Altenpflege, Diakonisches Werk, Bad Hersfeld

1981–1994 Grundschule, Gymnasialzweig und gymnasiale Oberstufe

(Allgemei-ne Hochschulreife), Bad Hersfeld

Br¨onner G, Hinney A, Reichwald K, Wermter AK, Scherag A, Friedel S, Hebebrand J (2006). Gene variants and obesity., pp. 266–299. Weinheim: Wiley-VCH.

Dempfle A, Wudy SA, Saar K, Hagemann S, Friedel S, Scherag A, Berthold LD, et al (2006). Evidence for involvement of the vitamin D receptor gene in idiopathic short stature via a genome-wide linkage study and subsequent association studies. Hum Mol Genet 15(18):2772–2783.

Eberhart LH, Frank S, Lange H, Morin AM, Scherag A, Wulf H, Kranke P (2006).

Systematic review on the recurrence of postoperative nausea and vomiting after a first episode in the recovery room - implications for the treatment of PONV and related clinical trials. BMC Anesthesiol 6:14.

Friedel S, Reichwald K, Scherag A, Brumm H, Wermter AK, Fries HR, Koberwitz K, et al (2007). Mutation screen and association studies in the Diacylglycerol O-acyltransferase homolog 2 gene (DGAT2), a positional candidate gene for early onset obesity on chromosome 11q13. BMC Genet 8:17.

Hinney A, Bettecken T, Tarnow P, Brumm H, Reichwald K, Lichtner P, Scherag A, et al (2006). Prevalence, spectrum, and functional characterization of melanocortin-4 receptor gene mutations in a representative population-based sample and obese adults from Germany. J Clin Endocrinol Metab 91(5):1761–1769.

Hinney A, Nguyen TT, Scherag A, Friedel S, Br¨onner G, M¨uller TD, Grallert H, et al (2007). Genome wide association (GWA) study for early onset extreme obesity supports the role of fat mass and obesity associated gene (FTO) variants. PLoS ONE 2(12):e1361.

H¨olter K, Wermter AK, Scherag A, Siegfried W, Goldschmidt H, Hebebrand J, Hinney A (2007). Analysis of sequence variations in the suppressor of cytokine signaling (SOCS)-3 gene in extremely obese children and adolescents. BMC Med Genet 8:21.

Khader P, Scherag A, Streb J, R¨osler F (2003). Differences between noun and verb processing in a minimal phrase context: a semantic priming study using event-related brain potentials. Brain Res Cogn Brain Res 17(2):293–313.

Lyon HN, Emilsson V, Hinney A, Heid IM, Lasky-Su J, Zhu X, Thorleifsson G, et al

(2007). The association of a SNP upstream of INSIG2 with body mass index is

reproduced in several but not all cohorts. PLoS Genet 3(4):e61.