• Keine Ergebnisse gefunden

Association Analysis

Im Dokument Genetics of Restless Legs Syndrome  (Seite 119-123)

5 Discussions

5.2 Targeted Sequencing of RLS Candidate Genes Using MIPseq

5.2.6 Association Analysis

The SKAT-O test [93] was used with SNVs that were called from the MIPseq of the cases and controls, and should identify genes that are associated with the RLS status. This test method was shown to achieve a high power over a broad range of scenarios [93]. Due to the small sample size, a correction had to be applied based on resampling [347]. This disabled the application of a generalized linear mixed model [74]. Thus principal components had to be used to correct for population stratification, but a correction for cryptic relatedness was missed [327]. However, highly related individuals were removed from the dataset and the impact of the cryptic relatedness might have been low [450] as the case-control cohort was likely an outbred population [355, 451].

However, the SKAT tests are more susceptible to population stratification [452], and the application of principal components or mixed models might fail to correct for the stratification if the disease risk is also stratified [453]. However, this stratification might be low in the German population [25].

In the SKAT-O tests [93], differential missingness was corrected [12, 74]. As a note, inflated test statistics are even expected after a successful correction for population substructure in traits with a polygenic inheritance [436, 437], which could be the case for RLS [214]. Thus further studies have to be performed to dissect the confounding from polygenicity using LD-score regression analysis [437].

Besides PTPRD, MYT1 (myelin transcription factor 1, also KIAA0835, KIAA1050, MTF1, MYTI or PLPB1 [2]) was significantly associated with RLS. MYT1 is widely expressed in developing vertebrate neuronal tissues [454], and it is involved in the proliferation and differentiation of oligodendrocytes [455]. MYT1 interacts with STEAP3 [456], which was a candidate gene for RLS association in the analysis of the ExomeChip data and it is involved in iron homeostasis [457]. Of note, COL20A1 (collagen type XX alpha 1 chain, also KIAA1510 [2]) was significant after correcting for testing 61 genes. The gene is transcribed in the brain [458]. It is in close proximity to MYT1. The genes DMPK (dystrophia myotonica protein kinase [2]), RASGRP4 (RAS guanyl releasing protein 4 [2]) and AAGAB (alpha- and gamma-adaptin binding protein [2]) were also significantly associated with RLS after correcting for multiple testing of 61 genes. DMPK was related to the pathophysiology of myotonic

5 Discussions

dystrophy [459]. RASGRP4 functions in the mast cell development [460]. And AAGAB is mainly known for its involvement in punctate palmoplantar keratoderma [461, 462]. However, the samples of the MIPseq experiment and the ExomeChip experiment were partially overlapping and not independent. Thus a Bonferroni correction for 61 tests might not be stringent enough to avoid false positive associations. Further functional studies are needed to assess its function in the pathophysiology of RLS and to validate the variants that were observed in this study.

In the single variant association test, one SNP reached genome-wide significance and another one almost genome-wide significance. The variants were located in MYT1 (3’ UTR) and OLFML2B (Aps60/61Gly) and were unlikely to be detrimental. Of note, the rare allele was unique to cases.

Further analysis has to be considered to validate these SNPs and to exclude the possibility of sequencing artefacts as the mapping qualities of the respective reads were near 10. Two other SNPs were significant after correcting for multiple testing of 26,192 score tests. For both SNPs, the common allele was the risk allele, and they were located in introns of ATP2C1 and CADM1. They might be regulatory variants of high impact. These SNPs have to be validated as well.

The findings from the MIPseq data have to be replicated in a larger samples set. Furthermore, a correction for population substructure might have been less effective than expected, and thus another replication might be performed based on family data.

5.3 Explaining RLS Families with RLS Risk SNPs

This work did not only address the rare variant-common disease hypothesis for RLS, but also the common variant-common disease hypothesis: The question was raised whether the published RLS associated risk loci might explain families enriched with RLS cases. Therefore, 79 families were genotyped using the Affimetrix Axiom technology, and the RLS risk SNPs were imputed in case of missing genotype calls. After the QC, 78 families were remaining (see appendix, Table 34, p. 315).

The leading SNPs from the latest published RLS GWAS [245] were used for the analysis and also rs113851554, which was shown to be associated with RLS [70]. Only little is known about the causality of these SNPs (e.g. see [32]), and thus these variants might tag the real causal variants. So a slightly different set of SNPs could have been chosen as well to address the general question, e.g.

rs11693221 instead of rs113851554 [32]. However, the overall message should not change.

The variants rs113851554 and rs2300478 of the MEIS1 locus were associated with the RLS phenotype in the cohort of the combined families. The p values differed between Wald test and score test (Table 27, p. 80) of the rare variant rs113851554. This was expected. Especially the Wald test gives deflated p values in case of analyzing rarer variants [342, 463], but the score test is inflated [464]. Thus the Wald test is conservative in single rare variant association tests. The typical GWAS significance threshold might not have to be applied here because the different SNPs throughout the pedigrees might be more correlated than in GWAS studies and thus a GWAS significance threshold might be too conservative.

For each individual family, the correlation was examined between RLS phenotypes and dosages of RLS associated risk alleles [70, 245] or their burden (only independent SNPs). Eight families showed at least one significant alternative model including an RLS risk SNP as a predictor for the disease status. These families were enriched for Finnish families. Their population is a founder population [465, 466], and a common genetic background might exist. Its setup might lead to an increased susceptibility to RLS. This background (or a common environmental component in Finland) might trigger RLS already with one RLS risk allele being present. As this background might be quite common, the inheritance of a SNP allele might already lead to inherited RLS. It might be interesting

5.3 Explaining RLS Families with RLS Risk SNPs

to examine the interaction of the linkage loci and the (burden of) disease risk SNPs in families that were subject to a linkage analysis in the past. The common risk allele burden significantly explained the familial phenotypic variance in 10 families. And two observations were made: First, only three families did also show significant associations of single SNP dosages with the disease status. Thus for some families, some RLS risk SNPs might have been less relevant than others, and adding less relevant SNPs to a burden calculation might have added noise to the model and reduced the significance. The relevance of a SNP might be correlated with its effect size in GWAS, and adding these effect sizes as a weighting scheme in the burden analysis might help to improve the burden model. However, the relevance of a SNP might also depend on the environmental or genetic background of a family. Second, the familial pseudo-R² depended on the size of the families: The larger the family, the lower the pseudo-R² value, the less phenotypic variance could be explained by the burden of risk alleles. The identification of new risk variants might then add to the explainable proportion of phenotypic variance in the larger families.

Im Dokument Genetics of Restless Legs Syndrome  (Seite 119-123)