• Keine Ergebnisse gefunden

cis- regulatory divergence is due to changes in chromatin accessibility and sequence divergence -

5. Chapter III - Regulatory Divergence in the Drosophila melanogaster subgroup

5.3.2. cis- regulatory divergence is due to changes in chromatin accessibility and sequence divergence -

We further tested, which mechanisms contribute to cis-regulatory divergence in our data. Two reasons can theoretically underlie cis-regulatory changes leading to subsequent gene expression divergence, namely either mutations directly in the regulatory regions or divergent accessibility of these regions.

Orthologous regulatory sequences might have experienced changes in their nucleotide sequence, which could, amongst other things, affect TF-binding (Wittkopp, 2013). Even if the regulatory regions of a gene are characterized, studying the influence of sequence changes on gene expression is not straightforward. In some reported cases only one nucleotide change is enough to alter the temporal expression of an important master regulator (Ramaekers et al., 2018), whereas other enhancer sequences keep their conserved function despite extensive reshuffling of TF binding sites (Khoueiry et al., 2017; Ludwig et al., 2000). However, some mechanistic insights have been gained in the last years, that may help interpreting the obtained data. It was for instance shown in Drosophila, that quantitative changes in enhancer strengths between species correlate linearly with sequence divergence (Arnold et al., 2014) and that sequence changes in regulatory regions may lead to differential functionality due to loss in transcription factor or co-factor binding (e.g. (Paris et al., 2013; Schmidt et al., 2010; Zheng et al., 2010)). However, how deleterious the loss of a certain TF binding motif is, seems to depend on the combinatorial binding of a TF collective (Khoueiry et al., 2017). In our genome wide comparison, orthologous sequence divergence is higher in open chromatin regions close to

- 186 -

genes with cis-regulatory divergence between species or compensatory changes in hybrids. A higher rate of polymorphisms in promoter regions of cis-effect genes (compared to trans-effect genes) was for instance shown in plants (Zhang and Borevitz, 2009), but also in Drosophila (McManus et al., 2010). These studies and our results suggest higher purifying selection in regulatory regions of highly connected developmental genes, which we found to be more often differentially expressed due to upstream trans-effects or are conserved between the species.

Studies in Arabidopsis thaliana suggest that not only open chromatin regions with nucleotide changes, but also differentially accessible DNase hyperactive sites (DHS) are often found close to genes that show differential expression between ecotypes (Alexandre et al., 2018). Therefore, differential accessibility of regulatory regions very likely adds to expression variation in cis. Here, we found that indeed genes with highly divergent DNA accessibilities are significantly more often differentially expressed due to cis-regulatory changes. Chromatin remodelling and differential enhancer opening is prevalent during development (e.g. Bozek et al., 2019; Hughes et al., 2017; Kvon et al., 2014; McKay and Lieb, 2013; Uyehara et al., 2017), and in the last years an in-depth understanding of how 3D chromatin organization, epigenetic and histone modifications and chromatin accessibility interact has emerged (e.g. (Corrales et al., 2017; Cubeñas-Potts et al., 2017; Rennie et al., 2018b; Sexton et al., 2012)). How this though affects divergent DNA accessibility among species is still largely unclear.

We further checked for sequence divergence and accessibility of promoters and intronic regulatory regions separately. Regulatory sequences annotated to TSS and promoter regions showed a higher sequence divergence, whereas intronic regulatory sequences seemed to be more constraint. Intronic peaks were more often differentially accessible in both of our comparisons, suggesting that in general the accessibility of TSS/promoter peaks is more conserved, whereas accessibility of regulatory regions in introns seems to be more species specific. We could therefore observe the trend in which changes in DNA accessibility affect more often intronic regions, though their sequences seem to stay more conserved. Apart from the circumstance that intronic sequences are maybe more conserved due to their location in gene loci, higher sequence conservation was indeed observed in long introns, which are thought to harbour more functional elements (Haddrill et al., 2005). It will be important to compare these results with sequence divergence of more distant intergenic regulatory regions.

- 187 -

5.3.3. Compensation and conservation of gene expression

It was suggested, that gene expression falls largely under stabilizing selection (e.g.

(Landry et al., 2005; Lemos et al., 2005)), i.e. that a certain level of gene expression has to be kept stable. The rational is, that even though mutations in regulatory sequences accumulate over time, trans-regulatory factors co-evolve to buffer these changes (Landry et al., 2005). We found in our analysis a high number of compensatory effects, characterized by allelic misexpression in the F1 hybrid generation. Interestingly, regulatory regions of genes show a similar sequence divergence than genes that are affected by cis-regulatory changes, suggesting that indeed upstream trans-regulatory factors co-evolved to maintain the expression levels in the parental species. We found compensatory regulation in all three gene sets, predominantly though in genes that show no divergence in peak accessibility, therefore, the main mode of cis-regulatory changes in these genes might be attributed to nucleotide changes. Nevertheless, compensatory changes are also found in genes with diverged accessibility of regulatory regions.

One characteristic of enhancer function is that they usually work in a highly modular manner (reviewed for example in (Arnone and Davidson, 1997; Wray, 2003)). It was for instance estimated for Drosophila that each expressed gene is controlled by an average of four distinct enhancers (Kvon et al., 2014). This modularity allows also to control gene expression in a spatially and temporally controlled manner (reviewed for instance in (Prud’homme et al., 2007)). This has been elegantly shown in more simple traits like pigmentation patterns, in which the deletion of a ‘spot enhancer’ or an ‘abdomen enhancer’ leads to loss of wing pigmentation on a Drosophila wing or loss of dark abdomen coloration (Jeong et al., 2006; Prud’homme et al., 2006). Our dataset provides the opportunity to further analyse in more detail, how much of the compensatory coevolution is driven by differential combinatorial usage of such enhancer modules. Since DNA accessibility is highly dependent on the developmental stage and tissue, one can assume that this kind of compensatory regulation is in general highly context dependent and calls for a more thorough comparison with other developing tissues, like the wing disc for instance.

We found a high number of genes that show conserved expression in the parental species as well as in their F1 hybrids. These conserved genes were highly enriched in general developmental functions, like growth, proliferation or morphogenesis, which is consistent with our finding that most developmental TFs are conserved in expression between the species.

Regulatory sequences of conserved genes were equally constraint in terms of sequence

- 188 -

divergence than genes that showed trans-regulatory divergence. Nevertheless, in cases that show high sequence divergence, conservation of TF binding could be attributed for example to the topology but also the function of the GRN. It has been suggested that upstream genes in highly connected GRNs show a more conserved TF occupancy (Khoueiry et al., 2017) and have therefore a higher chance to balance sequence changes in their regulatory regions. For conserved genes that show a high divergence in peak accessibility between the species, the modularity of enhancer elements, as discussed for compensatory changes, might ensure the correct level of gene expression. In contrast to genes that show compensatory changes though, these mechanisms would not lead to misexpression in the hybrids, therefore they might be less dependent on the co-evolution of upstream trans-regulators.

Overall, the high number of compensatory and conserved genes that do show changes in DNA accessibility or enhancer and promoter sequence reflects the high potential of compensatory mechanisms, that ensure the correct level of expression despite substantial cis-regulatory changes (Ludwig et al., 2000). In this study, we mainly concentrate on changes in regulatory regions and upstream transcription factors. Given the highly complex regulation of gene expression (reviewed in (Buchberger et al., 2019)) it remains to be studied how gene expression control on other levels, for instance miRNAs contribute to such compensatory mechanisms.