• Keine Ergebnisse gefunden

3.1 Positional effects of G-quadruplexes on E. coli gene expression

3.1.1 G-quadruplexes in untranslated regions

3.1.1.6 Naturally occurring quadruplexes in the SD region in E. coli

In two different artificial systems pronounced modulation of translation via G-quadruplexes has been shown: 1. Repression of gene expression up to 96% compared to the wildtype vector by masking the SD region (161) and 2. More than 100% activation of gene expression compared to a control by liberation of the SD site. The observation of drastic quadruplex-mediated effects in 5´-UTRs raises the question whether quadruplexes in natural genetic contexts exert similar control over gene expression. Hence the occurrence of potential G-quadruplexes surrounding the SD region in genomic sequence data of E. coli MG1655 was investigated. Using the ProQuad Pattern Algorithm (104) we searched for G-quadruplexes with 2-5 tetrads and loops of 1-5 nucleotides that overlap with the anticipated SD sequences located approximately 10 – 12 nt upstream of the start codon. 46 potential quadruplexes in the vicinity to SD sequences were identified (see Table 13.1 in the appendices). Gene functions were categorized using the KEGG database (287,288). The sequences were widely distributed in all kinds of genes, with most quadruplexes in fundamental functional categories of metabolic pathways, microbial metabolism in diverse environments and biosynthesis of secondary metabolites. Importantly, all identified quadruplex sequences are anticipated to form structures with two tetrads and do not show a conserved sequence pattern. Most of them also occur in other E. coli subtypes, e.g. Escherichia coli CFT073 (ECC) and Escherichia coli O157:H7 str. Sakai (ECS).

Although G-quadruplexes comprising only two tetrads are less stable, such structures could function as regulatory elements as well. Recently, Chowdhury and co-workers showed that a quadruplex motif formed by two tetrads is involved in the regulation of the expression of the human thymidine kinase (289). Wieland et al. also observed pronounced inhibition of gene expression by means of artificially designed sequences masking the SD region with two tetrad-containing quadruplexes (161). In order to investigate some of the naturally occurring G-quadruplex sequences in more detail, their influence on gene expression was studied. For this purpose, we placed the quadruplexes including the whole natural 5’-UTR in front of a β-galactosidase reporter gene under control of the araBAD promoter. We randomly chose

37

five different genes with putative G-quadruplex sequences overlapping the ribosome binding site from our set of sequences identified in the E. coli genome: oxyR, relA, rseA, napH and yadI (see Figure 3.9). The G-quadruplex sequences in front of those genes differ in loop length and distance from the start codon (sequences listed in Figure 3.9 and Table 13.1 in the appendices). One sequence (relA) even included the start codon within the possible G-quadruplex sequence. For each construct we designed two controls which should not be able to form a G-quadruplex, see Figure 3.9. Mutating guanines outside the core SD region should not alter the efficiency of 16S rRNA interactions with the mRNA, but instead reduce quadruplex-based secondary structure formation. However, in this region – which is crucial for initiating translation in bacteria – it is very likely that even small sequence changes influence gene expression (290). Some of the mutants showed very high gene expression patterns, which might be explained by sequence changes that facilitate ribosomal interactions. Especially yadIm1 showed an unexpectedly high gene expression although it contains only two G to U mutations. However, A/U rich sequences upstream of the SD site have been reported to serve as mRNA stabilizing elements (291). Unfortunately, both controls for the yadI construct behaved very differently in the gene expression studies and thus did not allow conclusions to be drawn with regard to G-quadruplex formation (see Figure 3.9 B). In any case, for three other constructs (relA, oxyR, napH) an effect on gene expression was observed which seems to be related to secondary structure formation. For the sequences upstream of the E. coli oxyR (see Figure 3.9 D) and relA (see Figure 3.9 B) genes, gene expression significantly increased in both mutants: 87.1% for oxyRm1 and 92.0% for oxyRm2 as well as 59.2% for relAm1 and 85.9% for relAm2 (see Figure 3.9 B and D). Regarding the napH construct, we observed a significant decrease of gene expression for both controls (more than 100%). However, gene expression of napH m1 decreased sevenfold compared to napH m2. As the two mutants differ considerably it is difficult to associate this with secondary structure formation. Also, addition of the quadruplex stabilizing compound NMM did not change the gene expression levels significantly (see Chapter 3.1.1.8). For the rseA 5’-UTR G-rich sequence, β-galactosidase expression is increased compared to both mutants (see Figure 3.9 B). In this case, the effect could not be deemed significant according to the unpaired t-test.

Next, the G-quadruplex in front of the oxyR gene was investigated in more detail. The stability of the oxyR quadruplex RNA sequence (see Figure 3.9 C oxyR and oxyRm1 and Table 7.1) was characterized via CD spectroscopy and thermal denaturation (see Figure 3.9 E and F). The oxyR sequence forms a parallel four-stranded structure as expected for an RNA quadruplex with a melting temperature of 56.2°C. The control sequence oxyRm1 (see Figure 3.9 C and Table 7.1) – containing two G-to-U mutations – showed a shifted CD signal

38

and a much lower melting temperature of 38.6°C. Therefore, we assumed the formation of a G-quadruplex with moderate stability for the oxyR construct and no stable structure formation for the control oxyRm1. As described above, the oxyR sequence showed significant reduction of gene expression compared to the controls oxyRm1 and oxyRm2.

In order to analyze the influence of certain nucleotide changes on gene expression in more detail, another set of controls for the oxyR G-quadruplex sequence was designed (see Figure 3.9 C&D). Here, we included sequences which were mutated outside of the G-tract, i.e. they should still be able to form a G-quadruplex structure (oxyRm3, oxyRm4 and oxyRm5). With these control constructs we wanted to support our assumption that changes in gene expression result from secondary structure formation and are not only the effect of sequence changes in this regulatory region. Accordingly, we expected reduced gene expression for controls able to form G-quadruplexes with respect to the non-quadruplex controls. For oxyRm3 and oxyRm4 the respective A was changed to U 14 nt and 10 nt in front of the start codon. Gene expression increased significantly (80%) compared to the naturally G-rich oxyR sequence, but still remained repressed in comparison to the mutants that were not able to form a G-quadruplex (oxyRm1 and oxyRm2). Interestingly, when U was changed into A 13 nt upstream of the start codon, gene expression decreased even more than in the natural oxyR sequence. In oxyRm6 the last G-tract was mutated, so that no G-quadruplex formation should be possible. In this case gene expression was also repressed when compared to the other mutants, but still significantly increased (77%) when compared to the naturally occurring oxyR sequence. Presumably, both effects (the secondary structure formation as well as the single-nucleotide changes in the SD region) contribute to the observed changes in gene expression. To exclude the influence of sequence mutations on mRNA stability or altered transcription rates for the oxyR constructs, we determined oxyR mRNA levels via RT-PCR. We found similar mRNA abundances for G-quadruplex constructs and mutants 1 and 2 (see Figure 3.9 H), suggesting differential translation initiation as the likely cause of the observed differences in gene expression. Furthermore, we showed that this modulation is not selective for a specific plasmid or read-out system as the insertion of the same SD background in front of the eGFP gene in the pQE vector led to comparable results (see Figure 3.9 G).

39

Figure 3.9: Naturally occurring quadruplexes in E. coli SD regions.

40

A Sequences (5’ to 3’) of quadruplexes occurring in the SD region of the E. coli relA, rseA, napH and yadI genes with their respective control mutants. Wt stands for the SD sequence in the wildtype pBAD vector. B β-galactosidase expression of constructs listed in A. C Sequence of the G-quadruplex in front of the oxyR gene and the respective controls. D β-galactosidase expression of oxyR constructs. E CD spectra and F thermal denaturation curves at 260 nm of the G-quadruplex in front of the E. coli oxyR gene compared to the oxyR mutant 1.G Gene expression of G-quadruplexes naturally occurring in E. coli SD region in front of oxyR gene investigated in the pQE vector in front of an eGFP reporter gene. H Analysis of eGFP mRNA levels by semi-quantitative RT-PCR for the SD region upstream of the E. coli oxyR gene compared to controls. RNA levels were calculated relative to the expression of the genomically encoded ssrA gene. Error bars represent standard deviations of three independent experiments, * indicates P<0.05, ** indicates P<0.001, *** indicates P<0.0001. © Cell Press.