• Keine Ergebnisse gefunden

Elongation velocity at high resolution across the gene

IV. Discussion and Outlook

3. Future explorations

3.4 Elongation velocity at high resolution across the gene

In order to introduce the reader to the relevant information needed to understand the next paragraphs I will briefly revisit the human gene architecture (Figure 34 a). A gene consists of exons which code for the mature transcript, and introns which are removed by splicing (see also Introduction I.1.2). Introns of nascent RNAs are mostly spliced co-transcriptionally 392,492,493. For each intron the spliceosome needs to assemble, thus, multiple spliceosomes are needed per transcript (recently reviewed in 71). The 5’ end of the first exon is defined as transcription start site (TSS). Downstream of the TSS is the Pol II pause site (PS) which marks the center of the promoter-proximal pause window (PS +/- 100 nt). The 3’ end of the last exon is referred to as poly(A) (pA) site. The RNA is cleaved at the pA signal and most RNA 3’ ends get a poly(A) tail 65. Pol II continues transcription downstream of the pA site until it reaches a transcription termination site (TTS). The region between pA site and TSS is defined as termination window. Slowly elongating polymerases are not only observed in the promoter-proximal window but can also be found at intron-exon junctions (splice sites) and over the termination window 19. Our multi-omics approach quantifies

elongation velocities at different genes (Figure 34 b, unpublished), at genes encoding different transcript classes (Figure 34 c, unpublished), or at specific gene segments (Figure 34 d, unpublished). Multiple genomic features and factors modulate the Pol II elongation velocity v (Figure 16 e). Here, I focus on the interplay of v with co-transcriptional splicing and the dynamic histone code (see also Introduction I.1.1).

The complex interplay of elongation velocity and splicing. Slow Pol II elongation velocity can promote splicing and inclusion of alternative exons 494-496, but it can also lead to exon skipping by extending the window of opportunity to recruit splice repressors 497. It was recently shown in plants that light can accelerate Pol II elongation rates along photosynthesis genes, and as result of the changed velocity, the corresponding transcripts are subject to alternative splicing 498. High resolution (close to single-nucleotide precision) is required to distinguish v at exon-intron borders. As discussed, occupancy profiling is ambiguous (Introduction I.3.1.2, Supplementary Note 2). But, it can provide clues if Pol II might elongate more (local minimum) or less rapidly (accumulation) in specific regions of the gene, assuming that Pol II transcription is generally processive and the drop-off rate within the gene body is low. This was indeed suggested by published PRO-seq data 347: alignment at 3’ splice sites (3’SS, intron-exon border) showed a drop in Pol II occupancy at the end of the intron, immediately followed by Pol II accumulation at the beginning of the exon. Alignment of PRO-seq density at 5’ SS (exon-intron border) showed Pol II accumulation at the end of the exon, and decreasing Pol II occupancy within the first 50 bp of the intron after which it returns to baseline levels 347. Another study found that the 3’ ends of ~17 nt long splice-site RNAs (spliRNAs) align perfectly with the 3’ end of exons and are conserved across species (human, mouse, Drosophila, C.elegans and marine sponge) 499. This might indicate a longer residence time of the polymerases at the 3’ end of exons since the nascent RNA associated with the elongation complex is longer protected (and thus, measurable). These observations agree with our kinetic measurements combining TT-seq and mNET-seq data of steady state human K562 cells (Figure 34 e, unpublished). Our approach allows to extract kinetic parameters at high resolution, and to compare the elongation velocity between segments of different genes which might have very diverse initiation frequencies, Ilocal which could not be interpreted by occupancy data alone.

Furthermore, it would be exciting to investigate if spacing between consecutive polymerases (spacing defined by the promoter-proximal pause duration) protects against polymerase traffic jams that could occur later due to pauses at splice sites. So far, it is unknown if promoter-proximal pause durations correlate with pause durations at exons in vivo. Thus, the analysis of transcription kinetics has the potential to reveal novel aspects of co-transcriptional splicing regulation.

Elongation velocity and a dynamic histone code. Thus far, we and others 130,272,335,336,500 have observed links between Pol II elongation velocity and histone modifications but much of this data is correlative in nature (Figure 50). To determine whether histone modifications are causative for fast or slow elongation velocity one would need to test mutants which cause more or less modifications and measure if elongation velocities would be changed. Another possibility is that modifiers traveling with Pol II have different time windows to act on the underlying chromatin during fast or slow elongation.

Figure 34. Elongation velocity at high resolution across human genes.

(a) Gene architecture. Colors as in (b and d). (b) Metagene analysis of local elongation velocity [kbp min-1] along 9,329 genes in human K562 cells (steady state), depicted as scaled genomic position, aligned at first TSS and last pA site. (c) Boxplot of elongation velocity for different transcript classes annotated in human K562 cells: 886 eRNAs, 248 conRNAs, 235 uaRNAs, 877 sincRNAs, 1,281 asRNAs, 157 lincRNAs, and 4,582 mRNAs. Black bars represent median, boxes mark upper and lower quartiles, whiskers represent 1.5 times the inter-quartile range.

(d) Boxplot of elongation velocity [kbp min-1] at different transcript segments: first exon (n = 8,674) (dark blue), intermediate exons (middle) (n = 7,954) (grey), last exons (n = 7,911) (white), or introns (n = 8,788) (emerald green). (e) Metagene analysis of elongation velocity [kbp min-1] aligned at different transcript segments (reference point highlighted in green): aligned at TSS (n = 8,821) (dark blue, left), exon start (3’SS) and end (5’SS) (n = 69,184) (middle), or pA site (n = 8,072) (right). List of splice sites was generated from RefSeq annotations.

V. Supplementary Information

This chapter contains information in support of the Introduction, Materials and Methods, Results and Discussion which could not be integrated within the respective chapter due to space limitations.