A. Appendix
A.2. Electronic supplement
A.2.3. CNE_results
This folder contains the results of the CNE analysis. It contains one folder with the direct CNEr outputs for all pairwise comparisons. Also, it contains outputs produced by the perl-scripts. See chapter 5 and gure 5.1 for futher information.
Acronyms
Aub Aubergine
bp base pair
CNE conserved non-coding element
CNS conserved non-coding sequence
DNA deoxyribonucleic acid
dsRNA double-stranded RNA
Hox gene homeobox gene
ITS internal transcribed spacer
lincRNA long intergenic non-coding RNA
lncRNA long non-coding RNA
LSU large transcriptional subunit
MFE minimal free energy
miRISC microRNA-induced silencing complex
miRNA micro RNA
mRNA messenger RNA
MRP RNA mitochondrial RNA processing
MSA multiple sequence alignment
mya million years ago
my million years
ncRNA non-coding RNA
OGS ocial gene set
piRNA PIWI-interacting RNA
Pol II RNA polymerase II
rasiRNA repeat-associated RNA
rDNA ribosomal DNA
RISC RNA-induced silencing complex
RNA ribonucleic acid
RNAi RNA interference
rRNA ribosomal RNA
scaRNA small Cajal body-specic RNA
siRNA small interfering RNA
SSU small transcriptional subunit
snoRNA small nucleolar RNA
snRNA small nuclear RNA
SRP RNA signal recognition particle RNA
ssRNA single-stranded RNA
TE transposable element
TFBS transcription factor binding sites
tRNA transfer RNA
TSS transcription start site
UCE ultraconserved element
UTR untranslated region
WGA whole genome alignment
Zuc Zucchini
List of Figures
1.1 Schematic overview of the secondary structure of four dierent ncRNAs:
miRNA, tRNA, H/ACA snoRNA, C/D snoRNA. . . 3
1.2 Biogenesis pathway of a miRNA. . . 7
1.3 Visualisation of the piRNA ping-pong loop. . . 15
1.4 Graphical overview of the rDNA cluster. . . 20
2.1 Graphical overview of the steps in the pipeline used for homology pre-diction of miRNAs. . . 37
2.2 FEELncclassier description. Sub classication of intergenic and genic lncRNA/transcripts interactions by the FEELncclassier module. Taken from Wucher et al. (2017). . . 42
3.1 Absolute numbers of dierent ncRNAs found through homology analysis. 47 3.2 Graphical overview of the mir-2 cluster. . . 49
3.3 Visualisation of tRNA clusters found in A. rosae and O. abietinus. . . . 55
3.4 Visualisation of overlapping ncRNAs predicted using the DARIO pipeline in A. rosae. . . 57
3.5 Visualisation of the overlapping ncRNAs predicted by the DARIO pipeline in Orussus abietinus. . . 61
5.1 Graphical overview of the steps in the pipeline used forCNE prediction. 76 6.1 Visualised number of CNE candidates identied with CNEr. . . 83
6.2 Visualisation of the numbers ofCNEsper cluster seen in table 6.2 in the four species. . . 85
6.3 Distribution of CNE prediction in Apis mellifera dierentiated by species. Only the results for the six longest scaolds are shown. The number ofCNEsis the accumulative total amount found on this scaold. x-axis shows the genomic location on the scaold, y-axis the number of CNEs. The results are for pairwise comparisons between species. . . 89
6.4 Distribution of CNE predictions in Athalia rosae. . . 90
6.5 Distribution of CNE prediction in Nasonia vitripennis. . . 91
6.6 Distribution of CNE prediction in Orussus abietinus. . . 92
List of Tables
1.1 A selection of dierent denitions of conserved non-coding elements and
ultraconserved elements. . . 28
3.1 List of all miRNAs present in seven insect species. . . 49
3.1 Continued from previous page. . . 50
3.1 Continued from previous page. . . 51
3.2 List of all regulatory elements and ncRNAs, excluding miRNAs and tRNAs, present in seven insect species. . . 52
3.2 Continued from previous page. . . 53
3.3 Read counts of the dierent datasets that were prepared for DARIO. . 53
3.4 Results of the tRNA de novo prediction. . . 58
3.5 List of tRNA families containing introns. . . 58
3.6 Results of de novo ncRNA prediction using the DARIO pipeline. . . 59
3.7 Number of lncRNAs predicted in four Hymenoptera species. . . 65
3.8 FEELnc lncRNA-gene interaction results for A. rosae, O. abietinus, A. mellifera, and N. vitripennis. . . 66
6.1 Number of CNEs identied by CNEr sorted by species, number of CNEs left after overlapping ones were combined, size of the assembly (Mb), and N50 (kb) of the assembly. . . 82
6.2 Number of CNE clusters, grouped by CNE numbers. . . 84
6.4 Total amount of CNE clusters and number of cluster with lncRNA/protein-coding genes. . . 86
6.3 Ratios of lncRNA/protein-coding genes. . . 86
6.5 CNE clusters with lncRNA in cis direction next to it. . . 87
6.6 Number of clusters consisting of ≥10 CNEs with an lncRNA as the as-sociated gene with the information of the shortest distance between the cluster and the lncRNAfound. . . 93
A.1 Coordinates of all regulatory elements and ncRNAs that were predicted in Athalia rosae, after manual curation. . . 115
A.2 Coordinates of all regulatory elements and ncRNAs that were predicted in Orussus abietinus, after manual curation. . . 124
Danksagung
Auch wenn es schwierig ist allen Leuten, die mich in den letzten Jahren unterstützt haben zu danken, versuche ich es. Zuerst natürlich Prof. Dr. Bernhard Misof, der mir in seiner Arbeitsgruppe die Möglichkeit gegeben hat diese Arbeit zu verwirklichen.
Am meisten bedanken möchte ich mich bei Dr. Alexander Donath, der diese Arbeit und mich betreut hat und mir immer mit seinem Rat und Wissen beiseite stand, auch wenn ich an den Skripten verzweifeln wollte.
Desweiteren danke ich Dr. Lars Podsiadlowski, der diese Arbeit begutachtet, und Prof.
Dr. Dietmar Quandt und Prof. Dr. Thomas Döring dafür, dass sie sich zur Teilnahme an der Prüfungskommision bereit erklärt haben.
Weiterer Dank gilt allen Mitgliedern der Graduate School on Genomic Biodiversity Research für interessante Gespräche und fürs Brainstormen.
Nicht zu vergessen sind die Menschen auÿerhalb der Arbeit, die mich mit Motivation, Aufmunterung und ihrer Freundschaft versorgt haben. Ganz besonders Anna-Lisa Hahnen, die sich die Mühe gemacht hat diese Arbeit gegenzulesen und meinem Englisch etwas zu helfen.
Zu guter Letzt danke ich meiner Familie, für einfach alles.