• Keine Ergebnisse gefunden

1. Comparative genomics of the genus Pseudomonas and phylum Chlorobi 1 Background

1.2 About the manuscripts

These two manuscripts were grouped together due to their common ground in comparative genomics. It was the intention of the first paper to i) attempt to locate overrepresented 8-14 bp oligomers in Pseudomonas strains, ii) analyse genomic structure including variable regions and

Part 1: Comparative genomics

genomic islands using previously investigated tetranucleotides, and compare and contrast these to mid-length oligomers and iii) use Pseudomonas as a test case for locating, visualising and testing specificity of new oligomeric characters in metagenomics. The major results from this paper included creation of the OligoCounter program suite, which formed the core tools for future work within this thesis. A dataset of overrepresented oligomers could be generated and maintained with this suite. The discovery by Weinel et al. (2002b) of 8-14 bp oligomers in Pseudomonas strains was verified and extended to newly sequenced genomes. Visualisation of whole genomes demonstrated usage of 8-14 bp oligonucleotides to reflect that of the shorter tetramers, which was surprisingly in contrast to past predictions (Reva and Tümmler 2005). Regions of atypical genomic oligomer usage, such as genomic islands, tend to lack globally overrepresented 8-14mers. This can be explained by the fact genomic islands and phages also exhibit their own characteristic oligonucleotide usage or be similar to a putative previous host genome (Pride et al. 2003). In other words, divergent genomic regions are frequently involved in horizontal gene transfer and this process is a major confounder of phylogenetic and taxonomic inference. Thus, oligomers appeared in theory to be promising candidates for taxonomic markers in the field of metagenomics. In practice, overrepresented oligomers were located in all Pseudomonas strains in sufficient numbers, even at a very conservative overrepresentation cutoff, and displayed significant specificity to allow distinction between species. Another more anecdotal observation concerned the apparent overrepresentation of a few tripeptides in Pseudomonas. The amino acid leucine was notably present in the triplets despite normalisation for its expected high frequency.

The second manuscript, "Comparative genomics of the Chlorobi", also makes use of oligonucleotide signatures as well as a range of other bioinformatic techniques in an assessment of the eleven sequenced genomes of the green sulfur bacteria. Oligonucleotide parameters precisely and elegantly depict the genome positions of a number of giant genes with constrained amino acid usage (see Figure 1 in the manuscript), but also show genomic islands, islets and phages. The phylogeny of the Chlorobi have been the subject of considerable debate (Imhoff 2003) and is likely a model phylum for the positive impact of molecular taxonomy over traditional morphological characteristics. Our findings, based on the comparisons of the sequences of two appropriate genes and the entire proteome content for all completely sequenced genomes, support those of Imhoff (2003). Using a whole proteome sequence comparison approach we also noted the conservation of orthologous genes throughout the Chlorobi, although gene synteny was, as expected, not conserved in this phylum. The origin of photosynthesis and horizontal gene transfer events of various

Part 1: Comparative genomics

photosynthetic genes have been of great interest in studies of these taxa. Our results of putatively horizontally transferred regions infer that, while some metabolic genes associated with photosynthesis have apparently been transferred between taxa, several hundred genes are probably necessary for the demanding requirements of a photosynthetic lifestyle. Thus most of the genes which enable photosynthesis belong to the core genome.

The first paper, "Visualization of Pseudomonas genomic structure by abundant 8-14mer oligonucleotides" was published in Environmental Microbiology in 2009 while "Comparative genomics of the Chlorobi" had been submitted to Photosynthesis Research at the time of writing. I was lead author on both works, and contributed written passages, figures, programming, and analyses to both. Oleg Reva contributed analyses and figures to the first paper, while David Ussery contributed whole genome proteome assessment figures and commented on the second manuscript.

Burkhard Tümmler wrote parts of both works.

1.3 References

Becq, J., Gutierrez, M. C., Rosas-Magallanes, V., Rauzier, J., Gicquel, B., Neyrolles, O. &

Deschavanne, P. (2007) Contribution of horizontally acquired genomic islands to the evolution of the tubercle bacilli. Mol Biol Evol 24(8) 1861-1871.

Bohlin, J., Skjerve, E. & Ussery, D. W. (2008) Investigations of oligonucleotide usage variance within and between prokaryotes. PLoS Comput Biol 4(4) e1000057.

Bush, E. C. & Lahn, B. T. (2006) The evolution of word composition in metazoan promoter sequence. PLoS Comput Biol 2(11) e150.

Eisen, J. A., Nelson, K. E., Paulsen, I. T., Heidelberg, J. F., Wu, M., Dodson, R. J., Deboy, R., Gwinn, M. L., et al. (2002) The complete genome sequence of Chlorobium tepidum TLS, a photosynthetic, anaerobic, green-sulfur bacterium. Proc Natl Acad Sci U S A 99(14) 9509-9514.

Foerstner, K. U., von Mering, C., Hooper, S. D. & Bork, P. (2005) Environments shape the nucleotide composition of genomes. EMBO Rep 6(12) 1208-1213.

Imhoff, J. F. (2003) Phylogenetic taxonomy of the family Chlorobiaceae on the basis of 16S rRNA and fmo (Fenna-Matthews-Olson protein) gene sequences. Int J Syst Evol Microbiol 53(Pt 4) 941-951.

Karlin, S., Mrázek, J. & Campbell, A. M. (1997) Compositional biases of bacterial genomes and evolutionary implications. J Bacteriol 179(12) 3899-3913.

Part 1: Comparative genomics

Lee, D. G., Urbach, J. M., Wu, G., Liberati, N. T., Feinbaum, R. L., Miyata, S., Diggins, L. T., He, J., et al. (2006) Genomic analysis reveals that Pseudomonas aeruginosa virulence is combinatorial.

Genome Biol 7(10) R90.

Nelson, K. E., Weinel, C., Paulsen, I. T., Dodson, R. J., Hilbert, H., dos Santos, V. A. P. M., Fouts, D. E., et al. (2002) Complete genome sequence and comparative analysis of the metabolically versatile Pseudomonas putida KT2440. Environ Microbiol 4(12) 799-808.

Ochman, H., Lerat, E. & Daubin, V. (2005) Examining bacterial species under the specter of gene transfer and exchange. Proc Natl Acad Sci U S A 102 Suppl 1, 6595-6599.

Olsen, G. J. & Woese, C. R. (1993) Ribosomal RNA: a key to phylogeny. FASEB J 7(1) 113-123.

Pride, D. T., Meinersmann, R. J., Wassenaar, T. M. & Blaser, M. J. (2003) Evolutionary

implications of microbial genome tetranucleotide frequency biases. Genome Res 13(2) 145-158.

Reva, O. N. & Tümmler, B. (2004) Global features of sequences of bacterial chromosomes,

plasmids and phages revealed by analysis of oligonucleotide usage patterns. BMC Bioinformatics 5, 90.

Reva, O. N. & Tümmler, B. (2005) Differentiation of regions with atypical oligonucleotide composition in bacterial genomes. BMC Bioinformatics 6, 251.

Santos, V. A. P. M. D., Heim, S., Moore, E. R. B., Strätz, M. & Timmis, K. N. (2004) Insights into the genomic basis of niche specificity of Pseudomonas putida KT2440. Environ Microbiol 6(12) 1264-1286.

Silby, M. W., Cerdeño-Tárraga, A. M., Vernikos, G. S., Giddens, S. R., Jackson, R. W., Preston, G.

M., et al. (2009) Genomic and genetic analyses of diversity and plant interactions of Pseudomonas fluorescens. Genome Biol 10(5) R51.

Sueoka N. (1962) On genetic basis for variation and heterogeneity of DNA base composition. Proc.

Natl. Sci. USA 48, 582-92.

Teeling, H., Waldmann, J., Lombardot, T., Bauer, M. & Glöckner, F. O. (2004) TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences. BMC Bioinformatics 5, 163.

Weinel, C., Nelson, K. E. & Tümmler, B. (2002a) Global features of the Pseudomonas putida KT2440 genome sequence. Environ Microbiol 4(12) 809-818.

Weinel, C., Ussery, D. W., Ohlsson, H., Sicheritz-Ponten, T., Kiewitz, C. & Tümmler, B. (2002b) Comparative Genomics of Pseudomonas aeruginosa PAO1 and Pseudomonas putida KT2440:

Orthologs, Codon Usage, Repetitive Extragenic Palindromic Elements, and Oligonucleotide Motif Signatures. Genome Letters 4, 175-187.

Willner, D., Thurber, R. V. & Rohwer, F. (2009) Metagenomic signatures of 86 microbial and viral metagenomes. Environ Microbiol. 11(7) 1752-1766.

Visualization of Pseudomonas genomic structure by