2 Results and Discussion

2.4 Genomics

2.4.2 Functional genes

For each pairwise alignment, the orthologous regions were summed up using a self-made java script, and the percentage of total orthologous regions were calculated for each pair of genomes. The similarity matrix was made. The neighbor-joining trees for genome homology and 16S rRNA showed the same topology (Figure 12). This once again indicates that 16S rRNA-based phylogeny is a good proxy for genome evolution.

Figure 12 Neighbor joining trees based on 16S rRNA similarity (left) and genome homology (right).

Figure 13 Comparison of PS operons of KT71, RAp1red, Ivo14 and HTCC2080.

Green, bch genes; red, puf genes; orange, crt genes; light grey, unknown conserved genes. The bchHM genes of RAp1red locate on ~24 kbp upstream of PS superoperon on the same scaffold (scaffold 4).

According to the former study (Yutin et al., 2007), the pufM genes from the NOR5-1B group are located inside the Group K, while the gene from KT71 is the closest relative of Group K. However, the pufL and pufM sequences of HTCC2148 and HTCC2246 (also a NOR5/OM60 member, which could not be grouped into any subclades), which were acquired using PCR amplification, did not group with other NOR5/OM60 members, but rather with two different groups of Alphaproteobacteria (Cho et al., 2007). This puzzle was not solved in this study, since the pufLM as well the whole PS superoperon were not found in the genome of HTCC2148. Since the probability that the whole PS superoperon was missing from the genome sequencing is not high, it is most possible that contamination of other photosynthetic bacteria happened during the PCR of the puf genes of HTCC2148. Proteorhodopsin

The proteorhodopsin genes (pop) were found only in the genomes of HTCC2148 and HTCC2143, but not in the four genomes in which the PS superoperon was present.

The pop gene of HTCC2148 is located at the beginning of a very short scaffold (scaffold 18, 4490 bp) and the sequence is not complete (540 bp), while the pop gene in HTCC2143 is complete (690 bp).

The pop gene of HTCC2143 is the closest relative of the SAR92 group, one of the most closely related groups to the NOR5/OM60 clade and HTCC2143 (16S rRNA sequence identities between the groups are 88 – 92%), while that of HTCC2148 also

cluster with other Alpha- and Gammaproteobacteria, although the exact position cannot be determined due to incompleteness of its sequence.

Downstream of the HTCC2143 proteorhodopsin gene are the genes for retinal synthesis, in the order pop-crtEIBY-blh-fni (crtE = idsA), all translated in the same direction. This gene arrangement is exactly the same as in HTCC2207 (Stingl et al., 2007). Therefore, the existence of proteorhodopsin in HTCC2143 is convincing. How-ever, the genes for retinal synthesis are not found in the genome of HTCC2148, and the downstream of pop are functionally unrelated genes. Since retinal is the chromophore for rhodopsin, the functionality of pop gene in the genome of HTCC2148 is therefore quite questionable. Carbon fixation

The key genes of Calvin Cycle, reverse citric acid cycle and reductive acetyl-CoA pathway were not found in any of the six genomes. However, in this study, several genes of the 3-hydroxypropionate cycle were identified in the four genomes of NOR5/OM60 strains. This includes the malonyl-CoA reductase gene (mcr) and the the propionyl-CoA synthase gene (pcs). These are two key genes which have not been found to be involved in any pathway other than the carbon-fixing 3-hydroxypropionate cycle (Hügler et al., 2002). The two genes were found in the tandem arrangement as pcs-mcr in the genomes of RAp1red, Ivo14 and HTCC2080. We have found only pcs in KT71, while mcr is missing as reported before (Friedmann et al., 2007).

Until now, these large genes (for HTCC2080, mcr 3651 bp and pcs 5421 bp) can be found only in a few strains: Chloroflexus spp., Roseiflexus spp. (both Chloroflexi) and Erythrobacter sp. NAP1 (Alphaproteobacteria); a single pcs gene was found in Chloro-herpeton thalassium ATCC35110 (Chlorobi). This is the first time that these genes are found in Gammaproteobacteria and the second time in Proteobacteria.

A comparative sequence analysis for all the available genomic pcs genes to date (Figure 14) shows clustering of the NOR5/OM60 sequences. The pcs sequence of the strain Ivo14 is closer to that of the other North Sea strains than to HTCC2080, which means that the pcs phylogeny is not parallel to 16S rRNA phylogeny. The similarity of pcs from all the sources is high (e.g. 46 – 54% amino acid identity between the

NOR5/OM60 and Chloroflexi sequences). Therefore it is highly possible that the pcs genes in NOR5/OM60 have the same function as in Chloroflexi.

Figure 14 Maximum likelihood tree of genes for all the known propionyl-CoA synthase (pcs) genes from genomes. Both Chloroflexus and Roseiflexus belong to the phylum Chloroflexi, while Chloroherpeton belongs to Chlorobi and Erythro-bacter belongs to AlphaproteoErythro-bacteria.

The enzymes for the first step of 3-hydroxypropionate pathway, accA, accBC and accD for acetyl-CoA carboxylase were found in all the five strains of the NOR5/OM60 clade, all separated at three isolated locations on the genomes. Genes for propionyl-CoA carboxylase (pccBA), methylmalonyl-CoA epimerase (mce), methylmalonyl-CoA mutase (mcm) and a putative arginine/ornithine transport system ATPase occur tandemly in all the six genomes.

The last steps of 3-hydroxypropionate cycle in Chloroflexus are more complicated than previously thought and are still under investigation (Friedmann et al., 2007). For the supposed succinyl-CoA:L-malate CoA transferase and L-malyl-CoA lyase, homologs with relatively low similarity to those in Chloroflexus can be found in the NOR5/OM60 genomes. It is hard to judge whether the NOR5/OM60 strains use these enzymes to close the cycle. On the other hand it is possible that the NOR5/OM60 strains may use a different pathway to recycle succinyl-CoA and to regenerate acetyl-CoA.

The absence of the mcr gene in KT71 is in accordance with the fact that KT71 was not able to grow autotrophically in physiological tests (Fuchs et al., 2007). The reason why it still keeps the huge pcs gene is not clear yet. The only other reported strain from Proteobacteria, the alphaproteobacterial AAnP Erythrobacter sp. NAP1, was

proved to be able to assimilate CO2 (Kolber et al., 2001). The daily cellular CO2 fixation rate was 3% of the cellular carbon content and contributed to about 1% of total carbon anabolism.

Since it is the first time that the pcs gene is found in Gammaproteobacteria, we searched for its homologous sequence using BLAST against metagenomic databases.

Hundreds of homologous sequences were found from the Global Ocean Survey (GOS) project (http://camera.calit2.net/index.php) (Rusch et al., 2007), and many of them are obviously more similar to the sequences of the NOR5/OM60 strains than to the other groups (e-values differentiate more than 1030 times). The sampling locations at which pcs genes were sequenced are also widely distributed. Therefore, the 3-hydroxypropionate pathway might be a common route for carbon fixation in the marine surface layer, and more studies in detail are expected to determine if they belong to the NOR5/OM60 group. Sulfur compound oxidation genes

The sox operon encoding enzymes for the oxidation of sulfur compounds is present in all genomes containing the PS-superoperon, i.e. KT71, RAp1red, Ivo14 and HTCC2080 (Figure 15), but not in HTCC2148 and HTCC2143. Among all the sox genes, soxCDXYZAB are the core genes for reducing thiosulfate (Friedrich et al., 2005).

Figure 15 Arrangement of sox operon in KT71, RAp1red, Ivo14 and HTCC2080 genomes. The soxX of KT71 and RAp1red show low similarity and different length with those of Ivo14 and HTCC2080.

The operon arrangement soxCDXYZAKB in KT71 and RAp1red is the same as in several Gamma- and Alphaproteobacteria, like Neptuniibacter caesariensis MED92 and

Methylobacterium sp. 4-46. However, the sox operon of Ivo14 and HTCC2080 has the arrangement soxCDYZAXB, the same as in several Betaproteobacteria, like Dechloro-monas aromatica RCB. The closest relatives of several genes of sox genes of Ivo14 and HTCC2080 also fall in Beta- or Deltaproteobacteria, e.g. as shown for the soxB gene. All these features suggest a lateral gene transfer of the whole sox operon from Beta- or Delta-proteobacteria, while the sox operons of NOR5-3 strains are closer to other Gamma-proteobacteria.

The distribution of several functional gene groups is summarized in Unit 2, Table 3.

In document Molecular Ecology of the NOR5/OM60 Groupof Gammaproteobacteria (Page 40-45)