• Keine Ergebnisse gefunden

Conservation genetics of Malagasy amphibians

N/A
N/A
Protected

Academic year: 2022

Aktie "Conservation genetics of Malagasy amphibians"

Copied!
102
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

CHAPTER 6

Comparative performance of the 16S rRNA gene in DNA barcoding of amphibians

Published in Frontiers in Zoology (2005) 2: article 5

Picture: Arie van der Meijden

(2)

Comparative performance of the 16S rRNA gene in DNA barcoding of amphibians

6.1. ABSTRACT

Identifying species of organisms by short sequences of DNA has been in the center of ongoing discussions under the terms DNA barcoding or DNA taxonomy. A C- terminal fragment of the mitochondrial gene for cytochrome oxidase subunit I (COI) has been proposed as universal marker for this purpose among animals. Herein we present experimental evidence that the mitochondrial 16S rRNA gene fulfills the requirements for a universal DNA barcoding marker in amphibians. In terms of universality of priming sites and identification of major vertebrate clades the studied 16S fragment is superior to COI. Amplification success was 100% for 16S in a subset of fresh and well-preserved samples of Madagascan frogs, while various combinations of COI primers had lower success rates. COI priming sites showed high variability among amphibians both at the level of groups and closely related species, whereas 16S priming sites were highly conserved among vertebrates. Interspecific pairwise 16S divergences in a test group of Madagascan frogs were at a level suitable for assignment of larval stages to species (1-17%), with low degrees of pairwise haplotype divergence within populations (0-1%). We strongly advocate the use of 16S rRNA as standard DNA barcoding marker for vertebrates to complement COI, especially if samples a priori could belong to various phylogenetically distant taxa and false negatives would constitute a major problem.

(3)

6.2. INTRODUCTION

The use of short DNA sequences for the standardized identification of organisms has recently gained attention under the terms DNA barcoding or DNA taxonomy (Floyd et al. 2002; Hebert et al. 2003a; Tautz et al. 2003). Among the promising applications of this method are the assignments of unknown life-history stages to adult organisms (Hebert et al. 2004a; Thomas et al. 2005), the large-scale identification of organisms in ecological or genomic studies (Blaxter 2004; Floyd et al. 2002) and, most controversially, explorative studies to discover potentially undescribed "candidate"

species (Hebert et al. 2004a; Hebert et al. 2004b; Venter et al. 2004). Although it is not a fundamentally new technique (Moritz and Cicero 2004), DNA barcoding is promising because technical progress has made its large-scale, automated application feasible (Blaxter 2004; Tautz et al. 2003), which may accelerate taxonomic progress (Wilson 2004).

DNA barcoding and DNA taxonomy are realities in many fields (Blaxter 2004).

Consensus by the scientific community is essential with respect to the most suitable genes that allow robust and repeatable amplification and sequencing and provide unequivocal resolution to identify a broad spectrum of organisms. While D. Tautz and co-workers (Tautz et al. 2003) proposed the nuclear ribosomal RNA genes for this purpose, P. D. N. Hebert and colleagues have strongly argued in favor of a 5' fragment of the mitochondrial gene for cytochrome oxidase, subunit I (COI or COXI) (Hebert et al. 2003a; Hebert et al. 2003b). This gene fragment has been shown to provide a sufficient resolution and robustness in some groups of organisms, such as arthropods and, more recently, birds (Hebert et al. 2003a; Hebert et al. 2003b; Hebert et al. 2004a;

Hebert et al. 2004b).

A genetic marker suitable for DNA barcoding needs to meet a number of criteria (Hebert et al. 2003a). First, in the study group, it needs to be sufficiently variable to be able to discriminate among most species but sufficiently conserved to be less variable within than between species. Second, priming sites need to be sufficiently conserved to permit a reliable amplification without the risk of false negatives when pooled samples or environmental DNA is analyzed. Third, the gene should convey sufficient phylogenetic information to assign species to major taxa using simple phenetic approaches. Fourth, its amplification and sequencing should be as robust as possible,

(4)

also under variable lab conditions and protocols. Fifth, sequence alignment should be possible also among distantly related taxa.

Here we explore the performance of a fragment of the 16S ribosomal RNA gene (16S) in DNA barcoding of amphibians. As a contribution to the discussion about suitable standard markers we provide experimental data on comparative amplification success of 16S and COI in amphibians, empirical data on conservedness of priming sites, and an example from the 16S-based identification of amphibian larval stages.

6.3. MATERIAL AND METHODS

To test for universality of primers and cycling conditions, we performed parallel experiments in three different laboratories (Berkeley, Cologne, Konstanz) using the same primers but different biochemical products and thermocyclers, and slightly different protocols.

The selected primers for 16S (Palumbi et al. 1991) amplify a fragment of ca.

550 bp (in amphibians) that has been used in many phylogenetic and phylogeographic studies in this and other vertebrate classes: 16SA-L, 5' - CGC CTG TTT ATC AAA AAC AT - 3'; 16SB-H, 5' - CCG GTC TGA ACT CAG ATC ACG T - 3'.

For COI we tested (1) three primers designed for birds (Hebert et al. 2004b) that amplify a 749 bp region near the 5'-terminus of this gene: BirdF1, 5' - TTC TCC AAC CAC AAA GAC ATT GGC AC - 3', BirdR1, 5' - ACG TGG GAG ATA ATT CCA AAT CCT G - 3', and BirdR2, 5' - ACT ACA TGT GAG ATG ATT CCG AAT CCA G - 3'; and (2) one pair of primers designed for arthropods (Hebert et al. 2003a) that amplify a 658 bp fragment in the same region: LCO1490, 5' - GGT CAA CAA ATC ATA AAG ATA TTG G - 3', and HCO2198, 5'-TAA ACT TCA GGG TGA CCA AAA AAT CA-3'. Sequences of additional primers for COI that had performed well in mammals and fishes were kindly made available by P. D. N. Hebert (personal communication in 2004) and these primers yielded similar results (not shown).

The optimal annealing temperatures for the COI primers were determined using a gradient thermocycler and were found to be 49-50°C; the 16S annealing temperature was 55°C. Successfully amplified fragments were sequenced using various automated sequencers and deposited in GenBank. Accession numbers for the complete data set of adult mantellid sequences used for the assessment of intra- and interspecific divergences

(5)

(e.g., Figure 6.5) are AY847959-AY848683. Accession numbers of the obtained COI sequences are AY883978-AY883995.

Nucleotide variability was scored using the software DnaSP (Rozas et al. 2003) at COI and 16S priming sites of the following complete mitochondrial genomes of nine amphibians and 59 other vertebrates: Cephalochordata: AF098298, Branchiostoma.

Myxiniformes: AJ404477, Myxine. Petromyzontiformes: U11880, Petromyzon.

Chondrichthyes: AJ310140, Chimaera; AF106038, Raja; Y16067, Scyliorhinus;

Y18134, Squalus. Actinopterygii: AY442347, Amia; AB038556, Anguilla; AB034824, C o r e g o n u s; M91245, Crossostoma; AP002944, Gasterosteus; AB047553, Plecoglossus; U62532, Polypterus; U12143, Salmo. Dipnoi: L42813, Protopterus.

Coelacanthiformes: U82228, Latimeria. Amphibia, Gymnophiona: AF154051, Typhlonectes. Amphibia, Urodela: AJ584639, Ambystoma; AJ492192, Andrias;

AF154053, Mertensiella; AJ419960, Ranodon. Amphibia, Anura: AB127977, Buergeria; NC_005794, Bufo; AY158705; Fejervarya; AB043889, Rana; M10217, Xenopus. Testudines: AF069423, NC_000886, Chelonia; Chrysemys; AF366350, Dogania; AY687385, Pelodiscus; AF039066, Pelomedusa. Squamata: NC_005958, Abronia; AB079613, Cordylus; AB008539, Dinodon; AJ278511, Iguana; AB079597, Leptotyphlops; AB079242, Sceloporus; AB080274, Shinisaurus. Crocodilia:

AJ404872, Caiman. Aves: AF363031, Anser; AY074885, Arenaria; AF090337, Aythya; AF380305, Buteo; AB026818, Ciconia; AF362763, Eudyptula; AF090338, Falco; AY235571, Gallus; AY074886, Haematopus; AF090339, Rhea; Y12025, Struthio. Mammalia: X83427, Ornithorhynchus; Y10524, Macropus; AJ304826, Vombatus; AF061340, Artibeus; U96639, Canis; AJ222767, Cavia ; AY075116, Dugong; AB099484, Echinops; Y19184, Lama; AJ224821, Loxodonta; AB042432, Mus; AJ001562, Myoxus; AJ001588, Oryctolagus; AF321050, Pteropus; AB061527, Sorex; AF348159, Tarsius; AF217811, Tupaia; AF303111, Ursus (for species names, see GenBank under the respective accession numbers).

16S sequences of a large sample of Madagascan frogs were used to build a database in BioEdit (North Carolina State University). Tadpole sequences were compared with this database using local BLAST searches (Altschul et al. 1990) as implemented in BioEdit.

The performance of COI and 16S in assigning taxa to inclusive major clades was tested based on gene fragments homologous to those amplified by the primers used

(6)

herein (see above), extracted from the complete mitochondrial sequences of 68 vertebrate taxa. Sequences were aligned in Sequence Navigator (Applied Biosystems) by a Clustal algorithm with a gap penalty of 50, a gap extend penalty of 10 and a setting of the ktup parameter at 2. PAUP* (Swofford 2002) was used with the neighbor-joining algorithm and LogDet distances and excluding pairwise comparisons for gapped sites.

We chose these simple phenetic methods instead of maximum likelihood or maximum parsimony approaches because they are computationally more demanding and because the aim of DNA barcoding is a robust and fast identification of taxa rather than an accurate determination of their phylogenetic relationships.

6.4. RESULTS

Amplification experiments

We performed independent amplification experiments with one set of 16S primers and three published sets of COI primers (Hebert et al. 2003a; Hebert et al.

2004b) focusing on representatives of different frog, salamander and caecilian genera.

The experiments were concordant in yielding more general results for 16S than COI. In a set of of fresh and well-preserved samples from relatively closely related mantellid frogs from Madagascar (Appendix 6.1), the 16S amplification success was complete, whereas the three sets of COI primers yielded success rates of only 50-70%.

Considering all three primer combinations, there were two species of frogs (10%) that did not amplify for COI (Boophis septentrionalis and B. tephraeomystax) at all.

Priming sites

The variability of priming sites was surveyed using nine complete amphibian mitochondrial sequences from GenBank (Figure 6.1), and 59 mt genomes of fishes, reptiles, birds and mammals (Figure 6.2).

(7)

Figure 6.1. Variability of priming sites for 16S rRNA and COI in amphibians.

(8)

Figure 6.2. Variation in priming sites of 16S rRNA (a, F-primer; b, R-primer) and COI (c, Bird-F1, LCO1490; d, HCO2198; e, Bird-R1, Bird-R2) fragments studied herein.

Values are nucleotide variability as calculated using the DnaSP program. Grey bars show the values for nine amphibians, black bars the values for a set of 59 other vertebrates (see Materials and Methods, and Figures 6.3 and 6.4).

(9)

A high variability was encountered for COI. The sequences of some species were largely consistent with the primers: Xenopus had two mutations only at each of the priming regions. However, other sequences were strongly different, with up to seven mutations, all at third codon positions. No particular pattern was recognizable for any major group that would facilitate designing COI primers specific for frogs, salamanders or caecilians. Interestingly the variability among the amphibian sequences available was as large as or larger than among the complete set of vertebrates at many nucleotide positions of COI priming sites (Figure 6.2), indicating a possible higher than random variability of this gene in amphibians.

In contrast, the 16S priming sites were remarkably constant both among amphibians and among other vertebrates (Figures 6.1 and 6.2). A wider survey of priming sites, i.e., the alternative reverse priming sites used in arthropod and bird studies (Hebert et al. 2003a; Hebert et al. 2004b), confirmed the high variability of COI in amphibians, and in vertebrates in general (Figure 6.2). A screening of the first 800 bp of the C-terminal part of the gene in nine amphibians of which complete mitochondrial genes were available did not reveal a single fragment of 20 bp where all nine species would agree in 80% or more of their nucleotides.

Recovery of major groups

The phenetic neighbor-joining analysis using the 16S fragment produced a tree that contained eight major groupings that conform to or are congruent with the current classification and phylogeny (Figure 6.3): cartilaginous fishes, salamanders, frogs, turtles, eutherian mammals, mammals, squamates, birds. Of these, the COI tree (Figure 6.4) recovered only the lineages of cartilaginuous fishes and birds. No additional relevant major lineage was found in the COI analysis that had not been recovered also by the 16S analysis.

(10)

Figure 6.3. Neighbor-joining tree of selected vertebrate taxa based on the fragment of the 16S rRNA gene amplified by primers 16SaL and 16SbH. Numbers in black circles indicate major clades that were recovered by this analysis: (1) cartilaginous fishes, (2) salamanders, (3) frogs, (4) turtles, (5) eutherian mammals, (6) mammals, (7) squamates, (8) birds.

(11)

Figure 6.4. Neighbor-joining tree of selected vertebrate taxa based on the fragment of the COI gene amplified by primers LCO1490 and HCO2198. Numbers in black circles indicate major clades that were recovered by this analysis. Only two of the clades recovered by the 16S analysis are also monophyletic here: (1) cartilaginous fishes, (8) birds.

16S rRNA barcoding of tadpoles

From an ongoing project involving the large-scale identification of tadpoles of Madagascan frogs (Thomas et al. 2005) we here provide data from larval and adult frog

(12)

species from two sites of high anuran diversity in eastern Madagascar, Andasibe and Ranomafana. These two localities are separated by a geographical distance of ca. 250 km. The results will be presented in more detail elsewhere.

We selected target species for which morphological and bioacoustic uniformity suggests that populations from Ranomafana and Andasibe are conspecific. All these species belong to the family Mantellidae. We then analysed haplotypes within and between these populations. In addition we assessed divergences among sibling species of mantellid frogs. These were defined as morphologically similar species that are phylogenetically sister to each other, or are in well-defined but phylogenetically poorly resolved clades of 3-5 species. Results revealed a low intrapopulational variation of 0- 3% uncorrected pairwise distances in the 16S gene, a surprisingly large differentiation among conspecific populations of 0-5.1%, and a wide range of differentiation among species, ranging from 1-16.5% with a mode at 7-9% (Figure 6.5). The few species separated by low genetic distances were allopatrically distributed. The interspecific divergence was higher in those species pairs in which syntopic occurrence has been recorded or is likely (2.7-16.5% divergence, mean 8.5%) as compared to those that so far only have been found in allopatry (1.0-12.9%, mean 6.9%).

Phylogenetic and phenetic analyses (Bayesian and Neighbor-joining) of these and many additional sequences (to be published elsewhere) mostly grouped sequences of those specimens from Ranomafana and Andasibe that a priori had been considered to be conspecific (exceptions were Mantidactylus boulengeri, not considered in the intraspecific comparisons here, and M. blommersae). This indicates that cases, in which haplotypes of a species are more similar to those of another species than to those of other conspecific individuals or populations, are rare in these frogs. Sharing of identical haplotypes among individuals belonging to different species, in our dataset, was limited to three closely related species pairs of low genetic divergences: Boophis doulioti and B.

tephraeomystax, B. goudoti and B. cf. periegetes, Mantella aurantiaca and M. crocea.

Depending on the taxonomic scheme employed, our complete data set contains 200-300 species of Madagascan frogs. Hence, haplotype sharing was demonstrated in 2-3% of the total number of species only.

(13)

Figure 6.5. Variation in the fragment of the 16S rRNA gene (ca. 550 bp) studied herein, (a) within populations, (b) among conspecific populations and (c) among sibling species of frogs in the family Mantellidae from Madagascar. The values are uncorrected p- distances from pairwise comparisons in the respective category. Only one (mean) value per species was used in (a) and (b), even when multiple individuals were compared.

To explore the reliability of tadpole identification using the 16S gene we used local BLAST searches against a database containing about 1000 sequences of adult frogs from a wide sampling all over Madagascar. 138 tadpoles from the Andasibe region and 84 tadpoles from the Ranomafana region were compared with adult sequences in the database. In 77% of the cases the highest scores were those from comparisons to adults from the same site as the tadpoles. In most of the unsuccessful comparisons, adult sequences of the corresponding species were not available from the

(14)

tadpole site (21%). In only 5 cases (2%) conspecific adults collected from a different site than the tadpoles yielded higher BLAST scores although adult sequences from the same site were in the database.

6.5. DISCUSSION

DNA barcoding in amphibians

DNA barcoding has great appeal as a universally applicable tool for identification of species and variants of organisms, possibly even in automated handheld devices (Janzen 2004). However, doubtless severe restrictions exist to its universal applicability (Moritz and Cicero 2004). Some taxa, e.g., cichlid fishes of Lake Victoria, have radiated so rapidly that the speciation events have not left any traces in their mitochondrial genomes (Verheyen et al. 2003); identifying these species genetically will only be possible through the examination of multiple nuclear markers, as it has been done to assess their phylogeny (Albertson et al. 1999). Some snails are characterized by a high intraspecific haplotype diversity, which could disable attempts to identify and distinguish among species using such markers (Thomaz et al. 1996).

Haplotype sharing due to incomplete lineage sorting or introgression is also known in amphibians (Funk and Omland 2003) although it was not common in mantellid frogs in our data set. However, a number of species showed haplotype sharing with other species, or non-monophyletic haplotypes, warranting a more extensive discussion. In Mantidactylus boulengeri, specimens from Andasibe and Ranomafana have similar advertisement calls and (at least superficially) similar morphologies, but their 16S haplotypes were not a monophyletic group (unpublished data). This species belongs to a group of direct-developing frogs that, like the Neotropical Eleutherodactylus (Dubois 2004) may be characterized by a high rate of cryptic speciation. Further data are necessary to decide whether the populations from Ranomafana and Andasibe are indeed conspecific. In contrast, there is little doubt that the populations of Mantidactylus blommersae from these two sites are conspecific, yet the Ranomafana haplotypes are closer to those of the clearly distinct species M.

domerguei. The species pairs where haplotype sharing has been observed (see Results) all appear to be allopatrically to parapatrically distributed and show no or only low differences in advertisement calls, indicating that occasional hybridization along contact

(15)

pairs always formed highly supported clusters or clades, and had divergences below 3%, indicating that haplotype sharing in mantellids may only constitute a problem when individuals are to be assigned to such closely related sister species.

Although our data show that DNA barcoding in mantellids is a largely valid approach when both reference and test sequences come from the same site, the occurrence of non- monophyletic and highly divergent haplotypes within species characterizes these and other amphibians as a challenging group for this technique. Certainly, DNA barcoding is unable to provide a fully reliable species identification in amphibians, especially if reference sequences do not cover the entire genetic variability and geographic distribution of a species. However, the same is true for any other morphological or bioacoustic identification method. Case studies are needed to estimate more precisely the margin of error of molecular identification of amphibian species. For many approaches, such as the molecular survey of the trade in frog legs for human consumption (Veith et al. 2000), the error margins might be acceptable. In contrast, the broad overlap of intraspecific and interspecific divergences (Figure 6.5) cautions against simplistic diagnoses of presumably new amphibian species by DNA divergences alone.

A large proportion of biological and evolutionary species will be missed by inventories that characterize candidate species by DNA divergences above a previously defined threshold.

Comparative performance of DNA barcoding markers in amphibians

Phenomena of haplotype sharing or non-monophyletic conspecific haplotypes will affect any DNA barcoding approach that uses mitochondrial genes, and are also to be expected with nuclear genes (e.g., Machado and Hey 2003). Nevertheless, some genes certainly outperform others in terms of discriminatory power and universal applicability, and these characteristics may also vary among organism groups. The mitochondria of plants are characterized by very different evolutionary patterns than those of animals, including frequent translocation of genetic material into and from the nucleus (Palmer et al. 2004), which limits their use for DNA barcoding purposes.

Nuclear ribosomal DNA (18S and 28S), proposed as standard marker (Tautz et al.

2003), has a high potential in invertebrate DNA barcoding but its high-throughput amplification encounters difficulties in vertebrates.

(16)

As a consequence, despite the need of consensus on markers for universal applicability of DNA barcoding, the use of different genes in different groups of organisms seems reasonable. It has been hypothesized that universal COI primers may enable amplification of a 5' terminal fragment from representatives of most animal phyla due to their robustness (Hebert et al. 2003a). The success in DNA barcoding of lepidopterans and birds suggests that this gene fragment can indeed be used as a standard for many higher animal taxa (Hebert et al. 2003a; Hebert et al. 2004a; Hebert et al. 2004b).

In our experiments we compared 16S primers commonly used in amphibians to COI primers that had been developed for other vertebrates (Hebert et al. 2004b) or invertebrates (Hebert et al. 2003a). It may well be possible, with some effort, to design primers that are more successful and consistent in amplifying COI from amphibians.

However, our results from mantellid frogs (Appendix 6.1) indicate a very good amplification success of the primers for some species, but failure for other, related species. This and our results on variability of priming sites predict enormous difficulties in designing one pair of primers that will reliably amplify this gene fragment in all vertebrates, all amphibians, or even all representatives of any amphibian order. A set of one forward and three reverse COI primers have been successfully used to amplify and sequence a large number of bird species (Hebert et al. 2004b), but birds are a much younger clade than amphibians with a probably lower mitochondrial variability.

A further optimization of COI amplification may also be achieved regarding the PCR protocol. Herein we used standard protocols that optimized annealing temperature only, whereas more complex touchdown protocols have been used for birds and butterflies (Hebert et al. 2004a; Hebert et al. 2004b). However, one major requirement for a DNA barcoding marker is its robustness to variable lab conditions. If DNA barcoding is to be applied as a standard in many different labs, primers and genes need to be chosen that amplify reliably under very different conditions and under standard protocols. This clearly applies to 16S, which we have amplified with very different annealing temperatures and PCR conditions in previous exploratory studies (results not shown).

Alignment of 16S sequences is complicated by the prevalence of insertions and deletions, and this gene is less variable than COI (Hebert et al. 2003a). Nevertheless,

(17)

higher potential than COI to assign vertebrate sequences to the level of classes and orders.

The 16S gene is a highly conserved mitochondrial marker but mutations are common in some variable regions, corresponding to loops in the ribosomal RNA structure. In amphibians, where many species are relatively old entities (Wilson et al.

1974), this ensures a sufficient amount of mutations among species. Our results for amphibians, and previous experience with fishes, reptiles and mammals, indicate that 16S is sufficiently variable to unambiguously identify most species.

A further mitochondrial gene that has been widely used in amphibian phylogenetic and phylogeographic studies is cytochrome b. This gene can easily be amplified in salamanders and archaeobatrachian frogs using primers that anneal with adjacent tRNA genes. However, neobatrachian frogs (the wide majority of amphibian species) are characterized by rearrangements of the mitochondrial genome (Macey et al.

1997; Sumida et al. 2001), and cytochrome b in these species borders directly to the control region. Although cytochrome b primers are available that work in many neobatrachians (Bossuyt and Milinkovitch 2000; Vences et al. 2003), they sometimes fail in closely related forms, similar to the COI primers used herein, presumably because of mutations at the priming sites (pers. obs. in mantellid frogs).

In contrast, the 16S primer pair used here can be considered as truly universal not only for amphibians but even for vertebrates. This is also reflected by the high number of amphibian 16S sequences in GenBank (2620 hits for 16S vs. 483 hits for COI, as of September 2004). Moreover, the 16S and 12S rRNA genes have been selected as standard markers for phylogeny reconstruction in amphibians (AmphibiaTree consortium), which will lead to a near-complete global dataset of amphibian 16S sequences in the near future. If the development of handheld devices (Janzen 2004) is envisaged as a realistic goal, then the universality and robustness of primers should be among the most relevant characteristics of a gene for DNA barcoding. When pooled samples containing representatives of various higher vertebrate taxa are to be analysed, the risk of false negatives strongly increases with decreasing universality of primers. As a consequence we recommend the use of 16S as additional standard DNA barcoding marker for vertebrates, especially for but not limited to applications that involve pooled samples.

(18)

ACKNOWLEDGEMENTS

For comments, technical help and/or discussions we are grateful to Paul D. N. Hebert, Axel Meyer, Dirk Steinke, Diethard Tautz and David B. Wake. We are further indebted to Simone Hoegg, Pablo Orozco and Mario Vargas who provided help in the lab, and to the Madagascan authorities for research permits. The DNA barcoding project of Madagascan tadpoles was supported by a grant of the Volkswagen foundation to M Vences and to F Glaw. DR Vieites was supported by the AmphibiaTree project (NSF grant EF –O334939).

(19)

CHAPTER 7

Phylogenetic performance of the Cytochrome b gene: case study of mantellid frogs

Boophis luteus Photo: Franco Andreone

(20)

Phylogenetic performance of the Cytochrome b gene: case study of mantellid frogs

7.1. ABSTRACT

Mitochondrial markers have been widely used for phylogenetic purposes, but surely not all genes are equally appropriate for the resolution of relationships at different levels of divergence. Here, we investigate the phylogenetic utility of the cytochrome b gene on a densely sampled taxon of Malagasy frog of family Mantellidae.

The phylogeny of mantellid frogs has been previously resolved using extensive molecular and morphological data sets in a smaller but representative subset of taxa, enabling us to compare the performance of the cytochrome b gene with that of the 16S rRNA gene in recovering basal relationships of the “true” phylogeny. Mantellids show high divergences in cytochrome b, with uncorrected pairwise distances of up to 30%

within genera and up to 34% among genera. As expected, third codon positions of the gene fragment studied show a strong signature of saturation. Nevertheless, exclusion of third codon positions resulted in considerably worse trees in which several otherwise well-established clades were not recovered any more. Partitioned Bayesian analysis with a different model of evolution for each codon position performed best to solve the trade-off between noise created by saturation at third positions, and the lack of phylogenetic information that their removal would constitute. Several novel phylogenetic placements within the mantellid genus Mantidactylus are suggested;

despite the lack of statistical support, these agree with previous or unpublished results from other genes, or make sense in the light of recent natural history observations.

Three species of the subgenus Blommersia are placed far from this clade: Mantidactylus bertini at a rather basal position in the Mantellinae, Mantidactylus madinika sister to Mantella, and Mantidactylus argenteus sister to Ochthomantis. Mantidactylus liber (subgenus Guibemantis) is confirmed being closer to the subgenus Pandanusicola.

(21)

7.2. INTRODUCTION

The early publication of universal PCR-primers (e.g., Kocher et al. 1989) and the reduced effective population size of one-fourth compared to nuclear-autosomal genes (Moore 1995) contributed to making mitochondrial genes the most widely used markers for phylogenetic, systematic and population genetic studies and for resolving taxonomic problems more generally. However, studies on a variety of organisms have shown that not all mitochondrial genes are suitable for recovering the “true” tree and resolving phylogenetic relationships at the same divergence level (e.g., Graybeal 1994; Russo et al. 1996; Springer et al. 2001; Zardoya and Meyer 1996). Especially in frogs, short fragments of the cytochrome b, 12S and 16S rRNA genes provide little information on deep divergences (Graybeal 1993; Hertwig et al. 2004), particularly if taxon sampling is incomplete. Cummings et al. (1995) found that of the set of 37 mitochondrial genes in ten vertebrates analysed in an effort to compare the performance of each gene relative to the whole genome in phylogenetic analysis, the trees inferred from single genes were largely concordant with the whole-genome tree for two widely used genes, cytochrome b and 16S rRNA. The 16S rRNA gene trees obtained by ML and NJ analyses were identical to the whole-genome tree, while only the cytochrome b gene obtained with the ML analysis reflected the whole-genome tree. The cytochrome b gene in particular has therefore enjoyed a long and successful history of being used for many phylogenetic issues.

Graybeal (1993) observed divergence among frog taxa in the rate of evolution of cytochrome b, and concluded that the phylogenetic utility of this gene cannot be predicted a priori. For protein coding genes the rate of evolution is generally limited by the protein function. The structure of the cytochrome b protein is well known, with the intermembrane domain (composed of four loops), the transmembrane domain (composed of eight α-helices) and the matrix domain (composed of the amino and carboxyl termini and three loops) (Degli Esposti et al. 1993). In one example from birds (Falconidae), the intermembrane and matrix regions of the protein show a higher level of saturation at the nucleotide level than the transmembrane portion (Griffiths 1997).

In Malagasy frogs of the family Mantellidae, the cytochrome b gene has been successfully used at shallow divergence levels in population genetics, and to resolve the phylogeny of closely related species (Chiari et al. 2004; Vences et al. 2004). The family Mantellidae as defined by Vences and Glaw (2001) includes a lineage of frogs endemic

(22)

to Madagascar and the Comoro Islands that is distinct from its probable sister group, the Asian-African family Rhacophoridae (Bossuyt and Milinkovitch 2000; Richards and Moore 1998; Richards et al. 2000; Vences et al. 2000a). Mantellids, a family originated approximately 70 mya ago, contain the genera Aglyptodactylus, Boophis, Laliostoma, Mantella and Mantidactylus assigned to three subfamilies, Mantellinae, Boophinae and Laliostominae. The genus Mantidactylus is paraphyletic (Richards et al. 2000; Vences et al. 2003) and subdivided into 12 subgenera (Glaw and Vences 1994) based on differences in morphology, ecology and reproductive behaviour.

Here, we investigate the phylogenetic relationships among mantellids using a fragment of cytochrome b and a dense taxon sampling (112 of the 158 mantellid species known), addressing the potential and problems of this gene and their effect on phylogenetic reconstruction. We also compare the phylogenetic performance of short fragments of the cytochrome b gene with that of the most widely used marker in frog phylogenetics, the 16S rRNA gene. Our results suggest that the cytochrome b gene performs better than the more conserved part of the 16S rRNA gene in recovering the

“true” tree. The cytochrome b gene should be then taken in further consideration for amphibian’s studies not only at the population genetics level, but also as an informative phylogenetic marker in frog’s phylogeny.

7.3. MATERIALS AND METHODS Available data sets

As a basis for assessing the phylogenetic performance of cytochrome b and analyzing the molecular evolution of this gene in mantellid frogs, we studied three separate data sets:

Dataset 1: Fragments of 551 bp of the cytochrome b gene as amplified via the polymerase chain reaction (PCR) using the primers Cytb-c and CBJ10933 from Bossuyt and Milinkovitch (2000), from a total of 196 specimens belonging to all five mantellid genera and representing 112 of the 158 mantellid species known (two species of Aglyptodactylus, 34 species of Boophis, 1 species of Laliostoma, 15 species of Mantella, 60 species of Mantidactylus; Glaw and Vences 2003). We included in our analyses sequences of eleven specimens belonging to the genera Paradoxophyla, Platypelis, Plethodontohyla, Rhombophryne, Scaphiophryne and Stumpffia (family

(23)

consists of 183 amino acids (from amino acid 144 to amino acid 327). Based on the model of the cytochrome b protein (Degli Esposti et al. 1993) the used fragment spans all the three above-mentioned regions of the mitochondrial membrane.

Dataset 2: Fragments of cytochrome b of 356 specimens of Mantella, largely homologous to the fragment of dataset 1 but slightly shorter (502 bp), obtained in previous studies (Chiari et al. 2004; Chiari et al. in press; Vieites et al. in press). These were collapsed into 122 different unique sequences using Collapse v 1.2 (Posada 2004).

These were used in our study to compare the saturation of the gene among mantellids at the shallower level of closely related species.

Dataset 3: Sequences of fragments of up to 360 bp of the 16S rRNA gene as amplified using the primers 16SAr-L and 16Br-H of Palumbi et al. (1991) were available for a subset of 177 mantellid specimens out of the 196 in dataset 1. Further sequences of five species of microhylids to be used as outgroup were downloaded from GenBank (Accession numbers: Stumpffia AY594124, Scaphiophryne AY594127, Rhombophryne AY594125, Plethodontohyla AY594118, Platypelis AY594103). The complete fragment was used for calculation of pairwise distance whereas hypervariable and gapped sites of the alignment were excluded for phylogenetic analysis (see below).

Molecular protocols

Total genomic DNA was extracted from toe clips fixed in 99% ethanol using a proteinase K digestion (final concentration 1 mg/mL). DNA was isolated by a standard salt extraction protocol (Bruford et al. 1992). Representative voucher specimens are stored in the collections of the Zoological Museum Amsterdam, Netherlands, and the Zoologische Staatssammlung München, Germany.

To amplify the cytochrome b and 16S rRNA fragments, polymerase chain reactions (PCRs) were performed in 25 µL reactions containing 1.0 unit of REDTaq DNA Polymerase (Sigma, Taufkirchen, Germany), 50 ng genomic DNA, 10 pmol of each primer, 15 nmol of each dNTP, 50 nmol additional MgCl2 and the REDTaq PCR reaction buffer (10 mM Tris-HCl, pH 8.3, 50 mM KCl, 1.1 mM MgCl2 and 0.01%

gelatine) using the following conditions: an initial denaturation at 94°C for 1:30 min; 35 cycles at 94°C for 30 seconds, primer-specific annealing temperature (53°C for cytochrome b and 55°C for 16S rRNA) for 45 seconds, extension at 72°C for 1:30 min;

final extension of 10:00 min at 72°C.

(24)

PCR products were checked on 1% agarose gels and purified using QIAquick spin columns (Qiagen) prior to cycle sequencing. Sequence data collection and visualisation were performed on an ABI 3100 automated sequencer. Sequencing reactions were prepared according to the manufacturers instructions, using the ABI sequence mix (BigDye® Terminator V3.1 Sequencing Standard, Applied Biosystems).

Newly obtained sequences were deposited in GenBank; accession numbers: ####-####

[to be added upon manuscript acceptance].

Data analysis

Unambiguous alignment of the cytochrome b sequences was possible by eye, as they contained no indels. Sequences were verified and aligned with Sequence Navigator (Applied Biosystems) software. The software TaxI (Steinke et al. 2005) was used to pairwise align the complete 16S rRNA fragments and calculate pairwise divergences.

For phylogenetic analysis the 16S rRNA sequences were aligned using the "Clustal"

option in the program Sequence Navigator (Applied Biosystems). All hypervariable and gapped regions were excluded, leaving fragments of 360 bp for phylogenetic analyses.

We estimated synonymous and nonsynonymous mutation rates in pairwise comparisons using the program PAML version 3.14 (Yang 1997), following the method of Yang and Nielsen (2000).

To evaluate the degree of saturation of the cytochrome b gene in datasets 1 and 2, uncorrected p-distances of each codon positions were plotted against the uncorrected p-distance of the entire fragment. Total uncorrected p-distances of the cytochrome b gene were plotted against both absolute numbers of transition (Ts) and of transversion (Tv). For the 176 specimens of dataset 3, fragments of both the cytochrome b and 16S rRNA genes were available. Pairwise uncorrected p-distances, and corrected K2P distances for the cytochrome b gene among congeneric species, were calculated using the software TaxI (Steinke et al. 2005). Pairwise uncorrected p-distances of each codon position of the cytochrome b gene and of the entire gene were plotted against the pairwise distance of the 16S rRNA fragments, using the software TaxI (Steinke et al.

2005). We largely used uncorrected distances to avoid any bias that could originate by correcting partitions of possibly different molecular evolution (cytochrome b vs. 16S rRNA, and each of the three codon positions) arbitrarily with the same model.

(25)

(Swofford 2002), using the heuristic search option with tree-bisection-reconnection (TBR) branch swapping and ten random addition sequence replicates. Maximum likelihood (ML) analyses were carried out using PHYML (Guidon and Gascuel 2003) following substitution model parameter estimation using the AIC criterion implemented in Modeltest version 3.06 (Posada and Crandall 1998). Five hundred bootstrap replicates were computed to evaluate the stability of nodes. Bayesian posterior probabilities were calculated using MrBayes, version 3b4 (Huelsenbeck and Ronquist 2001). 1.000.000 generations were run, every tenth tree collected and the "burn in"

parameter was set at 0.5% based on empirical evaluation.

Phylogenetic trees were separately computed for cytochrome b and 16S rRNA (datasets 1 and 3). For cytochrome b, we calculated trees based on (1) the complete fragment and (2) after removal of third positions. Unpartitioned Bayesian analyses were carried out for datasets 1 and 3, and a partitioned Bayesian analysis (with separate models of evolution for each codon position) was run for dataset 1.

Due to the many closely related taxa in the dataset, MP analyses were completed for 1010 rearrangements for the analysis including first and second codon position and for 109 rearrangement for the one including all the three positions, and a strict consensus tree was calculated from the retained trees.

7.4. RESULTS Saturation Analysis

The plots (Figures 7.1 A-D) show the faster evolution of the cytochrome b fragment compared with 16S rRNA. This is mainly caused by the high substitution rate in the third codon position of cytochrome b, while the first codon position evolves as fast as the 16S rRNA fragment (all positions including hypervariable regions), and the second evolves slower.

(26)

Figure 7.1. Plots of (A) pairwise uncorrected p-distances of 16S rRNA vs. cytochrome b; (B) pairwise uncorrected p-distances 16S vs. first position of cytochrome b; (C) pairwise uncorrected p-distances 16S vs. second position of cytochrome b; (D) pairwise uncorrected p-distances 16S vs. third position of cytochrome b. All comparisons refer to datasets 1 and 3.

The uncorrected p-distances (in percent) among mantellids for the cytochrome b gene are quite high, ranging up to 18% within the genus Aglyptodactylus, up to 30%

within the genus Boophis, up to 5% within the species Laliostoma labrosum, up to 19%

within the genus Mantella, up to 30% within the genus Mantidactylus, and ranges from 17-34% among different genera. The average uncorrected sequence divergence is 24%

among mantellids (dataset 1) and 10% among the sequences of Mantella (dataset 2) considered in this study (Table 7.1). Distances after applying the K2P correction, for congeneric individuals, reach (in percent) 38% in Aglyptodactylus, 45% in Boophis, 37% in Laliostoma labrosum, 46% in Mantidactylus and 39% in Mantella, with an average of 29% which is higher than the value obtained by Johns and Avise (1998, see Figs. 3, 5 therein) for congeneric species and confamilial genera of amphibians.

(27)

Genus Cytochrome b 16S rRNA

Aglyptodactylus 0-18% 0-10%

Boophis 0-30% 0-18%

Laliostoma 1-5% 0-0.6%

Mantella 0-19% 0-10%

Mantidactylus 0-30% 0-22%

Between genera 17-34% 10-24%

Average 24% 15%

Table 7.1. Uncorrected p-distance within each mantellid genera for the cytochrome b and the 16S rRNA genes. Because of partly unresolved taxonomy at the species level, the values summarize intraspecific and interspecific distances.

The uncorrected p-distances for the 16S rRNA gene (dataset 3) reach 10%

within the genera Aglyptodactylus and Mantella, 18% in Boophis, 0.6% in Laliostoma labrosum, 22% within the genus Mantidactylus, and range from 10-24% among different mantellid genera. The average of uncorrected sequence divergence is 15%

among mantellids (dataset 1) and 5% among the sequences of Mantella (dataset 2) compared (Table 7.1, Figure 7.2).

Figure 7.2. Box graph representing the uncorrected p-distances (in percent; y-axes) for each mantellid genus (x-axes) for the cytochrome b gene (black) and the 16S rRNA gene fragments respectively (grey).

Already the results of Graybeal (1993) indicate that plain nucleotide distances are not always a reliable indicator of the level of saturation, but the high divergences in cytochrome b sequences suggest saturation at the third codon position (Meyer 1994).

(28)

Indeed, among mantellids (dataset 1), the cytochrome b gene is highly saturated at its third, but not at the first and second codon positions (Figures 7.3 A, B, C).

Figure 7.3. Plots of pairwise sequence divergences (in percent) in the cytochrome b gene of mantellid frogs: (A) total uncorrected p-distances vs. uncorrected p-distances at first codon positions; (B) total uncorrected p-distances vs. uncorrected distances at second positions; (C) total uncorrected p-distances vs. uncorrected p-distances at third positions; (D) total uncorrected p-distances vs. uncorrected p-distances at third positions in the Mantella (dataset 2); (E) total uncorrected p-distances vs. absolute numbers of transitions (in percent) only at third positions ; F) total uncorrected p-distances vs.

absolute numbers of transversions (in percent) only at third positions. All comparisons except for D refer to dataset 1.

(29)

In contrast, analysing only values among members of the closely related species of the genus Mantella (dataset 2), cytochrome b appears not to be saturated at the third codon position (Figure 7.3 D). As expected the saturation of the third codon position is caused by high saturation of transitions, but did not yet reach saturation for transversion (Figures 7.3 E, F, but see also Table 7.3, since the skew in base composition determines the maximum sequence divergence at which saturation is reached).

The ratio between nonsynonymous and synonymous mutations (dN/dS) in pairwise comparisons among mantellids (dataset 1) ranges from 0 to 0.108, indicating that the majority of mutations are synonymous (data not shown). The highest value was observed in only one case, and the observed dN/dS ratio in general supports the assumption of neutral evolution of the cytochrome b gene although it may not be sensitive enough to detect subtle molecular adaptation (McClellan et al. 2004).

As previously observed in other vertebrates (Graybeal 1993; Farias et al. 2001;

Meyer 1993; Saccone et al. 1999; Santucci et al. 1998) there is a tendency of low content of guanine at second (G= 15%) and third position (G= 7%). First codon positions are relatively homogenous in terms of base composition, but tymine is favoured as second codon position, as already observed in birds (Friesen et al. 1993;

Groth 1998; Krajewski and King 1996). The third codon positions favour cytosine (Table 7.2). Variation among genera in degree of base compositional bias is not significant, except in Mantella and Laliostoma. The Chi-Square Test implemented in PAUP* does not reject the hypotesis of homogeneous base composition within Mantellidae.

Codon Postion

Nucleotide First Second Third

A 0.23 0.23 0.29

T 0.27 0.39 0.30

C 0.23 0.23 0.34

G 0.27 0.15 0.07

Table 7.2. Average nucleotide frequencies in cytochrome b sequences of mantellids calculated with the program PAML.

The three samples of Laliostoma labrosum have higher content of guanine at third position than at second and at the third position there is a bias against adenine (Table 7.3). In the genus Mantella we found at the third codon position a particularly low (<2%) content of guanine and a bias favouring tymine instead of cytosine as in the

(30)

other genera (Table 7.3). The mean base composition for the cytochrome b gene is 32%

tymine, 27% cytosine, 25% adenine and 16% guanine (Table 7.2).

Codon Postion

Genus Nucleotide First Second Third

A 0.23 0.23 0.26

T 0.28 0.39 0.14

C 0.22 0.23 0.50

Aglyptodactylus

G 0.27 0.15 0.10

A 0.23 0.23 0.25

T 0.28 0.38 0,28

C 0.22 0.24 0.38

Boophis

G 0.27 0.15 0.09

A 0.23 0.23 0.15

T 0.23 0.38 0.28

C 0.27 0.23 0.37

Laliostoma

G 0.27 0.15 0.19

A 0.23 0.23 0.29

T 0.26 0.39 0.44

C 0.24 0.23 0.25

Mantella

G 0.27 0.15 0.02

A 0.23 0.23 0.31

T 0.27 0.39 0.31

C 0.23 0.23 0.32

Mantidactylus

G 0.27 0.15 0.06

Table 7.3. Average nucleotide frequencies in each mantellid genus for cytochrome b sequences calculated with the program PAML.

Phylogenetic Analysis

The best fitting model selected by AIC in Modeltest for the cytochrome b fragment analysed (dataset 1) was the TrN+I+G substitution model with a gamma distribution shape parameter of 0.6452 and proportion of invariable sites of 0.3242. The GTR+I+G model with a gamma distribution shape parameter of 0.6755 and proportion of invariable sites of 0.4353 was selected for the data set after exclusion of third codon positions. The best models for the data sets of only first, second, and third positions, respectively, were a TVM+I+G model with a gamma distribution shape parameter of 0.7122 and proportion of invariable sites of 0.3094, the same model with a gamma distribution shape parameter of 0.7732 and proportion of invariable sites of 0.5362, and a GTR+G model with a gamma distribution shape parameter of 2.4384 and proportion of invariable sites of 0.

Of the total of 551 characters of the cytochrome b fragment analysed (dataset 1), 184 (33%) were invariant, 39 (7%) were variable but parsimony-uninformative and 328 (60%) were variable and parsimony-informative. After exclusion of third positions, of

(31)

the total of 368 characters, 184 (50%) were constant, 39 (11%) variable but parsimony- uninformative and 145 (39%) variable and parsimony-informative.

The best fitting model selected for the 16S rRNA fragment analysed (dataset 3) was GTR+I+G with gamma shape parameter of 0.6510 and proportion of invariable sites of 0.2654. Of the total of 360 characters of the 16S rRNA fragment analysed for phylogenetic purpose, 147 (41%) were invariant, 43 (12%) were variable but parsimony-uninformative and 170 (47%) were variable and parsimony-informative.

The tree obtained by partitioned Bayesian analysis of cytochrome b (dataset 1;

Figure 7.4) agrees in many aspects with the highly supported topology of Vences et al.

(2003) that was based on a much larger data set of mitochondrial and nuclear genes (1875 bp) in a reduced but representative set of taxa (47). In contrast to that analysis, the cytochrome b data herein do not support monophyly of the subfamilies Laliostominae and Boophinae: the genus Laliostoma clusters together with the brook- breeding clade of Boophis, and the genus Aglyptodactylus is placed as sister group of some species of pond-breeding Boophis species, although without statistical support.

The subfamily Mantellidae is found to be monophyletic. Two species of the subgenus Guibemantis are not placed in the Guibemantis clade: Mantidactylus liber with Pandanusicola and M. elegans with Spinomantis. Mantidactylus bertini (subgenus Blommersia) is placed ancestral to the clade containing the subgenus Blommersia and many other mantellines, and M. argenteus with Ochthomantis. The tree from the partitioned Bayesian analysis of dataset 1 agrees in almost all aspects with the tree from the unpartitioned Bayesian analysis, but considerably differs from trees using only first and second positions, and from a tree using all codon positions under the ML and the MP optimality criteria (Figure 7.5).

(32)

Figure 7.4. Tree of 191 mantellids specimens based on Bayesian partitioned analysis.

One asterisk indicates Bayesian posterior probability 95-98%. Two asterisks indicate Bayesian posterior probability 99-100%. The absolute numbers indicate Bayesian posterior probability lower then 95%. Subgenera or genera are collapsed. The outgroup is not shown.

(33)

Figure 7.5. Collapsed phylogenetic trees of mantellids based on cytochrome b fragment: (A) ML; (B) Bayesian; (C) ML analysis after exclusion of third codon position; (D) Bayesian analysis after exclusion of third codon position; (E) MP; (F) MP analysis after exclusion of third codon position. The outgroup is not shown.

(34)

The exclusion of third codon position of the cytochrome b data set to eliminate saturation negatively affects the robustness of the tree (Figure 7.5). The exclusion of this character determines the lack of information at the genus and species-group level (shallow divergence) and creates polytomy (Figures 7.5 D, F). For example, the usually very well defined genus Mantella becomes paraphyletic (Figures 7.5 D, F), as do other well-established clades (see Discussion below). The 16S rRNA trees (ML and Bayesian) recover largely the monophyly of the subgenera and genera, although they do not support monophyly of Mantidactylus (Figures 7.6 A, B).

Figure 7.6. Collapsed phylogenetic trees of mantellids based on 16S rRNA fragment:

(A) ML; (B) Bayesian. The outgroup is not shown.

The genus Boophis that results paraphyletic based on cytochrome b is monophyletic based on the 16S rRNA fragments considered, but the relationships among different subgenera and genera in the majority of the cases do not reflect the usually well-resolved relationships (Figures 7.6 A, B). The Bayesian analysis better resolves a number of relationship as compared to ML, by (1) placing Blommersia with the Guibemantis/Pandanusicola clade instead of Laliostoma, (2) supporting monophyly of Mantella, (3) placing M. bertini with Spinomantis, and (4) placing the subgenera Gephyromantis, Laurentomantis and Phylacomantis as monophyletic sister group of the

(35)

stream-breeding Mantidactylus lineages. Altogether cytochrome b performs better in resolving a tree than the 16S rRNA fragment analysed (in dataset 3) and the latter marker performs slightly better in the Bayesian analysis than under the ML criterion (Figures 7.6 A, B).

7.5. DISCUSSION

Phylogeny of mantellid frogs

Mantellids, in the past years, have been subject to a large number of phylogenetic analyses based on morphological (Vences et al. 2002b) and molecular datasets (Chiari et al. 2004; Lehtinen et al. 2004; Lehtinen and Nussbaum 2003;

Richards et al. 2000; Vences et al. 2000a, 2002a, 2003). Very large molecular datasets of 16S rRNA sequences from over 1000 individuals have also been obtained in the context of a DNA barcoding effort of these frogs (Vences et al. 2005). Especially the highly resolved tree of Vences et al. (2003) provides a means to evaluate the accuracy of the phylogenetic reconstructions herein. In general, the various clades shown as collapsed in Figure 7.4 and Figure 7.5 have been well supported by other data sets, and the fact that most were recovered in the present analysis, even if sometimes without statistical support, gives some confidence in the utility of cytochrome b. The major disagreement with all other data sets is the non-monophyly of the Laliostominae (Laliostoma + Aglyptodactylus) and of Boophis. The corresponding nodes are not significantly supported (Figure 7.4). They are one example for the tendency of recovering wrong basal relationship using cytochrome b at deeper divergences. In general, no high support is found for any of the basal nodes, whereas significant Bayesian support values were obtained for a number of more shallow clades, such as the genera Aglyptodactylus, Laliostoma and Mantella and the subgenera Chonomantis, Hylobatrachus, Mantidactylus and Ochthomantis. An important aspect, not further tested here, might have been the dense taxon sampling in our analysis, which is known to greatly improve phylogenetic accuracy (e.g., Zwickl and Hillis 2002; but see Rokas and Carroll 2005 and the review in Cummings and Meyer submitted).

Several phylogenetic placements differ from current classification and merit discussion. The fact that the genus Mantidactylus is paraphyletic, Mantella being more closely related to some Mantidactylus than to others, is well established and needs no further discussion (Richards et al. 2000; Vences et al. 2003). Mantidactylus liber is

(36)

placed with Pandanusicola, in agreement with previous analyses based on different genes (Lehtinen et al. 2004; Lehtinen and Nussbaum 2003; Vences et al. 2003), indicating that it belongs into this subgenus despite its different morphology and reproductive biology (Glaw and Vences 1994). Mantidactylus bertini is currently classified in Blommersia, but is here placed ancestral to a clade containing this and other subgenera. This is an enigmatic species of which very little biological information has become available so far. We recently collected this species from several sites in south-eastern Madagascar and confirmed that its terrestrial calling behaviour along streams, and femoral gland morphology, differs from other Blommersia and argues for classificatory change. The high divergence between the two M. bertini sequences included here are paralleled by divergences in the 16S rRNA gene (not shown), and strongly suggest the existence of several cryptic species subsumed under the name M.

bertini. Mantidactylus madinika, originally considered to belong into Blommersia as well, is placed sister to Mantella. The same grouping was recovered by Vences et al.

(2003) and indicates that this miniaturized species may indeed be similar to the ancestor of the highly derived Malagasy poison frogs included in Mantella. A third apparently misplaced Blommersia is Mantidactylus argenteus, which here stands sister to the subgenus Ochthomantis. Also this species is morphologically and biologically different from other Blommersia, showing parental care, and recent data on parental care in Mantidactylus majori (subgenus Ochthomantis) make sense in the light of the grouping suggested here (Vences and De la Riva in press). In conclusion, several cases that appear to be weird misplacements in our tree (Figure 7.4) may indeed reflect correct phylogenetic relationships.

Saturation and phylogenetic utility of cytochrome b

In an effort to evaluate the effect of the phylogenetic signal carried by each codon position and of their noise in tree reconstruction in mantellids, we contrasted a partitioned analysis, weighing each codon position differently, against a non-partitioned analysis. Commonly, the second codon position carries more phylogenetic information than the first and the third positions. Third codon positions and transitions are often downweighted in the analysis due to their saturation (reviewed by Meyer 1994).

Saturation of cytochrome b at the third and sometimes first codon position has been

(37)

Griffiths 1997). However, third codon positions are not always misleading and their exclusion can sometimes negatively affect phylogenetic resolution (Björklund 1999;

Krajewski and King 1996; Yoder et al. 1996). Cummings et al. (1995) observed that third codon positions were phylogenetically informative if the number of sites was large and ML was used, but not with MP and NJ. An a priori evaluation of the degree of saturation is then necessary to discriminate between the information and noise carried by third codon position.

Our data showed that saturation at third codon position begins at about 13%

uncorrected total sequence divergence (Figure 7.3 C). This implies that the effect of saturation at the third codon position is high among subgenera and genera, which might lead to problems of resolution or robustness of support. Moreover, the strong codon biases with the particularly low content of guanine at the third position will cause a codon bias. This will result faster saturation on a smaller maximal sequence divergence.

The bias will also increase differences among the examined species in terms of amino acid substitutions (see also Meyer 1993, 1994). Since mutations in third codon position usually do not result in amino acid changes, they will tend to maintain a functional gene product. In mantellids, the synonymous mutations in the cytochrome b genes surpassed the amount of nonsynonymous mutations and the wide majority of nucleotide changes did not cause amino acid changes.

The a priori analysis showing the saturation at the third codon position of the used marker would suggest exclusion of this character from the phylogenetic analyses to remove the possible noise due to homoplasy. Nevertheless, the Bayesian, ML and MP analyses based on first and second positions only (Figures 7.5 D, E) are in several aspects worse than the partitioned Bayesian tree (Figure 7.4). Not considering the novel phylogenetic relationships suggested by our tree and discussed above, this can be best documented by changes in the relationship among four clades. These are well- established monophyletic groups by molecular and morphological data (e.g., Vences et al. 2003), and are recovered by the analysis of all three cytochrome b codon positions but not by the analysis after exclusion of third positions: (1) the genus Mantella; (2) the forest-dwelling frogs belonging to the subgenera Gephyromantis, Laurentomantis and Phylacomantis; (3) the subgenus Guibemantis; (4) the subgenus Brygoomantis.

Hence, mantellid relationships are better recovered including third positions, suggesting that even this highly saturated partition includes some phylogenetic

(38)

information. In addition, our data also indicate that the partitioned Bayesian analysis performed better than the unpartitioned ML and MP analyses: ML did not recover monophyly of the two specimens assigned to M. bertini and creates polytomy of the subgenus Guibemantis, while MP did not recover monophyly of the subgenera Guibemantis, Blommersia and Spinomantis. In contrast, there were no conspicuous differences between the partitioned and the unpartitioned Bayesian analysis.

The sequence divergences between genera in 16S rRNA (for the whole amplified fragment) are distinctly lower than in cytochrome b (Table 7.1). In addition, before phylogenetic analysis we have further reduced this variation, and the number of potentially informative sites, by removing gapped and hypervariable regions. This fact could explain why the cytochrome b marker better recovers the correct relationships among different genera as compared with 16S rRNA (Figures 7.4 and 7.6 A, B).

As a conclusion, the cytochrome b gene performs better than the 16S rRNA fragment analysed (after exclusion of hypervariable sites) in resolving phylogenetic relationships among mantellids. The inclusion of the 16S rRNA hypervariable sites would probably increase the phylogenetic resolution of this gene, but such a correct alignment is a highly time-consuming and error-prone task. Despite a generally good performance, the short fragment of cytochrome b used herein was unable to provide sufficient statistical support for many relationships and therefore requires combination with other markers. Our data suggest that at the level of species, subgenera and genera of frogs, third positions of this gene do bear phylogenetic information and therefore should not be a priori excluded from analysis. Partitioned Bayesian analysis appears to outperform ML and MP for such datasets of short sequences of limited phylogenetic signal.

ACKNOWLEDGMENTS

We are indebted to Dirk Steinke for his comments on this manuscript and for helping with the TaxI program. We are also grateful to Arie van der Meijden and Simone Hoegg for valuable comments on the first draft of the manuscript, to Walter Salzburger for helping with the PAML program, and to Franco Andreone, Frank Glaw and David R.

Vieites for field companionship during sample collection. The Malagasy authorities kindly issued research and export permits. Y. Chiari was supported by an Ambassadorial Scholarship of the International Rotary Foundation. Laboratory work received support from grants of the Deutsche Forschungsgemeinschaft to M. Vences and A. Meyer and the University of Konstanz to A. Meyer.

(39)

CHAPTER 8

The phylogenetic utility of the nuclear Rag-1 (recombination activating gene-1) in vertebrate

Mantella aurantica Photo: Miguel Vences

(40)

The phylogenetic utility of the nuclear Rag-1 (recombination activating gene-1) in vertebrates

8.1. ABSTRACT

Among the nuclear markers used for phylogenetic purposes, the nuclear genes coding for the Recombination Activating Proteins, Rag-1 and Rag-2, have been applied in the past six years on the different group of vertebrates. However, the comparative phylogenetic utility of these genes among major lineages of vertebrates, except in a few cases, has never been studied comparatively. The Rag-1 and Rag-2 proteins are involved in starting a recombination process that produce the high diversity of different immunoglobulin and T cell receptors of the T and B cells. Due to their importance, these two proteins are highly conserved at the amino acid level across vertebrates.

Moreover, the second half of the Rag-1 protein is more conserved than the first one. The Rag-1 gene is intronless in the majority of vertebrates and about 3.2 kb long. We here investigate the phylogenetic potential of this gene across vertebrates (Amphibia, Aves, Crocodylia, Chondrichthyes, Crocodylia, Mammalia, Lepidosauria, Actinopterygii, Testudines), using published sequences downloaded from GenBank. When possible, the analyses were made on the total length of the gene (3.3 kb after the complete alignment). For comparative analyses among the different studied groups, a fragment of 510 bp, the longest available for a large number of sequences for each group, was selected. Our data show that additional or missing amino acids can probably be used as synapomorphic characters within groups, but that they have to be carefully considered across vertebrates. Based on the selected 510 bp subset, the gene did not appear to be saturated in any of the studied groups, even if sequence divergences at the third codon position in Actinopterygii and Amphibia are high. Across vertebrates, in the majority of cases (with the exclusion of third position in Aves, Amphibia, Actinopterygii and Mammalia), the hypothesis of homogeneous base composition cannot be rejected.

However, we observed a higher content of guanine at first positions and for tyrosine and adenine in all taxa at the second position. We also found a positive correlation among the times of divergence of the different groups and their maximum or mean sequence divergence. These characteristics make Rag-1 a good marker for the recovery of deep phylogenetic relationships. However, the use of this gene seems to also be useful for

(41)

variable, despite the fact that the second half is more often used for phylogenetic studies. We suggest exploring the phylogenetic potential of the first part of the Rag-1 gene for future studies.

Referenzen

ÄHNLICHE DOKUMENTE

Figure 24: Tegenaria argaeica female paratype, a -b, Tegenaria carensis c -e and Tegenaria domestica f -h; epigyne in ventral a, g, vulva in dorsal view b -c, h; left male palp

Although par- simony and species tree analyses found support for the monophyly of Fejervarya as currently understood, partitioned Bayesian inference and unpartitioned

According to the most species- and data-rich analysis to date (Kaffenberger et al. 2012), the species included in this subgenus instead form two highly supported clades of

Matrix of species, ecology, habitat, eye morphology variables and body colour characters in the mantellid frogs used in this study: ECO, general ecology and habits; DIP, detailed

Species 3 also shares several characters with species 2 but can be distinguished on the basis of absence (versus presence in species 2) of light and dark bands

These animals have very short generation times (one year), so that we expect them to be the first lemurs to show signs of genetic depletion if their populations.. decrease due

The performance of COI and 16S in assigning taxa to inclusive major clades was tested based on gene fragments homologous to those amplified by the primers used herein (see

In reference to the hitherto existing classification of Heterixalus, being predominantly based on chromatic and bioacoustic characters, a third goal of this study was to examine