Pristocera MAD01
100 Mischocyttarus flavitarsis
Metapolybia cingulata
81
Aglyptacros cf. sulcatus Scolia verticalis
Chyphotes mellipes Sapyga pumila
Aporus niger Dasymutilla aureola
96 Apis mellifera
Chalybion californicum
100
100 Protanilla JAP01
100 Leptanilla GRE01
Leptanilla RSA01
91
Martialis heureka
94
Tatuidris ECU01
99 Onychomyrmex hedleyi
99 Concoctio concenta Prionopelta MAD01 100
84 Amblyopone pallipes
92 Amblyopone mutica
Mystrium mysticum Adetomyrma MAD02
Apomyrma stygia Paraponera clavata 97
61 Discothyrea MAD07
Probolomyrmex tani
100 Proceratium MAD08
Proceratium stictum
100
100 Platythyrea punctata
Platythyrea mocquerysi
100 53
99
Anochetus madagascarensis Leptogenys diminuta Pachycondyla sikorae
Odontomachus coquereli Odontoponera transversa 90
88 Centromyrmex sellaris
85 Psalidomyrmex procerus
93 Loboponera politula Plectroctena ugandensis
98 Hypoponera sakalava
Hypoponera opacior
93 Thaumatomyrmex atrox
Simopelta cf. pergandei
100 100
Cerapachys sexspinus
60
59 Cerapachys augustae Cerapachys larvatus
Simopone marleyi Sphinctomyrmex steinheili 89
92 Cylindromyrmex striatus Acanthostichus kirbyi 100 Leptanilloides nomada
Leptanilloides mckennae 65
73
100 Dorylus laevigatus
Dorylus helvolus Aenictogiton ZAM02 72
100 Aenictus ceylonicus
Aenictus eugenii
100 Cheliomyrmex cf. morosus
100 Neivamyrmex nigrescens Eciton vagans
60
100 Tetraponera punctulata
88 Tetraponera rufonigra
99 Myrcidris epicharis
Pseudomyrmex gracilis 94
Aneuretus simoni
100
Dolichoderus scabridus
91 99
99
100Liometopum occidentale Liometopum apiculatum Tapinoma sessile 100 Technomyrmex difficilis
Technomyrmex MAD05
66
94 Leptomyrmex AUS01 Leptomyrmex erythrocephalus 55 Dorymyrmex bicolor
Forelius pruinosus 53
Papyrius nitidus Azteca ovaticeps Linepithema humile 67
Anonychomyrma gilberti 79 Turneria bidentata
Philidris cordatus
100 Myrmecia pyriformis
Nothomyrmecia macrops
53 100
79
100 Myrmecocystus flaviceps Lasius californicus
72
100 Brachymyrmex depilis
Myrmelachista JTL01 Oecophylla smaragdina Notostigma carazzii 100 Polyergus breviceps
Formica moki
95 93
100 Polyrhachis Cyrto01 Polyrhachis Hagio01 94
Calomyrmex albertisi 72
Camponotus hyatti Camponotus maritimus 88 Camponotus conithorax
Camponotus BCA01 Opisthopsis respiciens Anoplolepis gracilipes
Myrmoteras iriodum Notoncus capitatus
Acropyga acutiventris 84
71 Pseudolasius australis Paratrechina hystrix
Prenolepis albimaculata Prenolepis imparis
100 83
100
100 Myrmica tahoensis Myrmica striolagaster Manica bradleyi
Pogonomyrmex subdentatus
95
Orectognathus versicolor Daceton armigerum
55 Microdaceton tibialis
Eurhopalothrix bolaui Acanthognathus ocellatus Myrmicocrypta cf infuscata
Apterostigma auriculatum
100 Acromyrmex versicolor
Trachymyrmex arizonensis Wasmannia auropunctata Procryptocerus scabriusculus
100 Pyramica hoplites
Strumigenys dicomas 99 Pheidole clydei
Pheidole hyatti Basiceros manni
Pilotrochus besmerus 90 Aphaenogaster albisetosa
Messor andrei 62 Stenamma dyscheres
99 Aphaenogaster swammerdami
100 Aphaenogaster occidentalis Messor denticornis 92
100 Solenopsis molesta Solenopsis xyloni
78 Myrmicaria exigua
Monomorium ergatogyna Vollenhovia emeryi
100 Tetramorium validiusculum Tetramorium caespitum
Mayriella ebbei Xenomyrmex floridanus
Eutetramorium mocquerysi Myrmecina graminicola
Crematogaster emeryana Cardiocondyla mauritanica Leptothorax muscorum complex
Nesomyrmex echinatinodis
99 Meranoplus cf. radamae
Cataulacus MAD02 Terataner MAD02
Pheidologeton affinis Temnothorax rugatulus
Metapone madagascarica Rhopalomastix rothneyi
85
100 Acanthoponera minor Heteroponera panamensis 100
85 Gnamptogenys striatula
Typhlomyrmex rogenhoferi
86 Ectatomma opaciventre
Rhytidoponera chalybaea 0 1
outgroup taxa Leptanillinae
Martialinae
'poneroids'
formicoids
Figure A.4: Unmasked, unpartitioned data set. Maximum Likelihood (majority rule consensus) topology inferred from the unmasked, unpartitioned data set with 5,000 bootstrap replicates (-f a; GTR+Γ, see method section Chapter 3). The tree was rooted withPristocera.
Pristocera MAD01
100 Mischocyttarus flavitarsis
Metapolybia cingulata
79
Aglyptacros cf. sulcatus Scolia verticalis
Chyphotes mellipes
55 Sapyga pumila
Dasymutilla aureola
Aporus niger
94 Chalybion californicum
Apis mellifera
100
100 Protanilla JAP01
100 Leptanilla GRE01
Leptanilla RSA01
93
Martialis heureka
92
Tatuidris ECU01
98 Onychomyrmex hedleyi
99 Concoctio concenta Prionopelta MAD01
99
89 Amblyopone pallipes
91 Amblyopone mutica Mystrium mysticum Adetomyrma MAD02
Apomyrma stygia Paraponera clavata
96 Discothyrea MAD07
Probolomyrmex tani
100 Proceratium MAD08 Proceratium stictum
57 100
100 Platythyrea punctata Platythyrea mocquerysi
100 100
Anochetus madagascarensis Leptogenys diminuta Pachycondyla sikorae
62 Odontoponera transversa Odontomachus coquereli
51
94 Thaumatomyrmex atrox
Simopelta cf. pergandei
93
78 Centromyrmex sellaris
73 Psalidomyrmex procerus
87 Loboponera politula Plectroctena ugandensis
100 Hypoponera sakalava
Hypoponera opacior
100 100
Cerapachys sexspinus Cerapachys augustae Cerapachys larvatus
Simopone marleyi Sphinctomyrmex steinheili
50 89 Cylindromyrmex striatus
Acanthostichus kirbyi
100 Leptanilloides nomada Leptanilloides mckennae
59
100 Dorylus helvolus
Dorylus laevigatus Aenictogiton ZAM02
67
100 Aenictus ceylonicus
Aenictus eugenii
100 Cheliomyrmex cf. morosus
100 Neivamyrmex nigrescens Eciton vagans
100 Tetraponera punctulata
93 Tetraponera rufonigra
97 Myrcidris epicharis
Pseudomyrmex gracilis
71
Aneuretus simoni
100
Dolichoderus scabridus
99 98
100Liometopum apiculatum Liometopum occidentale Tapinoma sessile
100 Technomyrmex difficilis Technomyrmex MAD05
70
93 Leptomyrmex AUS01 Leptomyrmex erythrocephalus
65 Dorymyrmex bicolor Forelius pruinosus
58
Papyrius nitidus Azteca ovaticeps Linepithema humile
64
Anonychomyrma gilberti
65 Turneria bidentata Philidris cordatus
100 Myrmecia pyriformis
Nothomyrmecia macrops
51 100
77
99Lasius californicus Myrmecocystus flaviceps
62
99 Brachymyrmex depilis Myrmelachista JTL01
Oecophylla smaragdina Notostigma carazzii
100 Polyergus breviceps Formica moki
91 63
100 Polyrhachis Cyrto01 Polyrhachis Hagio01
94
Calomyrmex albertisi
95
Camponotus hyatti Camponotus maritimus
86 Camponotus BCA01 Camponotus conithorax Opisthopsis respiciens Anoplolepis gracilipes
Myrmoteras iriodum Notoncus capitatus
Acropyga acutiventris
85
Pseudolasius australis Paratrechina hystrix
55
Prenolepis imparis Prenolepis albimaculata
73 100
73 100
100Myrmica tahoensis Myrmica striolagaster Manica bradleyi
Pogonomyrmex subdentatus
97
Orectognathus versicolor Daceton armigerum
Microdaceton tibialis Eurhopalothrix bolaui Acanthognathus ocellatus Myrmicocrypta cf. infuscata
Apterostigma auriculatum
100 Acromyrmex versicolor
Trachymyrmex arizonensis Wasmannia auropunctata Procryptocerus scabriusculus
100 Pyramica hoplites Strumigenys dicomas
96 Pheidole clydei Pheidole hyatti Basiceros manni
Pilotrochus besmerus
96 Aphaenogaster albisetosa Messor andrei Stenamma dyscheres
94 Aphaenogaster swammerdami
100
Aphaenogaster occidentalis Messor denticornis
86
100 Solenopsis molesta Solenopsis xyloni
83 Myrmicaria exigua Monomorium ergatogyna Vollenhovia emeryi
100 Tetramorium validiusculum Tetramorium caespitum
Mayriella ebbei Xenomyrmex floridanus
Eutetramorium mocquerysi Myrmecina graminicola
Crematogaster emeryana Cardiocondyla mauritanica Leptothorax muscorum complex
Nesomyrmex echinatinodis
75 Meranoplus cf. radamae Cataulacus MAD02 Terataner MAD02
Pheidologeton affinis Temnothorax rugatulus
Metapone madagascarica Rhopalomastix rothneyi
91
100 Acanthoponera minor Heteroponera panamensis
100
88 Typhlomyrmex rogenhoferi Gnamptogenys striatula
96 Ectatomma opaciventre Rhytidoponera chalybaea
0.1
outgroup taxa Leptanillinae
Martialinae
'poneroids'
formicoids
Figure A.5: Masked, unpartitioned data set. Maximum Likelihood (majority rule consensus) topology inferred from the masked, unpartitioned data set with 5,000 boot-strap replicates (-f a; GTR+Γ, see method section Chapter3). The tree was rooted with Pristocera.
100 Mischocyttarus flavitarsis Metapolybia cingulata
62
Aglyptacros cf. sulcatus Scolia verticalis
Chyphotes mellipes
72 Sapyga pumila
Dasymutilla aureola Aporus niger
87 Chalybion californicum
Apis mellifera
100
100 Protanilla JAP01
100 Leptanilla GRE01
Leptanilla RSA01
93
91
Tatuidris ECU01
100 Onychomyrmex hedleyi
100 Concoctio concenta Prionopelta MAD01
100
Amblyopone pallipes Amblyopone mutica Mystrium mysticum Adetomyrma MAD02
Apomyrma stygia Paraponera clavata
98 Discothyrea MAD07
80 Probolomyrmex tani
100 Proceratium MAD08 Proceratium stictum
68 100
100 Platythyrea punctata Platythyrea mocquerysi
100 100
Anochetus madagascarensis Leptogenys diminuta
55
Pachycondyla sikorae
74
Odontoponera transversa Odontomachus coquereli
84
98 Thaumatomyrmex atrox Simopelta cf. pergandei
98
91 Centromyrmex sellaris
89
Psalidomyrmex procerus
96
Loboponera politula Plectroctena ugandensis
100 Hypoponera sakalava Hypoponera opacior
100 100
Cerapachys sexspinus Cerapachys augustae Cerapachys larvatus
Simopone marleyi
62
100 Dorylus laevigatus Dorylus helvolus Aenictogiton ZAM02
73
100 Aenictus ceylonicus Aenictus eugenii
100 Cheliomyrmex cf. morosus
100 Neivamyrmex nigrescens Eciton vagans
66 Acanthostichus kirbyi Cylindromyrmex striatus
100 Leptanilloides mckennae Leptanilloides nomada Sphinctomyrmex steinheili
63 57
100 Tetraponera punctulata
99 Tetraponera rufonigra
98 Myrcidris epicharis Pseudomyrmex gracilis
81
Aneuretus simoni
100
Dolichoderus scabridus
70 97
95
100Liometopum occidentale Liometopum apiculatum Tapinoma sessile
100 Technomyrmex difficilis Technomyrmex MAD05
77
94Leptomyrmex AUS01 Leptomyrmex erythrocephalus
59 Dorymyrmex bicolor Forelius pruinosus
58
Papyrius nitidus
68 Azteca ovaticeps Linepithema humile
69
Anonychomyrma gilberti
64
Turneria bidentata Philidris cordatus
100 Myrmecia pyriformis Nothomyrmecia macrops
100 86 100Lasius californicus
Myrmecocystus flaviceps
70
98 Brachymyrmex depilis Myrmelachista JTL01 Oecophylla smaragdina Notostigma carazzii
100Polyergus breviceps Formica moki
84 64
100Polyrhachis Cyrto01 Polyrhachis Hagio01
96
Calomyrmex albertisi
93Camponotus hyatti Camponotus maritimus
82 Camponotus BCA01 Camponotus conithorax Opisthopsis respiciens Anoplolepis gracilipes
Myrmoteras iriodum Notoncus capitatus
Acropyga acutiventris
88
Pseudolasius australis Paratrechina hystrix
Prenolepis imparis Prenolepis albimaculata
67 100
100
100Myrmica tahoensis Myrmica striolagaster Manica bradleyi
Pogonomyrmex subdentatus
97
Orectognathus versicolor Daceton armigerum
Microdaceton tibialis Eurhopalothrix bolaui Acanthognathus ocellatus
Myrmicocrypta cf. infuscata Apterostigma auriculatum
100 Acromyrmex versicolor Trachymyrmex arizonensis Wasmannia auropunctata Procryptocerus scabriusculus
100 Pyramica hoplites Strumigenys dicomas
99Pheidole clydei Pheidole hyatti Basiceros manni
Pilotrochus besmerus
98 Aphaenogaster albisetosa Messor andrei
79
Stenamma dyscheres
98
Aphaenogaster swammerdami
100
Aphaenogaster occidentalis Messor denticornis
90
100 Solenopsis molesta Solenopsis xyloni
60 Myrmicaria exigua Monomorium ergatogyna Vollenhovia emeryi
100 Tetramorium validiusculum Tetramorium caespitum
Mayriella ebbei Xenomyrmex floridanus
Eutetramorium mocquerysi Myrmecina graminicola
Crematogaster emeryana
65
Cardiocondyla mauritanica Leptothorax muscorum complex
Nesomyrmex echinatinodis
88 Meranoplus cf. radamae Cataulacus MAD02 Terataner MAD02
Pheidologeton affinis Temnothorax rugatulus
Metapone madagascarica Rhopalomastix rothneyi
96
100 Acanthoponera minor Heteroponera panamensis
100
90 Gnamptogenys striatula Typhlomyrmex rogenhoferi Ectatomma opaciventre Rhytidoponera chalybaea
0 1
'poneroids'
Martialis heureka
Leptanillinae
formicoids Martialinae
outgroup taxa
Figure A.6: Masked, partitioned data set.Maximum Likelihood (majority rule consen-sus) topology inferred from the masked, partitioned data set with 5,000 bootstrap replicates (-f a; GTR+Γ, see method section Chapter3). The tree was rooted withPristocera.
LoBraTe
B.1 Flowchart of the LoBraTe Process Pipeline
LoBraTe (Long Branch Test) is a process pipeline designed to infer the behaviour of different branch lengths on Maximum Likelihood inference under different evolution-ary model assumptions. Additionally, LoBraTe calculates branch length relations of correct and incorrect relationships with a special mathematical algorithm including a likelihood ratio test and chi square test. LoBraTe is actually used to test the mathematical algorithm for its efficiency to identify long branch attraction between strongly derived taxa. LoBraTe was also used for the simulation analyses of chapter 5 and 4. For chapter 5, over 800,000 simulations are automatically analyzed with LoBraTe.
4 alignments [INDELIBLE]
JC+INV [0.30]+GAM [1.0]
L = 2000 , 3000 , 4000, 10000 bp
Single ML-analyses [PhyML]
JC+all combinations of ...
- INV [ - ] [0.3] [var]
- GAM [100] [1.0] [0.1] [var]
Branch elongation 1 (stepwise) L = 0.01 , 0.05 , 0.1 , 0.3 , 0.5
for each step
Branch elongation 2 (stepwise) L = 0.1 , 0.3 , 0.5 , 0.7 ... 1.5
for each step
each alignment Given Topologies
for each topology e.g.:
ML examination
100 times
output
Out T1 T2 T3T4 T7T8 T9
T10
L5 L6
2
2 1
Example of a given Topology
Figure B.1: Overview of the LoBraTe simulation and analyse processes.
Output
For each Branch elongation 1
- Reconstruction success of a single setup under 4 different alignment Length per repeat step
0 1 2 3 4
0.1 0.20.30.40.50.60.70.80.9 1 1.11.21.31.4
Occure of LBA, non-LBA & LBAII
Branch elongation2
Alignment
lbaIII nonlba lba_II
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900
024681012141618202224262830 Split Support
Alignments
N Symetric Splits
lbaIII non-lba l5+st non-lba l6+st lbaII l5+st lbaII l6+st
- Split occurence of a single setup per repeat step
0 1 2 3 4 5
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4
Occure of LBAIII splits
Branch elongation2
Split support %
lbaIII nonlba L5+st nonlba L6+st
- Split occurence of a single setup summarized over 4 different alignment lengths per repeat step
-85000 -80000 -75000 -70000 -65000 -60000 -55000 -50000 -45000 -40000 -35000 -30000 -25000 -20000 -15000 -10000 -5000
02468101214161820222426283032343638404244464850525456586062646668707274767880828486
single best likelihood values
Branch elongation2
likelihood
lbaIII trees nonlba trees lbaII trees
- Maximum Likelihood scores of a single setup per repeat step
0 10 20 30 40 50 60 70 80 90 100
123 456 78910 111213 141516 171819 202122 232425 262728 293031 32
Parametric Bootstrap Support
Branch elongation2
Reconstruction Success for 100 repeats
lbaIII nonlba lbaII
- Reconstruction success of a single setup summarized over 100 repeats
-90000 -85000 -80000 -75000 -70000 -65000 -60000 -55000 -50000 -45000 -40000 -35000 -30000 -25000 -20000 -15000 -10000 -5000
1 234 567 891011 121314 151617 181920 21222324 252627 282930 3132
Likelihood Inference
Branch elongation2
ln-Values
ln_mean ln_max ln_min
- Maximum Likelihood scores of a single setup summarized over 100 repeats
0 10 20 30 40 50 60 70 80 90 100
123 45 678 9101112 131415 1617 181920 212223 2425 262728 293031 32
Parameter Inference
Branch elongation2
Values
gamma_mean gamma_max gamma_min invar_mean invar_max invar_min
- Parameter estimates of a single setup summarized over 100 repeats
0 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01 0.011 0.012 0.013
0 2 4 6 81012141618202224262830323436384042444648
single nce/nc ratios
ML trees
likelihood
lbaIII trees nonlba trees nonlba trees
- Branch length calculation of a single setup per repeat step
Figure B.2: Overview of single LoBraTe output plots.
RAxTAX
C.1 Flowchart of the RAxTAX Process Pipeline
RAxTAX is a process pipeline is designed to execute a full phylogenetic analyses starting from raw sequence data and ending by a full Maximum Likelihood analysis.
FigureC.1 gives an schematic overview about optional and stringent starting com-mands and the handling of an optional given taxon-restriction inputfile. FigureC.2 shows all single subprocesses of a full RAxTAX analysis in which only concatenated data is completely analysed. Parallel to this, RAxTAX can completely analyse all single masked and unmasked files within the same process run.
Restriction File (optional) Commands
(stringent) Raw Data (stringent)
Trees
RAxTAX
Infofiles
Commands
-Alignment Method (stringent) -Refinement (optional) -Likelihood Method (stringent) -Substitution Model (stringent) -Model Parameters (stringent) -Number of Bootstrap Replicates (optional)
Restriction File
-Taxa which should be excluded before a RAxTAX process run have to be named in the same line - Starting from the first line, demand RAxTAX runs are connected in series - Output files are stored with different prefix names for each process run
Figure C.1: Schematic overview about input and output of RAxTAX.
RAxTAX
Raw Sequence Data (FASTA)
Multiple Sequence Alignment
Alignment Refinement
Alignment Masking
Sequence Concatenation
Alignment Evaluation
FASTA to PHYLIP Tree Reconstruction
Tree Evaluation
l-ins-i e-insi g-ins-i
MUSCLE T-COFFEE Dialign-TX
aligned
FASconCAT
03_aliscore ALISCORE
ALICUT
aligned/refined/
masked
info Listfile masked
01_raw_data
aligned unaligned
Degap Data
unaligned
02_msa
MUSCLE aligned/
refined
aligned
supermatrix 04_fasconcat
FASconCAT supermatrix
info 05_aligroove1 AliGROOVE
similarity-matrix
06_ml RAxML
PhyML
info supermatrix PHYLIP trees
07_aligroove2 AliGROOVE
Tree-Tag
Make_directory
02_msa
aligned
aligned/refined
Restriction File
YES|NO EXIT
Taxon Exclusion
Figure C.2: Overview of all subprocesses during a complete RAxTAX analysis.
Manual FASconCAT
D.1 Introduction
FASconCAT is designed to concatenate sequence alignment files into one super-matrix file in a convenient manner. The supersuper-matrix, for which different output formats are selectable (FASTA, PHYLIP, NEXUS) can be directly used for phy-logenetic purposes. It considers standard nucleotide sequence alignments, recoded nucleotide sequences (e.g. with the third position of a codon RY coded), and amino acid alignments. Provided structure strings (in dot-bracket format), often used e.g.
in ribosomal RNA analyses, are recognized and concatenated as well. FASconCAT can handle input files in PHYLIP, CLUSTAL and FASTA format in one single run, there has to be no unique input format. Within a sequence file, sequences must have equal length. The software extracts taxon specific associated gene- or struc-ture sequences out of given input files and links them to one string. Missing taxon sequences in single files are replaced either by ’N’ (nucleotide data), ’X’ (amnino acid data) or by ’.’ (structure strings in ’dot-bracket’ format), dependent on their taxon associated data level. It is possible to concatenate nucleotide and amino acid files into one supermatrix file. FASconCAT can read sequences in interleaved and non-interleaved format. For given FASTA files, the program tolerates line breaks in sequences, but not in sequence (taxon) names. Sequence names may only include alphanumeric signs, underscores (_) and blanks. FASconCAT will issue an error prompt and die if any non-alphanumeric sign is encountered in sequence names.
FASconCAT was written on Linux and works on WindowsPCs, Mac OS and Linux running systems. Input files originating from Windows, CRFL line feeds should be converted into Unix (LF) line feeds in advance, especially, if the user changes the operating system. This can be done in several editors like e.g. Bioedit, Notepad++ orScite. FASconCAT usually replaces them, but might not succeed in every instance.
Ambiguities and indels are allowed. Any other sign in sequences, except for those covered by the universal DNA/RNA or amino acid code, will also lead to an error prompt. Structure information (e.g. of ribosomal RNA sequences) are also recognized, analyzed and concatenated. Structure information should be present in each file once and associated with equal taxon names, e.g. “structure”. Otherwise, the software will interrupt with a specific error prompt. FASconCAT provides ad-ditionally information about each input file and the new concatenated supermatrix in .xls format. The file includes single range information of each gene (gene frag-ment or partition) and a list of all concatenated sequences. If structure strings have
been included, it lists the number and percentage of unpaired and paired alignment positions of each single file and the supermatrix file. Optionally, extended infor-mation is provided. The extended inforinfor-mation setting includes reports about e.g.
base composition of single files and the supermatrix file for nucleotide data. Fur-ther, if structure strings in dot-bracket format have been included, the concatenated structure composition of loop and stem positions are printed in a separate .txt file (-i, see below). For a more detailed report about additional information see section
’Usage/Options’.
As another option, FASconCAT can generate NEXUS files of concatenated se-quences, either with commands which can be directly executed in PAUPorMrBayes, or without any specific commands. It is also possible to generate output files in PHYLIP format with relaxed– (unlimited signs) or strict (limited up to ten signs) sequence names while sequences are always printed out as non-interleaved. FAS-conCAT can be started directly via command line or indirectly, guided by menu options.