1
Supplemental Information for:
Genome-wide variation in nucleotides and retrotransposons in alpine populations of Arabis alpina (Brassicaceae)
Aude Rogivue, Rimjhim R. Choudhury, Stefan Zoller, Stéphane Joost, François Felber, Michel Kasser, Christian Parisod, Felix Gugerli
Table of Contents:
Appendix S1 Page 2
TABLE S1 Page 4
TABLE S2 Separate text file
TABLE S3 Page 6
TABLE S4 Page 7
TABLE S5 Page 15
TABLE S6 Page 16
TABLE S7 Page 18
FIGURE S1 Page 21
FIGURE S2 Page 22
FIGURE S3 Page 23
FIGURE S4 Page 24
FIGURE S5 Page 25
FIGURE S6 Page 26
2
Appendix S1. Mapping and TE annotation
Mapping
The mapping of the sequences after quality control against the reference genome V5.1 based on Jiao et al., (2017). We did not use all contigs but filtered according to their size and their gene content. Therefore, eight chromosomes and 728 of the 918 contigs (685 contigs > 10kb and 43 contigs < 10kb containing genes) were used, representing 335,551,605bp. For the production of the SNP dataset, mapping was performed with BWA v.0.7.12 (Li & Durbin, 2010) with a minimum output score of 10. Other options were used by default, and the mapped reads were sorted with SAMTOOLS v.1.2 (Li et al., 2009).
Transposable element annotation
Reliable assessment of transposable element (TE) polymorphisms at the population level depends on high-quality TE annotation such as generated for LTR-RTs in Arabis alpina V3 (Willing et al., 2015). No such annotation has so far been released for the latest, highly contiguous assembly of the A. alpina reference genome V5 (Jiao et al., 2017). To avoid biases resulting from the use of TE annotations from previous versions as well as those related to the transfer of annotation of high-copy TEs, LTR-RTs were de novo re-annotated in the reference sequence of A. alpina (V5.1). Following Choudhury, Neuhaus, and Parisod (2017), full-length LTR-RT copies were identified through their structural features with LTRharvest v.1.5.7 (Ellinghaus, Kurtz, & Willhoeft, 2008) and LTRdigest v.1.5.7 (Steinbiss, Willhoeft, Gremme, &
Kurtz, 2009) . After removal of nested and overlapping predictions, copies were clustered into families using CD-HIT-EST v.4.6.7 (Li & Godzik, 2006) when their LTR alignment covered at least 80% of the sequence length with an identity of at least 80%, following Wicker et al. (2007).
Characterized LTR-RT families were then classified into phylogenetically defined tribes using BLASTN of their reverse transcriptase (RT) coding sequence against the RT sequences (± 500bp) extracted from the V3 LTR-RT library (80% identity over 80% length). Families that remained unclassified were assigned to a tribe by BLASTN using RT sequences (± 500bp) extracted from Repbase Brassicaceae LTR-RTs.
All structurally defined and classified LTR-RTs were mapped on the A. alpina reference assembly V5.1 using RepeatMasker (version Open-4; Smit, Hubley, & Green, 2013) with RM- BLAST as search engine and divergence set to 20%. LTRs from each family were aligned using Muscle v3.8 (Edgar, 2004) and used to identify solo LTRs through Hidden Markov Model (HMM) profiles using HMMBuild from the HMMER package (hmmer.org). nHMMER was used to search for HMMs from all LTR clusters against the entire genome assembly and identify potential LTRs, including solo LTRs. The resulting annotation was filtered to remove nested LTR-RTs and annotations shorter than 80bp using a custom script, and referred to as reference TEs (Table S1 in Supplementary Information).
3 References
Choudhury, R. R., Neuhaus, J.-M., & Parisod, C. (2017). Resolving fine-grained dynamics of retrotransposons: comparative analysis of inferential methods and genomic resources.
Plant Journal, 90, 979–993.
Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797.
Ellinghaus, D., Kurtz, S., & Willhoeft, U. (2008). LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics, 9, 18.
Jiao, W.-B., Accinelli, G. G., Hartwig, B., Kiefer, C., Baker, D., Severing, E., … Schneeberger, K.
(2017). Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.
Genome Research, 27, 778–786.
Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics, 26, 589–595.
Li, W., & Godzik, A. (2006). Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 22, 1658–1659.
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079.
Smit, A., Hubley, R., & Green, P. (2013). 2013–2015. RepeatMasker Open-4.0. http://
www.repeatmasker.org.
Steinbiss, S., Willhoeft, U., Gremme, G., & Kurtz, S. (2009). Fine-grained annotation and classification of de novo predicted LTR retrotransposons. Nucleic Acids Research, 37, 7002–7013.
Wicker, T., Sabot, F., Hua-Van, A., Bennetzen, J. L., Capy, P., Chalhoub, B., … Schulman, A. H.
(2007). A unified classification system for eukaryotic transposable elements. Nature Reviews Genetics, 8, 973–982.
Willing, E.-M., Rawat, V., Mandáková, T., Maumus, F., James, G. V., Nordström, K. J., … Schneeberger, K. (2015). Genome expansion of Arabis alpina linked with
retrotransposition and reduced symmetric DNA methylation. Nature Plants, 1, .
4
TABLE S1 Number of annotated TEs per tribe in V5.1 of Arabis reference genome with their mean length per tribe and its comparison with their annotation in V3 (Choudhury et al., 2017).
The number of polymorphic TEs identified in this study per tribe is also presented here.
Tribe
No. Of Annotated
copies in V5.1
Assembly Mean Length
No. Of Annotated copies in V3
Assembly (Choudhury et al., 2017)
No. Of Polymorphic
TE copies
ATGP1 17721 2589.00 7777 1611
ATLANTYS2 9210 2279.73 6190 944
ALYGypsy4 3792 2154.13 3690 333
ATCOPIA20 2693 1112.64 992 877
ATHILA4 2582 2363.23 6024 249
ATCOPIA95 1524 2350.27 1277 362
TA1-2 816 2787.29 518 144
ALYCopia74 614 1928.39 331 93
ATHILA6 533 2064.66 604 25
ALYCopia32 526 2895.32 302 121
ATGP10 522 1937.08 329 67
ATGP4 441 2126.36 450 103
ENDOVIR1 436 3075.13 956 69
ATGP5 290 2700.76 245 47
ATCOPIA35 266 1864.34 172 43
ALYCopia6 200 3377.69 425 59
ATGP8 192 2506.93 109 11
ATGP2 161 2116.32 175 14
BraCopia7 154 1179.53 439 9
ATCOPIA23/ONSEN 153 3508.75 NA 45
ATCOPIA1 106 3323.19 NA 27
ATCOPIA27 104 783.68 NA 7
ATCOPIA40 95 2479.01 160 19
ALYGypsy1 92 4144.45 25 27
ATCOPIA14 88 3521.11 NA 37
ATCOPIA8 84 3768.14 107 23
ATLANTYS1 81 875.91 111 3
ATGP3 77 4207.51 49 14
5
ALYGypsy10 73 2593.82 112 10
ATCOPIA12 65 3948.54 NA 19
ATCOPIA17 64 3564.63 NA 31
ALYCopia40 51 2231.47 NA 5
ATCOPIA36 43 3436.51 30 5
ATCOPIA34 34 3071.91 40 6
ALYCopia24 25 3827.64 6 11
ATCOPIA43 21 3161.48 37 3
BraCopia119 20 3978.35 5 8
BraGypsy2 17 4376.88 11 6
ALYCopia42 15 4837.67 11 6
ALYCopia16 10 4096.00 20 NA
ATCOPIA2 8 2457.88 NA 5
ALYCopia45 NA NA 744 NA
SOLO-LTRs 159727 343.09 NA 9962
Unknown 40760 2105.37 5089
6
TABLE S2 List of the single-nucleotide polymorphisms (SNPs) present in the transposable elements (TEs), these SNPs were removed from the SNPs set to obtain the non-TE SNPs set. The list is available as a separate txt file.
7
TABLE S3 Estimation of the whole-genome linkage disequilibrium (LD) decay as half-decay distance (kb) for the single-nucleotide polymorphisms (SNPs) and the SNPs excluding transposable element sequences (non-TE SNPs) for the eight chromosomes and its average over chromosomes. The estimation was done using a window size between 250kb and 1500kb to obtain about the same number of pairwise comparisons among the chromosomes.
Chromosome Window size (kb)
Number of pairwise
comparison Half decay distance (kb)
SNPs
1 500 26,638,833 19.68
2 250 27,949,112 27.28
3 500 29,944,594 25.44
4 250 25,937,604 35.52
5 500 25,165,496 46.83
6 250 23,035,780 35.12
7 250 29,359,305 28.88
8 250 27,512,155 101.99
Average 40.09
Non-TE SNPs
1 1000 25,512,167 21.62
2 700 27,069,660 23.41
3 1000 26,817,985 19.25
4 700 23,862,096 39.54
5 1500 27,301,163 40.60
6 700 27,258,686 27.73
7 700 28,115,036 31.23
8 700 29,789,366 44.46
Average 30.98
8
TABLE S4 Number of raw reads obtained for each sample of Arabis alpina (N = 306) after whole- genome sequencing with Illumina HiSeq2500 (raw reads), after trimming and de-duplication of the reads, after mapping to the reference genome (V5.1, Jiao et al. 2017) as well as the mean and the median of the coverage. The four populations sampled are: Essets (Es, N = 70), Martinets (Ma, N = 96), Para (Pa, N = 70) and Pierredar (Pi, N = 70).
Samples
Number of reads Coverage
Raw reads
After trimming and deduplication (paired and
unpaired)
Properly paired mapping
V5.1
Mean Median
Es100 22534988 22059960 15197201 8 5.891
Es12 57890690 56960580 37663813 20 15.07
Es13 70600628 69405572 45632557 25 18.21
Es14 80909060 79705246 54430489 29 21.57
Es16 33826234 33486696 23233307 12 9.25
Es17 36070626 35753024 23926222 13 9.532
Es19 20901874 20692026 14079674 7 5.606
Es20 23776808 23479168 15673657 9 6.284
Es21 43713930 43288162 29145355 15 11.66
Es22 39291528 38413984 25675355 14 10.31
Es23 41799780 41404262 27719120 14 10.97
Es24 19108898 18920128 12851028 7 5.13
Es25 34241968 33850772 22359541 12 8.892
Es3 94490534 93208312 18035292 34 25.29
Es32 27341042 27026106 16457216 10 7.232
Es33 25405832 25121964 21667659 9 6.586
Es34 31804358 31403162 14207562 12 8.577
Es35 21468058 21226402 31564485 8 5.681
Es36 46815266 46374690 24821979 17 12.66
Es38 37889076 36127188 29393214 13 9.576
Es39 43697402 41591186 64450463 16 11.38
Es40 32559672 32277922 21589277 12 8.603
Es41 21053648 20834548 14012117 8 5.618
9
Es42 24760866 24451560 16584976 9 6.59
Es43 25732790 25458338 17239248 9 6.842
Es44 23304400 23079998 15922532 9 6.294
Es46 26930288 26614780 18242107 10 7.215
Es48 25395598 25108118 16924600 9 6.743
Es49 26692600 26426340 18087123 10 7.247
Es50 30275760 29980812 21130129 11 8.178
Es52 30216840 29892190 20546494 11 8.126
Es53 25234110 24955846 16223757 9 6.457
Es55 39029668 38447142 26849913 14 10.48
Es56 27567058 27052968 19093263 10 7.44
Es57 37656812 37075644 26034731 14 10.21
Es59 24056868 23699256 16737515 9 6.519
Es6 34907624 34567170 29762532 13 9.154
Es60 43227338 42515486 49502079 15 11.62
Es63 74595064 71404084 14288198 25 19.1
Es64 20599864 20324094 29927341 8 5.531
Es65-1 43381350 42676160 23014728 15 11.62
Es66 36823554 35085150 19796612 12 8.87
Es67 28786574 28345850 31291936 10 7.701
Es68 46075900 45223268 23100987 16 12.18
Es7 47948578 45250950 29963785 16 12.14
Es71 44527882 43642132 13100855 15 11.74
Es72 19767896 19161772 14798135 7 5.153
Es73 21707656 21034374 15510257 8 5.784
Es75 22847750 22151424 15465529 8 6.043
Es76 22574792 21910664 12961493 8 6.028
Es77 19109630 18543542 18481434 7 5.09
Es78 28110318 27193970 16837568 10 7.259
Es79 24555100 23970620 31452642 9 6.533
Es8 19255716 19073790 22388542 7 5.145
Es81 33159150 32492992 31627687 11 8.835
Es82 45850504 44985238 21734102 17 12.38
Es83 32122512 31556320 23302737 11 8.511
Es84 33761930 33185062 28487729 12 9.093
Es85 42102432 41144924 23403965 14 10.89
Es86 34425640 33781082 24279714 12 9.148
10
Es88 35372504 34663610 12757986 13 9.508
Es9 52656054 51891454 19301049 19 14.3
Es90 28153586 27644574 29561619 10 7.436
Es91 44200560 43198674 14151349 15 11.51
Es92 20809016 20403896 32735336 7 5.465
Es93 47853206 46897412 29290083 16 12.75
Es94 43242198 41346914 18287097 15 11.22
Es95 26648720 26146578 28688368 9 7.043
Es98 41969476 41182854 27263724 15 11.07
Es99 40913434 39840306 35852921 14 10.65
Ma10 35972254 33720408 14096090 12 9.054
Ma100 21174646 20628359 23145324 7 5.534
Ma11 29770832 27552932 18524082 10 7.219
Ma12 46805118 43320232 29134499 16 11.33
Ma13 31013372 29077521 19447888 10 7.577
Ma14 32765802 30007296 20185067 11 7.878
Ma15 43205898 42254988 29308771 15 11.44
Ma16 30737834 30107196 20861569 11 8.155
Ma17 36321380 35374966 24169040 13 9.448
Ma18 32571776 31768582 21359159 12 8.311
Ma19 31183094 30506166 21132118 12 8.243
Ma2 32690682 32006915 22460081 11 8.482
Ma20 33696042 32837345 23716598 12 8.776
Ma21 35054264 34199937 14682852 12 9.241
Ma22 22220316 21614280 28983909 8 5.743
Ma23 44742524 43469081 23537830 15 11.26
Ma24 36291160 35262593 23432029 12 9.151
Ma26 35010684 34169979 28284368 12 9.142
Ma27 41999202 40691185 21743474 15 11.02
Ma28 24569874 24134216 16597851 8 6.489
Ma29 25193562 23897136 16172698 9 6.305
Ma3 32899988 30787595 44837413 11 8.224
Ma30 24338894 22715176 15158153 8 5.907
Ma31 21151696 19937236 13651391 7 5.336
Ma32 19482620 18660549 12928659 7 5.05
Ma33 22868642 21318816 14407611 8 5.628
Ma34 22828234 21257104 14592359 8 5.7
11
Ma35 71198092 65892450 21125865 24 17.48
Ma36 24651192 23125476 15758170 8 6.151
Ma37 29838638 28111335 19468778 10 7.59
Ma38 26266092 24558030 16563217 9 6.469
Ma39 26345792 24407955 16459825 8 6.433
Ma4 32949882 30820274 20550074 11 8.043
Ma40 25568816 25102658 16438889 9 6.42
Ma41 20618614 20263058 13510875 7 5.276
Ma42 22470104 22001765 15336826 8 6.002
Ma43 18756080 18491450 12603442 7 4.918
Ma44 19599828 19300421 12588142 7 4.919
Ma45 23596312 23086879 15854990 8 6.191
Ma46 31368988 30720223 21597121 12 8.415
Ma47 22257408 21733934 15143030 8 5.917
Ma48 34971074 34040720 23474232 12 9.15
Ma49 25657780 25106765 17136937 9 6.687
Ma5 31679470 29709301 52889409 11 7.963
Ma50 29381020 28797211 19991317 11 7.797
Ma51 24996528 24275979 16458933 8 6.419
Ma52 19020982 18773071 12485116 7 4.878
Ma53 16370098 15652918 10778738 6 4.211
Ma54 19729638 18736764 12786947 7 4.995
Ma55 18011076 17099199 11863245 7 4.635
Ma56 83350182 79029881 20351179 28 20.58
Ma57 16883280 15748744 10427363 6 4.06
Ma58 23369348 21911826 14832591 8 5.78
Ma59 21696102 20659818 14374341 8 5.618
Ma6 30994642 29628330 61282471 11 7.823
Ma60 16555104 15565555 10568655 6 4.121
Ma61 100432596 93311932 50794001 34 23.77
Ma62 23997656 22573770 15023943 8 5.837
Ma63 21044170 19664016 13454619 7 5.246
Ma65 19732834 19414867 13274833 7 5.184
Ma66 16833504 16574188 11227040 6 4.39
Ma67 16386654 16108357 11138426 6 4.362
Ma68 17571914 17305149 11862782 6 4.624
Ma69 76948136 74890086 20032261 27 19.77
12
Ma70 16037502 15738933 10776848 6 4.209
Ma71 27917772 27392734 18600814 9 7.242
Ma72 86411268 83137589 56858199 30 22.11
Ma73 91231960 88147155 58115082 32 22.55
Ma74 19252758 18819002 13083207 7 5.11
Ma75 23405144 22878360 15910305 9 6.207
Ma76 115116858 109922177 74922772 39 29.04
Ma77 17666548 17342331 11452469 6 4.499
Ma78 16449834 15599252 10407603 5 4.095
Ma79 18573610 17560087 11990189 6 4.714
Ma8 30322186 28089596 9319213 10 7.216
Ma80 14913696 13956728 9332589 5 3.672
Ma81 14575396 13901059 11138354 5 3.673
Ma82 18159544 16790152 16247878 6 4.377
Ma83 25946494 24079240 13355689 9 6.377
Ma84 21024998 19822142 11664599 7 5.249
Ma85 18352642 17096025 11384340 6 4.589
Ma86 17762268 16748151 13754259 6 4.475
Ma87 21590900 20284462 12763055 7 5.376
Ma88 19879982 18612294 17375392 7 5.014
Ma89 25862412 25366282 18492361 9 6.792
Ma9 44369902 41397101 11728212 13 10.3
Ma90 17453778 17142954 11663477 6 4.618
Ma91 18322210 17914755 13738749 6 4.58
Ma92 20343822 19953506 8419177 7 5.384
Ma93 14323860 14083831 9870208 5 3.313
Ma94 14804668 14456817 11138572 5 3.888
Ma95 17216978 16840414 14155907 6 4.363
Ma96 22234480 21643363 14877649 7 5.557
Ma97 22526018 21981120 13413720 8 5.839
Ma98 20354482 19838714 16002396 7 5.262
Ma99 23852782 23305722 26442872 8 6.291
Pa1 60243382 59450108 18333710 20 15.94
Pa100 26602462 26104096 32889054 10 7.03
Pa11 49759506 49284300 31545829 17 12.99
Pa12 49515454 46769990 28089454 15 12.19
Pa13 42285096 41830546 20696926 15 11.13
13
Pa14 32989142 32701324 30813094 11 8.33
Pa16 45579976 45068866 16107485 17 12.22
Pa19 24420584 24194472 40194492 9 6.455
Pa2 36531706 36154302 13618710 12 8.896
Pa20 20511438 20247390 30162830 7 5.437
Pa23 44930008 44404870 26130738 16 11.82
Pa24 40645252 40228150 22312436 14 10.38
Pa26 34549326 33226914 52696952 12 8.646
Pa27 77384214 76529036 20204600 28 20.65
Pa28 30184750 29875392 34081682 11 8.057
Pa29 53929720 52855936 22572669 17 13.39
Pa3 37888218 37479694 15666187 14 10.11
Pa30 23993840 23682696 30395793 8 6.304
Pa31 46733610 44774524 12546769 16 11.73
Pa32 19356466 19129358 34464015 7 5.075
Pa33 52827460 49722606 19473322 18 13.24
Pa35 29559834 29226538 17218901 10 7.789
Pa36 26141528 25846496 11277504 9 6.868
Pa37 19162170 18960508 25501639 6 4.516
Pa40 19755870 19551468 12521513 7 5.022
Pa42 22362694 22111420 14922591 8 5.947
Pa43 24530240 24258806 15736232 9 6.285
Pa45 35916738 35120304 23502640 12 9.015
Pa46 21542876 21320198 14508348 8 5.764
Pa47 49411230 48918438 32598570 17 13.08
Pa48 21507666 21292398 14650546 8 5.819
Pa49 29225656 28915638 19666850 11 7.815
Pa5 35826966 35254300 17340444 12 9.546
Pa51 26004004 25707202 15754962 9 6.864
Pa52 22913172 22688068 33944577 8 6.245
Pa53 51068120 50521514 28785044 18 13.23
Pa55 44441132 43869116 13702478 15 11.39
Pa56 21541028 20974906 19090847 7 5.344
Pa57 28041314 27615390 22511887 10 7.435
Pa58 33945344 33322822 24031610 12 8.784
Pa6 56838584 55841530 31587749 18 13.6
Pa60 46570474 45760396 13372000 17 12.37
14
Pa61 19876762 19511110 15411834 7 5.19
Pa63 22698376 22373470 20744810 8 5.964
Pa64 30012588 29501062 16709361 11 8.005
Pa65 29687578 23603154 29507592 9 6.381
Pa66 43647846 42716116 23733317 15 11.54
Pa67 34999462 34435350 15364225 13 9.213
Pa68 25131536 24367434 21456590 8 6.078
Pa69 33301090 32202510 34223195 11 8.395
Pa71 25555534 24635718 16928928 9 6.596
Pa72 29257358 28331528 19719423 10 7.648
Pa74 19633000 19063638 13098753 7 5.129
Pa75 32502596 31390848 20179788 10 7.867
Pa76 20864234 20180554 13814901 7 5.438
Pa77 43924972 41301124 28823237 15 11.13
Pa78 23938378 23435256 16213434 9 6.255
Pa84 39124952 38399020 26575118 14 10.4
Pa85 20198940 19724706 13484744 7 5.267
Pa86 46365622 45274208 27946871 14 10.87
Pa87 20111366 19698146 13527328 7 5.223
Pa88 21349134 20883524 14489084 8 5.605
Pa89 37210256 36541958 24883165 13 9.728
Pa9 3631142 3555034 23637601 1 0.9504
Pa90 34879904 34208390 27248651 12 9.255
Pa91 41341096 40406556 24588504 14 10.74
Pa92 37732736 37072470 27254084 12 9.586
Pa94 40135848 39506334 26371077 14 10.65
Pa96 38622834 37794920 29833523 14 10.13
Pa98 44827144 42354106 2437054 15 11.51
Pi1 126209728 124152694 40347687 43 33.81
Pi11 59257972 58336970 40085320 21 15.91
Pi12 59289084 58371068 35430435 21 15.88
Pi13 53312146 49879504 22531611 19 13.6
Pi14 33769014 33374382 28198957 12 9.013
Pi15 42450704 39883204 29315387 14 10.83
Pi18 42680804 42205360 24251643 15 11.58
Pi19 35691502 34892644 85975061 12 9.406
Pi2 22553550 22314006 20752605 8 6.1
15
Pi20 31108652 30700780 29921496 11 8.187
Pi21 43402292 42901536 27746600 16 11.74
Pi26 41625746 41168210 21520401 14 10.92
Pi27 32163274 31455910 25526066 12 8.471
Pi28 37549268 36832882 41799471 13 9.932
Pi29 63402286 62405546 15392190 20 16.47
Pi3 52497426 51693384 13097097 18 14.13
Pi31 19991820 19753210 23402372 7 5.225
Pi33 35843476 35158552 30219251 12 9.175
Pi34 44587934 44050992 12265982 16 11.95
Pi36 18367676 18168712 27828257 7 4.898
Pi39 41854474 39284080 35856751 14 10.64
Pi40 24568056 24271430 16190936 9 6.424
Pi41 40990436 40596396 27147789 14 10.77
Pi42 46697272 46205044 31620325 17 12.5
Pi43 44912768 44471794 30472289 16 12.04
Pi44 40533872 40048658 27134968 14 10.74
Pi49 19270598 19053680 13160097 7 5.235
Pi5 50761310 48378310 25046756 18 13.23
Pi51 36936162 36577380 40636671 13 9.969
Pi52 61493942 57545752 23615702 21 15.61
Pi54 36906762 36515002 17672740 13 9.357
Pi55 25752042 25495688 16786489 9 6.987
Pi56 24503532 24237834 13529391 9 6.632
Pi57 19707468 19496376 17779501 7 5.371
Pi58 26749206 26426100 33678505 9 7.037
Pi59 49841470 49216518 34471820 16 13.19
Pi60 46764190 46290036 31930681 17 12.62
Pi62 26825230 26543578 18380945 10 7.242
Pi65 20252266 19854632 13641463 7 5.326
Pi67 39616570 38735992 27194540 14 10.54
Pi68 21918904 21560688 15023947 8 5.843
Pi7 28716170 28256384 21785258 10 7.634
Pi71 32131268 31334874 28418579 12 8.53
Pi72 41339422 40378142 12912992 14 11.08
Pi73 19254360 18803366 19363095 7 5.031
Pi74 28053664 27521002 17416467 10 7.513
16
Pi75 25990434 25297986 38390807 9 6.67
Pi76 55052636 53861462 27505039 20 14.8
Pi77 40877364 39919814 31619713 15 10.67
Pi78 47447234 45872214 13531617 15 12.3
Pi79 19851996 19134168 19232296 7 5.218
Pi80 48471422 46463368 32474939 16 12.63
Pi81 50144184 48004894 33107210 17 12.86
Pi82 19404730 18769194 13223906 7 5.15
Pi83 25792960 24958280 17432648 9 6.776
Pi84 21054116 20382084 14294121 7 5.565
Pi85 45206894 44137246 30041041 16 11.62
Pi86 33176064 32416426 22105921 12 8.635
Pi87 55116976 52091732 35865276 17 13.68
Pi88 43033538 42053842 29147513 15 11.31
Pi89 23918210 23153988 16049089 8 6.158
Pi9 49478734 41192286 38631083 15 11.09
Pi90 57551796 55511194 22897846 19 14.85
Pi92 35065238 33749528 13142225 12 8.795
Pi93 19084122 18580272 25116582 7 5.072
Pi94 37765898 36551884 44193 13 9.709
Pi95 66804 63748 23242221 0 0.01732
Pi96 34191228 33383886 35582820 12 8.894
Pi97 52568714 51474984 31160913 18 13.81
Pi98 47124642 45809504 29029917 15 11.88
17
TABLE S5 Results of the comparison among the population genomics values estimated on the four markers (nuclear microsatellites, single-nucleotide polymorphisms (SNPs), SNPs excluding transposable element sequences (non-TE SNPs), transposable elements (TEs)) tested with a one-way ANOVA for (a) Ho , (b) He, and (c) FIS, and (d) the results of the Mantel test among the pairwise FST. The ANOVA was followed by a Tukey HSD test, comparing the markers, resulting in the difference and its upper and lower limit (confidence interval 95%) and the p value adjusted for multiple comparisons.
a) Observed heterozygosity (Ho): One-way ANOVA, (p = 0.22)
Difference Lower Upper Adjusted p value
Microsatellites vs non-TE SNPs 0.005867105 -0.1234798 0.13521398 0.9979363 SNPs vs non vs TE SNPs 0.005000000 -0.1243469 0.13434687 0.9987177
TEs vs non-TE SNPs -0.057500000 -0.1868469 0.07184687 0.3515078
SNPs vs Microsatellites -0.000867105 -0.1302140 0.12847977 0.9999932 TEs vs Microsatellites -0.063367105 -0.1927140 0.06597977 0.2761637
TEs vs SNPs -0.062500000 -0.1918469 0.06684687 0.2865002
b) Expected heterozygosity (He): One-way ANOVA, (p < 0.05)
Difference Lower Upper Adjusted p value
Microsatellites vs non-TE SNPs 0.1875 0.02505854 0.40005854 0.0223162
SNPs vs non-TE SNPs 0.0050 -0.20755854 0.21755854 0.9997091
TEs vs non-TE SNPs 0.1025 -0.11005854 0.31505854 0.2880295
SNPs vs Microsatellites -0.1825 -0.39505854 0.03005854 0.0261973 TEs vs Microsatellites -0.0850 -0.29755854 0.12755854 0.4374602
TEs vs SNPs 0.0975 -0.11505854 0.31005854 0.3266417
c) Pairwise genetic differentiation (FST): One-way ANOVA, (p < 0.0001)
Difference Lower Upper Adjusted p value
Microsatellites vs non-TE SNPs 0.3025 0.1546838 0.1053162 0.0000205
SNPs vs non-TE SNPs -0.0425 -0.1903162 0.1053162 0.6856081
TEs vs non-TE SNPs 0.5750 0.4271838 0.7228162 0.0000000
SNPs vs Microsatellite -0.3450 -0.4928162 -0.1971838 0.0000053
TEs vs Microsatellite 0.2725 0.1246838 0.4203162 0.0000580
TEs vs SNPs 0.6175 0.4696838 0.7653162 0.0000000
d) Mantel r p value
SNPs vs non-TE SNPs 0.99 0.035
SNPs vs TEs 0.29 0.416
SNPs vs microsatellites 0.54 0.263
Non-TE SNPs vs TEs 0.32 0.346
Non-TE SNPs vs microsatellites 0.51 0.268
TEs vs microsatllites 0.74 0.080
18
TABLE S6 Gene Ontology analysis (TopGo 2.28.9) of the non-synonymous single-nucleotide polymorphisms excluding transposable elements sequences (non-TE SNPs) in alpine populations of Arabis alpina, the significance of the test is based in TopGO Fisher with weight01 algorithm (p < 0.01) and are highlighted. Only those terms with p < 0.05 with TopGO Fisher are shown in the table. Significance of difference in mutational target sites per GO category from total is provided based on Wilcox-rank sum test.
GO ID Term Anno
tated Signifi
cant Expected TopGO
Fisher GOslim description
Wilcox- rank sum test (Bonferroni correction)
Wilcox p<0.01 (Signifi
cantly longer)
Length difference from total GO:0006468 protein
phosphorylation 1171 504 393.16 6.20E-10 cellular protein modification process,
protein metabolic process
3.72E-125 * 965.63
GO:0007169 transmembrane receptor protein tyrosine
183 96 61.44 9.70E-08 cell communication,
signal transduction 1.63E-34 * 1300.19 GO:0009626 plant-type
hypersensitive response
145 69 48.68 1.80E-05 response to stress, cell
death 2.73E-22 * 1378.94
GO:0007165 signal
transduction 1684 623 565.4 7.30E-05 cell communication,
signal transduction 6.37E-103 * 908.13 GO:0046777 protein
autophosphoryl ation
207 96 69.5 8.20E-05 cellular protein modification process,
protein metabolic process
9.04E-24 * 841.98
GO:0006915 apoptotic
process 145 69 48.68 9.40E-05 biological process, cell
death 3.05E-14 * 1087.51
GO:0002764 immune response- regulating signaling pat...
9 9 3.02 1.60E-04 cell communication,
signal transduction 2.89E-03 * 1247.62
GO:0006952 defense
response 1440 547 483.48 1.70E-04 response to stress 3.39E-80 * 812.40 GO:0009987 cellular process 1141
5 3815 3832.57 2.00E-04 biological process 0.00E+00 GO:0009816 defense
response to bacterium
122 60 40.96 2.50E-04 response to stress, response to external stimulus, response to
biotic stimulus
1.65E-14 * 971.54
GO:0045087 innate immune
response 594 245 199.43 2.70E-04 response to stress 1.49E-47 * 970.58
GO:0006508 proteolysis 825 295 276.99 3.40E-04 biological process, protein metabolic
process
6.76E-29 * 630.82
GO:0009435 NAD
biosynthetic process
12 10 4.03 5.80E-04 nucleobase-containing compound metabolism,
biosynthetic process
2.96E-01
GO:0048544 recognition of
pollen 38 23 12.76 5.90E-04 reproduction, cell communication, pollination, pollen-pistil
interaction
4.64E-08 * 1175.25
GO:0080156 mitochondrial mRNA modification
18 13 6.04 9.20E-04 nucleobase-containing
compound metabolism 4.99E-02
19
GO:0010359 regulation of anion channel
activity
16 11 5.37 1.75E-03 transport 9.38E-05 * 1311.38
GO:0016045 detection of
bacterium 19 13 6.38 2.01E-03 biological process, response to external stimulus, response to
biotic stimulus
6.76E-03 * 894.83
GO:0006874 cellular calcium
ion homeostasis 36 20 12.09 2.42E-03 biological process,
cellular homeostasis 9.89E-05 * 1048.59 GO:0051026 chiasma
assembly 24 15 8.06 3.47E-03 reproduction,
nucleobase-containing compound metabolism, DNA metabolic process,
cell cycle, cellular component organization
7.37E-02
GO:0010204 defense response signaling pathway
35 20 11.75 3.51E-03 response to stress, cell communication, signal
transduction
1.16E-07 * 1265.34
GO:0007166 cell surface receptor signaling pathway
192 103 64.46 8.31E-03 cell communication,
signal transduction 4.19E-36 * 1263.50
GO:0000914 phragmoplast
assembly 9 7 3.02 8.64E-03 cell cycle, cellular
component organization 1.00E+00 GO:0006955 immune
response 605 253 203.13 8.72E-03 biological process 1.36E-49 * 967.48
GO:0002239 response to
oomycetes 33 17 11.08 8.96E-03 biological process, response to external stimulus, response to
biotic stimulus
5.22E-05 * 1195.41
GO:0050821 protein
stabilization 15 10 5.04 8.97E-03 biological process 1.00E+00 GO:0050896 response to
stimulus 6644 2200 2230.72 1.12E-02 biological process 6.93E-221 * 656.86 GO:0006855 drug
transmembrane transport
83 38 27.87 1.37E-02 transport 5.86E-08 * 922.00
GO:0046854 phosphatidylino sitol phosphorylation
18 11 6.04 1.52E-02 lipid metabolic process 2.28E-06 * 2761.20
GO:0009624 response to
nematode 165 69 55.4 1.61E-02 biological process, response to external stimulus, r esponse to
biotic stimulus
1.16E-06 * 417.73
GO:0010431 seed
maturation 65 31 21.82 1.73E-02 reproduction,
multicellular organism development, post- embryonic development
3.13E-03 * 412.39
GO:0016556 mRNA
modification 27 20 9.07 2.02E-02 nucleobase-containing
compound metabolism 1.66E-04 * 619.74
20
TABLE S7 Gene Ontology analysis (TopGo 2.28.9) showing enrichemnt loci of the polymorphic transposable elements (TEs) in alpine populations of Arabis alpina, the significance of the test is based in TopGO Fisher with weight01 algorithm (p < 0.01) and are highlighted. Only those terms with p < 0.05 with TopGO Fisher are shown in the table. Significance of difference in mutational target sites per GO category from total is provided based on Wilcox-rank sum test.
GO ID Term Anno
tated Signifi
cant Expected TopGO
Fisher GOslim description
Wilcoxon- rank sum test (Bonferroni correction)
Wilcox on p<0.01 (Signifi
cantly longer)
Length difference from total GO:0009611 response to
wounding 326 105 68.13 0.000003
5 response to stress 1.09E-01
GO:0009744 response to
sucrose 76 31 15.88 0.000056 biological process 1.00E+00
GO:0080129 proteasome core complex assembly
15 10 3.13 0.00017 cellular component
organization 1.00E+00 GO:0080167 response to
karrikin 276 82 57.68 0.00031 response to abiotic
stimulus 3.40E-02 GO:0006979 response to
oxidative stress 597 162 124.77 0.00035 response to stress 1.00E+00 GO:0051788 response to
misfolded protein
16 10 3.34 0.00036 response to stress 1.00E+00
GO:0050832 defense response to
fungus
361 109 75.45 0.00161 response to external stimulus, response to
biotic stimulus
5.66E-08 * 639.38
GO:0009817 defense response to
fungus, incompatible
interaction
123 40 25.71 0.00164 response to external stimulus, response to
biotic stimulus
1.77E-02
GO:0019752 carboxylic acid metabolic
process
1126 263 235.33 0.00185 cellular process 3.44E-15 * 641.75
GO:0042742 defense response to
bacterium
576 157 120.38 0.00192 response to external stimulus, response to
biotic stimulus
1.20E-05 * 658.55
GO:0015864 pyrimidine nucleoside transport
6 5 1.25 0.00197 transport 1.00E+00
GO:0009753 response to
jasmonic acid 322 93 67.3 0.00248 response to endogenous
stimulus 1.00E+00
GO:0031408 oxylipin biosynthetic
process
11 7 2.3 0.00257 lipid biosynthetic
process 1.00E+00
GO:0007568 aging 193 53 40.34 0.00483 biological process 1.53E-02
GO:0001666 response to
hypoxia 74 22 15.47 0.00501 response to abiotic
stimulus 1.00E+00
GO:0009821 alkaloid biosynthetic
process
15 8 3.13 0.00562 biosynthetic process 1.00E+00
GO:0006094 gluconeogenesi
s 7 5 1.46 0.00571 carbohydrate
biosynthetic process 1.44E-01
21
GO:0006419 alanyl-tRNA
aminoacylation 7 5 1.46 0.00571 nucleobase-containing compound metabolism,
translation, protein metabolic process
9.85E-01
GO:0006537 glutamate biosynthetic
process
7 5 1.46 0.00571 biosynthetic process 8.56E-02
GO:0006857 oligopeptide
transport 81 27 16.93 0.00739 transport 3.58E-05 * 1003.34
GO:0051567 histone H3-K9
methylation 13 7 2.72 0.00898 cellular protein modification process,
cellular component organization, protein
1.00E+00
GO:0048455 stamen
formation 16 8 3.34 0.0092 reproduction,
multicellular organism development, anatomical structure morphogenesis, post-
embryonic development, flower
development
1.00E+00
GO:0006508 proteolysis 825 199 172.42 0.00996 protein metabolic
process 8.72E-13 * 732.98
GO:0009626
plant-type hypersensitive
response 145 39 30.3 0.01173 response to stress, cell
death 1.32E-03 * 1238.27
GO:0009749 response to
glucose 40 14 8.36 0.01183 biological process 9.68E-02
GO:0006109
regulation of carbohydrate metabolic
process 39 11 8.15 0.01261 carbohydrate metabolic
process 1.00E+00
GO:0009116
nucleoside metabolic
process 68 21 14.21 0.01324 nucleobase-containing
compound metabolism 1.00E+00 GO:0006629 lipid metabolic
process 936 210 195.62 0.01428 lipid metabolic process 5.02E-12 * 648.52
GO:0043170
macromolecule metabolic
process 6600 1297 1379.38 0.01507 metabolic process 7.97E-52 * 611.01
GO:0051258 protein
polymerization 80 20 16.72 0.01564 cellular component
organization 8.05E-02
GO:0048767 root hair
elongation 101 30 21.11 0.01607
anatomical structure morphogenesis, cell
growth, cell
differentiation, growth 2.83E-02 GO:0009408 response to
heat 305 73 63.74 0.01896 response to abiotic
stimulus 1.00E+00
GO:0019853
L-ascorbic acid biosynthetic
process 21 9 4.39 0.01903 carbohydrate
biosynthetic process 1.00E+00
GO:0009911
positive regulation of
flower
development 57 19 11.91 0.0196
reproduction, multicellular organism
development, post- embryonic development, flower
development 1.04E-01
GO:0010380
regulation of chlorophyll biosynthetic
process 18 8 3.76 0.021 biosynthetic process 1.00E+00
22
GO:0080119 ER body
organization 15 7 3.13 0.02276 cellular component
organization 5.42E-03 * 2244.78
GO:0010555 response to
mannitol 9 5 1.88 0.02352 biological process 4.63E-01
GO:0032957
inositol trisphosphate
metabolic
process 9 5 1.88 0.02352 cellular process 1.00E+00
GO:0080149
sucrose induced translational
repression 9 5 1.88 0.02352
translation, response to stress, biosynthetic
process, protein
metabolic process 1.00E+00
GO:0009696
salicylic acid metabolic
process 50 15 10.45 0.02385 cellular process 1.00E+00
GO:0006605 protein
targeting 167 33 34.9 0.02406 transport 3.36E-02
GO:0009723 response to
ethylene 248 67 51.83 0.02656 response to endogenous
stimulus 9.76E-01
GO:0043085
positive regulation of
catalytic activity 42 16 8.78 0.02962 biological process 1.00E+00 GO:0015992 proton
transport 72 16 15.05 0.02992 transport 1.00E+00
GO:0009409 response to
cold 515 124 107.63 0.033 response to abiotic
stimulus 1.25E-01
GO:0048830
adventitious root
development 13 6 2.72 0.03666 multicellular organism
development 7.11E-01
GO:0007018
microtubule- based
movement 100 29 20.9 0.03875 cellular process 2.96E-07 * 1824.3
GO:0034220
ion transmembrane
transport 248 48 51.83 0.03937 transport 8.94E-03 * 636.76
GO:0030048
actin filament- based
movement 20 8 4.18 0.04076 cellular process 6.01E-05 * 6643.41
GO:0071395
cellular response to jasmonic acid
stimu... 87 26 18.18 0.04351 response to endogenous
stimulus 1.00E+00
GO:0009856 pollination 241 55 50.37 0.04359 reproduction 1.92E-04 * 903.11
GO:0031349
positive regulation of
defense
response 105 23 21.94 0.04368 response to stress 6.16E-01
GO:0043254
regulation of protein complex
assembly 36 8 7.52 0.0437 cellular component
organization 1.60E-01
GO:0009862
systemic acquired resistance, salicylic acid
mediated signaling
pathway 54 17 11.29 0.04496
response to stress, cell communication, signal transduction, response to external stimulus,
response to biotic
stimulus 2.90E-01
GO:0009407 toxin catabolic
process 58 18 12.12 0.04552
catabolic process, secondary metabolic
process 1.00E+00
GO:0006694 steroid 66 19 13.79 0.04655 lipid biosynthetic 1.88E-02
23
biosynthetic
process process
GO:0042127
regulation of cell
proliferation 54 15 11.29 0.04655 cellular process 1.00E+00
GO:0010043 response to zinc
ion 111 31 23.2 0.04781 biological process 1.00E+00
24
FIGURE S1 Geographic locations of the four sampled populations of Arabis alpina in the western Swiss Alps. 1) La Para (N = 70), 2) Pierredar (N = 70), 3) Les Essets (N =
70), 4) Les Martinets (N = 96). The aerial pictures are from SWISSIMAGE, 25cm, 2016, (https://shop.swisstopo.admin.ch/en/products/images/ortho_images/SWISSIMAGE).
1
3
2
4
25
FIGURE S2 Flowchart of the steps involved in TEPID (Stuart et al., 2016) and post processing of the raw polymorphic TEs for accurate genotyping of TEs in natural population.
26 a)
b)
FIGURE S3 Plots of the site frequency spectra for the four regions, (a) of the single-nucleotide polymorphisms excluding transposable element sequences (non-TE SNPs) and (b) of the polymorphic transposable elements (TEs).
27
FIGURE S4 Shared and private variants of the single-nucleotide polymorphisms, including those identified within transposable elements, among individuals of Arabis alpina sampled in the four study regions.
28
FIGURE S5 Density of single-nucleotide polymorphisms (SNPs) and SNPs excluding transposable element sequences (non-TE SNPs) along the eight chromosomes of Arabis alpina. The blue arrow highlights differences among both SNP sets, the loss of SNPs was mainly around the centromeres.
29
FIGURE S6 Efficiency of ‘tepid-refine’ algorithm. This procedure examines in detail the genomic region of a polymorphic transposable element (TE) identified in another sample and calls the same variant for samples with lower read count thresholds in order to reduce false negative variant calls within a group of related samples. a) Number of TE-absence variants identified versus sequencing depth of each individual. b) Number of additional TE-absence calls due to the TEPID refinement step versus sequencing depth of all individuals. c) Proportion of refined TE- absence calls due to the TEPID refinement step for each individual in the population with respect to TE-absence calls before refinement. d) Number of TE-presence variants identified versus the sequencing depth of each individual. e) Number of additional TE-presence calls due to the TEPID refinement step versus sequencing depth of all individuals. f) Proportion of refined TE-presence calls due to the TEPID refinement step for each individual in the population with respect to TE-presence calls before refinement. TE-absence and TE-presence calls are absence and presence of TEs with respect to the reference genome.