• Keine Ergebnisse gefunden

Genome-wide variation in nucleotides and retrotransposons in alpine populations of Arabis alpina (Brassicaceae)

N/A
N/A
Protected

Academic year: 2022

Aktie "Genome-wide variation in nucleotides and retrotransposons in alpine populations of Arabis alpina (Brassicaceae)"

Copied!
29
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

1

Supplemental Information for:

Genome-wide variation in nucleotides and retrotransposons in alpine populations of Arabis alpina (Brassicaceae)

Aude Rogivue, Rimjhim R. Choudhury, Stefan Zoller, Stéphane Joost, François Felber, Michel Kasser, Christian Parisod, Felix Gugerli

Table of Contents:

Appendix S1 Page 2

TABLE S1 Page 4

TABLE S2 Separate text file

TABLE S3 Page 6

TABLE S4 Page 7

TABLE S5 Page 15

TABLE S6 Page 16

TABLE S7 Page 18

FIGURE S1 Page 21

FIGURE S2 Page 22

FIGURE S3 Page 23

FIGURE S4 Page 24

FIGURE S5 Page 25

FIGURE S6 Page 26

(2)

2

Appendix S1. Mapping and TE annotation

Mapping

The mapping of the sequences after quality control against the reference genome V5.1 based on Jiao et al., (2017). We did not use all contigs but filtered according to their size and their gene content. Therefore, eight chromosomes and 728 of the 918 contigs (685 contigs > 10kb and 43 contigs < 10kb containing genes) were used, representing 335,551,605bp. For the production of the SNP dataset, mapping was performed with BWA v.0.7.12 (Li & Durbin, 2010) with a minimum output score of 10. Other options were used by default, and the mapped reads were sorted with SAMTOOLS v.1.2 (Li et al., 2009).

Transposable element annotation

Reliable assessment of transposable element (TE) polymorphisms at the population level depends on high-quality TE annotation such as generated for LTR-RTs in Arabis alpina V3 (Willing et al., 2015). No such annotation has so far been released for the latest, highly contiguous assembly of the A. alpina reference genome V5 (Jiao et al., 2017). To avoid biases resulting from the use of TE annotations from previous versions as well as those related to the transfer of annotation of high-copy TEs, LTR-RTs were de novo re-annotated in the reference sequence of A. alpina (V5.1). Following Choudhury, Neuhaus, and Parisod (2017), full-length LTR-RT copies were identified through their structural features with LTRharvest v.1.5.7 (Ellinghaus, Kurtz, & Willhoeft, 2008) and LTRdigest v.1.5.7 (Steinbiss, Willhoeft, Gremme, &

Kurtz, 2009) . After removal of nested and overlapping predictions, copies were clustered into families using CD-HIT-EST v.4.6.7 (Li & Godzik, 2006) when their LTR alignment covered at least 80% of the sequence length with an identity of at least 80%, following Wicker et al. (2007).

Characterized LTR-RT families were then classified into phylogenetically defined tribes using BLASTN of their reverse transcriptase (RT) coding sequence against the RT sequences (± 500bp) extracted from the V3 LTR-RT library (80% identity over 80% length). Families that remained unclassified were assigned to a tribe by BLASTN using RT sequences (± 500bp) extracted from Repbase Brassicaceae LTR-RTs.

All structurally defined and classified LTR-RTs were mapped on the A. alpina reference assembly V5.1 using RepeatMasker (version Open-4; Smit, Hubley, & Green, 2013) with RM- BLAST as search engine and divergence set to 20%. LTRs from each family were aligned using Muscle v3.8 (Edgar, 2004) and used to identify solo LTRs through Hidden Markov Model (HMM) profiles using HMMBuild from the HMMER package (hmmer.org). nHMMER was used to search for HMMs from all LTR clusters against the entire genome assembly and identify potential LTRs, including solo LTRs. The resulting annotation was filtered to remove nested LTR-RTs and annotations shorter than 80bp using a custom script, and referred to as reference TEs (Table S1 in Supplementary Information).

(3)

3 References

Choudhury, R. R., Neuhaus, J.-M., & Parisod, C. (2017). Resolving fine-grained dynamics of retrotransposons: comparative analysis of inferential methods and genomic resources.

Plant Journal, 90, 979–993.

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research, 32, 1792–1797.

Ellinghaus, D., Kurtz, S., & Willhoeft, U. (2008). LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics, 9, 18.

Jiao, W.-B., Accinelli, G. G., Hartwig, B., Kiefer, C., Baker, D., Severing, E., … Schneeberger, K.

(2017). Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data.

Genome Research, 27, 778–786.

Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows–Wheeler transform. Bioinformatics, 26, 589–595.

Li, W., & Godzik, A. (2006). Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics, 22, 1658–1659.

Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., … Durbin, R. (2009). The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079.

Smit, A., Hubley, R., & Green, P. (2013). 2013–2015. RepeatMasker Open-4.0. http://

www.repeatmasker.org.

Steinbiss, S., Willhoeft, U., Gremme, G., & Kurtz, S. (2009). Fine-grained annotation and classification of de novo predicted LTR retrotransposons. Nucleic Acids Research, 37, 7002–7013.

Wicker, T., Sabot, F., Hua-Van, A., Bennetzen, J. L., Capy, P., Chalhoub, B., … Schulman, A. H.

(2007). A unified classification system for eukaryotic transposable elements. Nature Reviews Genetics, 8, 973–982.

Willing, E.-M., Rawat, V., Mandáková, T., Maumus, F., James, G. V., Nordström, K. J., … Schneeberger, K. (2015). Genome expansion of Arabis alpina linked with

retrotransposition and reduced symmetric DNA methylation. Nature Plants, 1, .

(4)

4

TABLE S1 Number of annotated TEs per tribe in V5.1 of Arabis reference genome with their mean length per tribe and its comparison with their annotation in V3 (Choudhury et al., 2017).

The number of polymorphic TEs identified in this study per tribe is also presented here.

Tribe

No. Of Annotated

copies in V5.1

Assembly Mean Length

No. Of Annotated copies in V3

Assembly (Choudhury et al., 2017)

No. Of Polymorphic

TE copies

ATGP1 17721 2589.00 7777 1611

ATLANTYS2 9210 2279.73 6190 944

ALYGypsy4 3792 2154.13 3690 333

ATCOPIA20 2693 1112.64 992 877

ATHILA4 2582 2363.23 6024 249

ATCOPIA95 1524 2350.27 1277 362

TA1-2 816 2787.29 518 144

ALYCopia74 614 1928.39 331 93

ATHILA6 533 2064.66 604 25

ALYCopia32 526 2895.32 302 121

ATGP10 522 1937.08 329 67

ATGP4 441 2126.36 450 103

ENDOVIR1 436 3075.13 956 69

ATGP5 290 2700.76 245 47

ATCOPIA35 266 1864.34 172 43

ALYCopia6 200 3377.69 425 59

ATGP8 192 2506.93 109 11

ATGP2 161 2116.32 175 14

BraCopia7 154 1179.53 439 9

ATCOPIA23/ONSEN 153 3508.75 NA 45

ATCOPIA1 106 3323.19 NA 27

ATCOPIA27 104 783.68 NA 7

ATCOPIA40 95 2479.01 160 19

ALYGypsy1 92 4144.45 25 27

ATCOPIA14 88 3521.11 NA 37

ATCOPIA8 84 3768.14 107 23

ATLANTYS1 81 875.91 111 3

ATGP3 77 4207.51 49 14

(5)

5

ALYGypsy10 73 2593.82 112 10

ATCOPIA12 65 3948.54 NA 19

ATCOPIA17 64 3564.63 NA 31

ALYCopia40 51 2231.47 NA 5

ATCOPIA36 43 3436.51 30 5

ATCOPIA34 34 3071.91 40 6

ALYCopia24 25 3827.64 6 11

ATCOPIA43 21 3161.48 37 3

BraCopia119 20 3978.35 5 8

BraGypsy2 17 4376.88 11 6

ALYCopia42 15 4837.67 11 6

ALYCopia16 10 4096.00 20 NA

ATCOPIA2 8 2457.88 NA 5

ALYCopia45 NA NA 744 NA

SOLO-LTRs 159727 343.09 NA 9962

Unknown 40760 2105.37 5089

(6)

6

TABLE S2 List of the single-nucleotide polymorphisms (SNPs) present in the transposable elements (TEs), these SNPs were removed from the SNPs set to obtain the non-TE SNPs set. The list is available as a separate txt file.

(7)

7

TABLE S3 Estimation of the whole-genome linkage disequilibrium (LD) decay as half-decay distance (kb) for the single-nucleotide polymorphisms (SNPs) and the SNPs excluding transposable element sequences (non-TE SNPs) for the eight chromosomes and its average over chromosomes. The estimation was done using a window size between 250kb and 1500kb to obtain about the same number of pairwise comparisons among the chromosomes.

Chromosome Window size (kb)

Number of pairwise

comparison Half decay distance (kb)

SNPs

1 500 26,638,833 19.68

2 250 27,949,112 27.28

3 500 29,944,594 25.44

4 250 25,937,604 35.52

5 500 25,165,496 46.83

6 250 23,035,780 35.12

7 250 29,359,305 28.88

8 250 27,512,155 101.99

Average 40.09

Non-TE SNPs

1 1000 25,512,167 21.62

2 700 27,069,660 23.41

3 1000 26,817,985 19.25

4 700 23,862,096 39.54

5 1500 27,301,163 40.60

6 700 27,258,686 27.73

7 700 28,115,036 31.23

8 700 29,789,366 44.46

Average 30.98

(8)

8

TABLE S4 Number of raw reads obtained for each sample of Arabis alpina (N = 306) after whole- genome sequencing with Illumina HiSeq2500 (raw reads), after trimming and de-duplication of the reads, after mapping to the reference genome (V5.1, Jiao et al. 2017) as well as the mean and the median of the coverage. The four populations sampled are: Essets (Es, N = 70), Martinets (Ma, N = 96), Para (Pa, N = 70) and Pierredar (Pi, N = 70).

Samples

Number of reads Coverage

Raw reads

After trimming and deduplication (paired and

unpaired)

Properly paired mapping

V5.1

Mean Median

Es100 22534988 22059960 15197201 8 5.891

Es12 57890690 56960580 37663813 20 15.07

Es13 70600628 69405572 45632557 25 18.21

Es14 80909060 79705246 54430489 29 21.57

Es16 33826234 33486696 23233307 12 9.25

Es17 36070626 35753024 23926222 13 9.532

Es19 20901874 20692026 14079674 7 5.606

Es20 23776808 23479168 15673657 9 6.284

Es21 43713930 43288162 29145355 15 11.66

Es22 39291528 38413984 25675355 14 10.31

Es23 41799780 41404262 27719120 14 10.97

Es24 19108898 18920128 12851028 7 5.13

Es25 34241968 33850772 22359541 12 8.892

Es3 94490534 93208312 18035292 34 25.29

Es32 27341042 27026106 16457216 10 7.232

Es33 25405832 25121964 21667659 9 6.586

Es34 31804358 31403162 14207562 12 8.577

Es35 21468058 21226402 31564485 8 5.681

Es36 46815266 46374690 24821979 17 12.66

Es38 37889076 36127188 29393214 13 9.576

Es39 43697402 41591186 64450463 16 11.38

Es40 32559672 32277922 21589277 12 8.603

Es41 21053648 20834548 14012117 8 5.618

(9)

9

Es42 24760866 24451560 16584976 9 6.59

Es43 25732790 25458338 17239248 9 6.842

Es44 23304400 23079998 15922532 9 6.294

Es46 26930288 26614780 18242107 10 7.215

Es48 25395598 25108118 16924600 9 6.743

Es49 26692600 26426340 18087123 10 7.247

Es50 30275760 29980812 21130129 11 8.178

Es52 30216840 29892190 20546494 11 8.126

Es53 25234110 24955846 16223757 9 6.457

Es55 39029668 38447142 26849913 14 10.48

Es56 27567058 27052968 19093263 10 7.44

Es57 37656812 37075644 26034731 14 10.21

Es59 24056868 23699256 16737515 9 6.519

Es6 34907624 34567170 29762532 13 9.154

Es60 43227338 42515486 49502079 15 11.62

Es63 74595064 71404084 14288198 25 19.1

Es64 20599864 20324094 29927341 8 5.531

Es65-1 43381350 42676160 23014728 15 11.62

Es66 36823554 35085150 19796612 12 8.87

Es67 28786574 28345850 31291936 10 7.701

Es68 46075900 45223268 23100987 16 12.18

Es7 47948578 45250950 29963785 16 12.14

Es71 44527882 43642132 13100855 15 11.74

Es72 19767896 19161772 14798135 7 5.153

Es73 21707656 21034374 15510257 8 5.784

Es75 22847750 22151424 15465529 8 6.043

Es76 22574792 21910664 12961493 8 6.028

Es77 19109630 18543542 18481434 7 5.09

Es78 28110318 27193970 16837568 10 7.259

Es79 24555100 23970620 31452642 9 6.533

Es8 19255716 19073790 22388542 7 5.145

Es81 33159150 32492992 31627687 11 8.835

Es82 45850504 44985238 21734102 17 12.38

Es83 32122512 31556320 23302737 11 8.511

Es84 33761930 33185062 28487729 12 9.093

Es85 42102432 41144924 23403965 14 10.89

Es86 34425640 33781082 24279714 12 9.148

(10)

10

Es88 35372504 34663610 12757986 13 9.508

Es9 52656054 51891454 19301049 19 14.3

Es90 28153586 27644574 29561619 10 7.436

Es91 44200560 43198674 14151349 15 11.51

Es92 20809016 20403896 32735336 7 5.465

Es93 47853206 46897412 29290083 16 12.75

Es94 43242198 41346914 18287097 15 11.22

Es95 26648720 26146578 28688368 9 7.043

Es98 41969476 41182854 27263724 15 11.07

Es99 40913434 39840306 35852921 14 10.65

Ma10 35972254 33720408 14096090 12 9.054

Ma100 21174646 20628359 23145324 7 5.534

Ma11 29770832 27552932 18524082 10 7.219

Ma12 46805118 43320232 29134499 16 11.33

Ma13 31013372 29077521 19447888 10 7.577

Ma14 32765802 30007296 20185067 11 7.878

Ma15 43205898 42254988 29308771 15 11.44

Ma16 30737834 30107196 20861569 11 8.155

Ma17 36321380 35374966 24169040 13 9.448

Ma18 32571776 31768582 21359159 12 8.311

Ma19 31183094 30506166 21132118 12 8.243

Ma2 32690682 32006915 22460081 11 8.482

Ma20 33696042 32837345 23716598 12 8.776

Ma21 35054264 34199937 14682852 12 9.241

Ma22 22220316 21614280 28983909 8 5.743

Ma23 44742524 43469081 23537830 15 11.26

Ma24 36291160 35262593 23432029 12 9.151

Ma26 35010684 34169979 28284368 12 9.142

Ma27 41999202 40691185 21743474 15 11.02

Ma28 24569874 24134216 16597851 8 6.489

Ma29 25193562 23897136 16172698 9 6.305

Ma3 32899988 30787595 44837413 11 8.224

Ma30 24338894 22715176 15158153 8 5.907

Ma31 21151696 19937236 13651391 7 5.336

Ma32 19482620 18660549 12928659 7 5.05

Ma33 22868642 21318816 14407611 8 5.628

Ma34 22828234 21257104 14592359 8 5.7

(11)

11

Ma35 71198092 65892450 21125865 24 17.48

Ma36 24651192 23125476 15758170 8 6.151

Ma37 29838638 28111335 19468778 10 7.59

Ma38 26266092 24558030 16563217 9 6.469

Ma39 26345792 24407955 16459825 8 6.433

Ma4 32949882 30820274 20550074 11 8.043

Ma40 25568816 25102658 16438889 9 6.42

Ma41 20618614 20263058 13510875 7 5.276

Ma42 22470104 22001765 15336826 8 6.002

Ma43 18756080 18491450 12603442 7 4.918

Ma44 19599828 19300421 12588142 7 4.919

Ma45 23596312 23086879 15854990 8 6.191

Ma46 31368988 30720223 21597121 12 8.415

Ma47 22257408 21733934 15143030 8 5.917

Ma48 34971074 34040720 23474232 12 9.15

Ma49 25657780 25106765 17136937 9 6.687

Ma5 31679470 29709301 52889409 11 7.963

Ma50 29381020 28797211 19991317 11 7.797

Ma51 24996528 24275979 16458933 8 6.419

Ma52 19020982 18773071 12485116 7 4.878

Ma53 16370098 15652918 10778738 6 4.211

Ma54 19729638 18736764 12786947 7 4.995

Ma55 18011076 17099199 11863245 7 4.635

Ma56 83350182 79029881 20351179 28 20.58

Ma57 16883280 15748744 10427363 6 4.06

Ma58 23369348 21911826 14832591 8 5.78

Ma59 21696102 20659818 14374341 8 5.618

Ma6 30994642 29628330 61282471 11 7.823

Ma60 16555104 15565555 10568655 6 4.121

Ma61 100432596 93311932 50794001 34 23.77

Ma62 23997656 22573770 15023943 8 5.837

Ma63 21044170 19664016 13454619 7 5.246

Ma65 19732834 19414867 13274833 7 5.184

Ma66 16833504 16574188 11227040 6 4.39

Ma67 16386654 16108357 11138426 6 4.362

Ma68 17571914 17305149 11862782 6 4.624

Ma69 76948136 74890086 20032261 27 19.77

(12)

12

Ma70 16037502 15738933 10776848 6 4.209

Ma71 27917772 27392734 18600814 9 7.242

Ma72 86411268 83137589 56858199 30 22.11

Ma73 91231960 88147155 58115082 32 22.55

Ma74 19252758 18819002 13083207 7 5.11

Ma75 23405144 22878360 15910305 9 6.207

Ma76 115116858 109922177 74922772 39 29.04

Ma77 17666548 17342331 11452469 6 4.499

Ma78 16449834 15599252 10407603 5 4.095

Ma79 18573610 17560087 11990189 6 4.714

Ma8 30322186 28089596 9319213 10 7.216

Ma80 14913696 13956728 9332589 5 3.672

Ma81 14575396 13901059 11138354 5 3.673

Ma82 18159544 16790152 16247878 6 4.377

Ma83 25946494 24079240 13355689 9 6.377

Ma84 21024998 19822142 11664599 7 5.249

Ma85 18352642 17096025 11384340 6 4.589

Ma86 17762268 16748151 13754259 6 4.475

Ma87 21590900 20284462 12763055 7 5.376

Ma88 19879982 18612294 17375392 7 5.014

Ma89 25862412 25366282 18492361 9 6.792

Ma9 44369902 41397101 11728212 13 10.3

Ma90 17453778 17142954 11663477 6 4.618

Ma91 18322210 17914755 13738749 6 4.58

Ma92 20343822 19953506 8419177 7 5.384

Ma93 14323860 14083831 9870208 5 3.313

Ma94 14804668 14456817 11138572 5 3.888

Ma95 17216978 16840414 14155907 6 4.363

Ma96 22234480 21643363 14877649 7 5.557

Ma97 22526018 21981120 13413720 8 5.839

Ma98 20354482 19838714 16002396 7 5.262

Ma99 23852782 23305722 26442872 8 6.291

Pa1 60243382 59450108 18333710 20 15.94

Pa100 26602462 26104096 32889054 10 7.03

Pa11 49759506 49284300 31545829 17 12.99

Pa12 49515454 46769990 28089454 15 12.19

Pa13 42285096 41830546 20696926 15 11.13

(13)

13

Pa14 32989142 32701324 30813094 11 8.33

Pa16 45579976 45068866 16107485 17 12.22

Pa19 24420584 24194472 40194492 9 6.455

Pa2 36531706 36154302 13618710 12 8.896

Pa20 20511438 20247390 30162830 7 5.437

Pa23 44930008 44404870 26130738 16 11.82

Pa24 40645252 40228150 22312436 14 10.38

Pa26 34549326 33226914 52696952 12 8.646

Pa27 77384214 76529036 20204600 28 20.65

Pa28 30184750 29875392 34081682 11 8.057

Pa29 53929720 52855936 22572669 17 13.39

Pa3 37888218 37479694 15666187 14 10.11

Pa30 23993840 23682696 30395793 8 6.304

Pa31 46733610 44774524 12546769 16 11.73

Pa32 19356466 19129358 34464015 7 5.075

Pa33 52827460 49722606 19473322 18 13.24

Pa35 29559834 29226538 17218901 10 7.789

Pa36 26141528 25846496 11277504 9 6.868

Pa37 19162170 18960508 25501639 6 4.516

Pa40 19755870 19551468 12521513 7 5.022

Pa42 22362694 22111420 14922591 8 5.947

Pa43 24530240 24258806 15736232 9 6.285

Pa45 35916738 35120304 23502640 12 9.015

Pa46 21542876 21320198 14508348 8 5.764

Pa47 49411230 48918438 32598570 17 13.08

Pa48 21507666 21292398 14650546 8 5.819

Pa49 29225656 28915638 19666850 11 7.815

Pa5 35826966 35254300 17340444 12 9.546

Pa51 26004004 25707202 15754962 9 6.864

Pa52 22913172 22688068 33944577 8 6.245

Pa53 51068120 50521514 28785044 18 13.23

Pa55 44441132 43869116 13702478 15 11.39

Pa56 21541028 20974906 19090847 7 5.344

Pa57 28041314 27615390 22511887 10 7.435

Pa58 33945344 33322822 24031610 12 8.784

Pa6 56838584 55841530 31587749 18 13.6

Pa60 46570474 45760396 13372000 17 12.37

(14)

14

Pa61 19876762 19511110 15411834 7 5.19

Pa63 22698376 22373470 20744810 8 5.964

Pa64 30012588 29501062 16709361 11 8.005

Pa65 29687578 23603154 29507592 9 6.381

Pa66 43647846 42716116 23733317 15 11.54

Pa67 34999462 34435350 15364225 13 9.213

Pa68 25131536 24367434 21456590 8 6.078

Pa69 33301090 32202510 34223195 11 8.395

Pa71 25555534 24635718 16928928 9 6.596

Pa72 29257358 28331528 19719423 10 7.648

Pa74 19633000 19063638 13098753 7 5.129

Pa75 32502596 31390848 20179788 10 7.867

Pa76 20864234 20180554 13814901 7 5.438

Pa77 43924972 41301124 28823237 15 11.13

Pa78 23938378 23435256 16213434 9 6.255

Pa84 39124952 38399020 26575118 14 10.4

Pa85 20198940 19724706 13484744 7 5.267

Pa86 46365622 45274208 27946871 14 10.87

Pa87 20111366 19698146 13527328 7 5.223

Pa88 21349134 20883524 14489084 8 5.605

Pa89 37210256 36541958 24883165 13 9.728

Pa9 3631142 3555034 23637601 1 0.9504

Pa90 34879904 34208390 27248651 12 9.255

Pa91 41341096 40406556 24588504 14 10.74

Pa92 37732736 37072470 27254084 12 9.586

Pa94 40135848 39506334 26371077 14 10.65

Pa96 38622834 37794920 29833523 14 10.13

Pa98 44827144 42354106 2437054 15 11.51

Pi1 126209728 124152694 40347687 43 33.81

Pi11 59257972 58336970 40085320 21 15.91

Pi12 59289084 58371068 35430435 21 15.88

Pi13 53312146 49879504 22531611 19 13.6

Pi14 33769014 33374382 28198957 12 9.013

Pi15 42450704 39883204 29315387 14 10.83

Pi18 42680804 42205360 24251643 15 11.58

Pi19 35691502 34892644 85975061 12 9.406

Pi2 22553550 22314006 20752605 8 6.1

(15)

15

Pi20 31108652 30700780 29921496 11 8.187

Pi21 43402292 42901536 27746600 16 11.74

Pi26 41625746 41168210 21520401 14 10.92

Pi27 32163274 31455910 25526066 12 8.471

Pi28 37549268 36832882 41799471 13 9.932

Pi29 63402286 62405546 15392190 20 16.47

Pi3 52497426 51693384 13097097 18 14.13

Pi31 19991820 19753210 23402372 7 5.225

Pi33 35843476 35158552 30219251 12 9.175

Pi34 44587934 44050992 12265982 16 11.95

Pi36 18367676 18168712 27828257 7 4.898

Pi39 41854474 39284080 35856751 14 10.64

Pi40 24568056 24271430 16190936 9 6.424

Pi41 40990436 40596396 27147789 14 10.77

Pi42 46697272 46205044 31620325 17 12.5

Pi43 44912768 44471794 30472289 16 12.04

Pi44 40533872 40048658 27134968 14 10.74

Pi49 19270598 19053680 13160097 7 5.235

Pi5 50761310 48378310 25046756 18 13.23

Pi51 36936162 36577380 40636671 13 9.969

Pi52 61493942 57545752 23615702 21 15.61

Pi54 36906762 36515002 17672740 13 9.357

Pi55 25752042 25495688 16786489 9 6.987

Pi56 24503532 24237834 13529391 9 6.632

Pi57 19707468 19496376 17779501 7 5.371

Pi58 26749206 26426100 33678505 9 7.037

Pi59 49841470 49216518 34471820 16 13.19

Pi60 46764190 46290036 31930681 17 12.62

Pi62 26825230 26543578 18380945 10 7.242

Pi65 20252266 19854632 13641463 7 5.326

Pi67 39616570 38735992 27194540 14 10.54

Pi68 21918904 21560688 15023947 8 5.843

Pi7 28716170 28256384 21785258 10 7.634

Pi71 32131268 31334874 28418579 12 8.53

Pi72 41339422 40378142 12912992 14 11.08

Pi73 19254360 18803366 19363095 7 5.031

Pi74 28053664 27521002 17416467 10 7.513

(16)

16

Pi75 25990434 25297986 38390807 9 6.67

Pi76 55052636 53861462 27505039 20 14.8

Pi77 40877364 39919814 31619713 15 10.67

Pi78 47447234 45872214 13531617 15 12.3

Pi79 19851996 19134168 19232296 7 5.218

Pi80 48471422 46463368 32474939 16 12.63

Pi81 50144184 48004894 33107210 17 12.86

Pi82 19404730 18769194 13223906 7 5.15

Pi83 25792960 24958280 17432648 9 6.776

Pi84 21054116 20382084 14294121 7 5.565

Pi85 45206894 44137246 30041041 16 11.62

Pi86 33176064 32416426 22105921 12 8.635

Pi87 55116976 52091732 35865276 17 13.68

Pi88 43033538 42053842 29147513 15 11.31

Pi89 23918210 23153988 16049089 8 6.158

Pi9 49478734 41192286 38631083 15 11.09

Pi90 57551796 55511194 22897846 19 14.85

Pi92 35065238 33749528 13142225 12 8.795

Pi93 19084122 18580272 25116582 7 5.072

Pi94 37765898 36551884 44193 13 9.709

Pi95 66804 63748 23242221 0 0.01732

Pi96 34191228 33383886 35582820 12 8.894

Pi97 52568714 51474984 31160913 18 13.81

Pi98 47124642 45809504 29029917 15 11.88

(17)

17

TABLE S5 Results of the comparison among the population genomics values estimated on the four markers (nuclear microsatellites, single-nucleotide polymorphisms (SNPs), SNPs excluding transposable element sequences (non-TE SNPs), transposable elements (TEs)) tested with a one-way ANOVA for (a) Ho , (b) He, and (c) FIS, and (d) the results of the Mantel test among the pairwise FST. The ANOVA was followed by a Tukey HSD test, comparing the markers, resulting in the difference and its upper and lower limit (confidence interval 95%) and the p value adjusted for multiple comparisons.

a) Observed heterozygosity (Ho): One-way ANOVA, (p = 0.22)

Difference Lower Upper Adjusted p value

Microsatellites vs non-TE SNPs 0.005867105 -0.1234798 0.13521398 0.9979363 SNPs vs non vs TE SNPs 0.005000000 -0.1243469 0.13434687 0.9987177

TEs vs non-TE SNPs -0.057500000 -0.1868469 0.07184687 0.3515078

SNPs vs Microsatellites -0.000867105 -0.1302140 0.12847977 0.9999932 TEs vs Microsatellites -0.063367105 -0.1927140 0.06597977 0.2761637

TEs vs SNPs -0.062500000 -0.1918469 0.06684687 0.2865002

b) Expected heterozygosity (He): One-way ANOVA, (p < 0.05)

Difference Lower Upper Adjusted p value

Microsatellites vs non-TE SNPs 0.1875 0.02505854 0.40005854 0.0223162

SNPs vs non-TE SNPs 0.0050 -0.20755854 0.21755854 0.9997091

TEs vs non-TE SNPs 0.1025 -0.11005854 0.31505854 0.2880295

SNPs vs Microsatellites -0.1825 -0.39505854 0.03005854 0.0261973 TEs vs Microsatellites -0.0850 -0.29755854 0.12755854 0.4374602

TEs vs SNPs 0.0975 -0.11505854 0.31005854 0.3266417

c) Pairwise genetic differentiation (FST): One-way ANOVA, (p < 0.0001)

Difference Lower Upper Adjusted p value

Microsatellites vs non-TE SNPs 0.3025 0.1546838 0.1053162 0.0000205

SNPs vs non-TE SNPs -0.0425 -0.1903162 0.1053162 0.6856081

TEs vs non-TE SNPs 0.5750 0.4271838 0.7228162 0.0000000

SNPs vs Microsatellite -0.3450 -0.4928162 -0.1971838 0.0000053

TEs vs Microsatellite 0.2725 0.1246838 0.4203162 0.0000580

TEs vs SNPs 0.6175 0.4696838 0.7653162 0.0000000

d) Mantel r p value

SNPs vs non-TE SNPs 0.99 0.035

SNPs vs TEs 0.29 0.416

SNPs vs microsatellites 0.54 0.263

Non-TE SNPs vs TEs 0.32 0.346

Non-TE SNPs vs microsatellites 0.51 0.268

TEs vs microsatllites 0.74 0.080

(18)

18

TABLE S6 Gene Ontology analysis (TopGo 2.28.9) of the non-synonymous single-nucleotide polymorphisms excluding transposable elements sequences (non-TE SNPs) in alpine populations of Arabis alpina, the significance of the test is based in TopGO Fisher with weight01 algorithm (p < 0.01) and are highlighted. Only those terms with p < 0.05 with TopGO Fisher are shown in the table. Significance of difference in mutational target sites per GO category from total is provided based on Wilcox-rank sum test.

GO ID Term Anno

tated Signifi

cant Expected TopGO

Fisher GOslim description

Wilcox- rank sum test (Bonferroni correction)

Wilcox p<0.01 (Signifi

cantly longer)

Length difference from total GO:0006468 protein

phosphorylation 1171 504 393.16 6.20E-10 cellular protein modification process,

protein metabolic process

3.72E-125 * 965.63

GO:0007169 transmembrane receptor protein tyrosine

183 96 61.44 9.70E-08 cell communication,

signal transduction 1.63E-34 * 1300.19 GO:0009626 plant-type

hypersensitive response

145 69 48.68 1.80E-05 response to stress, cell

death 2.73E-22 * 1378.94

GO:0007165 signal

transduction 1684 623 565.4 7.30E-05 cell communication,

signal transduction 6.37E-103 * 908.13 GO:0046777 protein

autophosphoryl ation

207 96 69.5 8.20E-05 cellular protein modification process,

protein metabolic process

9.04E-24 * 841.98

GO:0006915 apoptotic

process 145 69 48.68 9.40E-05 biological process, cell

death 3.05E-14 * 1087.51

GO:0002764 immune response- regulating signaling pat...

9 9 3.02 1.60E-04 cell communication,

signal transduction 2.89E-03 * 1247.62

GO:0006952 defense

response 1440 547 483.48 1.70E-04 response to stress 3.39E-80 * 812.40 GO:0009987 cellular process 1141

5 3815 3832.57 2.00E-04 biological process 0.00E+00 GO:0009816 defense

response to bacterium

122 60 40.96 2.50E-04 response to stress, response to external stimulus, response to

biotic stimulus

1.65E-14 * 971.54

GO:0045087 innate immune

response 594 245 199.43 2.70E-04 response to stress 1.49E-47 * 970.58

GO:0006508 proteolysis 825 295 276.99 3.40E-04 biological process, protein metabolic

process

6.76E-29 * 630.82

GO:0009435 NAD

biosynthetic process

12 10 4.03 5.80E-04 nucleobase-containing compound metabolism,

biosynthetic process

2.96E-01

GO:0048544 recognition of

pollen 38 23 12.76 5.90E-04 reproduction, cell communication, pollination, pollen-pistil

interaction

4.64E-08 * 1175.25

GO:0080156 mitochondrial mRNA modification

18 13 6.04 9.20E-04 nucleobase-containing

compound metabolism 4.99E-02

(19)

19

GO:0010359 regulation of anion channel

activity

16 11 5.37 1.75E-03 transport 9.38E-05 * 1311.38

GO:0016045 detection of

bacterium 19 13 6.38 2.01E-03 biological process, response to external stimulus, response to

biotic stimulus

6.76E-03 * 894.83

GO:0006874 cellular calcium

ion homeostasis 36 20 12.09 2.42E-03 biological process,

cellular homeostasis 9.89E-05 * 1048.59 GO:0051026 chiasma

assembly 24 15 8.06 3.47E-03 reproduction,

nucleobase-containing compound metabolism, DNA metabolic process,

cell cycle, cellular component organization

7.37E-02

GO:0010204 defense response signaling pathway

35 20 11.75 3.51E-03 response to stress, cell communication, signal

transduction

1.16E-07 * 1265.34

GO:0007166 cell surface receptor signaling pathway

192 103 64.46 8.31E-03 cell communication,

signal transduction 4.19E-36 * 1263.50

GO:0000914 phragmoplast

assembly 9 7 3.02 8.64E-03 cell cycle, cellular

component organization 1.00E+00 GO:0006955 immune

response 605 253 203.13 8.72E-03 biological process 1.36E-49 * 967.48

GO:0002239 response to

oomycetes 33 17 11.08 8.96E-03 biological process, response to external stimulus, response to

biotic stimulus

5.22E-05 * 1195.41

GO:0050821 protein

stabilization 15 10 5.04 8.97E-03 biological process 1.00E+00 GO:0050896 response to

stimulus 6644 2200 2230.72 1.12E-02 biological process 6.93E-221 * 656.86 GO:0006855 drug

transmembrane transport

83 38 27.87 1.37E-02 transport 5.86E-08 * 922.00

GO:0046854 phosphatidylino sitol phosphorylation

18 11 6.04 1.52E-02 lipid metabolic process 2.28E-06 * 2761.20

GO:0009624 response to

nematode 165 69 55.4 1.61E-02 biological process, response to external stimulus, r esponse to

biotic stimulus

1.16E-06 * 417.73

GO:0010431 seed

maturation 65 31 21.82 1.73E-02 reproduction,

multicellular organism development, post- embryonic development

3.13E-03 * 412.39

GO:0016556 mRNA

modification 27 20 9.07 2.02E-02 nucleobase-containing

compound metabolism 1.66E-04 * 619.74

(20)

20

TABLE S7 Gene Ontology analysis (TopGo 2.28.9) showing enrichemnt loci of the polymorphic transposable elements (TEs) in alpine populations of Arabis alpina, the significance of the test is based in TopGO Fisher with weight01 algorithm (p < 0.01) and are highlighted. Only those terms with p < 0.05 with TopGO Fisher are shown in the table. Significance of difference in mutational target sites per GO category from total is provided based on Wilcox-rank sum test.

GO ID Term Anno

tated Signifi

cant Expected TopGO

Fisher GOslim description

Wilcoxon- rank sum test (Bonferroni correction)

Wilcox on p<0.01 (Signifi

cantly longer)

Length difference from total GO:0009611 response to

wounding 326 105 68.13 0.000003

5 response to stress 1.09E-01

GO:0009744 response to

sucrose 76 31 15.88 0.000056 biological process 1.00E+00

GO:0080129 proteasome core complex assembly

15 10 3.13 0.00017 cellular component

organization 1.00E+00 GO:0080167 response to

karrikin 276 82 57.68 0.00031 response to abiotic

stimulus 3.40E-02 GO:0006979 response to

oxidative stress 597 162 124.77 0.00035 response to stress 1.00E+00 GO:0051788 response to

misfolded protein

16 10 3.34 0.00036 response to stress 1.00E+00

GO:0050832 defense response to

fungus

361 109 75.45 0.00161 response to external stimulus, response to

biotic stimulus

5.66E-08 * 639.38

GO:0009817 defense response to

fungus, incompatible

interaction

123 40 25.71 0.00164 response to external stimulus, response to

biotic stimulus

1.77E-02

GO:0019752 carboxylic acid metabolic

process

1126 263 235.33 0.00185 cellular process 3.44E-15 * 641.75

GO:0042742 defense response to

bacterium

576 157 120.38 0.00192 response to external stimulus, response to

biotic stimulus

1.20E-05 * 658.55

GO:0015864 pyrimidine nucleoside transport

6 5 1.25 0.00197 transport 1.00E+00

GO:0009753 response to

jasmonic acid 322 93 67.3 0.00248 response to endogenous

stimulus 1.00E+00

GO:0031408 oxylipin biosynthetic

process

11 7 2.3 0.00257 lipid biosynthetic

process 1.00E+00

GO:0007568 aging 193 53 40.34 0.00483 biological process 1.53E-02

GO:0001666 response to

hypoxia 74 22 15.47 0.00501 response to abiotic

stimulus 1.00E+00

GO:0009821 alkaloid biosynthetic

process

15 8 3.13 0.00562 biosynthetic process 1.00E+00

GO:0006094 gluconeogenesi

s 7 5 1.46 0.00571 carbohydrate

biosynthetic process 1.44E-01

(21)

21

GO:0006419 alanyl-tRNA

aminoacylation 7 5 1.46 0.00571 nucleobase-containing compound metabolism,

translation, protein metabolic process

9.85E-01

GO:0006537 glutamate biosynthetic

process

7 5 1.46 0.00571 biosynthetic process 8.56E-02

GO:0006857 oligopeptide

transport 81 27 16.93 0.00739 transport 3.58E-05 * 1003.34

GO:0051567 histone H3-K9

methylation 13 7 2.72 0.00898 cellular protein modification process,

cellular component organization, protein

1.00E+00

GO:0048455 stamen

formation 16 8 3.34 0.0092 reproduction,

multicellular organism development, anatomical structure morphogenesis, post-

embryonic development, flower

development

1.00E+00

GO:0006508 proteolysis 825 199 172.42 0.00996 protein metabolic

process 8.72E-13 * 732.98

GO:0009626

plant-type hypersensitive

response 145 39 30.3 0.01173 response to stress, cell

death 1.32E-03 * 1238.27

GO:0009749 response to

glucose 40 14 8.36 0.01183 biological process 9.68E-02

GO:0006109

regulation of carbohydrate metabolic

process 39 11 8.15 0.01261 carbohydrate metabolic

process 1.00E+00

GO:0009116

nucleoside metabolic

process 68 21 14.21 0.01324 nucleobase-containing

compound metabolism 1.00E+00 GO:0006629 lipid metabolic

process 936 210 195.62 0.01428 lipid metabolic process 5.02E-12 * 648.52

GO:0043170

macromolecule metabolic

process 6600 1297 1379.38 0.01507 metabolic process 7.97E-52 * 611.01

GO:0051258 protein

polymerization 80 20 16.72 0.01564 cellular component

organization 8.05E-02

GO:0048767 root hair

elongation 101 30 21.11 0.01607

anatomical structure morphogenesis, cell

growth, cell

differentiation, growth 2.83E-02 GO:0009408 response to

heat 305 73 63.74 0.01896 response to abiotic

stimulus 1.00E+00

GO:0019853

L-ascorbic acid biosynthetic

process 21 9 4.39 0.01903 carbohydrate

biosynthetic process 1.00E+00

GO:0009911

positive regulation of

flower

development 57 19 11.91 0.0196

reproduction, multicellular organism

development, post- embryonic development, flower

development 1.04E-01

GO:0010380

regulation of chlorophyll biosynthetic

process 18 8 3.76 0.021 biosynthetic process 1.00E+00

(22)

22

GO:0080119 ER body

organization 15 7 3.13 0.02276 cellular component

organization 5.42E-03 * 2244.78

GO:0010555 response to

mannitol 9 5 1.88 0.02352 biological process 4.63E-01

GO:0032957

inositol trisphosphate

metabolic

process 9 5 1.88 0.02352 cellular process 1.00E+00

GO:0080149

sucrose induced translational

repression 9 5 1.88 0.02352

translation, response to stress, biosynthetic

process, protein

metabolic process 1.00E+00

GO:0009696

salicylic acid metabolic

process 50 15 10.45 0.02385 cellular process 1.00E+00

GO:0006605 protein

targeting 167 33 34.9 0.02406 transport 3.36E-02

GO:0009723 response to

ethylene 248 67 51.83 0.02656 response to endogenous

stimulus 9.76E-01

GO:0043085

positive regulation of

catalytic activity 42 16 8.78 0.02962 biological process 1.00E+00 GO:0015992 proton

transport 72 16 15.05 0.02992 transport 1.00E+00

GO:0009409 response to

cold 515 124 107.63 0.033 response to abiotic

stimulus 1.25E-01

GO:0048830

adventitious root

development 13 6 2.72 0.03666 multicellular organism

development 7.11E-01

GO:0007018

microtubule- based

movement 100 29 20.9 0.03875 cellular process 2.96E-07 * 1824.3

GO:0034220

ion transmembrane

transport 248 48 51.83 0.03937 transport 8.94E-03 * 636.76

GO:0030048

actin filament- based

movement 20 8 4.18 0.04076 cellular process 6.01E-05 * 6643.41

GO:0071395

cellular response to jasmonic acid

stimu... 87 26 18.18 0.04351 response to endogenous

stimulus 1.00E+00

GO:0009856 pollination 241 55 50.37 0.04359 reproduction 1.92E-04 * 903.11

GO:0031349

positive regulation of

defense

response 105 23 21.94 0.04368 response to stress 6.16E-01

GO:0043254

regulation of protein complex

assembly 36 8 7.52 0.0437 cellular component

organization 1.60E-01

GO:0009862

systemic acquired resistance, salicylic acid

mediated signaling

pathway 54 17 11.29 0.04496

response to stress, cell communication, signal transduction, response to external stimulus,

response to biotic

stimulus 2.90E-01

GO:0009407 toxin catabolic

process 58 18 12.12 0.04552

catabolic process, secondary metabolic

process 1.00E+00

GO:0006694 steroid 66 19 13.79 0.04655 lipid biosynthetic 1.88E-02

(23)

23

biosynthetic

process process

GO:0042127

regulation of cell

proliferation 54 15 11.29 0.04655 cellular process 1.00E+00

GO:0010043 response to zinc

ion 111 31 23.2 0.04781 biological process 1.00E+00

(24)

24

FIGURE S1 Geographic locations of the four sampled populations of Arabis alpina in the western Swiss Alps. 1) La Para (N = 70), 2) Pierredar (N = 70), 3) Les Essets (N =

70), 4) Les Martinets (N = 96). The aerial pictures are from SWISSIMAGE, 25cm, 2016, (https://shop.swisstopo.admin.ch/en/products/images/ortho_images/SWISSIMAGE).

1

3

2

4

(25)

25

FIGURE S2 Flowchart of the steps involved in TEPID (Stuart et al., 2016) and post processing of the raw polymorphic TEs for accurate genotyping of TEs in natural population.

(26)

26 a)

b)

FIGURE S3 Plots of the site frequency spectra for the four regions, (a) of the single-nucleotide polymorphisms excluding transposable element sequences (non-TE SNPs) and (b) of the polymorphic transposable elements (TEs).

(27)

27

FIGURE S4 Shared and private variants of the single-nucleotide polymorphisms, including those identified within transposable elements, among individuals of Arabis alpina sampled in the four study regions.

(28)

28

FIGURE S5 Density of single-nucleotide polymorphisms (SNPs) and SNPs excluding transposable element sequences (non-TE SNPs) along the eight chromosomes of Arabis alpina. The blue arrow highlights differences among both SNP sets, the loss of SNPs was mainly around the centromeres.

(29)

29

FIGURE S6 Efficiency of ‘tepid-refine’ algorithm. This procedure examines in detail the genomic region of a polymorphic transposable element (TE) identified in another sample and calls the same variant for samples with lower read count thresholds in order to reduce false negative variant calls within a group of related samples. a) Number of TE-absence variants identified versus sequencing depth of each individual. b) Number of additional TE-absence calls due to the TEPID refinement step versus sequencing depth of all individuals. c) Proportion of refined TE- absence calls due to the TEPID refinement step for each individual in the population with respect to TE-absence calls before refinement. d) Number of TE-presence variants identified versus the sequencing depth of each individual. e) Number of additional TE-presence calls due to the TEPID refinement step versus sequencing depth of all individuals. f) Proportion of refined TE-presence calls due to the TEPID refinement step for each individual in the population with respect to TE-presence calls before refinement. TE-absence and TE-presence calls are absence and presence of TEs with respect to the reference genome.

Referenzen

ÄHNLICHE DOKUMENTE

Here, we present Dog10K_Boxer_Tasha_1.0, an improved chromosome- level highly contiguous genome assembly of Tasha created with long-read technologies that in- creases

SCCP and MCCP levels as well as congener group patterns (n-alkane chain length, chlorine content) could be evaluated by electron capture negative ionization low resolution

Comparative mapping of the porcine BAC/PAC contig with respect to the gene-rich region on the human chromosome 19q13.1 map revealed a completely conserved gene order of this

We used the genome of a different waterfowl species, Mallard Anas platyrhynchos, as a reference to align Barnacle Goose second generation sequence reads from an RRL library and

adaptation of natural populations of three related plant species to similar environmental gradients

Number of significant associations for all annotated SNPs with eight environmental factors based on an FDR of 0.1% in

Der Preis ging an Aude Rogivue (*1989) für ihre Doktorarbeit «Genomic variation of Arabis alpina (Brassicaceae) in heteroge- neous alpine environments», ausgeführt unter der

element sequences (non-TE SNPs) and of polymorphic transposable elements (TEs) for 978. each functional feature compared to their expectation across the assembled