• Keine Ergebnisse gefunden

Appendix 2: Simulations with virtual species

N/A
N/A
Protected

Academic year: 2022

Aktie "Appendix 2: Simulations with virtual species "

Copied!
19
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

1

Figure S1: Histograms of the prevalence of species in the different datasets. The coloured bars represent 2

the species used in modelling and the white bars species that were observed but too infrequently (< 10 3

occurrences) for modelling.

4 5

(2)

6

Figure S2: Histograms of site species richness (SR) observed for the different datasets. The top row is 7

based on all species observed in the field surveys and the bottom row is based on the subset of species 8

used in the modelling.

9 10

(3)

11

12

Figure S3: Evaluation metrics for the individual SDMs for all four taxa. The solid line inside the box 13

indicates the median, boxes range from the 25th to the 75th percentile, and the whiskers indicate ± 2 14

standard deviations.

15

(4)

16

17

Figure S4: Boxplot of the community AUC (cAUC) for the four different taxa. The solid line inside 18

the box indicates the median, boxes range from the 25th to the 75th percentile, and the whiskers 19

indicate ± 2 standard deviations.

20 21

(5)

22

Figure S5: Standardised species richness deviation between observations and predictions based on 23

either “species-specific” thresholding (Fixed, Max.TSS, Max.Kappa), “site-specific” thresholding or 24

the simply sum of the predicted probabilities of all species in a site. Groups with significant 25

differences in their means (as measured by the pairwise-Wilcoxon test) are indicated by different 26

letters. The solid line inside the box indicates the median, the boxes range from the 25th to the 75th 27

percentile, and the whiskers indicate ± 2 standard deviations.

28 29

(6)

30

Figure S6: Correlation of the Jaccard similarity metrics based on the binary data with threshold- 31

independent Jaccard similarity metrics. The top row shows the correlation of the binary metrics with 32

maxJaccard, and the bottom row shows the correlation of the binary metrics with probJaccard. The 33

binary Jaccard similarity of panels a and d is based on a “species-specific” maxTSS threshold, of 34

panels b and e on a “species-specific” maxKappa threshold, and of panels c and f on a “site-specific”

35

probability ranking rule. The numbers in the legends indicate the Spearman correlation coefficient 36

between the binary and threshold independent metrics.

37 38

(7)

Appendix 1: Script to test the H0 that the observed and expected SR are not different 1

2

The here provided script creates a virtual community of species (sp.pool). Then each of the species 3

randomly gets a probability of occurrence/prevalence (sp.preval). Based on these two parameters a 4

realized community/SR gets randomly created by drawing from the Poisson-Binomial distribution of 5

the probabilities of the species at a given site (using independent Bernoulli trials). The “realized 6

communities” based on (100000 simulations, rep) can then be compared with the expectations directly 7

based on the probabilities (as a result of the probability mass function equation 2 under the assumption 8

of a Poisson-Binomial distribution).

9

The simulated frequencies of the species richness (Figure A1.1) is almost identical to the expectations 10

based on the probability mass function (Figure A1.2). This results in a strong correlation of the simulated 11

species richness and the theoretical expectations (Figure A1.3) given that the probabilities are “correct”

12

and follow a Poisson-Binomial distribution.

13

For all the calculation based on the Poisson-Binomial distribution we used the R-package poibin (Hong, 14

Y. 2013) providing efficient function for the cumulative distribution function (cdf), probability mass 15

function (pmf), quantile function, and random number generation.

16

References 17

Hong, Y. (2013). On computing the distribution function for the Poisson binomial distribution.

18

Computational Statistics & Data Analysis, Vol. 59, pp. 41-51.

19

20

21

(8)

Script 1. This scripts creates random communities to show the basic principles of the p-value 22

calculations mentioned in the main manuscript.

23 24

################################################################

25

### p-value Null models ########################################

26

################################################################

27 28

#Basic parameters

29

sp.pool <- round(runif(1, min=50, max=100)) #Random number of potental species to occur at a site (Regional species pool)

30

sp.preval <- round(runif(sp.pool, min=0, max=1), 3) #Random probability for each species to occur at the site(s)/prevalence

31

rep <- 100000 #Number of times the binomial distribution is drawn to create the "observed" species richness

32 33

#Site parameters calculation based on the input data

34

expected.SR <- sum(sp.preval) #The expected species richness

35

p.dist <- dpoibin(1:sp.pool, sp.preval) #The expected probability for all possible SR from 1 to sp.pool based on sp.preval

36 37

#Simple model to create n=rep, realisations of the probability distribution

38

obs.SR <- NULL

39

for(i in 1:rep){

40

obs.SR <- c(obs.SR, sum(rbinom(sp.pool,1,sp.preval)))

41 42 }

43

#Histogram of the observed SR based on independent Bernoulli trials

44

SR.hist <- hist(obs.SR, breaks=0:sp.pool)

45

SR.hist$counts <- SR.hist$counts/rep #Standardisation with the number of repetition

46

CI <- qpoibin(c(0.025, 0.975), sp.preval) #Calculation of the 95% confidence intervall based on the cumulative distribution

47

function

48 49

#Figure A1.1

50

plot(SR.hist, xlab="Species richness", col="grey", main="")

51

abline(v=CI, lty=3, col="red")

52

text(x=CI[1], y=max(SR.hist$counts), labels = paste(round(sum(obs.SR <= CI[1])/rep*100,1),"%", sep=""), adj=c(1,1), col="red")

53

text(x=CI[2], y=max(SR.hist$counts), labels = paste(round(sum(obs.SR >= CI[2])/rep*100,1),"%", sep=""), adj=c(0,1), col="red")

54 55

#Figure A1.2

56

plot(p.dist, type="b", pch=16, xlab="Species richness", ylab="Expectaion proportion based on mass function")

57 58

#Figure A1.3

59

plot(p.dist,SR.hist$counts, ylab="SR simulations based on Bernoulli trials", xlab="Expectation based on probabability mass

60

function", pch=16)

61 62 63

(9)

64

65

Figure A1.1 66

Figure A1.1: Histogram of the simulated species richness with independent Bernoulli trials. The red 67 lines represent the 95% confidence interval based on the cumulative distribution function and the 68

red numbers the percentage of simulations outside of the confidence interval. The presented 69

example is based on a species pool of 90 species and 100’000 simulations.

70

71

Figure A1.2: Expected proportions for all possible species richness values based on the probability 72 mass function of a Poisson-Binomial distribution. The presented example is based on a species pool 73

of 90 species.

74

(10)

75

Figure A1.3: Correlation of the expected proportion based on the mass function and the simulated 76 proportion based on independent Bernoulli trials.

77

(11)

Appendix 2: Simulations with virtual species

1

Reasoning behind these simulations 2

In this Appendix we describe the behavior of the presented evaluation approaches to errors in the species 3

data used for model calibration. These errors usually arise either due to detection issues (i.e. creating 4

false absences) or due to misidentification of species (i.e. creating false absences and presences). While 5

these biases might to some degree be present in almost all “real world” data sources, the bias usually 6

remains poorly known and one simply assumes the data to be “correct”.

7

Here, using virtual species (known truth) and adding errors in a controlled environment, we explored 8

how the different community evaluation approaches behave both in the case of detection issues (i.e.

9

false absences) and misidentification (i.e. false absences and presences). We will focus our analysis 10

solely on this (newly) suggested evaluation approaches and like to refer to another publication studying 11

the effect of detection issues and misidentification on SDMs in much more depth (Fernandes, Scherrer 12

& Guisan 2019).

13

In most published studies “the same data” is used for model calibration and evaluation (i.e. cross- 14

validation, split-sample) and the unbiased truth is unknown. In these cases (i.e., identical bias in 15

calibration and evaluation data), all the presented approaches perform equally well and give accurate 16

information about the model performance under the given bias (i.e., How well the data is predicted by 17

the model rather than how well the (unknown) truth is predicted). However, if one has an idea about the 18

bias affecting the data, the here presented simulation study on virtual species might help to select an 19

evaluation approach that is less affected by bias in the initial data.

20

Creation of virtual species/communities 21

We create a set of 100 virtual species that were loosely based on existing plant species to maintain 22

ecological realism. However, in contrast to the distribution of “real world” species that is determined by 23

a multitude of abiotic and biotic factors the distribution of our 100 virtual species is solely determined 24

by six environmental factors (annual mean temperature, annual temperature range, annual sum of 25

precipitation, potential annual solar radiation, slope and topographic position).

26

(12)

We then projected the niche of virtual species on a set of 720 sites with a large range of environmental 27

conditions. Based on this dataset of 720 sites our virtual plant species had a prevalence (i.e. percentage 28

of sites with presence) ranging from 0.2 to 0.8 and a site SR 40.7 ± 9.8 (mean ± sd).

29

This set of virtual communities based on 100 species and 720 sites was then considered our “know truth”

30

used to evaluate the performance of our models and evaluation approaches.

31

Simulation of errors 32

We tested five different levels of error added (0, 5%, 10%, 30%, 50%) and two different types of errors 33

emulating detection issues (i.e. adding false absences) and misidentification (i.e., adding false absences 34

and false presences). To simulate detection issues, we simply changed randomly X% of the presences 35

into absences. To simulate misidentifications, we change X% of presences or absences into the opposite 36

(0 -> 1 or 1 -> 0). The selection of which absences/presences to change was random.

37

For each simulation we calibrated our models based on the virtual communities with errors added (i.e., 38

containing false absences/presences) and then evaluated the models against the “known truth” (i.e., data 39

with no error). All models were run in R.3.6.1 using biomod2 (Thuiller et al. 2009; Thuiller et al. 2016) 40

and GLM (regression based) and RF (decision tree based) as modelling techniques and 5 fold-cross 41

validation. For each level of error, we run 100 simulations (i.e., 100 different randomized errors).

42

The quality of our (single species) SDMs was evaluated by AUC, and the community predictions based 43

on the S-SDMs were evaluated with all the approaches presented in the main manuscript.

44

Performance of individual SDMs 45

As expected, the performance of individual SDMs (individual species) was negatively affected by the 46

addition of errors in the calibration data (Fig. A2.1). However, the effect of omissions (i.e., detection 47

issues; false absences) was very small (Fig. A2.1) compared to the effects of misidentifications (i.e., 48

having also false presences). This pattern is well known (see e.g., Fernandes, Scherrer & Guisan 2019) 49

and mostly explained by the fact that for most modelling techniques the presences provide the 50

information signal and the absences mostly the background information (that’s also why presence-only 51

(13)

models work well with most techniques). There was no difference in model performance between the 52

two modelling techniques (GLM, RF) chosen (Fig. A2.1).

53 54

55

Fig. A2.1: Average model performance of individual SDMs measured by AUC in dependence of 56

different levels of error (detection issues or misidentification) added to the calibration data and evaluated 57

on the “known truth”.

58

Effect of detection issues (i.e., false absences) 59

The effect of detection issues (i.e., false absences) strongly varied among the different evaluation 60

approaches. cAUC, maximization approaches (maxSørensen, maxJaccard) and probability sum ratios 61

(probSørensen, probJaccard) are only slightly effected (Fig. A2.2), while the deviation of SR (Fig. A2.3) 62

and the improvement over null-models show strong signals (Fig. A2.4).

63

To understand why cAUC, maximization approaches and probability sum ratios are largely unaffected 64

we have to analyze the effects of adding false presences. If we keep in mind that the average probability 65

of a species to occur at a site is equal to its prevalence (i.e., proportion of sites occupied) we can directly 66

understand that omitting the species at sites (i.e., creating a false absence) reduces the number of 67

occupied sites and therefore the prevalence and the average probability to occur at a site. Therefore, 68

detection issues lead to a reduction/underestimation of the average probability of species to occur at a 69

(14)

site. However, identical to the “normal” AUC, the cAUC is not directly affected by the probability per 70

se but only by the ranking of probabilities (i.e., species with lowest to highest probability of occurrence).

71

As a results, the cAUC is not affected by detection issues as long as the detectability is similar across 72

the species of a community. In the case of highly variable detectability among the community, the cAUC 73

will be more affected (as demonstrated/explained below in the misidentification section).

74

The same explanation is valid for maximization approaches (maxSørensen, maxJaccard) and probability 75

sum ratios (probSørensen, probJaccard) as all of those depend on the ranking of probabilities rather than 76

the probabilities per se.

77

78

Fig. A2.2: Community evaluation based on cAUC, MaxSørensen, probSørensen in dependence of 79

different levels of detection issues (false absences) added to the calibration data and evaluated on the 80

“know truth”. A detectability of 1 reflects perfect detection while a detectability of 0.5 reflects 50%

81

omission error (i.e., 50% of presences change into absences).

82

The strong effect of detection issues on the deviation in SR is not surprising. As mentioned before, the 83

omission of species leads to a reduction in average probabilities. As the expected SR (E(SR)) is defined 84

as the sum of the probabilities of a site the omission of species leads directly to an underestimation of 85

the SR. Therefore, the more species are omitted the higher is the difference between observed and 86

expected SR. Again, it is important to note, that if no “known truth” is available and the same dataset is 87

used for calibration and evaluation (e.g., using cross-validation) then there will be no differences 88

(15)

between average expected and average observed SR as the observed SR is identically affected by the 89

omission of species.

90

91

Fig. A2.3: The probability to get the calculated deviation in SR or higher based on the predicted 92

probabilities of occurrence depending on different levels of detection issues (false absences) added to 93

the calibration data and evaluated on the “know truth”. A detectability of 1 reflects perfect detection 94

while a detectability of 0.5 reflects 50% omission error (i.e., 50% of presences change into absences).

95

The improvement over null-models was also strongly affected by the detection issues. This is not 96

surprising, as this approach is based directly on the probabilities of occurrence. Changing (i.e., in this 97

case reducing) these average probabilities leads to a lower likelyhood to get the true composition and 98

SR correctly, and consequently to a lower improvement compared to the null-model.

99

(16)

100

Fig. A2.4: Log-fold improvement of species richness and species composition compared to a null-model 101

based on the average prevalence of the observed species depending on different levels of detection issues 102

(false absences) added to the calibration data and evaluated on the “know truth”. A detectability of 1 103

reflects perfect detection while a detectability of 0.5 reflects 50% omission error (i.e., 50% of presences 104

change into absences).

105

Effect of misidentification (i.e., false absences and presences) 106

The effects of misidentification were much stronger than the effects of detection issues, especially for 107

cAUC, maximization approaches (maxSørensen, maxJaccard) and probability sum ratios 108

(probSørensen, probJaccard; Fig. A2.5). As mentioned before, these approaches are based on the ranking 109

of probabilities and are therefore very sensitive to changes in those rankings. In contrast to only adding 110

false absences (detection issues), the misidentification also adds false presences. Due to the random 111

process of removing and adding presences to species, the prevalence (and therefore the average 112

predicted probability) changes. However, in contrast to the detection issues simulation, the 113

misidentification simulations did not reduce the prevalence uniformly but randomly increased or 114

decreased it. This automatically leads to changes in the ranking of species, and therefore a strong effect 115

on cAUC, maximization approaches and probability sum ratios. This extreme case of misidentification 116

is similar to above mentioned scenario of detection issues affecting species differently. If some species 117

can be detected perfectly (0 omissions) and other species can be detected poorly (high omission error) 118

(17)

the prevalence of species changes non-uniformly affecting the ranking of probabilities similar (but 119

usually weaker) than misidentification.

120

121

Fig. A2.5: Community evaluation based on cAUC, MaxSørensen, probSørensen in dependence of 122

different levels of misidentification (false presences and false absences) added to the calibration data 123

and evaluated on the “know truth”.

124

The average deviation in SR between observation and expectation is less affected by misidentifications 125

than by detection issues (Fig. A2.6). This is expected and theoretically deviation in SR should not be 126

affected by the addition of presences and absences as long as the same number of presences are added 127

and removed (i.e. the overall SR is the same just the composition changes). However, as most of our 128

species have a prevalence below 0.5, it is slightly more likely to change an absence into a presence than 129

vice-versa leading to a slight change in average SR.

130

(18)

131

Fig. A2.6: The probability to get the calculated deviation in SR or higher based on the predicted 132

probabilities of occurrence depending on different levels of misidentification (false presences and false 133

absences) added to the calibration data and evaluated on the “know truth”.

134

The improvement over null-models is similarly affected by misidentification and detection issues. As 135

these approaches are directly based on the probabilities and take into account both the probability to be 136

present and absent, a random increase or decrease of probabilities lead to similar patterns (Fig. A2.7) 137

138

Fig. A2.4: Log-fold improvement of species richness and species composition compared to a null-model 139

based on the average prevalence of the observed species depending on different levels of 140

(19)

misidentification (false presences and false absences) added to the calibration data and evaluated on the 141

“know truth”.

142 143

References 144

Fernandes, R.F., Scherrer, D. & Guisan, A. (2019) Effects of simulated observation errors on the 145 performance of species distribution models. Diversity and Distributions, 25, 400-413.

146 Thuiller, W., Georges, D., Engler, R. & Breiner, F. (2016) biomod2: Ensemble Platform for Species

147 Distribution Modeling.

148 Thuiller, W., Lafourcade, B., Engler, R. & Araujo, M.B. (2009) BIOMOD - a platform for ensemble 149 forecasting of species distributions. Ecography, 32, 369-373.

150 151

Referenzen

ÄHNLICHE DOKUMENTE

Two different approaches, reconstructive oral history and digitised analysis, are discussed with a view to understanding the contribution of overseas trained doctors to

The nature of qualitative research in terms of the volume and complexity of unstructured data and the way in which findings and theory emerge from the data also makes

The Chinese government has been among the more strident in terms of sweeping legislation to tackle plastic waste – from a poorly enforced plastic-ban bag in 2007 to the

As predictable, from the flume tests it was derived that the run-out behavior is controlled by both kaolin and sand concentrations, but here, for lack of space, we comment only

The forest input data for Rockyfor3D is either a file with the trees positions and diameters, or files describing the tree density and diameters (mean and standard deviation)..

In a constraint-based approach to information structure (Vallduv´ı and Engdahl, 1996; Paggio, 2009), the multimodal relation could be represented in terms of structure sharing

The performance of organizations that handle a problem change is primarily positively affected (that is, cycle time is reduced) by high task awareness and high update rate of

The first Desoto mission was conducted by the USS Craig in March, 1964; ‘The North Vietnamese did not react, probably because no South Vietnamese commando