Supplementary ﬁgures of Calibration and validation of predicted genomic breeding values in an advanced cycle maize population

(1)

maize population

Hans-J¨urgen Auinger, Christina Lehermeier, Daniel Gianola, Manfred Mayer, Albrecht E. Melchinger, Sofia da Silva,

Carsten Knaak, Milena Ouzunova, Chris-Carolin Sch¨on

(2)

Figure S1: Heatmap of pairwise kinship coefficients in data set S_all (N = 5,968, M = 9,742). DH lines were hierarchically clustered within individual data sets using the unweighted pair group method with arithmetic mean.

(3)

1000 3000 5000 0.15

0.20 0.25

N

CS including S3 R² = 0.70 CS without S3 R² = 0.60

40 50 60 70 80

0.15 0.20 0.25

N_eff

CS including S3 R² = 0.56 CS without S3 R² = 0.39

Figure S2: Relationship of (a) size of the calibration set (N) and (b) effective sample size of the calibration set (N_{e ff}) with average maximum kinship (u_max) in combination with prediction set S6 for 16 calibration sets including data set S3 and 15 calibration sets not including data set S3.

(4)

●

● ●

●

●●

●

●●

●

● ●

●

● ●

●

● ●

●

●● ●

1000 3000 5000

0.55 0.60 0.65 0.70 0.75

N

Prediction accuracy

S1 S2

R² = 0.5 R² = 0.49

●

● ●

●

●●

●

● ●

●

● ●

●

● ●

●

● ●●

40 50 60 70 80

0.55 0.60 0.65 0.70 0.75

N_eff

Prediction accuracy

S1 S2

R² = 0.63 R² = 0.11

●

● ●

●

●●

●

● ●

●

● ●

●●

●

● ●

●

● ●

●

●●●

0.30 0.35 0.40 0.45 0.50 0.55

0.60 0.65 0.70 0.75

u_max

Prediction accuracy

S1 S2

R² = 0.73 R² = 0.55

●

● ●

●

●●

●

●●

●

● ●

●

●●

●

●●

●

● ●●

0.34 0.38 0.42

0.55 0.60 0.65 0.70 0.75

reliability

Prediction accuracy

S1 S2

R² = 0.73 R² = 0.74

●

● ●

●

●●

●

● ●

●

●●

●

●●

●

●●

●

●●●

6000 7000 8000 9000 0.55

0.60 0.65 0.70 0.75

nPoly

Prediction accuracy

S1 S2

R² = 0.07 R² = 0.43

●

●●

●

● ●

●

● ●

●●

●

● ●

●

● ●

●

●●

●

0.60 0.70 0.80

0.55 0.60 0.65 0.70 0.75

LPS

Prediction accuracy

S1 S2

R² = 0.85 R² = 0.65

● S5 ● S6

Figure S3: Relationship of prediction accuracy for grain dry matter content and the parameters sample size (N), effective sample size (N_{e ff}), number of poly- morphic SNPs shared by the calibration and the prediction set (nPoly), average maximum kinship (u_max), linkage phase similarity (LPS), and the expected reliability for 15 calibration sets predicting genomic breeding values (GBV) in S5 and 31 calibration sets predicting GBVs in S6.

(5)

0.1 0.2 0.3 0.4 0.1

Prediction accuracy without S1

Prediction accur

0.1 0.2 0.3 0.4 0.1

Prediction accur

●

0.1 0.2 0.3 0.4 0.1

0.2 0.3 0.4

Prediction accuracy including S3

●

0.1 0.2 0.3 0.4 0.1

0.2 0.3 0.4

●

0.1 0.2 0.3 0.4 0.1

0.2 0.3 0.4

●

S1_2 S1_3 S1_4 S1_5 S2_3 S2_4 S2_5 S3_4 S3_5 S4_5 S1_2_3 S1_2_4 S1_2_5

S1_3_4 S1_3_5 S1_4_5 S2_3_4 S2_3_5 S2_4_5 S3_4_5 S1_2_3_4 S1_2_3_5 S1_2_4_5 S1_3_4_5 S2_3_4_5 S1_2_3_4_5

Figure S4: Relationship of prediction accuracies for grain dry matter yield in S6 obtained with calibration sets including a specific data set (e.g. all possible calibration sets including S1) and corresponding accuracies obtained with calibration sets not including the specific set. Colour coding refers to the combinations including the specific set.

(6)

0.25 0.30 0.35 0.40

0.00.20.40.6

Expected reliability

Empirical reliability

S5

(a)

S6 R² = 0.45

R² = 0.62

0.25 0.30 0.35 0.40

0.00.20.40.6

Expected reliability

Empirical reliability

(b)

R² = 0.73 R² = 0.72

S5 S6

Figure S5: Relationship of empirical reliability and expected reliability for (a) grain dry matter yield and (b) grain dry matter content for 15 calibration sets predicting genomic breeding values in S5 and 31 calibration sets predicting genomic breeding values in S6.

40 50 60 70 80

0.0 0.2 0.4 0.6 0.8

N_eff

Prediction accuracy GDY

(a)

CS including S5

CS without S5

40 50 60 70 80

0.0 0.2 0.4 0.6 0.8

N_eff

Prediction accuracy GDC

(b)

CS including S5 CS without S5

Figure S6: Relationship of effective sample size of the calibration set (N_{e ff}) for 16 calibration sets including data set S5 and 15 calibration sets not including data set S5 with prediction accuracy in S6 in (a) for grain dry matter yield (GDY) and (b) for grain dry matter content (GDC).