maize population
Hans-J¨urgen Auinger, Christina Lehermeier, Daniel Gianola, Manfred Mayer, Albrecht E. Melchinger, Sofia da Silva,
Carsten Knaak, Milena Ouzunova, Chris-Carolin Sch¨on
Figure S1: Heatmap of pairwise kinship coefficients in data set Sall (N = 5,968, M = 9,742). DH lines were hierarchically clustered within individual data sets using the unweighted pair group method with arithmetic mean.
1000 3000 5000 0.15
0.20 0.25
N
CS including S3 R² = 0.70 CS without S3 R² = 0.60
40 50 60 70 80
0.15 0.20 0.25
Neff
CS including S3 R² = 0.56 CS without S3 R² = 0.39
Figure S2: Relationship of (a) size of the calibration set (N) and (b) effective sample size of the calibration set (Ne ff) with average maximum kinship (umax) in combination with prediction set S6 for 16 calibration sets including data set S3 and 15 calibration sets not including data set S3.
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
● ●
● ●
●
●
●
●
●
● ●
●
●
● ●
●
●
●● ●
1000 3000 5000
0.55 0.60 0.65 0.70 0.75
N
Prediction accuracy
S1 S2
R² = 0.5 R² = 0.49
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
● ●
●
●
●
●
●
● ●
●
●
●
●
●
●
● ●●
40 50 60 70 80
0.55 0.60 0.65 0.70 0.75
Neff
Prediction accuracy
S1 S2
R² = 0.63 R² = 0.11
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
●●
●
●
●
●
●
● ●
●
●
● ●
●
●
●●●
0.30 0.35 0.40 0.45 0.50 0.55
0.60 0.65 0.70 0.75
umax
Prediction accuracy
S1 S2
R² = 0.73 R² = 0.55
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
● ●
● ●
●
●
●
●
●
●●
●
●
●●
●
●
● ●●
0.34 0.38 0.42
0.55 0.60 0.65 0.70 0.75
reliability
Prediction accuracy
S1 S2
R² = 0.73 R² = 0.74
●
●
●
● ●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●●●
6000 7000 8000 9000 0.55
0.60 0.65 0.70 0.75
nPoly
Prediction accuracy
S1 S2
R² = 0.07 R² = 0.43
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
● ●
●●
●
●
●
●
●
● ●
●
●
● ●
●
●
●●
●
0.60 0.70 0.80
0.55 0.60 0.65 0.70 0.75
LPS
Prediction accuracy
S1 S2
R² = 0.85 R² = 0.65
● S5 ● S6
Figure S3: Relationship of prediction accuracy for grain dry matter content and the parameters sample size (N), effective sample size (Ne ff), number of poly- morphic SNPs shared by the calibration and the prediction set (nPoly), average maximum kinship (umax), linkage phase similarity (LPS), and the expected relia- bility for 15 calibration sets predicting genomic breeding values (GBV) in S5 and 31 calibration sets predicting GBVs in S6.
0.1 0.2 0.3 0.4 0.1
Prediction accuracy without S1
Prediction accur
0.1 0.2 0.3 0.4 0.1
Prediction accuracy without S2
Prediction accur
●
0.1 0.2 0.3 0.4 0.1
0.2 0.3 0.4
Prediction accuracy without S3
Prediction accuracy including S3
●
0.1 0.2 0.3 0.4 0.1
0.2 0.3 0.4
Prediction accuracy without S4
Prediction accuracy including S4
●
0.1 0.2 0.3 0.4 0.1
0.2 0.3 0.4
Prediction accuracy without S5
Prediction accuracy including S5
●
S1_2 S1_3 S1_4 S1_5 S2_3 S2_4 S2_5 S3_4 S3_5 S4_5 S1_2_3 S1_2_4 S1_2_5
S1_3_4 S1_3_5 S1_4_5 S2_3_4 S2_3_5 S2_4_5 S3_4_5 S1_2_3_4 S1_2_3_5 S1_2_4_5 S1_3_4_5 S2_3_4_5 S1_2_3_4_5
Figure S4: Relationship of prediction accuracies for grain dry matter yield in S6 obtained with calibration sets including a specific data set (e.g. all possible cali- bration sets including S1) and corresponding accuracies obtained with calibration sets not including the specific set. Colour coding refers to the combinations in- cluding the specific set.
0.25 0.30 0.35 0.40
0.00.20.40.6
Expected reliability
Empirical reliability
S5
(a)
S6 R² = 0.45R² = 0.62
0.25 0.30 0.35 0.40
0.00.20.40.6
Expected reliability
Empirical reliability
(b)
R² = 0.73 R² = 0.72
S5 S6
Figure S5: Relationship of empirical reliability and expected reliability for (a) grain dry matter yield and (b) grain dry matter content for 15 calibration sets pre- dicting genomic breeding values in S5 and 31 calibration sets predicting genomic breeding values in S6.
40 50 60 70 80
0.0 0.2 0.4 0.6 0.8
Neff
Prediction accuracy GDY
(a)
CS including S5CS without S5
40 50 60 70 80
0.0 0.2 0.4 0.6 0.8
Neff
Prediction accuracy GDC
(b)
CS including S5 CS without S5Figure S6: Relationship of effective sample size of the calibration set (Ne ff) for 16 calibration sets including data set S5 and 15 calibration sets not including data set S5 with prediction accuracy in S6 in (a) for grain dry matter yield (GDY) and (b) for grain dry matter content (GDC).