• Keine Ergebnisse gefunden

5.3 Prediction of injury and fatality probability

5.3.2 Multivariate analysis: MAIS or ISS as injury scale

5.3 Prediction of injury and fatality probability

Table 5.11: Frequencies of target variables (GIDAS).

Injury level Cases

0-1 435

2+ 450

0-2 731

MAIS 3+ 154

0-3 822

4+ 63

0-4 856

5+ 29

0-8 682

9+ 195

ISS 0-15 792

16+ 85

0-24 826

25+ 51

Fatalities not fatal 866

fatal 46

to 21.6 % in the original GIDAS data set. The root-mean-square RMS deviation of the regression coefficient formped,GIDASwas 6.94 %. For body height, five instances with a total of 43.4 % imputed values each have been computed, compared to 21.7 % in the original data.

The RMS deviation of the regression coefficient forr1,GIDAS was 1.74 %. These deviations are much smaller than the confidence intervals resulting from logistic regression.

Using Eq. 5.1, the resulting models can be written as follows:

pISS9+,GIDAS = 1

1 + exp (1.650−1.296·vc,GIDAS −0.781·yped) (5.17) pISS16+,GIDAS = (1 + exp (3.631−0.885·yped−0.792·mped,GIDAS−1.629·vc,GIDAS+

+0.435·LBRLv,GIDAS−0.35·BLEl,GIDAS+ 0.699·r1,GIDAS))−1 (5.18)

pISS25+,GIDAS = 1

1 + exp (3.822−1.157·vc,GIDAS −1.009·yped) (5.19)

pF atalities,GIDAS = 1

1 + exp (4.391−1.373·vc,GIDAS −1.237·yped) (5.20) An example of the practical use of the models is given using Eq. 5.17. Note that the parameters have been transformed (see Subsection 5.2.3) using mean and standard

Table 5.12: Multivariate logistic regression models for ISS9+, ISS16+, and ISS25+, age group 4+ (GIDAS).

Variable Symbol Scaling

p-value adjusted

95 % CI factor odds ratio

ISS9+(AIC: 595; BIC: 609)

Age yped 25.8 <0.001 2.185 1.771 2.694

Impact speed vc,GIDAS 17.0 <0.001 3.656 2.833 4.716

Constant exp(β0) 0.192

ISS16+ (AIC: 283; BIC: 317)

Body weight mped,GIDAS 21.3 0.002 2.209 1.351 3.612

Age yped 25.8 <0.001 2.423 1.679 3.497

Impact speed vc,GIDAS 17.0 <0.001 5.097 3.506 7.410 Lower bumper

LBRLv,GIDAS 9.2 0.022 0.647 0.446 0.939

reference line - vert.

Bonnet lead. e. - l. BLEl,GIDAS 3.0 0.033 1.419 1.030 1.955 Height to up. bum.

r1,GIDAS 0.4 0.010 0.497 0.293 0.844

reference line - vert.

Constant exp(β0) 0.026

ISS25+ (AIC: 234; BIC: 248)

Age yped 25.8 <0.001 2.744 1.837 4.099

Impact speed vc,GIDAS 17.0 <0.001 3.180 2.332 4.337

Constant exp(β0) 0.022

5.3 Prediction of injury and fatality probability

Table 5.13: Multivariate logistic regression models for MAIS2+, MAIS3+, MAIS4+, and MAIS5+, age group 4+ (GIDAS).

Variable Symbol Scaling

p-value adjusted

95 % CI factor odds ratio

MAIS2+ (AIC: 864; BIC: 888)

Age yped 25.8 <0.001 2.125 1.750 2.581

Impact speed vc,GIDAS 17.0 <0.001 3.144 2.494 3.964 Lower bumper

LBRLv,GIDAS 9.2 0.019 0.819 0.694 0.967

reference line - vert.

Height to winds.

r7,GIDAS 0.1 0.013 0.782 0.644 0.950

up. edge (wrap)

Constant exp(β0) 1.197

MAIS3+ (AIC: 534; BIC: 553)

Age yped 25.8 <0.001 2.118 1.691 2.653

Impact speed vc,GIDAS 17.0 <0.001 2.975 2.333 3.793

Constant exp(β0) 0.128

MAIS4+ (AIC: 259; BIC: 283 )

Age yped 25.8 <0.001 2.643 1.787 3.909

Impact speed v2c,GIDAS 1473.5 <0.001 3.258 2.334 4.547 Bonnet lead. e. - l. BLEl,GIDAS 3.0 0.006 1.645 1.155 2.344 Height to hood

r5,GIDAS 0.2 0.001 1.850 1.307 2.619

rear end (wrap)

Constant exp(β0) 0.025

MAIS5+ (AIC: 172; BIC: 192 )

Age yped 25.8 0.005 2.033 1.246 3.319

Impact speed vc,GIDAS 17.0 <0.001 2.789 1.982 3.925 Height to hood

r5,GIDAS 0.2 0.013 1.534 1.093 2.153

rear end (wrap)

Constant exp(β0) 0.013

Table 5.14: Multivariate logistic regression models for fatalities, age group 4+ (GIDAS).

Variable Symbol Scaling

p-value adjusted

95 % CI factor odds ratio

Fatalities (AIC: 203; BIC: 218)

Age yped 25.8 <0.001 3.445 2.163 5.488

Impact speed vc,GIDAS 17.0 <0.001 3.946 2.829 5.503

Constant exp(β0) 0.012

Table 5.15: Predictive accuracy of the models given by equations 5.17 to 5.20 (GIDAS). The number of included parameters is given by k.

Full-data model Cross-validation Model ROC

AUC 95 % CI k ROC

AUC SD SE Optimism ISS9+ 0.831 0.798 0.864 2 0.827 0.046 0.015 0.004 ISS16+ 0.921 0.891 0.950 6 0.879 0.070 0.023 0.041 ISS25+ 0.883 0.841 0.925 2 0.861 0.065 0.022 0.022 MAIS2+ 0.764 0.732 0.797 4 0.749 0.041 0.014 0.015 MAIS3+ 0.818 0.777 0.860 3 0.811 0.080 0.027 0.008 MAIS4+ 0.904 0.869 0.938 4 0.873 0.030 0.010 0.031 MAIS5+ 0.888 0.841 0.935 3 0.759 0.171 0.057 0.129 Fatalities 0.925 0.891 0.959 2 0.915 0.054 0.018 0.010

deviation. For example, a vehicle impact speed of 35 kph and a pedestrian age of 20 years are used. Together with mean and standard deviation from Table A.1, p. 188, Eq. 5.17 delivers the following probability:

pISS9+,GIDAS,expl. = 1

1 + exp (−ln(0.192)−ln(3.656)·vc,GIDAS −ln(2.185)·yped) =

= 1

1 + exp (1.650−1.296·vc,GIDAS −0.781·yped) =

= 1

1 + exp 1.650−1.296· 35−29.3517.04

−0.781· 20−35.9125.83 =

= 0.155

The predictive performance of the models is quantified using receiver operator charac-teristic (ROC) analysis. The in-sample predictive accuracy is given by the area under the curve (AUC) of the ROC. Compared with the results of the 10-fold cross-validation, the expectedout-of-sample accuracy can be estimated. Table 5.15 gives the corresponding results. The predictive quality, in-sample and out-of-sample, is remarkably high (ROC AUC 0.749 to 0.915). The optimism, quantifying the difference between in-sample and out-of-sample predictive accuracy, is relatively small for all models (except for MAIS5+).

The model for fatalities has the highest expected out-of-sample accuracy with 0.915.

The question whether MAIS or ISS should be used as target variable for injury or mortality probability models can be addressed for the GIDAS data set using the expected out-of-sample performance of the models. The ISS-based models have a mean ROC AUC between 0.827 and 0.879, whereas the MAIS-based ones have 0.749 to 0.873. The models are not directly comparable, as they predict probabilities of different injury scales. The ISS-based models tend to have a higher accuracy, as was suspected from the advantages

5.3 Prediction of injury and fatality probability

of ISS over MAIS as stated in the medical literature (see Section 5.1).

The relatively high standard deviation (SD) as well as standard error of the mean (SE) obtained in cross-validation seem to decrease with higher case numbers and become smaller than the optimism.

Comparison to existing models

There are a few models in the literature which are based on GIDAS and predict the probability for a particular injury level or for fatalities. The coefficients of the models are not directly comparable, as each model uses a different data set and probably a different scaling for the explanatory factors. The first two models for MAIS2+ and MAIS5+ are based on impact speed. They are not included explicitly in the publication, but only given as diagram [94].

Another model derived from the GIDAS data for MAIS2+ is given in [160]:

pM AIS2+ = 1

1 + exp (2.54−0.06·vc,GIDAS −0.02·yped) (5.21) This model includes impact speed and pedestrian age. Geometric quantities are not in-cluded (as given in Table 5.13). The ROC AUC of model 5.21 with respect to the data used for this thesis is 0.758. This ROC AUC is comparable to the corresponding value given in Table 5.15.

Regarding fatalities, the following model also includes impact speed and age as indepen-dent predictors (as the one presented in Table 5.14). It is again based on GIDAS and valid for pedestrians at age 15 and older [162]:

pf atal = 1

1 + exp (9.1−0.095·vc,GIDAS−0.040·yped) (5.22) The ROC AUC of 0.898 for pedestrians 15 and older is identical with the ROC AUC for the model presented in Table 5.14 within the same age group.

Fitzharris and Fildes constructed another model for mortality prediction [75], which is unpublished but included in [45]. No further information, e.g., on data used, sampling procedures or boundary conditions, is available. The model uses vehicle impact speed and pedestrian age ([75] following [45]):

pf atal = 1

1 + exp (6.302424−0.080358·vc−0.03166·yped) (5.23) The ROC AUC is 0.923 with respect to the data used for this thesis, which is comparable to the ROC AUC of 0.925 given in Table 5.15 for the corresponding model.

Figure 5.3 gives the mortality estimates of the three models (Eq. 5.20, 5.22, and 5.23) as function of vehicle impact speed. The model constructed in this thesis as well as the

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Probability

0.0 0.1 0.2

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 Vehicle impact speed [kph]

GIDAS, age = 20 GIDAS, age = 50 GIDAS, age = 80

Fitzharris, age = 20 Fitzharris, age = 50 Fitzharris, age = 80

Rosen, age = 20 Rosen, age = 50 Rosen, age = 80

Figure 5.3: Mortality estimates for different pedestrian ages depending on vehicle impact speed, as given by Eq. 5.20, 5.22, and 5.23.

model by Rosen give rather similar estimates. The model by Fitzharris tends to higher probabilities for given values of impact speed and pedestrian age. As both models are constructed using the GIDAS data set and include the same parameters, this is a plausible and expected finding. As discussed above, hardly any information besides the formula itself [45] is published for the Fitzharris model, consequently, no further interpretation of the different model estimates is possible.

The review of models available in the literature shows that ISS as outcome variable has been considered [75], but no model parameters have been published so far [45]. Explicit models for MAIS other than MAIS2+ have not been published. Regarding fatalities, a comparable model was derived in a previous study [162] and is confirmed by the results of this study.

Multivariate models based on PCDS

For the PCDS data set Table 5.16 gives the numbers of cases available. The correspond-ing multivariate logistic regression models are given in Table 5.17 for ISS, Table 5.18 for MAIS, and Table 5.19 for fatalities. The resulting models contain between two and four predictors. Impact speed is the most important predictor and is included in every model.

Note that impact speed is scaled differently than in the GIDAS data set, as it is scaled by the mean only.

Pedestrian age is also included in every model except MAIS5+ and is associated with

5.3 Prediction of injury and fatality probability

Table 5.16: Frequencies of target variables (PCDS).

Injury level Cases

0-1 157

2+ 211

0-2 226

MAIS 3+ 142

0-3 288

4+ 80

0-4 316

5+ 52

0-8 213

9+ 156

ISS 0-15 248

16+ 121

0-24 277

25+ 92

Fatalities not fatal 318

fatal 51

an increased risk (odds ratio 1.669 to 2.163). BMI is included in the model for fatalities and is associated with increased risk (odds ratio 1.742).

Different variables describing the geometry of the vehicle as well as the pedestrian stature are significant in multivariate analysis and included in every model (except MAIS2+).

Increasing hip height (hh,ped,P CDS) is associated with decreased risk (see Table 5.19). Ratios of body height to vehicle geometry are included in all ISS as well as in the MAIS4+ model and are also associated with decreased risk. The vehicle characteristics significant in the MAIS3+ and MAIS5+ models support these findings: Increased height of the vehicle front is associated with higher risk. The well-known risk imposed by light trucks or vans which typically have those geometric quantities are in line with these findings (see above).

It is important to interpret the results with respect to multicollinearity (correlations between factors within the PCDS data set): In the ISS25+ model, the ratio of hip height to the front-top transition point (r3,P CDS) is associated with a small (0.298<1) adjusted odds ratio, whereas shoulder height to the distance from the ground to the front hood opening (r6,P CDS) is associated with a large (1.639 > 1) adjusted odds ratio. These two factors are correlated (Spearman coefficient 0.817, p < 0.001). Considering the log-odds ratios, which are -1.209 and 0.494 respectively, the effects thus almost cancel each other out, with a negative odds ratio of about 0.489 associated with body height. There is also a correlation between hip and shoulder height and each of the vehicle characteristics and body height as well as between the two vehicle profile characteristics. That means that an increase in the persons height relative to the height of the vehicle front is still beneficial.

Table 5.17: Multivariate logistic regression models for ISS9+, ISS16+, and ISS25+, age group 4+ (PCDS).

Variable Symbol Scaling

p-value adjusted

95 % CI factor odds ratio

ISS9+ (AIC: 318; BIC: 334 )

Impact speed vc,P CDS 28.9 <0.001 26.197 12.942 53.028

Age yped 22.2 <0.001 1.958 1.455 2.636

Hip height to

r4,P CDS 0.3 0.016 0.703 0.529 0.936

forward hood height

Constant exp(β0) 0.030

ISS16+ (AIC: 286; BIC: 302)

Impact speed vc,P CDS 28.9 <0.001 21.256 10.975 41.169

Age yped 22.2 <0.001 1.866 1.376 2.530

Hip height to

r3,P CDS 0.2 0.004 0.618 0.446 0.855

transition point h.

Constant exp(β0) 0.017

ISS25+ (AIC: 221; BIC: 241)

Impact speed vc,P CDS 28.9 <0.001 29.250 13.644 62.705

Age yped 22.2 <0.001 2.114 1.490 2.998

Hip height to

r3,P CDS 0.2 <0.001 0.298 0.161 0.551

transition point h.

Shoulder height to

r6,P CDS 0.4 0.040 1.639 1.023 2.627

front hood h. (wrap)

Constant exp(β0) 0.005

5.3 Prediction of injury and fatality probability

Table 5.18: Multivariate logistic regression models for MAIS2+, MAIS3+, MAIS4+, and MAIS5+, age group 4+ (PCDS).

Variable Symbol Scaling

p-value adjusted

95 % CI factor odds ratio

MAIS2+(AIC: 377; BIC: 389)

Impact speed vc,P CDS 28.9 <0.001 12.605 6.840 23.229

Age yped 22.2 <0.001 1.715 1.312 2.243

Constant exp(β0) 0.149

MAIS3+(AIC: 327; BIC: 343)

Impact speed vc,P CDS 28.9 <0.001 17.135 9.216 31.858 Forward hood height

h3,P CDS 17.0 0.031 1.329 1.026 1.721

at centerline

Age yped 22.2 <0.001 1.669 1.256 2.220

Constant exp(β0) 0.035

MAIS4+(AIC: 229; BIC: 245 )

Impact speed vc,P CDS 28.9 <0.001 17.180 8.993 32.820

Age yped 22.2 <0.001 1.809 1.297 2.521

Hip height to

r3,P CDS 0.2 0.001 0.483 0.319 0.729

transition point h.

Constant exp(β0) 0.008

MAIS5+(AIC: 196; BIC: 207)

Impact speed vc,P CDS 28.9 <0.001 10.531 5.809 19.093 Transition point

h4,P CDS 15.6 0.001 1.897 1.301 2.767

height at contact

Constant exp(β0) 0.007

Table 5.19: Multivariate logistic regression models for fatalities, age group 4+ (PCDS).

Variable Symbol Scaling

p-value adjusted

95 % CI factor odds ratio

Fatalities (AIC: 182; BIC: 202)

Body mass index BM I 5.7 0.002 1.742 1.225 2.479

Impact speed vc,P CDS 28.9 <0.001 11.558 6.125 21.810

Age yped 22.2 <0.001 2.163 1.439 3.250

Pedestrian hip height hh,ped,P CDS 11.1 0.002 0.461 0.281 0.758

Constant exp(β0) 0.004

The variance underestimate due to imputation was quantified using BMI as variable including body weight (mped,P CDS) in the fatality model for the PCDS data set. Ten in-stances with a total of 20.0 % imputed values each for BMI have been computed, compared to 10.0 % in the original data. The RMS deviation of the regression coefficient was 4.80 %.

As for the GIDAS data before, this deviation is much smaller than the confidence intervals resulting from logistic regression.

Using Eq. 5.1 the resulting models can be written as follows:

pISS9+,P CDS = (1 + exp (3.505 + 0.352·r4,P CDS−0.672·yped−3.266·vc,P CDS))−1 (5.24) pISS16+,P CDS = (1 + exp (4.048 + 0.482·r3,P CDS

−0.624·yped−3.057·vc,P CDS))−1 (5.25) pISS25+,P CDS = (1 + exp (5.273 + 1.209·r3,P CDS−0.748·yped

−3.376·vc,P CDS −0.494·r6,P CDS))−1 (5.26) pF atalities,P CDS= (1 + exp (5.47−0.771·yped+ 0.774·hh,ped,P CDS

−2.447·vc,P CDS −0.555·BM I))−1 (5.27)

The findings from the PCDS data are comparable to the GIDAS data set. Impact speed and pedestrian age are very important predictors in the multivariate models. Pedestrian and vehicle characteristics are significant as additional explanatory variables. As discussed in Subsection 5.3.1, an increased ratio of body height to vehicle front is beneficial. This effect is clearly present in the PCDS data and is masked for GIDAS due to a correlation between body height and impact speed.

The predictive performance of the models is quantified using ROC analysis. Table 5.20 summarizes the corresponding results. The predictive quality, in-sample and out-of-sample, is remarkably high (ROC AUC 0.802 to 0.900). The optimism is relatively small for all models.

Using the expected out-of-sample performance as measure, the ISS-based models tend to have higher mean AUC ROC (0.871 to 0.900) than the MAIS-based models (0.802 to 0.877). As in GIDAS, this is a clear indication that ISS has to be favored, consider-ing the construction of injury probability models based on empirical in-depth accident data.

5.3 Prediction of injury and fatality probability

Table 5.20: Predictive accuracy of the models given by equations 5.24 to 5.27 (PCDS).

Full-data model Cross-validation Model ROC

AUC 95 % CI k ROC

AUC SD SE Optimism ISS9+ 0.880 0.846 0.914 3 0.871 0.060 0.020 0.009 ISS16+ 0.890 0.858 0.923 3 0.880 0.078 0.026 0.011 ISS25+ 0.901 0.868 0.935 4 0.900 0.052 0.017 0.002 MAIS2+ 0.817 0.775 0.859 2 0.802 0.063 0.021 0.015 MAIS3+ 0.870 0.834 0.905 3 0.850 0.048 0.016 0.020 MAIS4+ 0.903 0.868 0.939 2 0.875 0.113 0.038 0.028 MAIS5+ 0.898 0.856 0.940 2 0.877 0.069 0.023 0.021 Fatalities 0.913 0.876 0.950 4 0.898 0.058 0.019 0.014