• Keine Ergebnisse gefunden

4.4 Simulation Results

4.4.1 Effect of the Absence of Random Intercepts on the Model Fitting . 31

Firstly, a close look will be taken on whether and which model estimates are affected by the absence of random intercepts in the model specification. This section serves as a starting point of the analysis to evaluate the general necessity of random intercepts in the models.

To investigate the effects of omitting random intercepts, the non-linear effects of x on the parameters are compared for the three different model types. The estimation results for four data sets are displayed in Figure 4.2 where each row shows the effect for one distributional

parameter and each column corresponds to one model with the full model in the left, the partial model in the central and the fixed effects model in the right column. The four simulations that are shown in Figure 4.2 differ by the sample size type in columns and by the correlation structure in rows.

(a) Sample size type 1 andP1 (b) Sample size type 3 andP1

(c) Sample size type 1 andP4 (d) Sample size type 3 andP4 Figure 4.2: Posterior means and the 95% pointwise credibility intervals of the non-linear effects of x on the parameters for different simulations set-ups

.

Figure 4.2 shows that the effects of x are estimated very well in the full models for all four data sets. This is not unexpected, as the full model includes a random intercept in each parameter, however, with misspecified correlation structure in scenarios (c) and (d). Using the partial model for the analysis, the non-linear effect of x on µis estimated well, but the effects of x on σ are biased upwards, which is also not unexpected. The estimated effects of x on ν are only shifted negligibly upwards in the partial models. When the fixed effects

model is used for data analysis, µ is slightly overestimated. The results for σ and ν in the fixed effects models are similar to the estimates of the partial model. W.r.t. the different sample sizes and correlation matrices, the results of all three models are similar for the four data sets. The estimations of the non-linear effects of x for the remaining simulations can be found in Appendix A.6, A.7 and A.8.

The bias of the estimated effects of x on µ and σ using a missspecified model without random intercepts is due to the non-linearity of the log link function. In a setting with a linear link function and normally distributed random intercepts, the response functions in Figure 4.1 would be symmetric around the reference line and no bias would occur in the estimation. Due to the non-linear transformation of the linear predictor (compare Equation (4.2)), a bias is present and grows with the random intercept variance. The bias resulting from omitting the random intercepts is less pronounced for the parameterν where a log link function is applied.

Table 4.1 compares the deviances of the fitted models for the simulations shown in Figure 4.2.

Full model Partial model Fixed effects model (a) 11478.0 13147.1 14215.2

(b) 11706.0 13434.7 14378.9 (c) 11235.4 13044.2 14265.5 (d) 11394.3 13511.3 14435.2

Table 4.1: Deviance values of the three model types for (a) sample size type 1 and P1, (b) sample size type 3 and P1, (c) sample size type 1 and P4 and (d) sample size type 3 and P4

A smaller value for the deviance indicates a better model fit. As expected, for each data set the smallest value for the deviance (in bold) is obtained by the full model and the highest by the fixed-effects model.

4.4.2 Effects of the Missspecification of the Random Intercept’s Correlation Structure

In this chapter, the effects of missspecification of the correlation structure of the random intercepts on the estimation of the non-linear fixed effect as well as the random intercepts will be investigated.

4.4.2.1 Effects on the Estimation of the Covariate in the Different Scenarios In this section, the focus lies on the estimated non-linear effects of x on the parameters for the full models. For each data set type, the posterior mean estimates grouped by the underlying covariance matrix are plotted in Figure 4.3.

In each scenario, the estimates deviate only little from the true effect and, hence, at least in these scenarios the correlation structure of the random intercepts has only minor impact on estimation of the fixed effects.

(a) Sample size type 1

(b) Sample size type 2

(c) Sample size type 3

Figure 4.3: Comparison of the true (black) and the estimated non-linear effects ofx on the distributional parameters for the underlying correlation matrixP1 (blue),P2 (green), P3 (red) and P4 (purple) for the three sample size types

4.4.2.2 Effects on the Estimation of the Random Intercepts in the Different Scenarios

In this section, the impact of the violated independence assumption on the estimation of the random intercepts is investigated. Firstly, the mean-squared error (MSE) of the estimates

MSEk = 1 n

n

X

i=1

k,i−γˆk,i)2 (4.5)

is computed.

The results of the MSEs for the three parameters in all models can be found in Tables 4.2, 4.3 and 4.4. Each table shows the estimated MSEs of one parameter for all twelve modes. Each cell represents one specific model with sample size types in rows and correlation matrices in columns.

As expected, the MSEs decrease with the number of observations per subject for each parameter. No clear trend of the MSE in the different correlation scenarios can be identified

P1 P2 P3 P4

n = 50, ni = 40 0.0040 0.0037 0.0024 0.0047 n = 100, ni = 20 0.0081 0.0094 0.0056 0.0080 n = 250, ni = 8 0.0121 0.0114 0.0153 0.0234

Table 4.2: MSEs of the random intercepts estimation in the model forµ

P1 P2 P3 P4 n = 50, ni = 40 0.0122 0.0177 0.0132 0.0113 n = 100, ni = 20 0.0300 0.0390 0.0328 0.0361 n = 250, ni = 8 0.0962 0.0844 0.0839 0.1060

Table 4.3: MSEs of the random intercepts estimation in the model forσ

P1 P2 P3 P4 n = 50, ni = 40 0.0814 0.1083 0.1115 0.1019 n = 100, ni = 20 0.1273 0.1274 0.1438 0.1401 n = 250, ni = 8 0.2187 0.2227 0.2115 0.1799

Table 4.4: MSEs of the random intercepts estimation in the model for ν

for the model of µ in Table 4.2. The same is true for the models of σ and ν in Tables 4.3 and 4.4.

Comparison of the random intercept estimation across parameters is not useful due to the different variances of the random intercept distributions. Nevertheless, the MSEs are much higher for the random intercepts in the model for ν than for µ and σ. This is intuitive as the estimation of the model forν relies on information whether the value of zero is observed or not. In contrast, the estimation in the models for µ and σ are based on the value of the continuous component which provides substantially more information and hence, results in more accurate estimates. To illustrate this issue, the estimated random intercepts are plotted against the true values for two models in Figure 4.4. The closer a point is to the diagonal line, the higher the estimation accuracy. In the plots of the random intercepts in the model for ν, the points are color-coded with the number of times a “zero” is observed per individual.

As previously described, Figure 4.4 illustrates the higher estimation accuracy of the random intercepts in the models for µ and σ than in ν for both scenarios. Moreover, if the two models are compared to each other, it can again be identified that the accuracy of the random intercept estimation in the models for all three parameters is higher for the model that relies on more observations per individual.

After evaluating the effect of the violation on the point estimation of the random intercepts, a close look will be taken on how the correlation in the data affects the estimated random intercepts regarding the independence assumption. Thereby, to empirically evaluate whether the estimated random intercepts are uncorrelated as suggested by the model or correlated

Figure 4.4: Estimated against true random intercepts for all distributional parameters for the full models of sample size type 1 with correlation matrix P4 (upper row) and of sample size type 3 with correlation matrix P4 (bottom row)

as they are in the data, the correlation coefficient between the estimated random intercepts are calculated for all full models.

Figure 4.5 displays a plot matrix consisting of nine panels for each of the twelve full models.

In the lower off-diagonal of each plot matrix, scatterplots between the estimated random intercepts in the models for the different parameters are shown. Additionally, a fitted line from a linear regression is added in red to the plot. The upper off-diagonal displays the corresponding correlation coefficients between the random intercepts. In the diagonal, the kernel density estimates of the estimated random intercepts and their true marginal densities are plotted. The twelve plot matrices are arranged with sample size types in the columns and correlation matrices in the rows.

The first row of Figure 4.5 shows the plot matrices for the models of the data generated with correlation matrixP1. For these data, the independence assumption is not violated and the correlation of the estimated random intercepts is close to zero. The models that are based on P2 are shown in the second row. The correlations between the random intercepts in those models are low, and hence, the independence assumption is only violated marginally.

The correlation coefficients only show small deviations from the true ones. In the third and fourth row, the models for the data with medium to high correlations are displayed. In these cases, the estimated correlation coefficients are close to the true values but shrunk to zero due to the independence assumption, e.g., for model (i) the estimates are 0.427, 0.336, and 0.314, whereas the true correlation between any pair of random intercepts is always equal to 0.5. Additionally, for the models that are based on P3 and P4, the shrinkage to zero is less pronounced for higher number of observations per subject. For example, for correlation structureP4 where the random intercepts have correlations of 0.9, -0.9 and -0.9, the estimates for sample size type 1 are 0.882, -0.709 and -0.625 and hence, closer to the true values than for sample size type 3 with values of 0.655, -0.468 and -0.415.

Finally, the kernel density estimates for the random intercepts (diagonals of Figure 4.5) agree well with the marginal normal distribution of the data generating process forµand σ.

For ν, this only applies for the models with a high number of observations per individual.

(a) Sample size type 1 &P1 (b) Sample size type 2 &P1 (c) Sample size type 3 &P1

(d) Sample size type 1 &P2 (e) Sample size type 2 &P2 (f ) Sample size type 3 &P2

(g) Sample size type 1 &P3 (h) Sample size type 2 &P3 (i) Sample size type 3 & P3

(j) Sample size type 1 & P4 (k)Sample size type 2 &P4 (l) Sample size type 3 & P4

Figure 4.5: Plot matrices for the 12 full models with scatterplots between the estimated random intercepts in the models for the three different parameters including a line for the fit of a linear regression (lower off-diagonal), the kernel densities (diagonal) and the pairwise correlation coefficients (upper off-diagonal)

.