• Keine Ergebnisse gefunden

Skew-Normal and Chi-squared distribution

6.2 Single Predictor Results

6.2.2 Skew-Normal and Chi-squared distribution

0.813. In table 6.5CCAis even better. The bias is smaller, and the coverage is the best afterCOM,GAMLSSandGAMLSS-JSU, reaching validity even ifn=50 or n=1000 with R2=0.75. Both the skew normal and the chi-squared distributions used, are skewed to the right and, precisely, the MDM selectively deletes values on the right side of each distribution. The consequence of this process seems to be that the missing values, at least for these two distributions, don’t make CCA as bad as when X is normally distributed.

NORM fails to produce valid results in most cases, being even worse than CCA es-pecially if R2 ≥ 0.5. For example, if X is skew normally distributed, R2 = 0.5 and n=1000 the coverage ofNORM is 0.535 whileCCAhas coverage of 0.833. Given the values of standard errors and the ratio, the problem ofNORMseems to be caused by the bias of the method when the MDM is more selective. In the case ofX being chi-square distributed coverage values ofNORMcan be as low as 0.107. SinceAMELIArelies on the same normality requirements ofNORM, the simulation results are a close match. This behavior is maintained throughout the remaining simulation experiments.

The Hot Deck methods: PMM,AREG, andMIDAShave negligible biases as the sample size increases, but the results are generally invalid. Only two acceptable estimations are obtained whenn≥200. The first is provided byPMM-20ifR2 =0.75 andn=200, for both simulation settings. The second is given byMIDASifR2 =0.5 andn=1000.

The coverage rates oscillate between 0.864 and 0.928. Concerning the number of donors, the same pattern that was observed for PMM in the Normal case is noticed again here. The coverage decrease in the horizontal direction together with a quick reduction of the ratio of errors. In the vertical direction, the bias and the coverage vary in a parabolic fashion, bias (coverage) decreasing (increasing) up to a certain point and the moving in the opposite way.

IRMIshows again the same extreme behavior as in the previous experiment. This happened too in all experimental settings. The method will be excluded in any further discussion unless it is required by any special reason. The “robust” part emphasized in the name of this method seems to be its weakness.

CART and RF are practically unbiased, but in the current scenario, the coverage ranges from 0.854 to 0.935, below the nominal interval. RF provides its only valid estimation ifR2 =0.75 and n=1000 whenX is chi-square distributed while CARTis never valid. The have similar values in all criteria, the only difference is the slightly smaller estimated standard error ofCART.

While the true distribution of the data is not Normal the use of this assumption for the response model in theGAMLSS-based imputation methods is not an unreasonable choice. The main argument in favor is the flexible individual modeling of the mean and variance for each data point. This should alleviate the problems caused by the

departure from a linear model, like the heteroscedasticity. The expectation was not fulfilled, at least withBAMLSS. The performance is worse than in the first experiment with coverage values as low as 0.604 and biased estimation when R2 = 0.25. The most telling indicator of the flaws of this algorithm is the low ratio of the variances.

It ranges from 0.426 to 0.877 showing the underestimation of the variance.

In the case of GAMLSS and GAMLSS-JSU the results are good. The only cases of under-coverage are seen when n = 50, which may be due to the effect of the low sample size on a semi-parametric model. In the case of the chi-squared distributed covariate, the method had the extra obstacle of a different domain for the imputed values. In both experiments the results are valid if n≥200 although the estimated variances are large. Because of the Johnson’s SU distribution allows for the inclusion of skewness and kurtosis in the model is expected thatGAMLSS-JSUis the better method of its class and indeed it is.

Table 6.4: Skew normal distribution

method n=50 n=200 n=1000

bias cov sd ratio bias cov sd ratio bias cov sd ratio

R2=0.25

COM 0.016 0.956 0.255 1.020 -0.007 0.947 0.124 0.987 0.001 0.953 0.055 1.016 CCA -0.083 0.933 0.349 0.964 -0.090 0.884 0.165 0.934 -0.073 0.813 0.073 0.922 NORM 0.022 0.946 0.391 1.055 0.040 0.931 0.173 0.973 0.069 0.829 0.074 0.959 AMELIA 0.074 0.932 0.399 1.055 0.052 0.913 0.174 0.951 0.072 0.816 0.075 0.969 PMM-1 -0.032 0.906 0.417 0.921 -0.024 0.894 0.166 0.823 -0.001 0.881 0.066 0.804 PMM-3 -0.071 0.922 0.385 0.940 -0.035 0.912 0.162 0.860 -0.003 0.897 0.066 0.836 PMM-5 -0.113 0.939 0.379 0.993 -0.045 0.907 0.160 0.868 -0.006 0.894 0.066 0.848 PMM-10 -0.198 0.954 0.384 1.100 -0.068 0.894 0.159 0.874 -0.011 0.898 0.066 0.858 PMM-20 -0.318 0.940 0.393 1.323 -0.116 0.871 0.163 0.942 -0.021 0.898 0.067 0.864 PMM-D -0.151 0.951 0.380 1.040 -0.089 0.889 0.161 0.913 -0.033 0.887 0.067 0.857 AREG -0.188 0.913 0.424 0.935 -0.070 0.901 0.175 0.894 -0.012 0.909 0.067 0.873 MIDAS -0.014 0.955 0.397 1.083 -0.039 0.920 0.174 0.932 -0.012 0.928 0.075 0.938 IRMI -0.375 0.938 0.413 1.562 -0.399 0.438 0.192 1.557 -0.389 0.000 0.084 1.556 CART -0.090 0.926 0.334 0.902 -0.023 0.884 0.144 0.804 -0.003 0.888 0.062 0.802 RF -0.033 0.923 0.338 0.883 -0.015 0.878 0.145 0.785 0.011 0.869 0.062 0.791 BAMLSS -0.193 0.802 0.333 0.698 -0.107 0.867 0.164 0.881 -0.083 0.777 0.072 0.942 GAMLSS 0.007 0.890 0.436 0.952 -0.029 0.946 0.202 1.059 -0.017 0.952 0.086 1.033 GAMLSS-JSU 0.025 0.929 0.455 1.008 -0.028 0.952 0.202 1.033 -0.033 0.936 0.083 1.003

R2=0.50

COM 0.009 0.956 0.147 1.020 -0.004 0.947 0.071 0.987 0.000 0.953 0.032 1.016 CCA -0.056 0.934 0.213 0.938 -0.053 0.900 0.100 0.936 -0.041 0.833 0.044 0.909 NORM 0.059 0.939 0.220 1.045 0.063 0.876 0.093 0.960 0.076 0.535 0.040 0.942 AMELIA 0.097 0.908 0.221 1.015 0.072 0.864 0.093 0.943 0.078 0.520 0.040 0.944 PMM-1 0.037 0.888 0.218 0.878 0.002 0.865 0.086 0.762 0.004 0.862 0.037 0.754 PMM-3 -0.004 0.911 0.224 0.917 -0.003 0.889 0.087 0.813 0.003 0.889 0.037 0.795 PMM-5 -0.040 0.937 0.232 0.958 -0.007 0.902 0.089 0.834 0.003 0.895 0.037 0.807 PMM-10 -0.130 0.945 0.255 1.112 -0.023 0.903 0.094 0.872 0.003 0.898 0.037 0.827 PMM-20 -0.266 0.903 0.278 1.366 -0.063 0.893 0.103 0.946 -0.000 0.897 0.038 0.831 PMM-D -0.079 0.937 0.243 1.039 -0.038 0.905 0.097 0.894 -0.005 0.900 0.039 0.843 AREG -0.126 0.903 0.269 0.909 -0.032 0.892 0.098 0.856 -0.002 0.899 0.038 0.823 MIDAS 0.003 0.944 0.245 1.080 -0.009 0.930 0.101 0.920 0.002 0.918 0.044 0.910 IRMI -0.355 0.895 0.302 1.780 -0.371 0.143 0.139 1.809 -0.369 0.000 0.062 1.806 CART -0.085 0.918 0.217 0.980 -0.019 0.885 0.086 0.829 -0.001 0.880 0.036 0.790 RF -0.016 0.935 0.212 0.948 -0.004 0.885 0.085 0.802 0.010 0.864 0.036 0.778 BAMLSS -0.147 0.785 0.213 0.538 -0.049 0.879 0.101 0.744 -0.029 0.847 0.043 0.803

Table 6.4: Continuation of table on previous page

method n=50 n=200 n=1000

bias cov sd ratio bias cov sd ratio bias cov sd ratio

GAMLSS 0.028 0.902 0.277 0.934 0.018 0.938 0.122 1.056 0.017 0.940 0.045 1.001 GAMLSS-JSU 0.038 0.937 0.299 1.078 0.019 0.951 0.123 1.040 0.008 0.957 0.049 1.045

R2=0.75

COM 0.005 0.956 0.085 1.020 -0.002 0.947 0.041 0.987 0.000 0.953 0.018 1.016 CCA -0.033 0.930 0.130 0.946 -0.027 0.909 0.060 0.963 -0.022 0.859 0.026 0.932 NORM 0.060 0.918 0.127 0.970 0.049 0.847 0.056 0.904 0.052 0.457 0.024 0.916 AMELIA 0.084 0.921 0.127 0.944 0.056 0.845 0.058 0.918 0.053 0.452 0.025 0.952 PMM-1 0.057 0.856 0.122 0.776 0.014 0.862 0.052 0.748 0.005 0.868 0.023 0.785 PMM-3 0.044 0.926 0.140 0.914 0.018 0.879 0.054 0.806 0.007 0.885 0.024 0.834 PMM-5 0.018 0.951 0.159 1.018 0.021 0.889 0.057 0.834 0.008 0.889 0.024 0.842 PMM-10 -0.057 0.970 0.192 1.259 0.019 0.915 0.062 0.903 0.012 0.878 0.024 0.845 PMM-20 -0.201 0.933 0.223 1.531 -0.005 0.948 0.071 1.007 0.017 0.864 0.025 0.863 PMM-D -0.013 0.971 0.173 1.131 0.012 0.932 0.065 0.941 0.020 0.864 0.026 0.886 AREG -0.098 0.895 0.191 0.892 -0.012 0.907 0.064 0.886 0.002 0.912 0.025 0.892 MIDAS 0.028 0.944 0.166 1.147 0.012 0.910 0.062 0.912 0.006 0.913 0.027 0.918 IRMI -0.338 0.878 0.251 2.309 -0.355 0.026 0.115 2.322 -0.364 0.000 0.051 2.201 CART -0.086 0.884 0.162 0.977 -0.023 0.827 0.058 0.726 -0.002 0.837 0.022 0.721 RF -0.010 0.931 0.148 0.960 0.001 0.874 0.054 0.782 0.007 0.854 0.022 0.761 BAMLSS -0.078 0.788 0.138 0.426 0.022 0.872 0.059 0.752 0.033 0.718 0.025 0.877 GAMLSS 0.034 0.913 0.183 0.940 0.029 0.938 0.072 1.105 0.015 0.936 0.027 1.010 GAMLSS-JSU 0.060 0.934 0.206 1.238 0.008 0.964 0.084 1.025 -0.001 0.961 0.035 1.142

Table 6.5: Chi-squared distribution

method n=50 n=200 n=1000

bias cov sd ratio bias cov sd ratio bias cov sd ratio

R2=0.25

COM -0.005 0.952 0.261 1.001 -0.001 0.945 0.124 0.988 -0.000 0.953 0.055 1.012 CCA -0.082 0.920 0.378 0.909 -0.050 0.912 0.175 0.903 -0.041 0.891 0.076 0.918 NORM 0.016 0.938 0.437 0.997 0.102 0.886 0.186 0.936 0.119 0.644 0.079 0.945 AMELIA 0.053 0.922 0.456 1.000 0.113 0.868 0.188 0.926 0.121 0.657 0.079 0.969 PMM-1 -0.073 0.915 0.461 0.940 -0.016 0.890 0.172 0.832 -0.002 0.878 0.066 0.799 PMM-3 -0.122 0.929 0.420 0.960 -0.029 0.901 0.168 0.827 -0.004 0.881 0.066 0.826 PMM-5 -0.163 0.938 0.413 0.979 -0.044 0.906 0.169 0.848 -0.007 0.902 0.067 0.837 PMM-10 -0.243 0.940 0.418 1.094 -0.071 0.907 0.171 0.881 -0.014 0.895 0.067 0.845 PMM-20 -0.331 0.935 0.423 1.274 -0.122 0.888 0.175 0.946 -0.028 0.897 0.069 0.871 PMM-D -0.200 0.942 0.412 1.016 -0.093 0.903 0.173 0.909 -0.041 0.877 0.070 0.877 AREG -0.213 0.913 0.454 0.946 -0.071 0.917 0.184 0.903 -0.019 0.910 0.069 0.887 MIDAS -0.061 0.956 0.437 1.051 -0.038 0.926 0.185 0.940 -0.018 0.927 0.078 0.945 IRMI -0.393 0.946 0.447 1.530 -0.378 0.587 0.206 1.533 -0.379 0.000 0.090 1.570 CART -0.116 0.920 0.370 0.886 -0.019 0.908 0.152 0.831 -0.006 0.867 0.063 0.768 RF -0.091 0.921 0.373 0.883 -0.012 0.884 0.152 0.781 0.008 0.876 0.064 0.805 BAMLSS -0.343 0.708 0.372 0.678 -0.245 0.699 0.189 0.543 -0.228 0.478 0.083 0.261 GAMLSS -0.057 0.927 0.479 0.935 -0.002 0.941 0.200 0.970 -0.011 0.952 0.081 0.993 GAMLSS-JSU -0.053 0.922 0.498 0.999 -0.046 0.954 0.234 1.077 -0.020 0.961 0.099 1.059

R2=0.50

COM -0.003 0.952 0.153 1.001 -0.001 0.945 0.073 0.988 -0.000 0.953 0.032 1.012 CCA -0.045 0.918 0.236 0.909 -0.018 0.920 0.108 0.906 -0.014 0.926 0.047 0.942 NORM 0.086 0.929 0.263 1.046 0.129 0.738 0.102 0.865 0.129 0.178 0.043 0.875 AMELIA 0.116 0.913 0.284 1.083 0.138 0.727 0.105 0.878 0.130 0.176 0.045 0.907 PMM-1 0.020 0.901 0.260 0.887 0.024 0.846 0.093 0.731 0.008 0.866 0.039 0.758 PMM-3 -0.033 0.919 0.260 0.917 0.017 0.864 0.096 0.767 0.009 0.879 0.040 0.808 PMM-5 -0.078 0.932 0.271 0.988 0.011 0.889 0.100 0.800 0.010 0.894 0.040 0.821 PMM-10 -0.158 0.948 0.290 1.167 -0.012 0.909 0.107 0.872 0.011 0.880 0.041 0.837 PMM-20 -0.274 0.927 0.309 1.373 -0.063 0.910 0.118 0.997 0.007 0.899 0.042 0.857 PMM-D -0.116 0.941 0.280 1.052 -0.031 0.914 0.112 0.915 -0.000 0.915 0.044 0.879 AREG -0.145 0.914 0.299 0.947 -0.025 0.906 0.110 0.858 -0.003 0.911 0.043 0.881

Table 6.5: Continuation of table on previous page

method n=50 n=200 n=1000

bias cov sd ratio bias cov sd ratio bias cov sd ratio

MIDAS -0.018 0.945 0.286 1.121 0.003 0.934 0.115 0.931 0.001 0.937 0.048 0.957 IRMI -0.365 0.913 0.333 1.759 -0.359 0.282 0.154 1.819 -0.364 0.000 0.066 1.876 CART -0.113 0.885 0.250 0.968 -0.028 0.857 0.097 0.794 -0.006 0.838 0.039 0.721 RF -0.048 0.925 0.247 0.977 0.004 0.877 0.096 0.784 0.009 0.865 0.039 0.781 BAMLSS -0.286 0.713 0.259 0.572 -0.164 0.780 0.146 0.665 -0.107 0.604 0.061 0.596 GAMLSS 0.024 0.940 0.321 1.030 0.036 0.947 0.122 1.040 0.011 0.958 0.051 1.101 GAMLSS-JSU 0.037 0.957 0.369 1.140 -0.026 0.963 0.180 1.081 -0.047 0.944 0.076 1.102

R2=0.75

COM -0.002 0.952 0.086 1.001 -0.000 0.945 0.041 0.988 -0.000 0.953 0.018 1.012 CCA -0.024 0.938 0.142 0.930 -0.008 0.935 0.063 0.950 -0.005 0.940 0.027 0.980 NORM 0.101 0.911 0.148 0.925 0.094 0.643 0.059 0.746 0.087 0.107 0.025 0.768 AMELIA 0.129 0.877 0.157 0.936 0.103 0.682 0.065 0.825 0.089 0.172 0.029 0.876 PMM-1 0.078 0.831 0.144 0.758 0.033 0.796 0.056 0.669 0.009 0.858 0.025 0.747 PMM-3 0.052 0.921 0.171 0.937 0.041 0.836 0.062 0.754 0.014 0.871 0.026 0.811 PMM-5 0.010 0.948 0.194 1.099 0.045 0.854 0.065 0.802 0.019 0.855 0.026 0.826 PMM-10 -0.082 0.973 0.227 1.388 0.040 0.900 0.073 0.901 0.026 0.825 0.027 0.847 PMM-20 -0.216 0.945 0.254 1.647 0.004 0.954 0.085 1.053 0.035 0.760 0.028 0.858 PMM-D -0.028 0.970 0.210 1.230 0.029 0.930 0.078 0.973 0.039 0.740 0.029 0.894 AREG -0.103 0.891 0.210 0.870 -0.012 0.904 0.074 0.886 0.003 0.929 0.028 0.935 MIDAS 0.024 0.955 0.207 1.263 0.026 0.912 0.072 0.918 0.010 0.933 0.029 0.945 IRMI -0.349 0.872 0.279 2.285 -0.353 0.074 0.127 2.327 -0.359 0.000 0.055 2.360 CART -0.109 0.856 0.198 0.987 -0.041 0.804 0.073 0.714 -0.011 0.787 0.027 0.643 RF -0.020 0.936 0.180 1.002 0.005 0.846 0.067 0.748 0.006 0.864 0.025 0.757 BAMLSS -0.243 0.708 0.194 0.447 -0.067 0.820 0.114 0.549 0.007 0.800 0.046 0.657 GAMLSS 0.044 0.935 0.214 1.074 0.036 0.939 0.078 1.082 0.002 0.960 0.034 1.147 GAMLSS-JSU 0.071 0.943 0.258 1.307 0.009 0.968 0.116 1.124 -0.013 0.960 0.045 0.959