• Keine Ergebnisse gefunden

4. A hidden Markov model for panel data 64

4.4. Selection of covariables

0.200.250.300.350.400.45

time

shares of income groups

1970 1980 1990 2000 2010

Figure 4.9.:Split model: Shares of income groups. Solid line: income group 1, dashed line: income group 2, dotted line: income group 3.

groups. Due to the assumed serial dependence, hidden Markov models yield much more stable estimates and classification results compared with the mixture model from Sec-tion 4.2.

The first extension of the model, which allows each country to either run in a hidden Markov model or in a fix income group seems not to be able to capture all the aspects we observed in the mixture model and in the hidden Markov model. Splitting the hidden Markov part of the extended model into an advancement and a declining part yields the desired results. A drawback of this model is that countries can now either ascend or decline, thus multiple switches of one country like Angola or Iraq perform in the general hidden Markov model are impossible. In addition, one should be aware of the fact that the parameter estimates of the hidden Markov parts of the split model are based on a few observations only, since most of the countries are assigned to the fix-state part of the model.

0.0000.0020.0040.0060.0080.010

time

p−values

1970 1980 1990 2000 2010

0.00.20.40.60.81.0

time

p−values

1970 1980 1990 2000 2010

Figure 4.10.: Mean regression GDP: p-values. Left: Intercept (solid line), years of schooling (dashed line), life expectancy (dotted line). Right: Investment share of GDP (solid line), latitude (dashed line), fertility rate (dotted line). Dash-dotted: 5% and

10% level.

availability issues. In Table 4.2 those countries which are not considered in the following models are indicated by ’-’.

To improve comparability of the results we standardize all covariables to mean 0 and standard deviation 1.

Mean regression in the income groups

The first step when including covariables to the models is the choice of explanatory variables. To get an idea of a reasonable model, we study the influence of the variables investment share of GDP,average years of schooling,life expectancy,latitudeandfertility rate on the response variableGDP of all countries. In addition, we add an intercept to our model.

0.00.20.40.60.81.0

time

p−values

1970 1980 1990 2000 2010

0.00.20.40.60.81.0

time

p−values

1970 1980 1990 2000 2010

Figure 4.11.: Mean regression component 1: p-values. Left: Intercept (solid line), investment share of GDP (dashed line), years of schooling (dotted line). Right: Life expectancy (solid line), latitude (dashed line), fertility (dotted line). Dashed-dotted

lines: 5% and 10% level.

The p-values of the estimated linear model are shown in Figure 4.10. They indicate that the variablesyears of schooling,life expectancy and latitude might affect the GDP of a country. To gain more insight, we use the classification of the mixture model to divide the countries into three income groups and perform linear regression in each group, using the covariables from above.

Once the countries are divided into income groups, none of the variables seems to be significant for explaining the GDP in mean. As an example, Figure 4.11 shows the p-values of the model in income group 1, the other income groups yield similar results.

Thus, it is probably more reasonable to perform regression on the mixing probabilities.

Regression for the mixing probabilities

Now we use the a-posteriori mixing probabilities for each countryi= 1, . . . , I, from the estimated mixture model in Section 4.2,

ˆ

π(t)k,i= πˆk(t)g(x(t)t,i; ˆϑ(t)k )

K

k=1πˆk(t)g(x(t)t,i; ˆϑ(t)k )

and perform a linear regression for each component k = 1, . . . , K. The response vari-able is the (probit-)transformed a-posteriori probability Φ−1(ˆπk,i(t)), where Φ denotes the distribution function of a Gaussian distribution and the covariables are chosen as in the model above. The corresponding p-values for component 2 are shown in Figure 4.12.

Since income groups 1 and 3 yield similar results, we observe that next to the inter-cept, the variables years of schooling, latitude and life expectancy might influence the probability of a country to be in a certain income group. Thus, we reduce the model and again perform linear regressions of the transformed a-posteriori probabilities on the

0.00.20.40.60.81.0

time

p−values

1970 1980 1990 2000 2010

0.00.20.40.60.81.0

time

p−values

1970 1980 1990 2000 2010

Figure 4.12.: Regression for mixing probabilities: p-values component 2. Left: In-tercept (solid line), investmentshare of GDP (dashed line), years of schooling (dotted line). Right: Life expectancy (solid line), latitude (dashed line), fertility rate (dotted

line). Dashed-dotted lines: 5% and 10% level.

0.00.20.40.60.81.0

time

p−values

1970 1980 1990 2000 2010

0.00.20.40.60.8

time

p−values

1970 1980 1990 2000 2010

0.000.050.100.150.20

time

p−values

1970 1980 1990 2000 2010

Figure 4.13.: Reduced model: p-values of regression for mixing probabilities. Left:

income group 1, mid: income group 2, right: income group 3. Covariables intercept (solid line), years of schooling (dashed line), life expectancy (dotted line), latitude

(dash-dotted line). Longdash: 5% and 10% level.

variables intercept,years of schooling,latitude and life expectancy.

The p-values of the reduced models are shown in Figure 4.13. We observe that the vari-able life expectancy is highly significant in the three income groups with p-value close to zero. In the first income group beginning in 1990 the p-value of the variable latitude rises over the 10% significance level, while the remaining covariables are significant at the 5% level over time (except of the intercept in a period between 1990 and 1992). In the second income group the p-value of the intercept is quite high from 1970–1992 and while close to zero in the beginning of the observed time horizon, from 1990 the p-value of the variable years of schooling rises and stays over the 10% level after 1999. In the third income group all variables have p-values close to zero over the 41 years, except of the variable latitude with p-value close to zero from 1970 until 1993, rising above the 10% significance level between 1997 and 2000 and after 2007.

The estimated coefficients of the reduced models show that the chosen covariablesyears of schooling, life expectancy and latitude seem to have positive effects on the GDP of a country. We observe that in income group 1 the signs of the estimated coefficients are almost always negative. Thus, increasing years of schooling, life expectancy and latitude lowers the probability of a country to be in income group 1. In income group 2 the variables years of schooling and latitude still have negative signs, while the variable life expectancy has positive influence on the probability of a country to be part of income group 2. In income group 3 the effects of all three covariables have positive signs.

4.5. Switching Regression: Cross sectional analysis with