• Keine Ergebnisse gefunden

Diagnostics Tests for Multinomial Logit Model

Annual per capita household income in Pakistani rupees

8. FACTORS AFFECTING OCCUPATIONAL CHOICES

8.3 Diagnostics Tests for Multinomial Logit Model

Before discussing the results of multinomial logit model, it is appropriate to mention some of the required diagnostic measures for this model. The following sections begins with the details of Independence of Irrelevant Alternatives (IIA), followed by WALD and LR tests for combining dependent categories and finally the collinearity diagnostics.

8.3.1 Independence of Irrelevant Alternatives

The multinomial logit model makes the assumption known as the independence of irrelevant alternatives (IIA). According to IIA, in a multinomial logit model, the (log-) odds of one level of response versus another do not depend on any of the other levels; that is, other possible outcomes are not relevant.

Following LONG & FREESE, 2001 IIA in multinomial logit model is,

IIA or the odds do not depend on other alternatives that are available, is commonly investigated in multinomial logit model using two tests105, HAUSMAN & MCFADDEN test106 and SMALL &

HSIAO test (LONG & FREESE, 2001). However, as a rule of thumb, multinomial logit should only be used in situations where IIA is reasonable, such as when the different response catego-ries are distinct and dissimilar.

The study carried out the HAUSMAN and MCFADDEN test as well as the SMALL & HSIAO test to investigate the violation of IIA. Both of these tests given in Table 34 & Table 35 revealed that the assumption of IIA has not being violated. Furthermore, four of the test statistics in HAUSMAN

test are negative, again evidence of non-violation of IIA (SCHMIDT & STRAUSS, 1975; AMELIE &

ZIMMERMANN, 2004). Hence, we are confident to state that the occupational categories are distinct and cannot be substituted for one another. For a nice description of tests concerning IIA in multinomial logit model, see CHENG & LONG, 2007 and HAUSMAN & MCFADDEN, 1984.

105 As a word of caution, these tests according to LONG & FREESE, 2001 often give inconsistent results and provide little guidance to violations of the IIA assumption. Hence, MCFADDEN, 1973 mentioned in LONG & FREESE, 2001 suggested that IIA implies that the multinomial should only be used in cases where the outcome categories “can plausibly be assumed to be distinct and weighed independently in the eyes of each decision maker.” Similarly, AMEMIYA,1981 cited by LONG & FREESE, 2001 suggests that the MNLM works well when the alternatives are dissimilar.

106 The Hausman test of IIA was calculated following LONG & FREESE, 2001 in the following three steps:

a) Estimation of full model with all J outcomes included, with estimates in ˆ βF.

b) Estimation of a restricted model by eliminating one or more outcome categories, with estimates in ˆ βR. c) Then considering ˆ*

βF as a subset of ˆ

βF after eliminating coefficients not estimated in the restricted model.

The test statistic becomes,

where H is asymptotically distributed as chi-squared with degrees of freedom equal to the rows in ˆ

βR if IIA is true.

Significant values of H indicate that the IIA assumption has been violated.

Table 34. Hausman Test for Assumption of Independence of Irrelevant Alternatives

Category chi2 df P>chi2 Evidence

Formal -114.66 60 1.0000 for Ho

Businessmen -4.02 60 1.0000 for Ho

Purefarmer 17.29 60 1.0000 for Ho

Mixedfarmer 49.93 60 0.8198 for Ho

Puretenant -6.01 60 1.0000 for Ho

Mixedtenant -28.42 60 1.0000 for Ho

Note: Ho: Odds (Outcome-J vs. Outcome-K) are independent of other alternatives

Insignificant values of Hausman test indicate that the IIA assumption has not been violated.

Source: Basic survey carried out in six villages of Northwest Pakistan, 2004-05.

Table 35. Small-Hsiao Tests of IIA Assumption for Multinomial Logit Model

Category lnL(full) lnL(omit) chi2 df P>chi2 Evidence

Formal -1313.563 -1288.456 50.214 65 0.912 for Ho

Businessmen -1270.279 -1239.639 61.280 65 0.608 for Ho

Purefarmer -1742.317 -1712.164 60.306 65 0.642 for Ho

Mixedfarmer -1363.124 -1325.315 75.618 65 0.173 for Ho

Puretenant -1742.346 -1708.834 67.025 65 0.407 for Ho

Mixedtenant -1580.125 -1549.245 61.761 65 0.591 for Ho

Note: Ho: Odds (Outcome-J vs. Outcome-K) are independent of other alternatives.

The small-Hsiao test provides the evidence that IIA has not been violated. For detail explanation of these tests see LONG & FREESE, 2001.

Source: Basic survey carried out in six villages of Northwest Pakistan, 2004-05.

8.3.2 Tests for Combining Dependent Categories

In order to investigate that the outcomes are distinguishable, we carried out WALD and LR tests for combining the dependent occupational categories. As can be seen in Table 36 these two tests provide very similar results. We can reject the hypothesis that our variables do not differenti-ate between cdifferenti-ategories. Therefore, we cannot combine any occupational cdifferenti-ategory. If two outcomes are indistinguishable with respect to the variables in the model, then one can obtain more efficient estimates by combining them (LONG & FREESE, 2001).

Table 36. Wald and LR Tests for Combining Alternatives

Ho: All coefficients except intercepts associated with a given pair of alternatives are 0 (i.e., alternatives can be combined).

Source: Basic survey carried out in six villages of Northwest Pakistan, 2004-05.

8.3.3 Collinearity Diagnostics

Multicollinearity occurs when variables are so highly correlated with each other that they actually convey the same information and it becomes difficult to come up with reliable estimates of their individual regression coefficients (see for details GREENE, 2003; GUJARATI, 2003;

WOOLDRIDGE, 2003; HAMILTON, 2006). The post estimation Stata command ‘collin’ was used to compute the collinearity diagnostics107 given in Table 37. According to HAMILTON, 2006 multicollinearity becomes an issue of concern if the largest Variance Inflation Factor (VIF) is greater than 10 and the mean VIF108 is substantially larger than 1. The current study encounters no problem of collinearity as indicated by Table 37.

107 One can also use the ‘VIF’ command in Stata to perform collinearity diagnostics (LONG & FREESE, 2001).

108 The VIF reflects the degree to which other coefficients variances are increased due to the inclusion of that predator (HAMILTON, 2006).

Table 37. Collinearity Diagnostics for Variables used in Multinomial Logit Model

Variable VIF SQRT-VIF Tolerance1 R-Squared

AGE 1.34 1.16 0.75 0.25

EDU 1.11 1.06 0.90 0.10

HHS 1.75 1.32 0.57 0.43

HHW 1.81 1.35 0.55 0.45

logowlndTot 1.19 1.09 0.84 0.16

logLvStkUnt 1.06 1.03 0.94 0.06

Dalazak 2.14 1.46 0.47 0.53

Gulbela 2.26 1.50 0.44 0.56

Kochian 2.32 1.52 0.43 0.57

Kukar 2.88 1.70 0.35 0.65

Mushtarzai 2.61 1.62 0.38 0.62 Mean VIF =1.86

Condition Number = 14.24

Note: 1) Tolerance or (1-R2) tells what proportion of an x variable’s variance is independent of all other x variables (HAMILTON, 2006).

Source: Basic survey carried out in six villages of Northwest Pakistan, 2004-05.

8.4 Factors Affecting Labor Participation: Results and Discussion of Multinomial Logit