• Keine Ergebnisse gefunden

6 Data analysis methods

6.2 Latent class analysis

1 ( :=YP Y =

Ei i i .

This is the defined error term that serves to compute the goodness-of-fit statistic ‘G’. If the variance of the estimated probabilities becomes smaller, ‘G’ becomes larger. The underlying assumption is that a small variance will facilitate the estimation of the dependent variable. Therefore, when a small variance occurs, a possible estimation error indicates that the observed distribution of the dependent variable is not congruent with the estimated distribution of the dependent variable (Urban 1993).

6.2 Latent class analysis

With the above-described analysis methods, we obtain one result for all the data. In some cases, the data suggest that there are different classes of respondents or parameters since ‘[…] The standard aggregate model fails to take into account the fact that preferences (utilities) differ from one respondent to another (or at least from one segment to another)’ (Magidson et al. 2003 p. 1). One appropriate analysis method for determining whether respondents can be grouped is latent class analysis (LCA) (Goodman 1974; McDonald 1962), which seeks to model latent classes or categories underlying observed relationships (Loehlin 1998). Latent class analysis is closely related to two other methods that investigate latent elements in a model. The first is factor analysis, a latent variable method in which the factors are unobserved

hypothetical variables that underlie and explain the observed correlations. The second is item response theory, or latent trait theory, in which a latent variable (the underlying trait being measured) is fitted to responses in a series of test items (Loehlin 1998). All three methods are primarily used in psychology and social sciences. LCA has its origins in the latent structure analysis of Lazarsfeld (1968), which is concerned with the probability relation between the set of observed indicators and the inferred position of the units involved in an empirical study. The principal goal of this method is the division of heterogeneous groups into homogeneous and statistically unrelated subgroups (Reunanen & Suikkanen 1999 p. 6). Central to this goal is the principle of local independence. Lazarsfeld & Henry (1968) indicate that the relation between the latent classes and the observable items is defined by the axiom of local independence, which states that, within a class, the items are all independent of one another. In other words, this definition states mathematically that the latent variable explains why the observed items are related to one another: The association of two items is expressed by a third observable variable (Lazarsfeld & Henry 1968). If we transfer the principle of local independence to individual people, it signifies that they are similar with regard to a certain latent property or latent continuum if they produce a statistically unrelated distribution in tests measuring this continuum. The latent continuum (the third observable variable) expresses a general attitude of coding units (persons) towards several questions on a particular subject, for example, renewable energies. The general attitude towards renewable energies is the continuum along which a respondent is positioned at a certain point (Reunanen & Suikkanen 1999).

A statistically unrelated distribution implies that the ratio of ‘yes’ and ‘no’

answers stays the same for different questions. For example, the first group consists of 108 respondents of whom 90 respondents answer ‘yes’ to the first question and, of the 90 ‘yes-respondents’, 75 answer ‘yes’ to a second question and 15 answer ‘no’ to the second question. This is a ratio of 75:15. In a second group of 18 respondents, 15 answer ‘no’ to the first question and 3 also answer ‘no’ to the second question. The ratio here is 15:3, which is the same as 75:15 (Reunanen & Suikkanen 1999 p. 4).

Statistically unrelated means if we add up more questions, the ratio of ‘yes’ and ‘no’

answers will stay the same. The answer to the first question does not affect the answer to the second question. This kind of unrelated distribution is homogeneous with regard to the latent property measured by the variables. Variables are questions, in this

example ‘attitude towards renewable energies’. Additionally, statistically unrelated distribution means that the mathematical probability of joint occurrence of certain answers is the same as their real percentage in the observed data (Reunanen &

Suikkanen 1999 p. 4).

The first step in LCA is to compute a one-class solution for the data. It means that the total ratio (probability) of answers to, say, three different questions (variables) is calculated. In a hypothetical example, the first variable has a ratio of 0.861 for the statements ‘I agree’ (0), 0.056 for ‘I cannot say’ (1), and 0.083 for ‘I disagree’ (2) (Reunanen & Suikkanen 1999 p. 7). For each coding unit, there is a coding pattern that displays the structure of answers one respondent gives to different questions. For example, 000 (‘I agree with all three statements’) is a coding pattern. A second respondent has 020 as coding pattern (‘I agree with the first and third statement, but I disagree with the second one’). From the coding patterns, the log-likelihood index is computed describing the probability of the whole data set under the one-class solution.

To do this, the logarithms of each coding unit’ probabilities are added up. The coding pattern of coding unit 1 is 000:

ln(0.861) + ln(0.750) + ln(0.556) .

The figures in brackets are the probabilities of ‘I agree’ answers (=0) for all respondents with regard to three questions. The greater the probabilities of coding units are, the better is the log-likelihood index and the more homogeneous is the group.

If the sum of the coding pattern probabilities (p) is smaller than one, the variables are statistically related. The log-likelihood index expresses the degree of the variables’

relatedness (Reunanen & Suikkanen 1999 p. 7).

In the second step, LCA calculates a solution for several classes from the data.

As mentioned above, the aim of LCA is to divide the data into subgroups in such a way that the variables in each group are as unrelated as possible. To reach this goal, the data are first divided into two randomly formed groups. Next, an iterative grouping follows until the log-likelihood index for the two-class solution is as good as possible.

Then, more and more classes are computed, which improves the log-likelihood index.

The greater the number of classes, the more homogeneous they are. It is important to note that LCA seeks to determine the structure of data and not which coding unit belongs to which class because one coding unit may belong to different classes. For instance, one class may be in favour of the use of renewable energies (in our example,

the variable/question concerns investment into the research and development of renewable energies), while the second class is against the use of renewable energies and the third one chooses neither-nor (Reunanen & Suikkanen 1999 pp. 7-8).

The reverse of the one-class solution is the saturated model. In this model, the homogeneity of classes is perfect if the variables in each class are unrelated.

Homogeneity can be improved by increasing the number of classes. If each coding pattern has its own class, then perfect homogeneity within the class is reached (saturated model). For example, if there are eleven coding patterns (000, 010, 220, etc.) and eleven classes, the classes’ homogeneity would be perfect (Reunanen & Suikkanen 1999 p. 11). This is not advisable because a high number of classes are too complex to interpret.

Ultimately, the goal is to find the right number of classes between the one-class solution and the saturated model. Several indices can help determine the optimal number of classes:

- BIC: best information criterion

- AIC: Akaike’s information criterion

- CIC: flattest multiplier

If all three indices suggest the same number of classes, this would be the best solution.

BIC is the strictest index because it suggests the smallest number of classes. This holds for all three indices: the smallest number in the output of AIC, BIC, and CIC indicates the best suggested number of classes. After having computed the best number of classes, a chi-square distributed test statistic is used to compare the log-likelihood index of the respective class solution (H0)with the saturated model (H1). However, Reunanen and Suikkanen (1999 p. 13) conclude ‘that all these indexes are just supporting devices, and they must not be obeyed blindly.’ This statement refers to a data set in which all three indexes suggest different numbers of classes: for example, BIC suggests the one-class solution, and AIC and BIC support a two-class solution.

Normally, the smallest number of classes — which, in our example, is one class — is the best one. But perhaps the researcher prefers the two-class solution of AIC and CIC based on the specification of his or her data set.