• Keine Ergebnisse gefunden

Appendix 3.3 Generalizability of Results

5.4 Data and Estimation Issues

counterfactual shares can then be calculated, using shares either in the initial year, the final year, or the average of the two. The variance can be calculated using all three methods as a sensitivity test. Changes in this “counterfactual” variance provide an estimate of changes in the returns to unobserved skill.

A more convenient way to correct for composition changes is to re-weight the data for the purposes of calculating the residual variance so that the distribution and prices of observable skills at time t+1 is identical to the distribution and price of skills at time t. The re-weighting procedure is in the spirit of Dinardo, Fortin, and Lemieux (1996) and is described in Lemieux (2002) and Lemieux (2004). The advantages to the re-weighting procedure are two-fold. First, it can be applied even when the data is divided into fine experience-education cells. Second, it provides a whole counterfactual wage distribution and thus makes it possible to compute measures of residual wage dispersion other than the variance, e.g. the ratios between different percentiles of the residual distribution.

It should further be noted that measurement error is an additional factor which may, if its extent changes over time, introduce a change in residual variance which is unrelated to unobserved skills or returns. We already mentioned the case of hyperinflation, where measurement error most likely renders any analysis useless. Our solution to this problem is to consider years for comparison which are less affected by inflation. This is most relevant for the 1980s, where we consider 1980 and 1986 the most appropriate base years. Apart from that we have no means of analyzing if and how measurement error has changed over time in the EPH and thus assume it constant.

Aires. According to the Argentine Census, 46 percent of the Argentinean population lived in this area. As Argentina is mostly urban, trends observed in Buenos Aires are often considered representative for Argentina as a whole.

More urban centers of Argentina were later added to the sample over time, totaling 28 major provincial cities in the most recent incarnations of the survey. There is data with comparable coverage since 1992 for 16 main urban conglomerates in Argentina (henceforth ARG16). Until 2003, the survey was conducted on a semi-annual basis (May and October) before the questionnaire and methodology changed substantially.

We investigate the time series for GBA from 1980-2002, always using the October round of the survey50. For the wage analysis we focus on real hourly wages of workers with one single job only as reported in the EPH questionnaire51. To convert nominal wages into real wages we use INDEC’s historic general consumer price index (IPC) for Gran Buenos Aires and deflate all values to constant October 2000 Pesos.

To underline the explanatory power of the results from the smaller GBA sample, the decomposition analysis is also carried out using the ARG16 sample from 1992-2002 as a robustness check. For the analysis, the sample of urban centers is not continuously expanded to 28 cities as survey coverage increases over time. This is because changes in the survey’s coverage can have substantial effects on the residual variance induced by geographical differences, which we cannot observe. This may be the case even if there are no important changes in the observable means. Regional variation in the ARG16 sample is accounted for by adjusting all incomes to the level of GBA, using a one-time comparison of price levels in 2001. This method effectively incorporates the assumption that relative regional price differences have not changed over time. However, due to the convertibility regime from 1991 to 2001 and the according price stability this assumption may be justified for most years of the ARG16 sample, yet arguable for later years.

50 The May round of 2003 could be used to expand the data by another half a year, however in an analysis of variance this might be rather misleading due to seasonality effects on employment and wages. Data from INDEC clearly shows that there is considerably higher economic activity in May than in October.

51 To avoid effects stemming from changes in the incidence of multiple-job holders this paper focuses on wages from the principal occupation, only. In order to do that one has to discard workers with more than one occupation in order to establish consistency of the data series over time. Before 1995, hourly wage data is only available for those workers with one single job. Even though this may be a minor point, to our knowledge this adjustment to guarantee consistency has not been done yet in any empirical research using EPH data.

Data inspection reveals a strong spike in all wage dispersion figures centered around 1989, the worst year of hyperinflation in Argentina (see Figure 5-6 and Figure 5-7, appendix). Prices soared up to nearly 4000 percent annually, which led to the introduction of the Argentine currency board in April 1991. Measurement error is likely to be higher in times of high inflation, if people have to recall their earnings in an environment of constantly changing prices and wages. Second, during hyperinflation, prices and wages change monthly, weekly or even daily. Since surveys cannot be carried out at the same point in time for the whole sample, sequenced interviewing will introduce an upward bias to the wage variance in times of high price volatility and wage contract turnover

Thus, the figures for the 1980s must be analyzed with caution. Using a base year with a bloated wage variance might lead to wrong conclusions of variance changes over time. What matters for the data quality from periods of high inflation is not only the yearly inflation but also the inflation figures from the month of interviewing. We use 1980 as base years, as there was moderate inflation during both the whole year and in the survey month of October.

We apply the reweighing methodology to analyze changes in the residual variance over time against a base year by re-weighting the observations of the more recent year.

The educational and demographic distribution of the Argentine labor force has changed noticeably since the 1980s. In particular, the overall improvement in educational attainment may have increased wage dispersion over time.

The decomposition is carried out stepwise, following Lemieux (2002): first, a counterfactual wage distribution is generated, using the later year’s observable skill distribution and the base year’s estimated coefficients on observed skills. The difference between the inequality indicators of the final year and those of the counterfactual distribution can be attributed to changes in the returns to observed skills. In a second step, the counterfactual distribution is re-weighted as detailed above. The difference between inequality indicators of the two distributions is ascribed to changes in the skill distribution in the population. Finally, the difference between the distributional indicators

of the base year and the counterfactual distribution using both, base year weights and returns, is the effect of changing returns to unobserved skills.52