• Keine Ergebnisse gefunden

Case II - Effect Does Exist

3.4.3 Analytical Analysis

Figure 3.9: Histograms of the true and transformed locally standardized (normalized) t-values

0.01.02.03.04.05Fraction

−4 −2 0 2 4

(Normalized) locally standardized true t−values cut at |4|

Fractions Normal density Based on own meta−data base

Distribution of locally standardized true t−values

0.02.04.06Fraction

−4 −2 0 2 4

(Normalized) locally standardized converted t−values cut at |4|

Fractions Normal density Based on own meta−data base

Distribution of locally standardized converted t−values

The visual analysis suggests that there are not enough values around the mean value of each study. This would explain both figures; that for all values, as well as the figure for the true and transformed values. Therefore, the conclusion of the visual analysis is that either many values of low or medium significance are missing or have partially been replaced by more significant - or very insignificant - values.

Table 3.23: Regressing log(|t|)on log(n)

Variable coef. robust sd t p

log(n) 0.0448951 0.0203802 2.20 0.028???

Constant 0.0896568 0.1103612 0.81 0.417?

The regression is based on 5049 observation from 548 studies using clustered and robust standard errors (using the studies as clusters).R2=0.0045,F(1,547) =4.85,P(F) =0.028, root mean square error (RMSE) is 1.1899.

The symbols?,??and???represent the significance in a regression without clustered standard errors at a 10,5 and 1% level.

Figure 3.10: Relationship between log(t)and log(n)

−6−4−2024log(|t|)

ln

2 4 6 8 10 12

log(n)

log(t) Linear fit

Based on own meta−data base

Scatterplot and linear fit

Table 3.24: Coefficients from regressing log(|t|)on log(n)for specific subsets

Subset coef. robust sd t p n %w cluster

All 0.04490 0.02038 2.20 0.028??? 5049 100.00 548

True t 0.02421 0.02503 0.97 0.334? 2371 57.77 329

Transformed t 0.09522 0.03196 2.98 0.003??? 2678 42.23 248

t<0 0.04328 0.02141 2.02 0.044??? 3645 74.67 503

continued on the next page. . .

Subset coef. robust sd t p n %w cluster

t>0 0.06598 0.04147 1.59 0.113??? 1404 25.33 299

n≤60 0.09354 0.14737 0.63 0.527 1305 26.52 159

60≤n<250 −0.17363 0.14013 −1.24 0.217?? 1219 26.40 162 250≤n<500 0.85946 0.42491 2.02 0.046??? 1025 16.21 108

n≥500 0.09770 0.05323 1.84 0.068??? 1500 30.87 189

#est. ≤2 0.07476 0.04565 1.64 0.105 153 19.25 101

3≤#est. ≤5 0.03804 0.05973 0.64 0.526 311 17.34 93

6≤#est. ≤11 0.03503 0.04053 0.86 0.389 675 20.85 115

12≤#est. ≤25 −0.00909 0.03643 −0.25 0.803 1248 23.53 129

#est. ≥26 0.09179 0.04224 2.17 0.032??? 2662 19.03 110

S: Europe −0.01965 0.07106 −0.28 0.783 736 12.87 69

S: USA 0.04787 0.02234 2.14 0.033??? 3989 76.61 420

S: Other 0.11029 0.09963 1.11 0.273?? 324 10.52 59

S: Economists 0.01007 0.02669 0.38 0.706 2147 48.11 265

S: Soc. & crim. 0.06384 0.03994 1.60 0.112??? 2497 35.92 197

S: Others 0.17227 0.05384 3.20 0.002??? 380 15.04 80

S: Journal 0.05602 0.02208 2.54 0.011??? 4043 85.30 466

S: Book −0.03650 0.07795 −0.47 0.643 362 5.48 30

S: Paper −0.03076 0.05668 −0.54 0.590 644 9.22 52

S: Cross sections 0.00301 0.06189 0.05 0.961 1198 23.58 131

S: Time series 0.03773 0.03687 1.02 0.308?? 1444 32.00 171

S: Panel data 0.09055 0.03958 2.29 0.024??? 1006 17.26 96

S: Crime data 0.03956 0.02516 1.57 0.117??? 3234 65.52 358

S: Surveys 0.14302 0.04778 2.99 0.003??? 1735 27.58 155

S: Experiments 0.05778 0.08665 0.67 0.508 188 10.11 55

Crime data 0.04079 0.02781 1.47 0.143??? 3048 61.32 340

Surveys 0.17936 0.05945 3.02 0.003??? 1732 25.98 148

Experiments 0.06537 0.04650 1.41 0.164?? 269 12.70 68

Bivariate 0.14258 0.03593 3.97 0.000??? 1491 22.07 153

Multivariate 0.01938 0.02249 0.86 0.389? 3558 77.93 467

The first regression is based on 5049 observation from 548 studies using clustered and robust standard errors (using the studies as clusters). Some categories do not some up to one because of missing or multiple entries.

Variables with a preceding “S:” are measured on the study-level while all others are based on the estimates. n is he number of observations in the corresponding category. %wis the fraction of the summed weights of the corresponding category. The symbols?,?? and???represent the significance in a regression without clustered standard errors at a 10 ,5 and 1% level.

end of thetable 3.24

We have to be cautious when interpreting the results fromtable 3.24, because most coefficients are close to zero and the categories are based on different numbers of observations. Nevertheless, it seems to be that the data offers some clues. There is no evidence, if a publication bias is present, that positive or negative results are more affected. However, there are suspicious results when we look at studies with moderate sample sizes. It seems to be, consultingfigure 3.10, that there are too many low absolute (normalized) t-values, with sample sizes between 250 and 500, than there

should be. On the other hand, there seem to be too many large absolute values for results based on sample sizes between 60 and 250. The high coefficient for surveys may be based on the fact that the results scale more heavily with the sample size than other research methods do. For example, panel data is expected to yield much lower coefficients than Pearson correlations. On the other hand, panel data are usually based on many observations and, therefore, the significance of their results scale much better than in the case of cross sections or time series. The fact that journals have the “best” coefficient contradicts with the common hypothesis that publication bias is more readily found in (refereed) journals. The contrary is the case: results published in working papers and books, which are usually not subject to a referee process, have even the wrong sign but are not significant. So, if at all, these are biased towards zero. The relationship between the sample size and significance is more profound for bivariate methods. To some extent, this may be attributed to the dominant usage of Pearson correlations, multivariate methods which do not have t-distributed standard errors under the null hypothesis, or their complexity. The difference between economists on the one side and sociologists, criminologists and jurists may be partly explained by the fact that economists do use bivariate methods and surveys less often. There seems to be no relationship in the case of studies about Europe (mainly the United Kingdom and Germany). Whether this is based on some artifacts of the data, publication bias or a weaker deterrent effect cannot be unambiguously judged here. It is somewhat disappointing that no clear assertion can be made for the true t-values. This may be partly explained by the fact that t-values are much more often reported by economists who use multivariate methods and employ crime data.

As pointed out by Stanley(2005a), the absence of any relationship can lead to two different conclusions: either there is a publication bias present or there is simply no effect. The latter option is often left aside in the literature, leading to false conclusions and thus has to be considered separately. In the case of the deterrence literature this seems not to be the case, because there is an overall relationship and insignificant results for specific subsets should be based on a bias rather than on the absence of an effect.

Even if there is an effect, the regression itself cannot tell us the characteristics of the bias. Usu-ally, there will be clusters around the typical regions of significance, which reduce the influence of the sample size. As shown insubsection 3.4.2, it seems to be that there is a shift away from

“medium” insignificance in the distribution but no obvious clustering.

Other Evidence

We have also some other statistics at our disposal. We know whether the deterrence variable was the main focus of an estimation or just a covariate (e.g., in studies analyzing the effect of crime on unemployment). In principle, the mean effect should be independent of the focus of the estimation. However, the mean (normalized) t-value for studies which focus on deterrence is

−1.47 (the median is −1.44), while the values for estimates which do not focus on deterrence, differ significantly (mean−1.02, median−1.01). This tentative evidence should not be overrated,

because estimates, which do not focus on deterrence, may lack important features to minimize potential errors, biases and other specialities found in the deterrence literature.