Case II - Effect Does Exist
3.6 Multivariate Statistics
3.6.3 Ordinary Least Squares
In the literature, too much emphasis is put on statistical
significance, implicitly assuming a statistically significant effect is economically meaningful in terms of size.
Florax and de Groot(2002)
Although (weighted) ordinary least squares is used in all of the following regression methods, we call those methods (simple) OLS which do not objectively select “important” variables. When possible, we use robust clustered standard errors (each study is treated as a cluster). Although the residuals of all estimates are significantly different from the corresponding normal distribution, a visual inspection of each plot reveals that the deviations do not seem to be severe. Since we will compare all implemented methods insection 4.2, we show the results of the OLS regressions of all variables (table 3.49) and the set of variables which are significant at a 10% level in the first regression (table 3.50). Variables which cause singularity problems are dropped by the algorithm (22 out of 515 variables). In all regressions each dummy is to be interpreted in comparison to its opposite property; e.g., the coefficient of the Author Isaac Ehrlich is to be compared to a study without the participation of Ehrlich. If both values of a dummy are included in a regression, they are compared to the missing values. In the following tables we include the variation of each variable. For non-metric variables it indicates the percentage of entries which differ from the most frequent entry. This means for dummy variables that the largest possible variation is fifty percent.
This information is important when interpreting some variables which are almost constant (i.e., have a very low variation) and influence only very few estimates. For reasons of parsimony, we display only those variables intable 3.49which are significant at a 25% level.
Table 3.49: Multivariate analysis - full OLS
Variable var. coef. t p
Study: not explorative 4.0 −0.546 −1.17 0.241
Study: measuring points 97.4 0.001 1.65 0.100
Study: year of first measure 84.4 −0.000 −1.70 0.089
Study: time span in months 63.2 −0.001 −1.44 0.150
Study: size of second population 0.9 0.001 1.18 0.238
continued on the next page. . .
. . . last page oftable 3.49continued
Variable var. coef. t p
Study: size of first realized sample 58.8 −0.000 −3.13 0.002
Study: size of second realized sample 2.4 −0.001 −3.12 0.002
Study: maximum age in first sample 5.7 −0.019 −1.82 0.069
Study: check for validity 2.1 4.587 1.24 0.214
Study: tests of significance 9.4 −0.604 −1.86 0.063
Study: number of bivariate estimates 32.4 0.013 1.98 0.048
Study: user, tr 48.6 −1.898 −1.95 0.052
Study: publication, journal article 13.7 0.545 1.50 0.135
Study: publication, working paper, report 4.9 −1.002 −1.22 0.224
Study: publication, not a dissertation or master thesis 2.3 −1.929 −2.32 0.021
Study: author, David W. Rasmussen 1.3 −2.134 −1.21 0.226
Study: author, Simon Hakim 1.0 −2.486 −2.11 0.035
Study: author, Raymond Paternoster 2.4 −1.388 −1.58 0.115
Study: author, Isaac Ehrlich 1.0 −1.294 −1.16 0.246
Study: author, Maynard I. Erickson 0.6 1.750 1.69 0.092
Study: author, Jack P. Gibbs 0.9 −1.982 −1.47 0.141
Study: author, Alex R. Piquero 1.3 −2.081 −1.95 0.052
Study: journal, Accident Analysis and Prevention 2.2 −1.634 −1.16 0.248
Study: journal, Studies on Alcohol 1.1 −1.665 −1.25 0.213
Study: author, Germany 3.2 2.641 1.51 0.132
Study: author, Switzerland 1.0 3.645 2.41 0.016
Study: author, Finland 0.8 −4.270 −2.33 0.020
Study: author, Australia 2.0 1.120 1.54 0.125
Study: author, Sweden 0.9 2.474 2.36 0.018
Study: author, other country 3.7 0.942 1.35 0.177
Study: author, criminology 11.3 0.907 1.77 0.077
Study: author, law 3.6 0.710 1.24 0.217
Study: publication, type not applicable 0.4 −3.226 −2.43 0.015
Study: institute, sociology 21.3 −0.723 −1.34 0.179
Study: experiment (laboratory) 4.4 −1.193 −1.21 0.228
Study: experiment (field, institutional initiative) 1.6 −1.746 −2.03 0.043
Study: first population, Canada 4.3 2.516 3.38 0.001
Study: first population, Netherlands 1.3 −1.594 −1.45 0.146
Study: sample base, first population, complete country 36.9 −1.624 −1.68 0.093
Study: sample base, first population, partial country 37.4 −2.006 −2.00 0.046
Study: sample base, second population, complete country 2.4 −2.811 −1.86 0.064
Study: sample base, second population, partial country 2.2 −4.687 −2.29 0.022
Study: sample unit, first population, miscellaneous 7.4 1.250 1.88 0.060
Study: sample unit, second population, individuals 1.7 2.792 1.61 0.108
Study: sample individuals, second population, population 2.6 5.075 2.47 0.014
Study: sample individuals, second population, miscellaneous 1.2 4.603 1.75 0.080
Study: complete sample 9.8 −0.773 −1.58 0.114
Study: PKS is public data base 1.2 2.496 2.30 0.022
Study: miscellaneous public data base 40.1 0.778 2.26 0.024
continued on the next page. . .
. . . last page oftable 3.49continued
Variable var. coef. t p
Study: UCR is public data base 22.9 0.506 1.32 0.187
Study: no public data base 26.3 0.702 2.17 0.030
Study: no class over-represented 0.9 1.840 1.41 0.158
Study: no disadvantaged group 0.6 −2.242 −2.12 0.035
Study: percentage of convicted>75% 0.6 2.653 2.13 0.034
Study: main location>500000 inhabitants 3.5 −1.869 −1.89 0.059
Study: main location<5000 inhabitants 0.2 −7.691 −4.92 0.000
Study: does not claim to be representative 32.5 −0.301 −1.15 0.250
Study: claims to be representative 19.4 −0.547 −1.77 0.078
Study: does not check representativeness 26.9 0.395 1.71 0.087
Study: closed questions for pretest 21.5 1.374 2.45 0.014
Study: mixed questions for pretest 2.1 2.283 2.46 0.014
Study: Guttman reliability method 0.2 9.035 2.54 0.011
Study: miscellaneous reliability method 0.3 −3.552 −2.00 0.046
Study: correlational reliability method 0.2 −4.002 −2.03 0.043
Study: variables reliable 3.9 2.123 1.18 0.238
Study: validity test of some variables 1.5 −3.717 −1.25 0.213
Study: unknown if variables valid 0.3 −7.100 −1.58 0.114
Estimate: deterrence is focus-variable 14.8 −0.327 −1.18 0.240
Estimate: sub-sample 14.9 0.260 1.26 0.207
Estimate: sub-sample of males 1.4 −1.48 0.139
Estimate: sub-sample of non-urban area 0.8 −1.753 −1.52 0.128
Estimate: exogenous, index mean 0.0 −2.071 −1.53 0.127
Estimate: exogenous, index items miscellaneous 0.2 1.671 1.17 0.242
Estimate: exogenous, index items standardized 0.2 2.550 1.58 0.115
Estimate: study type, death penalty 8.2 3.145 1.19 0.234
Estimate: exogenous, crime data, incarceration per crime 0.9 −1.316 −1.47 0.143
Estimate: exogenous, crime data, convicted per crime 0.6 −1.422 −1.43 0.152
Estimate: exogenous, survey, is no experiment 0.9 2.789 1.88 0.061
Estimate: exogenous, survey, probability of detection by police 7.1 −1.048 −2.98 0.003
Estimate: exogenous, survey, probability of punishment by justice 4.5 −1.235 −3.35 0.001
Estimate: exogenous, survey, severity of punishment by justice 3.2 −0.436 −1.24 0.217
Estimate: exogenous, survey, probability of other kind of punishment 0.4 −1.193 −1.39 0.166
Estimate: exogenous, survey, probability of detection by friends or family 0.2 −1.658 −2.23 0.026 Estimate: exogenous, survey, probability of punishment by friends or family 1.6 −1.634 −3.97 0.000
Estimate: exogenous, survey, severity of punishment by friends or family 1.0 −0.678 −1.49 0.136
Estimate: exogenous, survey, time between offense and clearance 0.1 −1.293 −2.03 0.043
Estimate: exogenous, survey, relates to the present 21.3 2.299 1.39 0.164
Estimate: exogenous, survey, relates to the past 2.7 2.723 1.61 0.107
Estimate: exogenous, experiment, yes 7.2 3.426 1.95 0.052
Estimate: exogenous, experiment, no 5.4 3.397 1.89 0.060
Estimate: exogenous, experiment, relates to the present 13.9 −0.841 −1.47 0.142
Estimate: exogenous, experiment, relates to the past 0.5 −1.758 −1.32 0.188
Estimate: exogenous, relates to one year 42.7 0.663 1.62 0.107
continued on the next page. . .
. . . last page oftable 3.49continued
Variable var. coef. t p
Estimate: exogenous, relates to more than one year 12.1 −0.492 −1.20 0.232
Estimate: exogenous, metric category 36.8 1.959 1.80 0.073
Estimate: exogenous, interval category 9.1 1.746 1.46 0.145
Estimate: exogenous, binary category 18.8 1.923 1.75 0.080
Estimate: exogenous, nominal category 0.3 −3.318 −1.44 0.149
Estimate: exogenous, ordinal category 7.4 2.475 2.16 0.031
Estimate: exogenous, in differences 3.1 1.533 1.89 0.059
Estimate: endogenous, index miscellaneous 0.1 1.792 1.18 0.240
Estimate: endogenous, index additive, weighted 0.1 −1.001 −1.53 0.127
Estimate: endogenous, number of registered suspects 0.6 1.601 1.33 0.183
Estimate: endogenous, number of convicted to prison sentence 0.2 2.476 1.98 0.048
Estimate: endogenous, probability of delinquency of fictitious offense (sur-veyed is delinquent)
1.2 1.895 1.77 0.077
Estimate: endogenous, recidivism 0.7 2.768 2.28 0.023
Estimate: endogenous, accidents 4.2 1.232 1.49 0.136
Estimate: endogenous, self reported delinquency since age of fourteen 0.6 2.619 2.73 0.007
Estimate: crime category, misdemeanors 9.5 −1.137 −3.04 0.002
Estimate: crime category, formal deviant behavior 2.5 0.735 1.37 0.170
Estimate: crime category, other 1.5 1.042 1.44 0.150
Estimate: offense, assault 10.1 −0.297 −1.24 0.214
Estimate: offense, negligent assault 1.8 0.695 1.74 0.083
Estimate: offense, burglary 12.2 0.251 1.40 0.163
Estimate: offense, larceny (severe) 3.2 −0.591 −1.89 0.059
Estimate: offense, drug possession (hard) 0.5 −2.585 −1.87 0.061
Estimate: offense, driving without a licence 0.0 −0.858 −1.38 0.168
Estimate: offense, drunk driving 12.1 0.638 1.76 0.079
Estimate: offense, fare dodging 0.4 0.592 1.51 0.132
Estimate: offense, fraud 3.9 0.531 1.55 0.121
Estimate: offense, tax evasion 7.3 0.600 1.54 0.124
Estimate: offense, other 7.1 −0.721 −1.86 0.064
Estimate: offense, vehicle theft 8.5 0.267 1.36 0.175
Estimate: offense, environmental crimes, violations of prescriptive limits 2.3 1.606 1.61 0.107
Estimate: property and violent characteristics 48.8 0.627 1.41 0.159
Estimate: endogenous, metric category 19.5 −1.749 −1.60 0.111
Estimate: endogenous, interval category 4.1 −1.532 −1.26 0.208
Estimate: endogenous, ordinal category 4.6 −3.203 −2.85 0.004
Estimate: endogenous, binary category 9.6 −3.269 −2.92 0.004
Estimate: endogenous, not in logs 32.0 1.232 1.73 0.085
Estimate: endogenous, in logs 26.8 1.285 1.74 0.083
Estimate: endogenous, other transformation 7.8 −2.009 −2.18 0.030
Estimate: covariate, age 20.2 −0.491 −2.02 0.044
Estimate: covariate, marital status 5.6 0.701 1.88 0.060
Estimate: covariate, profession 0.4 1.677 1.89 0.060
Estimate: covariate, social class 1.5 −1.489 −1.72 0.086
continued on the next page. . .
. . . last page oftable 3.49continued
Variable var. coef. t p
Estimate: covariate, drug usage 1.3 −1.124 −1.26 0.209
Estimate: covariate, morality 1.6 1.020 2.02 0.044
Estimate: covariate, personal characteristics 3.0 −0.720 −1.54 0.124
Estimate: covariate, random effects 1.0 1.054 1.45 0.147
Estimate: covariate, poverty, welfare 6.4 0.673 2.34 0.020
Estimate: covariate, urbanity 8.2 0.446 1.48 0.138
Estimate: covariate, GDP 1.4 −1.005 −1.53 0.127
Estimate: covariate, population (-growth) 11.5 0.434 1.64 0.102
Estimate: covariate, alcohol (consumption) 1.7 0.623 1.32 0.188
Estimate: covariate, consumption 2.0 −0.717 −1.41 0.159
Estimate: covariate, risk propensity 0.8 1.905 2.63 0.009
Estimate: no correction for simultaneity 19.3 −1.357 −1.98 0.048
Estimate: unweighted model 6.8 −0.477 −1.25 0.211
Estimate: bivariate method,ρ 0.5 −2.774 −1.57 0.117
Estimate: bivariate method, binomial 0.2 2.063 1.26 0.207
Estimate: multivariate method, COX regression 0.3 2.331 1.17 0.242
Estimate: square root of sample size for negative values 79.2 −0.014 −4.44 0.000
Estimate: square root of sample size for positive values 82.5 0.052 6.90 0.000
N=6530,R2=0.478, number of cluster is 663, 22 out of 515 variables are dropped due to singularity problems.
The columnvar refers to the variation of a variable (i.e., the percentage of valid observations); the maximum variation for dummy variables is fifty percent. candt are the coefficients and the corresponding (normalized) t-values of the included variables. The reference category for dummies is usually the opposite or, in the case of multiple categories, the missing values.
end of thetable 3.49
Since we use the set of variables which are significant at a 10% level as the most simple type of variable selection insection 4.2, we present those results intable 3.50.
Table 3.50: Multivariate analysis - OLS of 10%-significant variables
Variable var. coef. t p
Study: measuring points 97.4 −0.000 −1.54 0.123
Study: year of first measure 84.4 0.000 1.33 0.184
Study: size of first realized sample 58.8 −0.000 −4.99 0.000
Study: size of second realized sample 2.4 −0.001 −7.21 0.000
Study: maximum age in first sample 5.7 −0.006 −1.14 0.256
Study: tests of significance 9.4 −0.661 −2.77 0.006
Study: number of bivariate estimates 32.4 0.009 1.82 0.070
Study: user, tr 48.6 −0.783 −3.78 0.000
Study: publication, not dissertation or master thesis 2.3 0.163 0.42 0.673
Study: author, Simon Hakim 1.0 −0.534 −0.63 0.527
Study: author, Maynard I. Erickson 0.6 0.451 1.23 0.218
continued on the next page. . .
. . . last page oftable 3.50continued
Variable var. coef. t p
Study: author, Alex R. Piquero 1.3 0.181 0.57 0.566
Study: author, Switzerland 1.0 0.566 1.62 0.106
Study: author, Finland 0.8 −1.732 −4.95 0.000
Study: author, Sweden 0.9 0.950 1.82 0.069
Study: author, criminology 11.3 0.267 1.03 0.304
Study: publication, type not applicable 0.4 −1.971 −2.43 0.015
Study: experiment (field, institutional initiative) 1.6 −0.877 −1.24 0.215
Study: first population, Canada 4.3 0.769 3.62 0.000
Study: sample base, first population, complete country 36.9 −1.694 −1.14 0.255
Study: sample base, first population, partial country 37.4 −1.896 −1.28 0.200
Study: sample base, second population, complete country 2.4 −1.637 −2.71 0.007
Study: sample base, second population, partial country 2.2 −1.306 −1.53 0.127
Study: sample unit, first population, miscellaneous 7.4 0.270 0.80 0.423
Study: sample individuals, second population, population 2.6 2.257 2.89 0.004
Study: sample individuals, second population, miscellaneous 1.2 1.920 2.42 0.016
Study: PKS is public data base 1.2 0.323 0.95 0.341
Study: miscellaneous public data base 40.1 0.433 2.20 0.028
Study: no public data base 26.3 0.316 1.48 0.139
Study: no disadvantaged group 0.6 −0.734 −1.88 0.061
Study: percentage of convicted>75% 0.6 0.862 2.42 0.016
Study: main location>500000 inhabitants 3.5 −1.583 −2.01 0.045
Study: main location<5000 inhabitants 0.2 −2.453 −4.21 0.000
Study: claims to be representative 19.4 −0.140 −0.79 0.430
Study: does not check representativeness 26.9 0.478 3.01 0.003
Study: closed questions for pretest 21.5 1.100 4.15 0.000
Study: mixed questions for pretest 2.1 1.666 4.81 0.000
Study: Guttman reliability method 0.2 1.097 2.66 0.008
Study: miscellaneous reliability method 0.3 −2.030 −1.35 0.176
Study: correlational reliability method 0.2 −1.476 −4.96 0.000
Estimate: exogenous, survey, is no experiment 0.9 1.763 2.94 0.003
Estimate: exogenous, survey, probability of detection by police 7.1 −0.675 −3.26 0.001
Estimate: exogenous, survey, probability of punishment by justice 4.5 −0.654 −3.10 0.002
Estimate: exogenous, survey, probability of detection by friends or family 0.2 −1.330 −2.52 0.012 Estimate: exogenous, survey, probability of punishment by friends or family 1.6 −1.291 −4.52 0.000
Estimate: exogenous, survey, time between offense and clearance 0.1 −1.467 −2.88 0.004
Estimate: exogenous, experiment, yes 7.2 −0.688 −1.95 0.051
Estimate: exogenous, experiment, no 5.4 0.282 0.72 0.469
Estimate: exogenous, metric category 36.8 0.592 2.50 0.013
Estimate: exogenous, binary category 18.8 0.367 1.41 0.160
Estimate: exogenous, ordinal category 7.4 0.119 0.45 0.652
Estimate: exogenous, in differences 3.1 1.408 2.63 0.009
Estimate: endogenous, number of convicted to prison sentence 0.2 1.348 2.20 0.028
Estimate: endogenous, probability of delinquency of fictitious offense (sur-veyed is delinquent)
1.2 0.021 0.04 0.965
continued on the next page. . .
. . . last page oftable 3.50continued
Variable var. coef. t p
Estimate: endogenous, recidivism 0.7 1.730 3.63 0.000
Estimate: endogenous, self reported delinquency since age of fourteen 0.6 0.692 1.50 0.135
Estimate: crime category, misdemeanors 9.5 −0.111 −0.46 0.642
Estimate: offense, negligent assault 1.8 0.702 1.75 0.080
Estimate: offense, larceny (severe) 3.2 0.002 0.01 0.995
Estimate: offense, drug possession (hard) 0.5 −1.956 −1.75 0.080
Estimate: offense, drunk driving 12.1 −0.144 −0.66 0.507
Estimate: offense, other 7.1 −0.191 −0.74 0.461
Estimate: endogenous, ordinal category 4.6 −0.565 −1.99 0.047
Estimate: endogenous, binary category 9.6 −0.681 −2.60 0.009
Estimate: endogenous, not in logs 32.0 0.164 0.51 0.611
Estimate: endogenous, in logs 26.8 −0.516 −1.40 0.163
Estimate: endogenous, other transformation 7.8 −0.904 −1.57 0.118
Estimate: covariate, age 20.2 −0.067 −0.34 0.735
Estimate: covariate, marital status 5.6 −0.028 −0.10 0.920
Estimate: covariate, profession 0.4 0.090 0.15 0.882
Estimate: covariate, social class 1.5 −0.620 −0.67 0.505
Estimate: covariate, morality 1.6 0.019 0.07 0.940
Estimate: covariate, poverty, welfare 6.4 1.124 3.50 0.000
Estimate: covariate, risk propensity 0.8 0.664 1.79 0.073
Estimate: no correction for simultaneity 19.3 −0.539 −2.36 0.019
Estimate: square root of sample size for negative values 79.2 −0.013 −3.58 0.000
Estimate: square root of sample size for positive values 82.5 0.054 6.85 0.000
N=6530,R2=0.304, number of cluster is 663, all variables fromtable 3.49which were significant at a 10%
level are selected. The columnvarrefers to the variation of a variable (i.e., the percentage of valid observations);
the maximum variation for dummy variables is fifty percent. candt are the coefficients and the corresponding (normalized) t-values of the included variables. The reference category for dummies is usually the opposite property or, in the case of multiple categories, the missing values.
end of thetable 3.50
While most significant variables remain unchanged in terms of size and significance when pro-ceeding from the large set to the smaller set, some variables change. Being not a dissertation or master thesis (which applies to almost all studies) changes from being significant and negative to positive insignificance. The indicator for Alex R. Piquero reverses its sign while the signs of all other authors remain unchanged. This could be explainable when studies from that author have some special properties which are not taken into account in the second regression. Severe larceny switches from negative significance to positive insignificance, while drunk driving does exactly the opposite, as well as the dummy indicating the logarithm of the endogenous variable. Finally, the impact of most covariates is largely reduced in significance.
All in all, important factors correlated with support of the deterrence hypothesis are the eco-nomic background in general (represented by the user tr who was responsible for all ecoeco-nomic
studies), Finnish studies, very large or small locations, studies which check the reliability of vari-ables with correlations, use the probability and severity of punishment (and the celerity) by offi-cials or friends and family in surveys, as well as estimates which are not corrected for simultaneity.
The opposite effect can be found when Canadian data is studied, when “other” public data bases are used, when the studied individuals have almost all been convicted before, when authors do not check representativeness, when closed or mixed questions are used in a pretest, when the exoge-nous variables is metric or measured in differences, when the deterrence variable relates to prison sentences or recidivism, and, finally, covariates relating to poverty and welfare are implemented.
Last but not least, the technical influence of the sample size and the size of the studied population (which strongly correlates with the sample size) have to be mentioned.
When the results are compared with the bivariate analysis in section 3.5, noteworthy changes are: German authors, when controlling for other effects, are now correlated with less support of the deterrence theory, while studies from Alex R. Piquero are now associated with more support in the larger set (table 3.49). The impact of many variables, which measure deterrence in surveys and appeared to be associated with less support insection 3.5is now reversed. The coefficients of the covariates age, marital status and the social class switch their signs. Curiously, the correlation between the studied offenses and the resulting (normalized) t-values are rather incompatible with those fromtable 3.43.