• Keine Ergebnisse gefunden

The purpose of our empirical approach is to model how the occurrence and timing of

“publishing a creator-owned comic book for the first time in a career” is related to social in-fluence and social status. As the statistical framework within which to test our hypotheses, we used discrete-time hazard models with a complementary log-log link function, using random effects and linear time dependence.

Because our data set records the career histories of comic creators aggregated yearly, time is measured in discrete steps. The unit of observation is a creator-year. In many cases, our career history data are censored, as a proportion of creators quit their careers without ever publishing a creator-owned book. Moreover, some creators might have published a creator-owned book after the data collection ended in 2014. Hence, discrete-time hazard models are suitable for addressing this characteristic (Allison, 1982; Singer and Willett, 2003).

The key idea of discrete-time hazard models is to define an unobservable variable calledhazardas the dependent variable that controls both the occurrence of the eventand the length of time until the event occurs. LetT be a discrete random variable denoting the unobserved time of the event occurrence. The discrete-time hazard is the conditional prob-ability that individualiexperiences the event at timetigiven he or she did not experience the event in an earlier period

λi(t) =Pr(Ti =t|Tit).

In our analysis, this translates into the following definition: hazard is the conditional probability of publishing a creator-owned comic book in thet-th year of his or her career, given a creator has never published a creator-owned comic book before and the particular values for the explanatory variables in that period.

The link function mathematically specifies how hazard depends on time and the explanatory variables. In his landmark paper, Cox (1972) uses the logistic regression function, which remains a popular choice today because most researchers are familiar with its interpretation. However, the logit model is the most suitable when the event happens in discrete time steps. The logit link is no longer the best choice when events unfold in continuous time (i.e., we observe the event occurrence grouped or aggregated in discrete time). The analysis of grouped data based on the logit specification can lead to inferences sensitive to the choice of interval length (e.g., monthly vs. yearly) (Singer and Spilerman, 1976). Prentice (1978) derives the discrete-time hazard function

λi(t) = 1−exp[−exp(x0itβ)]

that does not suffer from this problem. Solving this equation yields the complementary log-log link function

cloglogλi(t) =log[−log(1−λi(t)] =x0itβ.

As we observe the event grouped in yearly time intervals, the complementary log-log link function is thus the appropriate choice for our models.

Another relevant specification decision is how to model the effect of time. We specified two types of time effects: a linear tenure effect and year fixed effects using a dummy variable specification. The usual starting point to model how tenure relates to the transition hazard is the so-called general specification, which would have implied using 39 dummy variables, one for each possible level of tenure in our sample. Since our data set is huge (more than 30,000 creator-years) but sparse (fewer than 3000 transitions to entrepreneurship), this general specification performs poorly for higher tenure values where the risk set becomes small as only a few creators remain in the sample for so many years. Table 5.4 shows the decrease in the size of the risk set.

Figure 5.2 shows that the estimated rate of transitions to entrepreneurship varies regularly around an almost linear negative trend. We hence used the variable tenure, which counts the number of years since the respective creator published his or her first comic book in a linear specification to control for the main effect of time. The transition rate per calendar year shows strong fluctuations that indicate macroeconomic or other industry-wide effects on the entrepreneurial entry rate. Figure 5.3 depicts these fluctuations. To control for these types of effects, we chose a dummy variable specification for the calendar years. In comparison to the general specification of the main effect of time (tenure), the

Table 5.4.: Sample size and transitions to entrepreneurship by tenure Tenure Sample size Transitions Transition rate Survival rate

2 6,999 468 0.07 0.93

dummy specification is feasible here, as we observe a sufficient number of transitions per calendar year.

A potential source of bias in our models is the unobserved heterogeneity among individuals. For example, some artists could come from a family with entrepreneurial role models and have a higher propensity toward entrepreneurship than others, which is not captured by the covariates. To address this potential source of bias, we included individual-level random effects in all our models. The random effects model can deal with the correlation between repeated observations (creator-years) obtained for the same individual. In that sense, the random effects account for the unobserved covariates or other forms of heterogeneity between individuals (cf. Scheike and Jensen, 1997). The implicit critical assumption of the random effects model is that the unobserved covariates are not correlated with the explanatory variables.

Figure 5.2.: Graph of the transition rates to entrepreneurship per tenure

0.00 0.05 0.10 0.15

10 20 30 40

Tenure

Estimated transition rate

estimate spline

Figure 5.3.: Graph of the transition rates to entrepreneurship per year

0.00 0.05 0.10 0.15

1990 1995 2000 2005 2010 2015

Year

Estimated transition rate

To test the proposed hypotheses using our data, we estimated a series of discrete-time event history models. We used the statistical programming language R version 3.3.1 (R Core Team, 2016) and the functionglmerfrom thelme4package version 1.1-15 (Bates et al., 2015) for all the estimations. We started with a baseline model comprising all the covariates and then consecutively added the explanatory variables measuring cohesion and equivalence to test for social influence or award winning and centrality to test for the status effects. We further varied the model specification and used alternative operationalizations to check the robustness of our results. In this chapter, we summarize the estimation results and provide tables as well as graphics to illustrate the empirical findings.