• Keine Ergebnisse gefunden

CHAPTER 3: TECHNICAL EFFICIENCY AND ORGANIZATIONAL

3.3 M ETHODS AND D ATA

3.3.1 Theoretical Model

3.3.1.1 Exponential stochastic frontier model

A stochastic production frontier is used to estimate firms’ technical efficiency. In particular, we assume that Ν firms can produce output y by using a vector of inputs 𝐱∈ R!!. The production frontier model (in logarithmic terms) can be written in the following way:

𝑦! =𝑥!!𝜷+𝑣! −𝑢!, (1) where y is the logarithm of the output of production, x is a vector of the logarithm of inputs, 𝛃 is a vector of parameters to be estimated, v! is a two-sided symmetric error term that accounts for white noise, and u! is a non-negative one-sided error component that measures inefficiency. The output is specified as the an output index, the vector of inputs x consists of transport distance, labor in number of employees and total production capacity. While the two-sided error term v! is assumed to follow a Normal distribution with zero mean and variance σ!!, we assume an exponential distribution for the inefficiency component u! with the rate parameter λ!:

𝑢! ~ 𝐸𝑥𝑝(𝜆!) (2)

Technical efficiency (TE) estimates, which are bounded on the unit interval, can be obtained by taking the expectation of e!!!. However, since the objective of this study is not only to examine the efficiency levels of dairy processing firms but also the determinants of their inefficiency, the rate parameter λ! can be expressed as a function of firm-management characteristics as follows:

𝜆! = 𝑒𝒛!!𝜹 (3)

where 𝐳 is a vector of potential determinants of technical inefficiency, and 𝛅 is the L × 1 vector of parameters to be estimated.

3.3.1.2 Bayesian inference

We use Bayesian techniques to estimate the model in equations (1-3) (van den Broeck et al., 1994). Bayesian methods are particularly useful in stochastic frontier analysis since latent variables, like the inefficiency component, can be integrated out from the likelihood simply by using the powerful simulation-based method of data augmentation (instead of numerical integration that frequentist methods use). All parameters to be estimated are collected in a vector 𝛉=[𝛃!!!,𝛅!]!. Then, the posterior distribution of the model is written as:

𝜋 𝜽, 𝒖𝒊 𝒚,𝐗,𝐙 ∝𝑝 𝒚, 𝒖𝒊 𝜽,𝐗,𝐙 × 𝑝(𝜽) (4)

where p 𝐲, 𝐮𝐢 𝛉,𝐗,𝐙 is the complete data likelihood of the model, 𝐙 is the matrix of covariates in equation (3), and p(𝜽) is the prior density of the parameters to be estimated. The complete data likelihood consists of two terms: (i) the probability density function (pdf) of the Normal distribution, which is due to the normality assumption of the error term v! and (ii) the pdf of the exponential distribution that is assumed for the inefficiency component u!. The prior density includes three terms:

two multivariate Normal densities for the vectors of parameters 𝛃 and 𝛅, where prior means are set equal to zero and the covariance matrices are specified as diagonal with a value of 1000 on the diagonal entries, and the inverse-Gamma density for the variance parameter σ!! with the shape and scale hyper-parameters being set equal to 0.001. The model's parameters are estimated using Markov Chain Monte Carlo (MCMC) simulation, with the latent variable u!being integrated out from the posterior using data augmentation. Gibbs sampling is used to sample from the full conditionals of 𝛃 and 𝛅 since their priors are conjugate, while Metropolis-Hastings updates are used for 𝛅, since its complete conditional does not belong to any known distributional family.

3.3.2 Data and Variable Construction

Southern Brazil is today the largest dairy producing zone in the country and Paraná is the third largest dairy state in Brazil, producing 4.7bn liters, or 14% of the national production of 33.6 bn. liters in 2016. In 2009 “Paraná Economic and Social Development Institute – IPARDES” conducted a census to gather information from the states’ dairy processing companies which included 301 units, then corresponding to 96% of the companies (total population) and 83% of the processed volume in the state. The questionnaires contained information on the characteristics of companies, the origin and quality control of raw milk, the technological structure, management practices, institutional choice, policy support, etc. For the purpose of this analysis we

retained only 243 companies including 35 cooperatives and 208 investor owned firms (IOFs). The remaining 58 companies were excluded from the sample because of excessive missing values on the variables of interest, or because of unreasonable/abnormal values caused likely by typos during data entry. In the production frontier model we specify one output (output index) and three inputs (total capacity of processing, labor and transport).

With regard to the output index specified in equation (1), its estimation was based on other known variables. We therefore used the monthly mean volume collected over the last 12 months, the rate of the specific products produced by the company as available in the dataset, the volume of milk necessary to produce each specific product and their respective prices in that year. This output index is represented in Brazilian currency (1,000 R$ Reais). Since we used the raw milk volumes to calculate the output index, we didn’t include this variable as an input to avoid endogeneity.

The inputs that are specified in the X vector in equation (1) are the following: (i)

“Total capacity of processing” represents the full capacity in liters per month of the processing plants. It can also be viewed as a proxy for capital, (ii) “Labor” represents the total number of employees in the company, which can be very intense in some sectors of developing countries, which have a high number of small enterprises (Tybout, 2000). (iii) “Transport” represents the maximum distance (in km) that each company has to travel in order to collect the milk from the farthest farmer.

This variable is included because the dairy sector is very demanding in terms of transport since milk, as a perishable product, has to be collected frequently (Frenken, 2014), which means every two days in most cases. Companies processing milk exclusively from their own herd have a value of 0.01. In the dataset those inputs were the best inputs representing the production function. Table 3-1 provides the description of the output and input variables. It shows that cooperatives have, on average, higher values for the output and the three input variables.

Table 3-1: Descriptive Statistics of the Output and Inputs

Variable Full sample (n=243) Cooperatives

(n=35) IOFs (n=208) Frontier Unit Mean (Std. Dev.) Mean (Std. Dev.) Mean (Std. Dev.)

Output 1000 R$ 684.628,50

(2.268.836,00) 1.685.719 (5.255.695) 516.175,70 (1.132.705,00)

Transport Km 67,78 (99,58) 82,09 (184,71) 65,37 (77,13)

Labor Persons 24,96 (51,76) 38,77 (99,1) 22,63 (38,48)

Total capacity 1000 liters 1.018.322

(2.454.208) 1.754.338 (4.623.248) 894.473 (1.850.341)

Technical efficiency model

Used capacity % 0,5 (0,23) 0,48 (0,25) 0,5 (0,23)

Cooperatives Dummy 0,14 (0,35)

Different payment

criteria Dummy 0,83 (0,38) 0,6 (0,5) 0,87 (0,34)

Type of inspection

service Categorical 2,06 (0,76) 1,8 (0,8) 2,11 (0,74)

Source: Own Calculations.

Finally the z vector in equation (3) includes four variables: (i) “used capacity of the plant”, defined as the percentage of the used capacity/total capacity; (ii) A categorical variable is used for the “type of inspection service” adopted (SIM, SIE or SIF), where the SIF is the most restrictive and rigorous with 77 companies in the sample and the SIM the least with 62 companies in the sample, the SIE is the mid-term and 104 are subscribe under this category; (iii) a dummy representing any

“different criteria of payment” different from volume of milk; finally we included a (iv) dummy for “cooperatives”. Other potentially relevant drivers of efficiency like firm age, provenances of capital, among others were tested, but no interesting results were found, thus not reported.

Regarding the size, during the application of the questionnaires the companies were asked to declare their yearly turnover size category. The frequency of the companies in the six size categories is shown in Table 2.

Table 3-2: Size categories of companies in the sample

Category size Frequency Percent

Below R$ 360.000 104 42.8

From R$ 360.001 to R$ 1.200.000 60 24.7

From R$ 1.200.001 to R$ 2.400.000 25 10.3 From R$ 2.400.001 to R$ 10.500.000 33 13.6 From R$ 10.500.001 to R$ 60.000.000 11 4.53

Above R$ 60.000.001 6 2.47

Source: Own Calculations.

There has been an effect on the producer cooperatives. As noted above, the central cooperatives used to dominate the pasteurized milk segment, and they have been the most affected by these changes. All cooperatives currently produce UHT, even very small ones with scale disadvantages. However, the pasteurized milk was mainly sold by co-ops that were protected from competition because, with

pasteurized milk being more perishable and requiring cooling storage and transportation, they were able to dominate their local catchment area. Nestlé and Danone have never sold pasteurized milk.

From the 243 companies in the sample, four didn’t want to declare their sizes. As expected, a large share, more than 65% of the companies are classified as small companies in the two first categories.

In this study the production function is specified in a Cobb-Douglas19 functional form.