Generalized linear models - Childhood mortality in Nigeria

2.4 Childhood mortality in Nigeria

4.1.1 Generalized linear models

A common way to build regression models extending the classical linear model for Gaussian responses to more general situations such as binary responses or count data are generalized linear models originally introduced by Nelder & Wedderburn (1972) (for more compre-hensive overviews see Fahrmeir & Tutz (2001) or McCullagh & Nelder (1989)). In these models the influence of covariates u on a response variable y is assumed to satisfy the following two assumptions:

Distributional assumption

Conditional on covariates ui, the responses yi are independent and the distribution of yi

belongs to a simple exponential family, i. e. its density can be written as f(yi|θi, φ, ωi) = exp

yiθi −b(θi)

φ ωi+c(yi, φ, ωi)

, i= 1, . . . , n (4.1) where

θi is the natural parameter of the exponential family (see below), φ is a scale or dispersion parameter common to all observations, ω_i is a weight, and

b(·) and c(·) are functions depending on the specific exponential family.

Structural assumption:

The (conditional) expectation E(yi|ui) =µi is linked to the linear predictor

ηi =u⁰_iγ (4.2)

via

µi =h(ηi) or ηi =g(µi), where

h is a smooth, bijective response function, g is the inverse of h called the link function and γ is a vector of unknown regression coefficients.

Both assumptions are connected by the fact that the mean ofy_i is also determined by the distributional assumption and can be shown to be given as

µi =b⁰(θi) = ∂b(θ_i)

∂θ .

Therefore, the natural parameter can be expressed as a function of the mean, i. e. θi = θ(µ_i). In contrast to the classical linear model, the variance of y_i in general also depends on the linear predictor since

var(y_i|u_i) =σ²(µ_i) = φv(µ_i) ωi

with v(µ_i) = b⁰⁰(θ_i) being the variance function of the underlying exponential family. In Table 4.1 the natural parameter, the expectation, the variance function and the scale parameter are listed for the most commonly used exponential families.

Distribution θ(µ) b⁰(θ) v(µ) φ

Normal N(µ, σ²) µ µ=θ 1 σ²

Gamma Ga(µ, ν) −1/µ µ=−1/θ µ² ν⁻¹

Poisson P o(λ) log(λ) λ = exp(θ) λ 1

Binomial B(n, π) log(π/(1−π)) π= _1+exp(θ)^exp(θ) π(1−π) 1 Table 4.1: Components of the most commonly used exponential families.

For a given response distribution different response functions are used in practice de-pending on the specific application. Some examples will be discussed in the following subsections. One particularly important special case is the so called natural (or canoni-cal) link function obtained from

θ_i =θ(µ_i) =η_i,

where the natural parameter is linked directly to the linear predictor. This choice is frequently used as the default link function since it results in simpler estimation equations (although other choices may be more appropriate in some situations as we will see later on).

4.1.1.1 Models for continuous responses Normal distribution

The classical linear model can be subsumed into the context of generalized linear models by defining h(η) = η, i. e. the response function is simply the identity. For Gaussian distributed responses this also represents the natural link function. The variance function v(µ) is constant, while the scale parameter equals the variance of the error terms of the linear regression model (see also Table 4.1).

Gamma distribution

If the response values are all nonnegative, the normal distribution in combination with the identity link is often not adequate for an appropriate analysis. Although lognormal models, where the identity link is replaced by the log link, are frequently used in practice, a more natural choice would be a distribution whose support is ₊ by definition. In addition, choosing an appropriate response distribution also allows to account for the fact that usually nonnegative responses follow a skewed and asymmetric distribution. One member of the class of exponential families allowing for both properties is the gamma distribution. Here, the natural response function is given by the negative reciprocal

h(η) =−η⁻¹ =µ.

This response function is, however, only rarely used in practice since it does not ensure the nonnegativity of the expectation. Instead the log-link

g(µ) = log(µ) =η or, equivalently, the exponential response function

h(η) = exp(η) = µ

are the most common choices when analyzing gamma distributed responses.

4.1.1.2 Models for count data

A regression model for the analysis of count data can be derived under the assumption of Poisson distributed responses. In this case the natural response function is given by the exponential

h(η) = exp(η) =µ, and the natural link function is the natural logarithm g(µ) = log(µ) = η.

Therefore the present model is also referred to as a loglinear model. Note that in contrast to normal and gamma models the scale parameter is fixed at φ= 1 for Poisson data.

4.1.1.3 Models for binary and binomial responses

For binary responses yi ∈ {0,1} the expectation is given by the probability π = P(y = 1), which requires appropriate response functions to ensure π ∈ [0,1]. Obviously, any cumulative distribution function satisfies this condition and different model formulations are obtained for different choices of the distribution function. In any case, the scale parameter is again fixed at φ= 1.

Logit model

When choosing the natural link function g(π) = log

π 1−π

=η,

the logit model is obtained which corresponds to the logistic distribution function as response function:

h(η) = exp(η)

1−exp(η) =π.

The logistic distribution function is symmetric and has somewhat heavier tails than the standard normal distribution function used in probit models. Due to the intuitive inter-pretation of the regression coefficients based on odds and odds ratios, the logit model is most commonly used when analyzing binary data, especially in medical applications.

Probit model

In the probit model the logistic distribution function is replaced by the standard normal distribution function. This results in somewhat lighter tails while retaining symmetry.

Since for the probit model the evaluation of the likelihood is computationally more de-manding and parameter estimates are not interpretable in terms of odds or odds ratios, the logit model is often preferred in practice. An exception are fully Bayesian generalized linear models estimated based on Markov Chain Monte Carlo simulations, where probit models are represented via Gaussian distributed latent variables allowing for simple Gibbs sampling updates (see also the discussion of latent variable approaches in Section 11.5 and Albert & Chib (1993) for a description of fully Bayesian inference in probit models).

Complementary log-log model

An asymmetric binary regression model is obtained with the extreme minimal value dis-tribution function

h(η) = 1−exp(−exp(η)) =π.

This model is also called the complementary log-log model since it results in the link function

g(π) = log(−log(1−π)) = η.

Though being less frequently used in the analysis of originally binary data, it is more commonly applied in discrete time survival models since it can be interpreted as a grouped proportional hazards model (see Fahrmeir & Tutz (2001), Ch. 9 and Section 14.4.2).

Binomial responses

To model binomial responses yi ∼ B(ni, πi), exactly the same models as discussed for binary responses can be used by replacing yi with ¯yi = yi/ni and introducing weights ωi =ni, i= 1, . . . , n. In this formulation the expectation is also given by πi =E(¯yi) and the binomial distribution can easily be subsumed in the exponential family framework.

Im Dokument Mixed model based inference in structured additive regression (Seite 31-34)