Basic Concepts in Statistics - Error Propagation

The following gives a short introduction into the basic concepts of statistics, see [81, 100, 145] for more information.

3.2.1 Error Types

This entire chapter would not have been necessary if it were not for the fact that any observation will always contain errors. These errors are traditionally grouped into three categories: random errors, systematic errors, and blunders.

Blunders — or outliers, as they are customarily called in computer vision — are gross errors generally due not to the observed process or variable, but to the observer. If at all possible, they should be removed from the set of observations. How to reliably classify outliers is unfortunately still an open question, the reader is referred to [46, 67, 140, 151] for examples from nearly 20 years of outlier removal in computer vision. Particularly en vogue is cur-rently once more a method called RANSAC — Random Sample Consensus

— which was introduced in 1981 by Fischler and Bolles [46].

Outliers are ignored in the following unless otherwise stated.

Systematic errors — orsystematic effects, as they are commonly named in the recent literature — are not really errors in the observations, but rather in the underlying model. It is therefore usually possible to remove or avoid systematic effects if an appropriate model is chosen, and part of Section 4 is dedicated to the process of model-selection. An example of systematic effects often encountered in computer vision are radial distortions of the image due to an imperfect lens, see Section 4.2.1. It is well known how to model this effect (usually by an odd polynomial of the distance to the principal point, compare e. g. [139]), and therefore easy to account for it. This is usually not done by incorporating the model of the radial distortion into that of (perspective) projection — which would lead to rather intractable equations

— but by correcting the observations for this particular effect. In computer vision, such corrections are often part of (partial) camera calibration.

Random errors are the only kind of effects with which traditional statistics is concerned, although the use of the term “error” is deprecated in modern literature, and the term statistical properties used instead. This captures the fact that from a statistical standpoint, observations can be considered

60 Basic Concepts in Statistics

samples of an unknown probability distribution of a random variable. Dis-crepancies between several observations are therefore not due to errors, but simply serve to describe the particular probability distribution. It is statis-tics’ task to gain as much information as possible about this distribution from the observations.

So how can we describe the properties of our unknown probability distribution?

A very concise description can often be given by the use of moments, as described in the next section.

3.2.2 Mean and Central Moments

Probability distributions can often be described in terms of their mean and central moments. The population mean of a random variablex, also called first moment or expectation, is denoted byE(x) and is defined (if it exists) as the average value µxof the variable over all possible values, weighted by their respective probabilities Px∈IRor probability density function²px∈IR, it is

E(x) =µx= Xn

i=1

xiPx(xi) (3.1)

E(x) =µx= Z ∞

−∞

x px(x) dx (3.2)

in the continuous case. Given two random variablesxandyand three constants a, b, c, the following rules hold [100]:

E(E(x)) =E(x) (3.3)

E(x+y) =E(x) +E(y) (3.4)

E(c) =c (3.5)

E(c^·x) =c^·E(x) (3.6)

=⇒ E(a^·x+b) =a^·E(x) +b. (3.7) Ifxandyare independent random variables, it is also true that

E(x^·y) =E(x)^·E(y). (3.8)

Note, however, that, in general,E(x²)6= (E(x))².

2In the following denoted by pdf.

3.2.2 Mean and Central Moments 61

Central moments, which can be used to describe most pdfs, are expectations with respect to the mean, where the k^thcentral moment is defined as

mk=E

(x−E(x))^k

. (3.9)

One particularly important central moment is the second moment or varianceσ_x². It is:

σ_x²=m2=E

(x−E(x))²

(x−µx)²

(3.10)

=E x²−2xµx+µ²_x

(3.11)

=E x²

−2µxE(x) +µ²_x (3.12)

=E x²

−µ²_x. (3.13)

The variance’s positive square rootσ >0 is called standard deviation. Note that Equation (3.13) can easily lead to numeric problems for big values ofµ²_x. The equations corresponding to Equations (3.4)–(3.8) are:

σ²_x+y=σ_x²+σ²_y (3.14)

σ²_c= 0 (3.15)

σ_c²·x=c²^·σ_x² (3.16)

=⇒ σ_a²·x+b=a²^·σ_x². (3.17) Related to the concept of the variance is that of the cofactorq, which could be viewed as a relative variance. It is

q²_x= σ²_x

σ²₀ (3.18)

for a possibly unknown value ofσ²₀, the reference variance.

The Equations (3.2), (3.10), and (3.13) can only be used if the pdfpx is already known. In order to estimate the pdf from observations alone, we have to approx-imate the population mean and population variance (and possibly higher order moments) by the sample mean and sample variance. GivenN measurementsxi, (i= 1. . . N) the sample or empirical meanxis defined as the arithmetic mean:

x= 1 N

i=1

xi. (3.19)

62 Basic Concepts in Statistics

It isE(x) =µx. The sample variance is defined as s²_x= 1

N−1 XN

i=1

(xi−x)² (3.20)

and it isE(s²_x) =σ²_x. Higher order moments can be approximated similarly.

3.2.3 Normal Distribution

The most important probability distribution, and one uniquely defined by mean and variance, is the normal or Gaussian distribution

N(µx, σ_x²) = 1 p2πσ²_xe⁻¹²

(x−µx)2 σ2

x . (3.21)

3.2.4 Multidimensional Extension

The above can easily be extended for multi-dimensional random variables. If x^∈IRⁿis a vector ofN(not necessarily independent) random variables, so is the expectation simply

E(x) =µ_x= XN

i=1

xiPx(xi) (3.22)

E(x) =µ_x= Z ∞

−∞

xpx(x) dx (3.23)

in the continuous case, where the summation (or integration) can be performed separately for each vector-element. Note thatPx, px∈IR.

The central moments of order (k1+· · ·+kn) can be calculated as E

(x1−µx1)^k¹^· · · · ^·(xn−µxn)^kⁿ

. (3.24)

Of particular importance are again the second order central moments ofx∈IRⁿ, which can be set up as all combinations between each two elements of the vector.

The result can be arranged as a matrixM_xx∈IR^n×n. This matrix is customarily called the matrix of second central moments, the variance-covariance matrix or

3.2.4 Multidimensional Extension 63

simply the covariance matrix. It is

M_xx=E((x−µ_x)(x−µ_x)^T) =







mx1x1 mx1x2 · · · mx1xn

mx2x1 mx2x2 · · · mx2xn

... ... . .. ... mxnx1 mxnx2 · · · mxnxn













σ²_x₁ σx₁x₂ · · · σx₁x_n

σx2x1 σ²_x₂ · · · σx2xn

... ... . .. ... σxnx1 σxnx2 · · · σ²_x_n





=Σ_x. (3.25)

Note that this is a square symmetric matrix since

σx_ix_j =E((xi−µx_i)^·(xj−µx_j)) =E((xj−µx_j)^·(xi−µx_i)) =σx_jx_i. (3.26) Using Equation (3.24) it is always possible to construct higher order central mo-ments. The equations corresponding to Equations (3.3)– (3.7) and (3.14)– (3.17) are (with random variablesx,y^∈IRⁿ, constant vectorsb,c^∈IR^mand a constant matrixA^∈IR^m×n):

E(E(x)) =E(x) (3.27)

E(x+y) =E(x) +E(y) (3.28)

E(c) =c (3.29)

E(Ax) =AE(x) (3.30)

=⇒ E(Ax+b) =AE(x) +b (3.31)

Σ_x+y=Σ_x+Σ_y (3.32)

Σ_c= 0 (3.33)

Σ_Ax=AΣ_xA^T (3.34)

=⇒ ΣAx+b=AΣ_xA^T. (3.35)

A cofactor matrix can be defined analogous to Equation (3.18), it is Q_x= 1

σ²₀Σ_x. (3.36)

The sample meanx and sample covariance matrixS_x are defined in analogy to

64 Error Propagation

Equations (3.19) and (3.20) as x= 1

N XN

i=1

xi (3.37)

S_x= 1 N−1

i=1

(xi−x)(xi−x)^T. (3.38)

Then-dimensional normal distribution is given by N(µ_x,Σ_x) = 1

p(2π)ⁿ|Σ_x| exp(−1

2(x−µ_x)^TΣ⁻¹_x (x−µ_x)), (3.39) where|Σ_x|is the determinant ofΣ_x∈IR^n×n. Additional distributions are given in Section 3.4 (theχ²-distribution, used for testing) and 3.5 (an adaption of the normal distribution to cyclic data), but the normal distribution is by far the most important distribution used in this thesis. Its theoretical and practical importance is due to thecentral limit theoremwhich states that the sumPn

i=1xi ofn inde-pendent random variablesx1, . . . , xnwill be asymptotically normally distributed asn→ ∞. Normal distributions are encountered very often in practical applica-tions; in particular, random variables that representindependent measurements in photogrammetry, geodesy, or surveying are often nearly normally distributed [100].

Another reason for the normal distribution’s prominence is its simple form which is completely described by mean and variance. This makes it particularly well suited for the propagation of statistical properties as described in the next section.

Im Dokument Error Propagation (Seite 59-64)