Basic Concepts in Statistics - Error Propagation

The following gives a short introduction into the basic concepts of statistics, see [81, 100, 145] for more information.

1Note, however, that at least the 4^thedition of the Manual of Photogrammetry contains several gross errors.

3.2.1 Error Types 55

3.2.1 Error Types

This entire chapter would not have been necessary if it were not for the fact that any observation will always contain errors. These errors are traditionally grouped into three categories: random errors, systematic errors, and blunders.

Blunders — or outliers, as they are customarily called in computer vision — are gross errors generally due not to the observed process or variable, but to the observer. If at all possible, they should be removed from the set of observations. How to reliably classify outliers is unfortunately still an open question, the reader is referred to [46, 67, 140, 151] for examples from nearly 20 years of outlier removal in computer vision. Particularly en vogue is currently once more a method called RANSAC — Random Sample Consensus — which was introduced in 1981 by Fischler and Bolles [46].

Outliers are ignored in the following unless otherwise stated.

Systematic errors — or systematic effects, as they are commonly named in the recent literature — are not really errors in the observations, but rather in the underlying model. It is therefore usually possible to remove or avoid sys-tematic effects if an appropriate model is chosen, and part of Section 4 is dedicated to the process of model-selection. An example of systematic effects often encountered in computer vision are radial distortions of the image due to an imperfect lens, see Section 4.2.1. It is well known how to model this effect (usually by an odd polynomial of the distance to the principal point, compare e. g. [139]), and therefore easy to account for it. This is usually not done by incorporating the model of the radial distortion into that of (perspec-tive) projection — which would lead to rather intractable equations — but by correcting the observations for this particular effect. In computer vision, such corrections are often part of (partial) camera calibration.

Random errors are the only kind of effects with which traditional statistics is concerned, although the use of the term “error” is deprecated in modern liter-ature, and the term statistical properties used instead. This captures the fact that from a statistical standpoint, observations can be considered samples of an unknown probability distribution of a random variable. Discrepancies be-tween several observations are therefore not due to errors, but simply serve to describe the particular probability distribution. It is statistics’ task to gain as much information as possible about this distribution from the observations.

So how can we describe the properties of our unknown probability distribution? A very concise description can often be given by the use of moments, as described in the next section.

56 Basic Concepts in Statistics

3.2.2 Mean and Central Moments

Probability distributions can often be described in terms of their mean and central moments. The population mean of a random variable x, also called first moment or expectation, is denoted by E(x) and is defined (if it exists) as the average value µ_x of the variable over all possible values, weighted by their respective probabilities Px ∈IRor probability density function² px ∈IR, it is

E(x) =µx = Xn

i=1

xiPx(xi) (3.1)

E(x) =µ_x = Z ∞

−∞

x p_x(x) dx (3.2)

in the continuous case. Given two random variables x and y and three constants a, b, c, the following rules hold [100]:

E(E(x)) = E(x) (3.3)

E(x+y) = E(x) +E(y) (3.4)

E(c) = c (3.5)

E(c x) = c E(x) (3.6)

=⇒ E(a x+b) = a E(x) +b. (3.7)

If x and y are independent random variables, it is also true that

E(x y) = E(x) E(y). (3.8)

Note, however, that, in general, E(x²)6= (E(x))².

Central moments, which can be used to describe most pdfs, are expectations with respect to the mean, where the k^th central moment is defined as

mk=E

(x−E(x))^k

. (3.9)

One particularly important central moment is the second moment or variance σ_x². It is:

σ_x² =m₂ =E (x−E(x))²

=E (x−µ_x)²

(3.10)

=E x²−2xµx+µ²_x

(3.11)

=E x²

−2µxE(x) +µ²_x (3.12)

=E x²

−µ²_x. (3.13)

2In the following denoted by pdf.

3.2.3 Normal Distribution 57 The variance’s positive square root σ > 0 is called standard deviation. Note that Equation (3.13) can easily lead to numeric problems for big values of µ²_x. The equations corresponding to Equations (3.4)–(3.8) are:

σ²_x+y = σ²_x+σ_y² (3.14)

σ_c² = 0 (3.15)

σ²_{c x} = c² σ_x² (3.16)

=⇒ σ_{a x+b}² = a² σ²_x. (3.17)

Related to the concept of the variance is that of the cofactor q, which could be viewed as a relative variance. It is

q²_x= σ²_x

σ²₀ (3.18)

for a possibly unknown value of σ²₀, the reference variance.

The Equations (3.2), (3.10), and (3.13) can only be used if the pdf px is already known. In order to estimate the pdf from observations alone, we have to approximate the population mean and population variance (and possibly higher order moments) by the sample mean and sample variance. Given N measurements xi, (i= 1. . . N) the sample or empirical mean x is defined as the arithmetic mean:

x= 1 N

i=1

xi. (3.19)

It is E(x) =µx. The sample variance is defined as s²_x = 1

N −1 XN

i=1

(xi−x)² (3.20)

and it is E(s²_x) =σ_x². Higher order moments can be approximated similarly.

3.2.3 Normal Distribution

The most important probability distribution, and one uniquely defined by mean and variance, is the normal or Gaussian distribution

N(µ_x, σ²_x) = 1

p2πσ_x² e⁻¹²

(x−µx)2

σx2 . (3.21)

3.2.4 Multidimensional Extension

The above can easily be extended for multi-dimensional random variables. Ifx^∈ IRⁿ is a vector ofN (not necessarily independent) random variables, so is the expectation

58 Basic Concepts in Statistics

in the continuous case, where the summation (or integration) can be performed separately for each vector-element. Note that Px, px∈IR.

The central moments of order (k1+· · ·+kn) can be calculated as E

(x1−µx1)^k¹ · · · (xn−µxn)^kⁿ

. (3.24)

Of particular importance are again the second order central moments of x ^∈ IRⁿ, which can be set up as all combinations between each two elements of the vector. The result can be arranged as a matrix M_xx ^∈ IR^n×n. This matrix is customarily called the matrix of second central moments, the variance-covariance matrix or simply the covariance matrix. It is

Note that this is a square symmetric matrix since

σxixj =E((xi−µxi) (xj −µxj)) =E((xj−µxj) (xi−µxi)) =σxjxi. (3.26) Higher order central moments can always be constructed using Equation (3.24). The equations corresponding to Equations (3.3)– (3.7) and Equations (3.14)– (3.17) are (with random variablesx,y^∈IRⁿ, constant vectorsb,c^∈ IR^m and a constant matrix A^∈ IR^m×n):

Error Propagation 59

Σ_x+y=Σ_x+Σ_y (3.32)

Σ_c = 0 (3.33)

Σ_Ax=AΣ_xA^T (3.34)

=⇒ Σ_Ax+b=AΣ_xA^T. (3.35)

A cofactor matrix can be defined analogous to Equation (3.18), it is Q_x = 1

σ₀² Σ_x. (3.36)

The sample mean x and sample covariance matrix S_x are defined in analogy to Equations (3.19) and (3.20) as

x = 1 N

i=1

x_i (3.37)

S_x = 1 N −1

i=1

(xi−x)(xi−x)^T. (3.38)

The n-dimensional normal distribution is given by N(µx,Σ_x) = 1

p(2π)ⁿ|Σ_x| exp(−1

2(x−µ_x)^TΣ⁻¹_x (x−µ_x)), (3.39) where |Σ_x| is the determinant of Σ_x ∈ IR^n×n. Additional distributions are given in Section 3.4 (the χ²-distribution, used for testing) and 3.5 (an adaption of the normal distribution to cyclic data), but the normal distribution is by far the most important distribution used in this thesis. Its theoretical and practical importance is due to the central limit theorem which states that the sum Pn

i=1xi of n inde-pendent random variables x1, . . . , xn will be asymptotically normally distributed as n → ∞. Normal distributions are encountered very often in practical applica-tions; in particular, random variables that represent independent measurements in photogrammetry, geodesy, or surveying are often nearly normally distributed [100].

Another reason for the normal distribution’s prominence is its simple form which is completely described by mean and variance. This makes it particularly well suited for the propagation of statistical properties as described in the next section.

Im Dokument Error Propagation (Seite 54-59)