deposit_hagen
Publikationsserver der Universitätsbibliothek
Mathematik und
Informatik
Lehrgebiet Stochastik Forschungsbericht
Frank Recker
A simple algorithm for the
cumulative distribution function
of the Gaussian distribution
A simple algorithm for the cumulative distribution function of the Gaussian distribution
Frank Recker
∗Abstract
We give a simple algorithm for the value of the cumulative distribution function of the Gaussian distribution.
Key Words:Gaussian distribution, cumulative distribution function, CDF MSC2010 classification:60-04, 68-04
Contents
1 Introduction 1
2 The Algorithm 1
3 The Proof 2
1 Introduction
The Gaussian Integral
Φ(x) = 1
√2π
x
Z
−∞
e−t
2
2 dt (1)
is ubiquitous in statistical applications. There are libraries such as the GNU scientific library [1] which offer this calculation as a function. However, sometimes one cannot use these libraries due to technical or legal constraints. In this paper, we describe a simple algorithm for this calculation. Example implementations can be found at the github site [2]
of the author.
2 The Algorithm
The algorithm for computingφ(x, n)forx ∈ Randn ∈ N = {1,2, . . .}is defined as follows:
Algorithm 1. Letdn= 0and
dj =−x2 2j
dj+1+ 1 2j+ 1
for allj=n−1, . . . ,1. Finally let
φ(x, n) =1 2 + x
√2π(d1+ 1).
∗Frank Recker, Technische Hochschule Mittelhessen, Mathematik, Fachbereich MNI, Wiesenstraße 14, 35390 Gießen, E-Mail:frank.recker@mni.thm.de
The algorithm computes an approximation of the Gaussian cumulative distribution function as defined in Equation (1). Theorem 2 formalizes this.
Theorem 2. Letx∈R. DefineΦ(x)as in Equation (1) andφ(x, n)for alln ∈Nas in Algorithm 1. Then we have
n→∞lim φ(x, n) = Φ(x).
Furthermore ifn≥ x22 then
|φ(x, n)−Φ(x)|< 1
√2π
|x|2n+1 (2n+ 1)2nn!.
Table 1 shows the values of φ(x, n)and the maximum error due to Theorem 2 for somexandn. The values were calculated with the code taken from [2]. Note that all calculations are done with floating point numbers and hence rounding errors will occur.
x n φ(x, n) error bound
1.96 1 1.2819268695868082 no bound
1.96 2 0.7812851592193613 0.28848977918213764 1.96 10 0.9749960638553972 7.014638266104427e-6 1.96 200 0.9750021048517796 1.23e-321
5 1 2.4947114020071637 no bound 5 10 -1169.2649270406318 no bound
5 30 0.9285538915764981 9.958422559186228e-2 5 50 0.9999997133453642 4.5497179496632544e-12 5 200 0.9999997133486902 1.5200212487901728e-158
Table 1: Some values and error bounds forφ(x, n)
3 The Proof
First, we derive a power series forΦ(x).
Lemma 3. LetΦ(x)as in Equation (1). Then
Φ(x) =1 2 + 1
√ 2π
∞
X
k=0
(−1)kx2k+1
(2k+ 1)2kk!. (2)
Proof. We haveΦ(0) = 12 and hence we can rewrite Equation (1) as
Φ(x) = 1 2+ 1
√2π
x
Z
0
e−t
2
2 dt. (3)
Next we use the power series for theexp-function:
ex=
∞
X
k=0
xk k!.
This series converges absolutely for allx∈R. We insert this into Equation (3) and get
Φ(x) = 1 2+ 1
√2π
x
Z
0
∞
X
k=0
−t22k k! dt= 1
2+ 1
√2π
x
Z
0
∞
X
k=0
(−1)kt2k 2kk! dt.
For convergent power series we can swap integration and summation. Hence we get
Φ(x) =1 2 + 1
√2π
∞
X
k=0 x
Z
0
(−1)kt2k 2kk! dt= 1
2+ 1
√2π
∞
X
k=0
(−1)kt2k+1 (2k+ 1)2kk!
t=x
t=0
and therefore
Φ(x) =1 2 + 1
√2π
∞
X
k=0
(−1)kx2k+1 (2k+ 1)2kk!
as stated in the lemma.
Next, we will show that φ(x, n) is in fact just a partial sum of the series given in Equation (2).
Lemma 4. Letx∈R,n∈Nandφ(x, n)as defined in Algorithm 1. Then we have
φ(x, n) = 1 2+ 1
√ 2π
n−1
X
k=0
(−1)kx2k+1
(2k+ 1)2kk!. (4)
Proof. For brevity define
φ0(x, n) = 1 2+ 1
√2π
n−1
X
k=0
(−1)kx2k+1 (2k+ 1)2kk!.
For allk∈Nletak = 2k+11 andbk =−x2k2. Furthermore leta0 = 1andb0 = √x2π. Its easy to show by complete induction that
φ0(x, n) =1 2 +
n−1
X
k=0
ak k
Y
j=0
bj
.
Writing this in a Horner-scheme style gives
φ0(x, n) =1
2 +b0(a0+b1(a1+b2(· · ·+bn−2(an−2+bn−1an−1). . .))). Algorithm 1 just evaluates this expression from the inner brackets to the outer one (dj is the value of the termbj(aj+bj+1(. . .))for allj= 1, . . . , n−2).
Now we can prove Theorem 2.
Proof of Theorem 2. From Equations (2) and (4) we get
n→∞lim φ(x, n) = Φ(x).
The seriesφ(x, n)is alternating. Letn≥ x22. We compare the terms of the series fork=n andk=n−1:
(−1)nx2n+1 (2n+1)2nn!
(−1)n−1x2n−1 (2n−1)2n−1(n−1)!
=
(2n−1)x2 (2n+ 1)2n
≤
2n−1 2n+ 1
<1.
Hence, the series forφ(x, n)is alternating and the absolute values of the terms are mono- tone decreasing. Calculus shows in this case that|φ(x, n)−Φ(x)|is bound by the absolute value of then-th term of the series which is
√1 2π
|x|2n+1 (2n+ 1)2nn!.
References
[1] The Gaussian Distribution in the GNU Scientific Library,http://www.gnu.org/
software/gsl/manual/html_node/The-Gaussian-Distribution.
html
[2] Implementations of Algorithm 1 on github, https://github.com/frecker/
gaussian-distribution