4 Examples and experiments

a[i] = top;

The functionrestore_heap_from_toprestores the heap property for the top elementa1 by pushing it downwards in the heap.

hheap maintenance 16.4i+≡

inline void

restore_heap_from_top(double *const a,const int n){

int i = 1;

int j = 2;

const double top = a[1];

hwalk down and establish heap property17.1i }

The function restore_heap_from_bottom restores the heap property for a leaf ai in the heap, by pushing it upwards in the heap.

hheap maintenance 16.4i+≡

inline void

restore_heap_from_bottom(double *const a,int i){

int j = i » 1;

const double last = a[i];

while (j > 0 && a[j] < last){

a[i] = a[j];

i = j;

j »= 1;

}

a[i]= last;

}

Heap maintenance is put into a file.

hheap_maintenance_for_essa.hpp17.4i≡

#ifndef HEAP_MAINTENANCE_FOR_ESSA_HPP

#define HEAP_MAINTENANCE_FOR_ESSA_HPP hheap maintenance 16.4i

#endif//HEAP_MAINTENANCE_FOR_ESSA_HPP

Original essa as described by Ratschek and Rokne has the property that the sequence with the smaller leading element has an element less in the next iteration. For revised essa we can achieve this by running it in round-towards-zero, for modified essaby running it in round-up. Then either y = 0 or y has the same sign as x. If a₁ > b₁ an element is removed from b but none is inserted.

Similarly the size ofa is reduced ifb₁ > a₁. This property might force termination, as the length of the sequences has a large influence in the termination criteria. On the other hand the sequence whose length is reduced might actually be the dominant one. Thus it is unclear whether this property is beneficial.

4.1 Examples

To illustrate how the variants of essa work and to show that they perform a different number of iterations, we give two examples. InTable 1 originalessaneeds less iterations than our revisedessa in the default rounding mode. To construct such an example we exploit the fact that in the original essaboth results of an operation always have the same sign, while the roundoff error in our revised version may have any sign. This is not the case when running revised essa in round-towards-zero and consequently it needs as many steps as original essa in this example. InTable 2 both variants of revisedessaneed one iteration, while original essaneeds four.

4.2 Test data generation

To test the performance of our essavariants, we need some input data. Instead of using data from a certain application or predicate, we decided to create artificial data. Using artificial data covering a broad range (hopefully) allows us to draw conclusions for a majority of applications. We will shortly mention two generators for random sums, taken from the literature, and then describe our own solution. Our goal is to generate sums with varying exponent ranges, including zero sums, as they correspond to degenerate constellations in geometric algorithms which occur occasionally.

To test the original essa, Ratschek and Rokne [14] generate sums with 10000 summands, using floating-point numbers with a mantissa of 24 bits and an exponent in the range of [−127,127]. To generate easy examples, simply10000numbers are taken from that range. To generate harder sums, a numbercis selected, then−cis inserted eight times into the sum, together with8c, keeping the sum zero. This is repeated 1111 times, resulting in 9999 summands. Finally ±2⁻⁶⁰ is inserted and the summands are shuffled, so the sum of the now 10000summands is±2⁻⁶⁰. The latter scheme can be generalized by varying the final summand and therefore the value of the sum. We think however, that the property of 9 or any fixed number of summands canceling out exactly, rarely occurs in practice.

Furthermore this could penalize one algorithm and help another one in an unapparent way, so we refrained from using this scheme.

Ogita et al. [12] propose a generator for random dot products with a prespecified condition number.

The generator can also be used to generate sums, by transforming the dot product into a sum. Such a transformation however also imposes a structure on the sum and a generator for sums is easy to derive and just as easy to implement. The following function returns in a a sum withn summands and a condition of approximately2^c. Naturallyamust provide space forn summands.

hrandom sum generation18.1i≡ void

generate_random_sum(double *const a,const int n,const int c){

CGAL::MP_Float sum(0);

const int m = n/2;

hgenerate first part of sum18.2i hgenerate second part of sum21.1i random_integer_generator rig;

std::random_shuffle(a,a+n,rig);

}

The first half of the summands are generated as ±α·2^e with α uniformly drawn from [0,1] and e uniformly drawn from 0, . . . , c. It is assured, that0 andc actually occur as exponent.

hgenerate first part of sum18.2i≡ for(int i=1;i<m-1;i++){

const int e = static_cast<int>(drand48()*c);

a[i] = ldexp((2*drand48()-1),e);

sum += CGAL::MP_Float(a[i]);

}

a[0] = ldexp((2*drand48()-1),0);

a[m-1] = ldexp((2*drand48()-1),c);

revisedessaoriginalessamodifiedessainround-up a1=1.00000000000000000000000000000000000000000000000000002·20 a2=1.00000000000000000000000000000000000000000000000000002·20 b1=1.00000000000000000000000000000000000000000000000000012·2−2 b2=1.00000000000000000000000000000000000000000000000000002·2−2 b3=1.00000000000000000000000000000000000000000000000000002·2−54 b4=1.00000000000000000000000000000000000000000000000000002·2−54 a1=1.00000000000000000000000000000000000000000000000000002·20 a2=1.00000000000000000000000000000000000000000000000000002·20 b1=1.00000000000000000000000000000000000000000000000000012·2−2 b2=1.00000000000000000000000000000000000000000000000000002·2−2 b3=1.00000000000000000000000000000000000000000000000000002·2−54 b4=1.00000000000000000000000000000000000000000000000000002·2−54

a1=1.00000000000000000000000000000000000000000000000000002·20 a2=1.00000000000000000000000000000000000000000000000000002·20 b1=1.00000000000000000000000000000000000000000000000000012·2−2 b2=1.00000000000000000000000000000000000000000000000000002·2−2 b3=1.00000000000000000000000000000000000000000000000000002·2−54 b4=1.00000000000000000000000000000000000000000000000000002·2−54 Inthefirstiteration,xgetsroundedup,soyisnegative.Originalessacomputestwonewpositivesummands.Modifiedessainround-upalsocomputestwopositivesummands,yet xisabetterapproximationtoa1−b1thanxinoriginalessa. x=1.10000000000000000000000000000000000000000000000000002·2−1 y=-1.00000000000000000000000000000000000000000000000000002·2−54x=1.00000000000000000000000000000000000000000000000000002·2−1 y=1.11111111111111111111111111111111111111111111111111102·2−3x=1.01111111111111111111111111111111111111111111111111102·2−1 y=1.10000000000000000000000000000000000000000000000000002·2−53 Thesizeofbothsequencesstaysthesameandthelargestelements havealmostthesamemagnitudeasbefore.Revisedessadoesnot terminateyet.

Althoughthelargestelementsinbothsequencesarenowalmostthe samemagnitudeasinthefirstiteration,therearefewernegative summands,a1>3b1andoriginalessaterminates.

Modifiedessaterminatesjustlikeoriginalessa. a1=1.00000000000000000000000000000000000000000000000000002·20 a2=1.10000000000000000000000000000000000000000000000000002·2−1 b1=1.00000000000000000000000000000000000000000000000000002·2−2 b2=1.00000000000000000000000000000000000000000000000000002·2−54 b3=1.00000000000000000000000000000000000000000000000000002·2−54 b4=1.00000000000000000000000000000000000000000000000000002·2−54

a1=1.00000000000000000000000000000000000000000000000000002·20 a2=1.00000000000000000000000000000000000000000000000000002·2−1 a3=1.11111111111111111111111111111111111111111111111111102·2−3 b1=1.00000000000000000000000000000000000000000000000000002·2−2 b2=1.00000000000000000000000000000000000000000000000000002·2−54 b3=1.00000000000000000000000000000000000000000000000000002·2−54

a1=1.00000000000000000000000000000000000000000000000000002·20 a2=1.01111111111111111111111111111111111111111111111111102·2−1 a3=1.10000000000000000000000000000000000000000000000000002·2−53 b1=1.00000000000000000000000000000000000000000000000000002·2−2 b2=1.00000000000000000000000000000000000000000000000000002·2−54 b3=1.00000000000000000000000000000000000000000000000000002·2−54 AlthoughSterbenzLemmaisnotapplicable,thenextcomputation isexactandrevisedessaterminates. x=1.10000000000000000000000000000000000000000000000000002·2−1 y=0 a1=1.10000000000000000000000000000000000000000000000000002·2−1 a2=1.10000000000000000000000000000000000000000000000000002·2−1 b1=1.00000000000000000000000000000000000000000000000000002·2−54 b2=1.00000000000000000000000000000000000000000000000000002·2−54 b3=1.00000000000000000000000000000000000000000000000000002·2−54 Table1:Wecreateasumwithpositivesign.Thetwopositivesummandsa1anda2arechosentobeequal.Thetwolargernegativesummands b1andb2arechosensuchthat4bi≥ai.Furthermoreb1ischosensuchthatthetransformationofa1−b1intox+ywillresultinanegativey forrevisedessa.b3andb4arejusttheretomakethesetofnegativesummandsappearlarge.Notethatinround-to-zerorevisedessawouldalso terminateafterthefirstiteration.

revisedessaoriginalessamodifiedessainround-up a1=1.00000000000000000000000000000000000000000000000000112·21 b1=1.00000000000000000000000000000000000000000000000000112·20 b2=1.00000000000000000000000000000000000000000000000000102·2−1 b3=1.00000000000000000000000000000000000000000000000000102·2−1 a1=1.00000000000000000000000000000000000000000000000000112·21 b1=1.00000000000000000000000000000000000000000000000000112·20 b2=1.00000000000000000000000000000000000000000000000000102·2−1 b3=1.00000000000000000000000000000000000000000000000000102·2−1

a1=1.00000000000000000000000000000000000000000000000000112·21 b1=1.00000000000000000000000000000000000000000000000000112·20 b2=1.00000000000000000000000000000000000000000000000000102·2−1 b3=1.00000000000000000000000000000000000000000000000000102·2−1 ConformingtoSterbenzLemma,thefirstcomputationa1b1in revisedessaisexact.Theexponentsofa1andb1differ,thereforewegettwonewsum- mands.Additionallythisisacasewhere|x|<|y|.Inround-upmodifiedessacomputestwonewsummandsofthesame sign,ifany. x=1.00000000000000000000000000000000000000000000000000112·20 y=0x=1.10000000000000000000000000000000000000000000000000002·2−50 y=1.11111111111111111111111111111111111111111111111110102·2−1x=1.00000000000000000000000000000000000000000000000000102·20 y=1.00000000000000000000000000000000000000000000000000002·2−52 Nowalreadya1>2b1andrevisedessaterminates.Incontrasttorevisedessathepositivepartofthesumissplitinto twosummands.Theterminationcriteriadoesnotapply.Wedifferfromrevisedessaonlyinonebitthatisstoredina2instead ofa1,butthissufficestopreventtermination. a1=1.00000000000000000000000000000000000000000000000000112·20 b1=1.00000000000000000000000000000000000000000000000000102·2−1 b2=1.00000000000000000000000000000000000000000000000000102·2−1

a1=1.11111111111111111111111111111111111111111111111110102·2−1 a2=1.10000000000000000000000000000000000000000000000000002·2−50 b1=1.00000000000000000000000000000000000000000000000000102·2−1 b2=1.00000000000000000000000000000000000000000000000000102·2−1

a1=1.00000000000000000000000000000000000000000000000000102·20 a2=1.00000000000000000000000000000000000000000000000000002·2−52 b1=1.00000000000000000000000000000000000000000000000000102·2−1 b2=1.00000000000000000000000000000000000000000000000000102·2−1 Nowtheexponentsofa1andb1areequalandwegetonlyonenew summand.Stilltheterminationcriterionisnotfulfilled.Althoughtheexponentsdiffer,wegetridofonesummand. x=1.11111111111111111111111111111111111111111111111100002·2−2 y=0x=1.00000000000000000000000000000000000000000000000000102·2−1 y=0 a1=1.11111111111111111111111111111111111111111111111100002·2−2 a2=1.10000000000000000000000000000000000000000000000000002·2−50 b1=1.00000000000000000000000000000000000000000000000000102·2−1

a1=1.00000000000000000000000000000000000000000000000000102·2−1 a2=1.00000000000000000000000000000000000000000000000000002·2−52 b1=1.00000000000000000000000000000000000000000000000000102·2−1 Originalessacomputestwonewnegativesummandsandthusreduces thesizeoftheactuallydominantsequence!Fortunatelya1=b1,soweloosetwosummandsandmodifiedessa terminatesinthenextstep. x=-1.00000000000000000000000000000000000000000000000000002·2−52 y=-1.00000000000000000000000000000000000000000000000000002·2−50x=0 y=0 a1=1.10000000000000000000000000000000000000000000000000002·2−50 b1=1.00000000000000000000000000000000000000000000000000002·2−50 b2=1.00000000000000000000000000000000000000000000000000002·2−52

a1=1.00000000000000000000000000000000000000000000000000002·2−52 x=1.00000000000000000000000000000000000000000000000000002·2−51 y=0 a1=1.00000000000000000000000000000000000000000000000000002·2−51 b1=1.00000000000000000000000000000000000000000000000000002·2−52 Table2:Theinputdataischosensuchthatthefirstiterationinrevisedessareducesthenumberofsummands,whilethisisnotthecasefor originalessa.Furthermorethesumisonlymarginallypositive,topreventterminationoforiginalessainthefirststep.

sum += CGAL::MP_Float(a[0]);

sum += CGAL::MP_Float(a[m-1]);

To generate the second half, for each summand first an accurate floating-point approximation s of the already generated sum is computed. Thens α·2^e is inserted as a summand, withα chosen as before and edecreasing linearly fromc to0.

hgenerate second part of sum21.1i≡ for(int i=m;i<n;i++){

const double factor = 1.0 - static_cast<double>(i-m)/static_cast<double>(n-m-1);

const int e = static_cast<int>(factor * c);

a[i] = ldexp((2*drand48()-1),e)-CGAL::to_double(sum);

sum += CGAL::MP_Float(a[i]);

}

Here is how we create zero sums. We select a random summand and compute a best floating-point approximation s⁰ to the sum of the remaining summands. We replace the selected summand by −s⁰ and iterate until the overall sum is zero.

hrandom sum generation18.1i+≡

void

modify_sum_to_zero(double *const a,const int n){

CGAL::MP_Float sum(a[0]);

for(int i=1;i<n;i++) sum += CGAL::MP_Float(a[i]);

random_integer_generator rig;

while(CGAL::to_double(sum) != 0.0){

const int i = rig(n);

sum -= CGAL::MP_Float(a[i]);

a[i] = -CGAL::to_double(sum);

sum += CGAL::MP_Float(a[i]);

} }

To assert reproducibility we use our own random integer generator for shuffling the summands.

hrandom integer generator21.3i≡ struct random_integer_generator{

inline int

operator()(const int n){

double d = drand48();

while(d == 1) d = drand48();

return static_cast<int>(floor(n*d));

} };

hsum_generators.hpp 21.4i≡

#ifndef SUM_GENERATORS_H

#define SUM_GENERATORS_H double

generate_random_sum(double *const a, const int n, const int c);

double

modify_sum_to_zero(double *const a, const int n);

#endif

hsum_generators.cpp21.5i≡

#include <cstdlib>

#include <cmath>

#include <algorithm>

#include <CGAL/basic.h>

#include <CGAL/MP_Float.h>

hrandom integer generator21.3i hrandom sum generation18.1i

4.3 Experimental results

We performed experiments on a notebook with an Intel Core 2 Duo T5500 processor with 1.66 Ghz, using g++ 4.3.2 and cgal 3.3.1. Using the generator from above, we generate 10000 sums with l summands and approximate condition number c for l ∈ {16,23,64,128,256,515} and c ∈ {4,8,16,23,64,128}. We measure running time, as well as the number of iterations of the essa main loop for each sum. Figure 1 and Figure 2 show the average running time in milliseconds for computing 10 times the sign of each sum. Figure 3 shows the minimum, average and maximum number of steps needed to compute the sign. We repeat the experiment, after modifying each sum to have a value of zero. Note that this changes the condition number to ∞ and essais required to eliminate all summands. The results are shown in Figure 4,Figure 5 and Figure 6. As competitors we use originalessa, modifiedessain round-up, revisedessa(in round-to-nearest) and revisedessa in round-towards-zero.

Revised essa distributes new summands to both sequences, while all other three competitors move both new summands to the sequence which previously had the larger leading summand. However revised essaand revisedessain round-towards-zero are nearly indistinguishable, both with respect to running time and number of iterations. Controlling where new summands are moved to does not seem to have a large impact.

With respect to running time, modified essais an improvement over original essabut revised essa is a clear winner. This seems partly to be caused by using standard floating-point operations only, since modifiedessaand revisedessaare similar with respect to the number of iterations used. The number of iterations of originalessaalso shows why it performs badly. It has a larger variance and in general is larger than for any otheressavariant. For many summands and large condition numbers its minimum is larger than the maximum number of iterations for any otheressavariant.

The performance of revised essa seems to be invariant with respect to the condition number, both in the number of iterations and the running time. It rarely uses more than l iterations, leaving a large gap to the bound from Theorem 8. While the same holds for the number of iterations of modified essa, its running time increases with the condition number. The reason is probably that modified essa is not as good in reducing the number of summands, this makes heap maintenance more expensive.

5 Conclusion

Our revised essa is clearly an improvement over previously existing essa variants. Revised essa reduces the number of summands if and only if the subtraction of both leading elements can be per-formed without rounding error. It also maximizes the cancellation in each iteration. Both properties lead to a more compact representation of the sum in each step, which helps to meet the termination criteria early. It is however not competitive to other algorithms that allow to compute the sign of a sum of floating-point numbers exactly [11]. These algorithms avoid branches in their inner loops, which are essential for the heap maintenance inessa. Unlike these algorithms however revisedessa is completely immune to overflow and underflow and runs correctly in any rounding mode. It might therefore have applications where these properties are required.

condition number≈2⁴

condition number ≈2⁸

condition number≈2¹⁶

condition number ≈2³²

condition number≈2⁶⁴

revisedessain round-towards-zero revisedessa

modifiedessain round-up originalessa

Figure 1: Running times for nonzero sums. Thex-axis is labeled with the number of summands.

revisedessain round-towards-zero revisedessa

modifiedessain round-up originalessa

Figure 2: Running times for nonzero sums. Thex-axis is labeled withc, where2^cis the approximate condition number.

0 8 16 24 32 40 48

4 8 16 32 64 128

16 summands

0 16 32 48 64 80 96

4 8 16 32 64 128

32 summands

0 32 64 96 128 160

4 8 16 32 64 128

64 summands

0 64 128 192 256

4 8 16 32 64 128

128 summands

0 128 256 384 512

4 8 16 32 64 128

256 summands

0 256 512 768

4 8 16 32 64 128

512 summands

revisedessain round-towards-zero revisedessa

modifiedessain round-up originalessa

Figure 3: Number of iterations of the main loop for nonzero sums. Thex-axis is labeled withc, where 2^c is the approximate condition number.

original condition number≈2⁴

original condition number≈2⁸

original condition number≈2¹⁶

original condition number≈2³²

original condition number≈2⁶⁴

revisedessain round-towards-zero revisedessa

modifiedessain round-up originalessa

Figure 4: Running times for zero sums. Thex-axis is labeled with the number of summands.

revisedessain round-towards-zero revisedessa

modifiedessain round-up originalessa

Figure 5: Running times for zero sums. The x-axis is labeled with c, where 2^c is the approximate condition number of the sum before it was modified to be zero.

revisedessain round-towards-zero revisedessa

modifiedessain round-up originalessa

Figure 6: Number of iterations of the main loop for zero sums. The x-axis is labeled withc, where 2^c is the approximate condition number of the sum before it was modified to be zero.

References

[1] IEEE Standards Comitee 754. IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754-1985. Institute of Electrical and Electronics Engineers, New York, 1985. Reprinted in SIGPLAN Notices, 22(2):9-25, 1987.

[2] David Cordes and Marcus Brown. The literate-programming paradigm. Computer, 24(6):52–61, 1991.

[3] Theodorus J. Dekker. A floating-point technique for extending the available precision. Nu-merische Mathematik, 18(2):224–242, June 1971.

[4] Marina Gavrilova, Dmitri Gavrilova, and Jon G. Rokne. New algorithms for the exact compu-tation of the sign of algebraic expressions. InCanadian Conference on Electrical and Computer Engineering, pages 314–317, May 1996.

[5] Marina Gavrilova, Helmut Ratschek, and Jon G. Rokne. Exact computation of Voronoi diagram and Delaunay triangulation. Journal of Reliable Computing, 6(1):39–60, 2000.

[6] Marina Gavrilova and Jon G. Rokne. Reliable line segment intersection testing. Computer-Aided Design, 32(12):737–745, October 2000.

[7] Marina Gavrilova and Jon G. Rokne. Computing line intersections. International Journal of Image and Graphics, 1(2):1–14, May 2001.

[8] Donald E. Knuth. Literate programming. Comput. J., 27(2):97–111, 1984.

[9] Georg Mackenbrock, Helmut Ratschek, and Jon G. Rokne. Experimental reliable code for 2D convex hull construction, 1998. http://pages.cpsc.ucalgary.ca/~rokne/convex/.

[10] Daniel Mall. web pages on literate programming. http://www.literateprogramming.com/. [11] Marc Mörig and Stefan Schirra. On the design and performance of reliable geometric predicates

using error-free transformations and exact sign of sum algorithms. In19th Canadian Conference on Computational Geometry (CCCG’07), pages 45–48, August 2007.

[12] Takeshi Ogita, Siegfried M. Rump, and Shin’ichi Oishi. Accurate sum and dot product. SIAM Journal on Scientific Computing, 26(6):1955–1988, 2005.

[13] Michael L. Overton. Numerical Computing with IEEE Floating-Point Arithmetic. SIAM, 2001.

[14] Helmut Ratschek and Jon G. Rokne. Exact computation of the sign of a finite sum. Appl. Math.

Comput., 99(2-3):99–127, 1999.

[15] Helmut Ratschek and Jon G. Rokne. Exact and optimal convex hulls in 2D. Int. J. Comput.

Geometry Appl., 10(2):109–129, 2000.

[16] Helmut Ratschek and Jon G. Rokne. Geometric computations with interval and new robust meth-ods: applications in computer graphics, GIS and computational geometry. Horwood Publishing, Ltd., Chichester, USA, 2003.

[17] Siegfried M. Rump, Takeshi Ogita, and Shin’ichi Oishi. Accurate floating-point summation.

Technical Report 05.1, Faculty of Information and Communication Science, Hamburg University of Technology, 2005.

[18] Pat H. Sterbenz. Floating-Point Computation. Prentice-Hall, 1974.

Im Dokument Engineering an Exact Sign of Sum Algorithm (Seite 21-34)