• Keine Ergebnisse gefunden

Remarks for Time Measurement

N/A
N/A
Protected

Academic year: 2021

Aktie "Remarks for Time Measurement"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

IWR, University of Heidelberg Winter term 2015/16

Exercise Sheet 2 29. October 2015

Exercise for Course

Parallel High-Performance Computing Dr. S. Lang

Return: 5. November 2015 at the beginning of the exercise or earlier

Task 3 C++ Introduction: Debugging (5 points)

The following program shall add all natural numbers from given a∈Ntob∈N:

1 # i n c l u d e < i o s t r e a m >

2

3 // s u m s all n a t u r a l n u m b e r s in [ a , b ]

4 int sum (int a , int b )

5 {

6 int r e s u l t ;

7 for (int i = a ; i <= b ; i ++)

8 {

9 int r e s u l t = r e s u l t + i ;

10 }

11

12 r e t u r n 0 ;

13 }

14

15 int m a i n ()

16 {

17 std :: c o u t < < sum (1 , 10) < < std :: e n d l ;

18 r e t u r n 0 ;

19 }

Although the progrmm is syntactically correct, it calculates the wrong result. Find the errors and correct them without modifying the program purpose. What do you have to change, if further natural numbers shall be added, but for the number domain applies:a, b∈R.

Tip: If you just want to try compile the program on the computer in a file debug.cc, the C++

compiler initiated with the option-Wall gives you hints to erroneous code. In the command line use for compilation:g++ -Wall debug.cc.

Task 4 Measurement von MFLOPS (15 points)

In this task we want to measure for two numerical applications, how many arithmetic operations per second are achievable on our pool machines. Herefore we implement the following mathematical operations:

1. Matrix Multiplication.

Given two matricesA, B∈Rn×n. Then the matrix productC=ABis again a matrixC∈Rn×n with the entries:

cij =

n

X

k=1

aikbkj.

2. Gauß-Seidel 2d.

Given a domain inddimensions defined by Ωdn=

n

(i0, . . . , id−1)∈Zd | ∀0≤k < d,0≤ik< n o

.

In 2D this would be for example a mesh with n2 points. We choose a mesh with equidistant points, therefore Ω = [0, n−1]2. On this mesh a mesh function um : Ω2n→Ris defined. For this the iteration procedure

um+1(i, j) = 1 4

n

um+1(i−1, j) +um+1(i, j−1) +um(i, j+ 1) +um(i+ 1, j) o

(i, j)∈[1, n−1]2 defines the so-called Gauss-Seidel iteration.

(2)

Subtask (a) (5 points) Implement the matrix multiplication in the programming language C/C++ and use for storing the matrices an arbitrary data structure of your choice (e.g. one-dimensional or two-dimensional arrays or std::vector). Determine the number of floating point operations and calculate herefrom and of the measured runtime the speed of the program in

”Million FLoating point OPerations per Second

“(MFLOPS).

For time measurement you can choose the functions provided in timer.h. You can find the header file on the lecture homepage, hints for usage at the end of the exercise sheet. Be careful to choose the problem size n in a size, that the time measurement is not influenced by the measurement error, in the pool aboutn≥1000. Initialise the arrays with meaningful data (not 0.0), e. g.u(i, j) =i+j.

Compile the program with maximal optimization level. For the GNU C/C++ compiler is e.g. -O3 - funroll-loopsrecommendable.

Visualize all results in graphical form, MFLOPS over problem sizen. Discuss the curvature of the graph, especially why and when the MFLOPs rate decreases. For graphics generation the program gnuplot, also installed in the pool, is recommended.

Subtask (b) (5 points)

Repeat the investigations of subtask (a) for the Gauss-Seidel scheme.

Subtask (c) (5 points)

Introduce for the matrix multiplication a better cache usage by tiling as proposed in the lecture, and determine the acceleration for different blocking sizes.

Remarks for Time Measurement

The different timings

During time measurement on the computer the problem arises that the time a program needs depends on the load of the whole system. Are there many processes active a single process has only few time and runs accordingly long in wall clock time. The processor time instead measures how many seconds the processor has been active executing the program. The clock tics as long as the program runs and when the process is idle it waits.

timer.h

In the header file timer.h there are several auxiliary functions implemented, that can read the used processor time. There are three functions available:

• void reset_timer(struct timeval* timer): reset/initialise counter.

• double get_timer(struct timeval timer): read used seconds.

• void print_timer(struct timeval timer): print used seconds.

Example

1 # i n c l u d e " t i m e r . h " // H e a d e r f i l e for t i m e m e a s u r e m e n t

2

3 int m a i n ()

4 {

5 s t r u c t t i m e v a l t i m e r ; // v a r i a b l e for t i m e m e a s u r e m e n t

6 r e s e t _ t i m e r (& t i m e r ) ; // r e s e t and i n i t i a l i z e c o u n t e r

7 ... // Do s o m e t h i n g t h a t n e e d s t i m e

8 p r i n t _ t i m e r ( t i m e r ) ; // p r i n t c o u n t e r

9 }

More about the internal time measurement can be read in the manpage for getrusage (2).

Referenzen

ÄHNLICHE DOKUMENTE

!  Good rule of thumb: choose the size of the cells such that the edge length is about the average size of the objects (e.g.,.. measured by

In particular , the method comprises capturing a first small 35 A - scan , edge detection in the A - scan or by a segmentation set of data by means of OCT and using the

a certain graph, is shown, and he wants to understand what it means — this corre- sponds to reception, though it involves the understanding of a non-linguistic sign;

Ryder (1975) applied what we now call ∝ -ages to show how the chronological age at which people became elderly changes in stationary populations with different life

In this environment, we focus on the following problem: Given historic delivery data (in terms of pre-packed lots of some lot-type for each branch) and sales data for a group

To date, only two rabies VNAs tests are considered as the gold standard methods for rabies serology and recommended by the World Health Organization (WHO) and the World Organization

While researching about tax elasticity in his paper “An Econometric Method for Estimating the Tax Elasticity and the Impact on Revenues of Discretionary Tax Measures”

The compar routine is expected to have two arguments which point to the key object and to an array member, in that order, and should return an integer less than, equal to, or