One iteration of the parallel Richardson iteration consists of the following steps

(1)

Parallel Solution of Large Sparse Linear Systems, SS 2015 Exercise sheet 3 Prof. Dr. Peter Bastian, Marian Piatkowski Deadline 1st June 2015 IWR, Universit¨at Heidelberg

EXERCISE6 THE PARALLELRICHARDSON ITERATION

We want to solve the linear systemAx“bwith the Richardson iteration x^pk`1q“x^pkq`ωpb´Ax^pkqq.

LetAbe the stiffness matrix coming from the discretization of the Poisson equation on the unitsquare withP₁Finite Elements. We use a structured simplicial mesh withN “n²degrees of freedom.

To accelerate the computational time we want to do the iteration in parallel with pprocessors.

Therefore we subdivide the unitsquare into psmaller squares and the degrees of freedom are dis- tributed accordingly. With this partitioning we assume thatp is a square number such that every processor haspn{?

pq² degrees of freedom. The index set of the degrees of freedom is denoted byI, the index set of thei-th processor byI_i. Every processor stores the entries ofx^pkqcorresponding to its degrees of freedom and relevant rows ofA.

One iteration of the parallel Richardson iteration consists of the following steps:

• Communication of required entries ofx^pkqfrom the neighbouring processors.

• Calculation ofx^pk`1q.

1. Describe the index setsIiand specify which entries ofx^pkqthe processorihas to communicate with which processor.

2. The computation time for an arbitrary arithmetic operation (addition, subtraction or multipli- cation) is denoted byt_op, the time needed for sending one byte to another processor byt_byte and the time needed to set up a message to another processor is denoted byt_msg.

Derive a formula for the total computational time of one iteration withpprocessors. The entries ofx^pkqare stored in double precision such that every entry requires 8 byte of memory.

The formula has to be only asymptotically correct. The matrix rows of the nodes next to the boundary which have less entries can be considered as interiour nodes.

3. Present the speedup of the parallel iteration in a table using the following parameters:

t_op“2ns t_byte“20ns t_msg“5000ns

nP t1024,4096u pP t1,4,16,256,4096u

12 Points

(2)

EXERCISE7 DOMAIN DECOMPOSITION

In the lecture you have learned the following theorem:

LetΩĂR^dbe Lipschitz domain (open, bounded and connected) and letf PL²pΩq. Then the Poisson problem

´∆upxq “fpxq @xPΩ

upxq “0 @xP BΩ (1)

is equivalent to the solution of the two subproblems

´∆u1pxq “fpxq @xPΩ1

u₁pxq “0 @xP BΩ₁zΓ u₁pxq “u₂pxq @xPΓ Bu1pxq

Bn1 “ ´Bu2pxq Bn2

@xPΓ

´∆u2pxq “fpxq @xPΩ2

u₂pxq “0 @xP BΩ₂zΓ

(2)

with the non-overlapping decomposition

Ω“Ω₁YΩ₂, Ω₁XΩ₂ “ H, Γ“ BΩ₁X BΩ₂, µpBΩ_iq ą0, such that theBΩiare Lipschitz continuous.

The theory for the Poisson equation can be formulated for right-hand sidesf P H^´1pΩqas well.

But for the general assumptionf PH^´1pΩq, the equivalence (1)ô(2) does not hold. To see this we are going to consider the following counter example in one dimension forΩ“ p´1,1q:

´∆upxq “ ´2δpxq inΩ up´1q “up1q “0

whereδpxqdenotes theDirac delta function.

1. Find the unique weak solutionuPH¹pΩq.

2. What are the transmission conditions ofupxqonΓ? Compare them to the transmissions conditions given in (2).

3. What changes in the derivation for the transmission conditions if we take again a right-hand sidef PL²pΩq?Hint:Cauchy-Schwarz inequality.

8 Points