• Keine Ergebnisse gefunden

Exercise2(MatrixVectorMultiplication, Credits ) Exercise1(Histogram, Credits ) DueDate AssignmentonMassivelyParallelAlgorithms-Sheet3

N/A
N/A
Protected

Academic year: 2021

Aktie "Exercise2(MatrixVectorMultiplication, Credits ) Exercise1(Histogram, Credits ) DueDate AssignmentonMassivelyParallelAlgorithms-Sheet3"

Copied!
1
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Prof. G. Zachmann

Christoph Schr¨oder (schroeder.c@cs.uni-bremen.de)

University of Bremen School of Computer Science

CGVR Group September 16, 2020

Summer Semester 2020

Assignment on Massively Parallel Algorithms - Sheet 3

Due Date

Exercise 1 (Histogram, Credits)

In class, you have learned the histogram algorithm (which uses atomic operations).

• What is the worst-case input? (in the sense that the GPU algorithm will take the longest time)

• What is the best-case input?

• In the best case, what is the probability that any two threads access the same memory location?

Consider 1024 bins and 64 threads and only one warp. It could help to think about the probability of no collision.

Exercise 2 (Matrix Vector Multiplication, Credits )

In the given FrameworkMatrixVectorMul Matrix A is stored using a row major order.

Your tasks are the following:

a) Implement a Matrix Vector multiplication kernel for the above Matrix stored in row major order.

b) Implement a method to store the above Matrix in column major order and then modify the above Matrix vector multiplication kernel to handle matrix stored in column major order .

c) Compare run times between the above two implementations (row major order vs column major order) for different Matrix sizes and provide arguments for the differences/similarities between run times for these two implementations .

1

Referenzen

ÄHNLICHE DOKUMENTE

f = P/(P + S) with P = execution time of parallizable part on single processor and S = execution time of inherently serial part on single processor (see Slide on Amdahl’s Law

i) Note that the Blelloch Algorithm performs exclusive scan operation. Please perform ap- propriate modifications to generate the inclusive max scan result.. ii) Use the

Hint: Please note that the tiled version of Matrix Multiplication is used in the above given framework and use the similarities between algorithm EXTEND-PATH and Matrix

Consequently, when transforming an order matrix into a process model using Algorithm 2, the resulting process model must be sound as well, since the algorithm constructs a process

Bitte schreiben Sie Ihren Namen und Matrikelnummer lesbar auf Ihre

Bitte schreiben Sie Ihren Namen und Matrikelnummer lesbar auf Ihre

[r]

[r]