• Keine Ergebnisse gefunden

Exercise2(Lineofsightusingmaxscanoperation, 5Credits ) Exercise1(SegmentedScan, 2Credits ) DueDate AssignmentonMassivelyParallelAlgorithms-Sheet4

N/A
N/A
Protected

Academic year: 2021

Aktie "Exercise2(Lineofsightusingmaxscanoperation, 5Credits ) Exercise1(SegmentedScan, 2Credits ) DueDate AssignmentonMassivelyParallelAlgorithms-Sheet4"

Copied!
2
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Prof. G. Zachmann

Christoph Schr¨oder (schroeder.c@cs.uni-bremen.de)

University of Bremen School of Computer Science

CGVR Group September 17, 2020

Summer Semester 2020

Assignment on Massively Parallel Algorithms - Sheet 4

Due Date

Exercise 1 (Segmented Scan, 2 Credits)

Assume you are given an input vector with frags array (head-tail flags). Assume further that all segments have a power-of-2 length (not necessarily the same lengths).

Describe how you can modify the standard scan algorithm so that it computes the segmented scan (segmented prefix-sum). You can start with either Hillis-Steele or Blelloch, but we find it easier to describe using Blelloch’s algorithm.

Note that you can describe your modifications as a sequential algorithm. Also note, that you don’t have to write pseudo-code, a clear description suffices. But your algorithm must contain at least parts of the standard scan algorithm.

Exercise 2 (Line of sight using max scan operation, 5 Credits)

The given framework LineOfSightgenerates two image files "height_field_{cpu,gpu}.bmp". In the images, an arbitrary height map, generated by a sine function, is plotted. From the top left corner to the button right corner, a line color codes the visibility along that ray. The visibility is computed using theLine of Sightconcept presented in the lecture. Blue color represents the points that are visible and red color represents points that are not visible in the view direction (line of sight ray), see Figure 1.

Hint: Please note that in the above framework, for simplicity, only a single block with dimension:

max threads, supported by the respective device is launched. (usually powers of 2) Your tasks are as follows:

a) Implement a kernel for the inclusive max scan operation using theHillis Steele Algorithm (single block version) as presented in the lecture.

b) Implement two kernels (one for up sweep and other for down sweep) for inclusive max scan operation usingBlelloch Algorithm(single block version)

Hints:

i) Note that the Blelloch Algorithmperforms exclusive scan operation. Please perform ap- propriate modifications to generate the inclusive max scan result.

ii) Use the utility functions provided in the framework to computing angles from height and to calculate the location of the point on a ray in both tasksaandb.

iii) The expected output image is shown in Figure 1.

c) Compare runtimes between the above two implementations (aandb) and provide arguments for the differences/similarities between run times for these two implementations.

1

(2)

Figure 1: Height Field bitmap with a colored line of sight. Blue colored points are visible and red points are non-visible points.

2

Abbildung

Figure 1: Height Field bitmap with a colored line of sight. Blue colored points are visible and red points are non-visible points.

Referenzen

ÄHNLICHE DOKUMENTE

Please compare the following categories of VR displays: fish tank VR display (i.e., stereoscopic moni- tor, please specify whether you consider autostereo or with glasses),

a) Consider two approaches of doubling the number of transistors: halving the size of a single transistor while maintaining constant die area (Moore’s Law) versus maintaining the

Hint: You can use one of the examples on the lecture homepage or from the Cuda SDK ( included in the Cuda installation package ) to test if Cuda works at all on your computer.

b) Implement another version of the kernel using global memory only for all intermediate results.. Note: CUDA does not support synchronization across different blocks of a

Hint: Please note that the tiled version of Matrix Multiplication is used in the above given framework and use the similarities between algorithm EXTEND-PATH and Matrix

a) Modify the bubble sort cuda implementation (single block) in the previous assignment (assignment 10) so that it can handle array lengths greater than 2 times the maximum number

The intersect() function of the acceleration data structures is called for the whole scene instead of the currently used intersect() from the SurfaceList class. As expected, it

Develop an efficient algorithm to reset the temperature of the heat sources/sinks in each simulation step. Think about both, expensive memory accesses and computational effort. Is