• Keine Ergebnisse gefunden

More Parallel Algorithms

N/A
N/A
Protected

Academic year: 2022

Aktie "More Parallel Algorithms"

Copied!
6
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Martin Ziegler 9

Komplexitätstheorie

NL NL and Parallel Computation

Every problem in

NL NL

can be solved in parallel time O(log²n) by polynomially many processors!

• dirPath ≼L Boolean Matrix powering:

– G=(V,E) adjacency matrix A∈{0,1}V×V of G:

– Au,v>0 :⇔ v reachable from u in ≤1 step

– (Ak)u,v>0 ⇔ v reachable from uV in k steps

• goal: (Ak)s,t for some k|V|=:n.

– rept.squaring: AA2A4A8…: O(log n) – each phase = matrix multipl.; n2 dot products – each dot product in parallel time O(log n)

Theorem: A LB⊆Σ* solvable in parallel time O(logkn) (k2) on polynomial size circuits same for A.

(2)

Komplexitätstheorie

More Parallel Algorithms

Prefix Sum: Given (x1,…,xn), calculate all sums

x1, x1+x2, x1+x2+x3, …, x1+x2+…+xn-1, x1+x2+…+xn-1+xn

in logarithmic parallel time?

log n

……

for any associative operation ⊕

using O(n·log n) gates

(3)

Martin Ziegler 11

Komplexitätstheorie

'generate', 'propagate'

Carry Look-Ahead Adder

Prefix Sum: Given (x1,…,xn), calculate all sums

x1, x1+x2, x1+x2+x3, …, x1+x2+…+xn-1, x1+x2+…+xn-1+xn

in parallel time O(logO(log nn)) for any associative operation

Long Addition: Given (a0,…,an-1) and (b0,…,bn-1), calculate (c0,…,cn-1,cn) := (a0,…,an-1) + (b0,…,bn-1) in logarithmic parallel time? ripple-carry adder i-th carry zi

= g

i

(p

izi-1

)

where gi

:= a

ibi and pi

:= a

ibi

(g,p)(g',p') := ( g'(p'g), p'p ) associative!

(z

i

,0) = (z

i-1

,0)

(g

i

, p

i

)

=( (z

i-2

,0)

(g

i-1

, p

i-1

) )

(g

i

, p

i

)

(4)

Komplexitätstheorie

Circuits: Depth and Size

y0

, y

1

, y

2

, …, y

m-1

∨ ∧ ¬

x0

, x

1

, x

2

, …, x

n-1

… ∨

¬ ∨ … ∧

¬ ∨ ∧ … ¬

Gates ∨,∧,¬ are universal unbounded fan-out

fan-in: binary/unary N-ary: simulate

in depth O(log N)

• n inputs, m outputs

• depth d ⇒ size ≤ m·2d

• If sorted topologically, evaluation on a TM

in time O(size)

… … … …

(5)

Martin Ziegler

Komplexitätstheorie

evaluation on a TM in time poly(size)

Uniformity

• Each circuit C has a fixed number of inputs

for deciding L{0,1}*, consider a family (Cn)

• {1 : n= 〈M〉 for terminating TM M } undecidable to TM, but decidable by some family of circuits:

• F. Meyer auf der Heide (1984): knapsack can be decided by circuit family Cn of polynom.size

• New circuit for each n: nonuniform algorithm Def: Call family Cn of circuits uniform

if some logspace-DTM can, on input 1, output 〈Cn

(sorted topologically)

(6)

Martin Ziegler 14

Komplexitätstheorie

n inputs, m outputs

Circuit vs. Turing Complexity

Can evaluate a given circuit C on a TM

• in time O(size) once sorted topologically

• and in space O(m+depth):

– for each gate on level d – recursively evaluate its 2

predecessors on levels<d

Can simulate a given TM M with input x on a circuit

• of depth O(SM(|x|)²

)

Reachability + Matrix Powering

• of size O

(

TM(|x|)²

)

y0, y1, y2 , …, ym-1

¬

x0, x1, x2 , …, xn-1

¬

¬ ¬

size ≈ seq. time, depth ≈ space : next slide

Referenzen

ÄHNLICHE DOKUMENTE

§ Shared memory and local memory is private to each block of threads, cannot be seen by child threads.. § So, how to return a value from a

As expected, cuckoo hashing is highly robust against answering bad queries (Figure 6.5) and its performance degrades linearly as the average number of probes approaches the

§  Awareness of the issues (and solutions) when using massively parallel architectures.. §  Programming skills in CUDA (the language/compiler/frameworks for

§  Synchronization usually involves waiting by at least one task, and can therefore cause a parallel application's execution time to increase. §  Granularity :=

§  Device memory pointers (obtained from cudaMalloc() ). §  You can pass each kind of pointers around as much as you

One method to address this problem is the Smart Grid, where Model Predictive Control can be used to optimize energy consumption to match with the predicted stochastic energy

§  Assume the scan operation is a primitive that has unit time costs, then the following algorithms have the following complexities:.. 38

B.  For each number x in the list, cut a spaghetto to length x list = bundle of spaghetti &amp; unary repr.. C.  Hold the spaghetti loosely in your hand and tap them on