• Keine Ergebnisse gefunden

Fast summation algorithms of radial functions

N/A
N/A
Protected

Academic year: 2022

Aktie "Fast summation algorithms of radial functions"

Copied!
32
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Fast Fourier transforms at nonequispaced nodes and applications

Franziska Nestler, Daniel Potts Department of Mathematics Technische Universität Chemnitz

CECAM Summer School, Jülich, 11th September 2013

(2)

1 Introduction: FFT

2 NFFT

3 NFFT based fast summation

4 Application to particle simulation

(3)

”The FFT is, without doubt, one of the most important algorithm in applied mathematics and engineering.” (V. Olshevsky)

”The Fast Fourier transform (FFT) is one of the truly great computational developments of this century. It has changed the face of science and engineering so that it is not an exaggeration to say that life as we know it would be very different without FFT.”(Charles Van Loan)

1805Carl Friedrich Gaußused an algorithm similar to FFT.

1903Runge

1942Danielson and Lanczos 1965Cooley and Tukey

Gauß Runge Lanczos Tukey

(4)

Problem:fast computation of

f(wj) =

M/2−1

X

k=−M/2

ke−2πikwj (j=−N/2, . . . , N/2−1)

h(k) =

N/2−1

X

j=−N/2

fje2πikwj (k=−M/2, . . . , M/2−1) wj∈T:= [−1/2,1/2)

forequispacednodeswjandM =N wj:= j

M (j=−M/2, . . . , M/2−1) FFTinO(MlogM)instead ofO(M2)flops

(5)

Problem:(NFFT) evaluation of the 1–periodic function

f(w) =

M/2−1

X

k=−M/2

ke−2πikw

atarbitraryknotswj∈T(j=−N/2, . . . , N/2−1) Idea:

1.approximatef bys1: m:=σM (σ >1),ϕ(x) :=˜ P

k∈Zϕ(x+k)

s1(w) :=

m/2−1

X

l=−m/2

glϕ˜

w− l m

2.approximates1 bys: pm,ψ(x) :=ϕ(x)·χ[−p

m,mp](x)

s(w) :=

m/2−1

X

l=−m/2

glψ˜

w− l m

=

[wm]+p

X

l=[wm]−p

glψ˜

w− l m

3.f(wj)≈s1(wj)≈s(wj)

(6)

Approximate

f(w) =

M/2−1

X

k=−M/2

ke−2πikw

by

s1(w) =

m/2−1

X

l=−m/2

glϕ˜

w− l m

=

X

k=−∞

ˆ

gkck( ˜ϕ) e−2πikw

m/2−1

X

k=−m/2

ˆ

gkck( ˜ϕ)e−2πikw

1 set ˆ gk:=

k/ck( ˜ϕ) k=−M/2, . . . , M/2−1,

0 k=−m/2, . . . ,−M/2−1, M/2, . . . , m/2−1

2 by FFT(m):

g = 1 M/2−1X ˆ

g e−2πikl/m

(7)

Algorithm-1D(NFFT)

1. Fork=−M/2, . . . , M/2−1compute ˆ

gk:= ˆfk/ck( ˜ϕ).

2. Forl=−m/2, . . . , m/2−1compute by FFT(m)

gl:= 1 m

M/2−1

X

k=−M/2

ˆ

gke−2πikl/m.

3. Forj=−N/2, . . . , N/2−1compute

f(wj)≈s(wj) :=

[wjm]+p

X

l=[wjm]−p

glψ˜

wj− l m

.

arithmetic operations:

O(M+mlogm+ (2p+ 1)N) =O(MlogM+pN)

(8)

Matrix-vector notation:

f=Afˆ,

whereAmay be factorised approximately as follows:

A≈CF D.

Each of the three matrices corresponds to a step in the NFFT algorithm:

1. D∈RM×M is a diagonal matrix:

D:=diag 1

m ck( ˜ϕ) M/2−1

k=−M/2

2. F ∈Rm×M is a truncated Fourier matrix:

F :=

e−2πikl/mm/2−1 l=−m/2,

M/2−1 k=−M/2

(9)

3. C∈RN×mis a sparse band matrix with2p+ 1non-zero entries per row:

C:=

cj,l

N/2−1 j=−N/2,

m/2−1

l=−m/2

where cj,l=

ψ w˜ jml

ifl∈ {bwjmc −p, . . . ,dwjme+p}

0 otherwise.

Structure of the matrixC. Non-zero entries are indicated by dots. The row indexjruns from−N/2toN/2−1, the column indexlruns from−m/2to m/2−1. Parameters used wereN=M = 64,m= 128andp= 5; Legendre nodes were used for thewj.

(10)

Error estimates:

|f(wj)−s(wj)| ≤ Ea(wj) +Et(wj)

aliasing error Ea(wj) := |f(wj)−s1(wj)|

truncation error Et(wj) := |s1(wj)−s(wj)|

Ea(wj)≤ kfˆk1 max

−M/2≤k<M/2

X

r=−∞

r6=0

ck+mr( ˜ϕ) ck( ˜ϕ)

Et(wj)kfˆk1

m max

−M/2≤k<M/2

1

|ck( ˜ϕ)|

m/2−1

X

l=−m/2

ϕ˜

wjml

ψ˜

wjml

(11)

Window functionsϕ(w) =˜ P

k∈Zϕ(w+k):

•Gaussian(Dutt, Rokhlin 1993; Steidl 1998) ϕ(w) = (πb)−1/2e−(mw)2/b

b:= 2σ 2σ−1

p π

•B-splines(Beylkin 1995; Potts, Steidl, Tasche 1998) ϕ(w) = B2p(mw)

•Sinc–function(Potts 2001)

ϕ(w) = (2σ−1)M2p

sincπ(2σ−1)M w

2p

2p

•Kaiser–Bessel function(Fourmont 2001, Jackson 1991)

|w| ≤ mp : ϕ(w) = 1 π

sinh(bp

p2−m2w2) pp2−m2w2

b:=π

2−1

σ

(12)

Error estimates for special window functionsϕ:

|f(wj)−s(wj)| ≤C(σ, p)kfkˆ 1

with

C(σ, p) :=

















4 e−pπ(1−1/(2σ−1))

for Gaussian 4

1 2σ−1

2p

for B-Splines

1 p−1

2 σ2p+

σ 2σ−1

2p

for sinc 4π(√

p+p)q4

1−σ1e−p2π

1−1/σ

for Kaiser-Bessel

For fixedσ >1, theerror decaysexponentiallywithp.

(13)

2 4 6 8 10 12 14 16 18 20 10−16

10−14 10−12 10−10 10−8 10−6 10−4 10−2 100

m E2

Gaussian Kaiser−Bessel B−Spline Sinc2m

p

The error with options double precision,d= 1, parameters M = 1024, N= 2000, σ= 2forE2

E2=kf−sk2

kfk2

=

N/2−1

X

j=−N/2

|fj−s(wj)|2

1 2

·

N/2−1

X

j=−N/2

|fj|2

12

(14)

NFFT– fast computation of

f(wj) =

M/2−1

X

k=−M/2

ke−2πikwj (j=−N/2, . . . , N/2−1)

matrix–vector form

fˆ:= ( ˆfk)M/2k=−M/2,f:= (f(wj))N/2j=−N/2,A:= e−2πikwjN/2−1,M/2−1 j=−N/2,k=−M/2

f=Afˆ≈CF Dfˆ NFFTH (adjoint,not inverse!) – fast computation of

h(k) =

N/2−1

X

j=−N/2

fje2πikwj (j=−M/2, . . . , M/2−1)

The factorisation that was derived forAallows us to derive an NFFTH algorithm simply by transposingA:

(15)

NFFT

(multivariate case)

fast computation of the sums

f(wj) =

M/2−1

X

k1=−M/2

. . .

M/2−1

X

kd=−M/2

fke−2πikwj (j=−N/2, . . . , N/2−1)

h(k) =

N/2−1

X

j=−N/2

fje2πikwj

k∈ {−M/2, . . . , M/2−1}d=:IMd

forequispacednodeswj:=Mj (N=Md)

FFT(fast Fourier transform) inO(MdlogM) forarbitrarynodeswj∈[−1/2,1/2)d

NFFT(nonequispaced FFT) inO(MdlogM+pdN)

(16)

Software available:

NFFT – C subroutine library (Keiner, Kunis, Potts 2002–2013) http://www.tu-chemnitz.de/∼potts/nfft

Generalization:

Nonequispaced in time and frequency (NNFFT), nonequispaced DCT/DST, hyperbolic cross, NFFT on the sphere, iterative solution of the inverse transforms

Applications:

fast summation, fast Gauss transform, summation on the sphere, MRI, polar FFT, Radon transform, CT, ridgelet transform

Documentation:

NFFT3 Tutorial (Keiner, Kunis, Potts)

(17)

Fast summation algorithms of radial functions

P Problem: fast computation of

f(xj) :=

N

X

k=1

αkK(xj−xk) (j= 1, . . . , N)

nodesxj∈Rd,K(x) =K(kxk)radial functions

f = Kα

Kare special kernels, e.g.

singular kernels: 1

|x|, 1

x2,log|x|, x2log|x|

nonsingular kernels: (x2+c2)±1/2,e−δx2

Applications: integral equations, scattered data approximation, image processing, discrete Gauss transform,. . .

(18)

Introduction NFFT Fast summation P2NFFT

Known methods for products of vectors with specially structured dense matrices

f=Kα

panel clustering, fast multipole method, wavelet methods

Standard algorithmforequispacednodes: K –Toeplitz matrix

f = FFT( diag(b) FFT

H

(α))

f = NFFT( diag( ˜ b) NFFT

H

(α)) + near field

(19)

Known methods for products of vectors with specially structured dense matrices

f=Kα

panel clustering, fast multipole method, wavelet methods

Standard algorithmforequispacednodes: K –Toeplitz matrix

f = FFT( diag(b) FFT

H

(α))

Ideafornonequispacednodes: replaceFFTbyNFFT

f = NFFT( diag( ˜ b) NFFT

H

(α)) + near field

(20)

Introduction NFFT Fast summation P2NFFT

Problem: fast evaluation of f(x) :=

N

X

k=1

αkK(x−xk) =

N

X

k=1

αkK(kx−xkk),

at theN given nodesx=xj∈Rd Singular kernels: 1

|x|, 1

x2,log|x|, x2log|x|

at boundary, 12−εB≤ kxk ≤ 12 (assumekxj−xkk ≤ 12 −εB)

smooth and periodic functionKR

Approximation:

KR(x)≈ KRF(x) := X

l∈Idm

ble2πilx

(21)

Introduction NFFT Fast summation P2NFFT

Problem: fast evaluation of f(x) :=

N

X

k=1

αkK(x−xk) =

N

X

k=1

αkK(kx−xkk),

at theN given nodesx=xj∈Rd Singular kernels: 1

|x|, 1

x2,log|x|, x2log|x|

RegularizeK:

near0,kxk ≤εI

at boundary, 12−εB≤ kxk ≤ 12 (assumekxj−xkk ≤ 12 −εB)

smooth and periodic functionKR

Approximation:

KR(x)≈ KRF(x) := X

l∈Idm

ble2πilx

12+εB εI I 1 2εB

(22)

Problem: fast evaluation of f(x) :=

N

X

k=1

αkK(x−xk) =

N

X

k=1

αkK(kx−xkk),

at theN given nodesx=xj∈Rd Singular kernels: 1

|x|, 1

x2,log|x|, x2log|x|

RegularizeK:

near0,kxk ≤εI

at boundary, 12−εB≤ kxk ≤ 12 (assumekxj−xkk ≤ 12 −εB)

smooth and periodic functionKR

Approximation:

KR(x)≈ KRF(x) := X

l∈Id

ble2πilx

(23)

Regularization by algebraic polynomials Given: aj, bj,j= 0, . . . , q−1

Compute: polynomialP with

P(j)(c−r) =aj (j= 0, . . . , q−1) (1) P(j)(c+r) =bj (j= 0, . . . , q−1) (2) Theorem (Two point Taylor interpolation):

For givenaj, bj(j= 0, . . . , q−1) there exists a unique polynomialP of degree2q−1which satisfies the conditions (1) and (2):

P(x) = 1 2q

q1

X

j=0 q1j

X

k=0

rj j!2k

q1 +k k

h(1 +y)j+k(1y)qaj+ (−1)j(1y)j+k(1 +y)qbj

i,

wherey:=x−cr . For symmetric functions: (−1)jbj=aj.

P Around0:aj=K(j)(−εI), bj=K(j)I)

At the boundary: aj=K(j)(1/2−εB), bj=K(j)(−1/2+εB)

(24)

Splitting: K(x) = [K(x)− KR(x)] +KR(x) =:KNE(x) +KR(x)

ApproximationKR(x)≈ KRF(x): f(x)≈f(x) :=˜ fNE(x) +fRF(x) Near field (kx−xkk ≤εI, direct):

fNE(x) :=

N

X

k=1

αkKNE(x−xk) Fourier method

fRF(x) :=

N

X

k=1

αkKRF(x−xk)

fRF(xj) =

N

X

k=1

αk

X

l∈Imd

ble2πil(xj−xk) = X

l∈Imd

bl N

X

k=1

αke−2πilxk

!

| {z }

NFFTH

e2πilxj

| {z }

NFFT

(25)

Particle-particle NFFT (P2NFFT)

Coulomb potential in charged particle systems:

φ(xj) :=

N

X

i=1,i6=j

qi

kxi−xjk

Approach:

setK(kxk) :=kxk−1

letkxi−xjk ≤h(1/2−εB)

constructh-periodic regularization

fast computation of the far field by NFFT based fast summation

φRF(xj) = X

l∈I3m

bl N

X

i=1

qie2πilxi/h

!

| {z }

NFFTH

e−2πilxj/h

| {z }

NFFT

(26)

Coulomb potential in charged particle systems:

φ(xj) :=X

n∈S N

X

i=1

0 qi

kxi−xj+nk s.t.periodic boundary conditions

xj∈B1T×B2T×B3T

1

fully periodic:S =B1Z×B2Z×B3Z

Ewald summation

Ewald splitting

1

r =erf(αr)

r +erfc(αr) r

0 0.2 0.4 0.6 0.8 1

0 2 4 6 8 10

erf(x) := 2πRx r

0 e−t2dt(error function)

erfc(x) := 1erf(x)(complementary error function)

(27)

Coulomb potential in charged particle systems:

φ(xj) :=X

n∈S N

X

i=1

0 qi

kxi−xj+nk s.t.periodic boundary conditions

xj∈B1T×B2T×B3T

1

fully periodic:S =B1Z×B2Z×B3Z

Ewald summation X

n∈S N

X

i=1

0 qi

kxi−xj+nk =X

n∈S N

X

i=1 0qi

erfc(αkxi−xj+nk) kxi−xj+nk +

X

n∈S N

X

i=1

qi

erf(αkxi−xj+nk) kxi−xj+nk − 2α

√πqj

short range part: direct evaluation after truncation

limr→0erf(αr)

r =π substractself potential

transformlong range partinto a sum in Fourier space

(28)

Coulomb potential in charged particle systems:

φ(xj) :=X

n∈S N

X

i=1

0 qi

kxi−xj+nk s.t.periodic boundary conditions

xj∈B1T×B2T×B3T

1

fully periodic:S =B1Z×B2Z×B3Z

Ewald summation

compute long range part using NFFTs (Hedman, Laaksonen 2006)

φL(xj) = 4π B1B2B3

X

k6=0

e−kkk2/(4α2) kkk2

N

X

i=1

qieikxi

!

| {z }

NFFTH

e−ikxj

| {z }

NFFT

k∈B1B2B3Z

(29)

Coulomb potential in charged particle systems:

φ(xj) :=X

n∈S N

X

i=1

0 qi

kxi−xj+nk s.t.periodic boundary conditions

xj∈B1T×B2T×B3T

1

2d-periodic: S=B1Z×B2Z× {0}

Ewald summation, long range part: k∈ B

1B

2Z φL(xj) = π

B1B2

X

k6=0

Θ(kkk, xij,3)

kkk eik(xij,1,xij,2)

Idea: regularize the functionsΘ(k,·)(N., Potts 2013)

0

−(B3 +ε) B3 +ε

−B3 B3

Θ(k,·)

M/2−1

X

l=−M/2

bk,leπilx/(B3+ε)

Fast Fourier transforms at nonequispaced nodes and applications

(30)

Coulomb potential in charged particle systems:

φ(xj) :=X

n∈S N

X

i=1

0 qi

kxi−xj+nk s.t.periodic boundary conditions

xj∈B1T×B2T×B3T summary:

S={0}3: NFFT based fast summation in 3d

fully periodic: Ewald + NFFT

2d-periodic: Ewald + NFFT based fast summation in 1d

1d-periodic: Ewald + NFFT based fast summation in 2d

Structure Near field +CF D

| {z }

NFFT

|{z}

diag

DHFHCH

| {z }

NFFTH

(31)

Calculation of the fields

Ej:=−∇φ(y) y=xj

Long range part for fully p.b.c.:

two possibilies:

1 ikdifferentation (apply∇to Fourier series) ELj = 4iπ

B1B2B3

X

k6=0

ekkk2/(4α2)

kkk2 kS(k)e−ikxj with S(k) :=

N

X

i=1

qieikxi

2 analytic differentation (apply∇to NFFT window function)

∇φL(xj) = 4π B1B2B3

∇X

k6=0

e−kkk2/(4α2)

kkk2 S(k)e−ikxj

≈ 4π B1B2B3

X

l∈I3n

gl∇˜ϕ xjm1l

Analog for other types of boundary conditions

(32)

Conclusions

: fast evaluation of trigonometric sums for nonequispaced data

software available

important: NFFT based fast summation

application to particle simulation: methods for all types of boundary conditions

http://www.tu-chemnitz.de/∼potts/nfft http://www.tu-chemnitz.de/∼nesfr

Referenzen

ÄHNLICHE DOKUMENTE

Perracchione Lectures on radial basis functions In practice, using different colours plot the Halton points for different values of M and N.. Again, by using the

In the second part of the paper we have analyzed the computational com- plexity of the inverse problem, i.e., we have considered fractal cornpression as a discrete

In order to reach a required accuracy we could apply a tuning algorithm of the form 2 or 3 in order to determine sufficiently large oversampling factors σ and appropriate

• 2d-periodic: Ewald summation (2D) and fast summation (1D) Long range parts of the forces: Use gradient NFFT. Extended method for computation of interactions

• proposed a new approach for NFFT based fast Ewald summation under mixed boundary conditions [Nestler, Potts 2013]. • based on constructing regularizations of the

Potts: Parallel three-dimensional nonequispaced fast Fourier transforms and their application to particle simulation. Porto: Ewald summation of electrostatic interactions of

AB far as the parameter ko limits the value of objective function, we will name this method the Method of Constraints. The solution procedure component is a

Namely, these are implicit grid refinement to accelerate the early phase of the algorithm when the object is only sparsely filled with spheres, and the explicit grid refine- ment