A new method to generate and reduce one-loop amplitudes in OpenLoops 2

(1)

A new method to generate and reduce one-loop amplitudes in OpenLoops 2

M. F. Zoller

in collaboration with F. Buccioni and S. Pozzorini

PSI, Villigen - Theoretical Particle Physics Seminar - 04/10/2017

(2)

Outline

I. Numerical amplitude generation in OpenLoops

II. New colour and helicity treatment

III. On-the-fly Reduction

IV. Numerical stability

V. Summary and Outlook

(3)

I. Numerical amplitude generation in OpenLoops

• Fully automated numerical algorithm for tree and one-loop amplitudes ^(h = helicity configuration): W₀ = ^X

h

X

col

|M₀(h)|², W₁ = ^X

h

X

col

2 Re



M^∗₀(h)M₁(h)



, W₁^loop-ind = ^X

h

X

col

|M₁(h)|² Tree level and one-loop amplitudes are sums of Feynman diagrams

M₀ = ^X

d

M^(d)₀ , M₁ = ^X

d

M^(d)₁

• hybrid tree-loop recursion ⇒ high CPU efficiency and numerical stability

• NLO QCD and NLO EW corrections fully implemented

• OpenLoops is interfaced to Sherpa, Powheg, Herwig, Whizard, Geneva, Munich, Matrix

(4)

• OpenLoops 1 publicly available at openloops.hepforge.org [Cascioli, Lindert, Maierhöfer, Pozzorini]

– Third party tools for the tensor integral reduction to scalar MIs:

Cuttools 1.9.5 [Ossola, Papadopoulos, Pittau ’08], OneLoop 3.6.1 [van Hameren ’10], Collier 1.2 [Denner, Dittmaier, Hofer ’16]

– High tensor rank in loop momentum q ⇒ high complexity – Stability in the IR region is challenging for 2 → 4 processes

Long-term goal: NNLO automation for 2 → 2 and 2 → 3 processes

– 2 loop amplitude construction and reduction needed ⇒ avoid high tensor rank complexity – Numerical stability at NLO for 2 → 4 is crucial

• OpenLoops 2 to be published soon [Buccioni, Lindert, Maierhöfer, Pozzorini, M.Z.]

– Amplitude construction and integrand reduction merged ⇒ On-the-fly Reduction

⇒ tensor rank ≤ 2 at all times

– Stability issues addressed in a targeted way

(5)

Tree level amplitudes

M₀ = ^X

d

M^(d)₀

Each diagram factorizes into a colour factor and a colour stripped amplitude M^(d)_l = C_l^(d) A^(d)_l .

colour stripped A^(d)₀ are split into subtrees by cutting an internal line:

w

a

w

b for example

⇒ Numerical merging of subtrees performed recursively:

σa wa = σa

wb

wc

w_a^α(k_a, h_a) = ^X

βγα (k_b,k_c)

k_a²−m²_a w_b^β(k_b, h_b) w_c^γ(k_c, h_c)

with momentum k_a =k_b+k_c and for all possible helicity configurations h_a = h_b+h_c.

⇒ Once computed subtrees used in multiple Feynman diagrams at tree and loop level

(6)

One-loop amplitude

A^(d)₁ = ^Z d^Dq Tr

"

N(q, h)

#

D¯₀D¯₁· · ·D¯_N₋₁ =

w_N−1 wN

w1 w2

D0

D1

D2

D_N−1

q cut open at D¯₀

−−−−−−−−−−−→

"

N(q, h)

#βN

β0

=

wN

w1 βN

β0

propagators D_i = (q +p_i)² −m²_i, spinor/Lorentz indices β_i ⇒ trace: contraction with δ_β^β⁰

N, helicity configurations of subtree w_i: h_i helicity configurations of A^(d)₁ : h= h₁ +. . .+h_N

Numerator factorizes into segments:

"

N(q, h)

#βN

β0

=

" N

Y

i=1

S_i(q, h_i)

#βN

β0

=

"

S₁(q, h₁)

#β1

β0

"

S₂(q, h₂)

#β2

β1

· · ·

"

S_N(q, h_N)

#βN

βN−1

In the SM a segment (external subtree(s) + one loop vertex + propagator) is a q-polynomial of rank r ≤ 1:

3-point segment:

"

S_i(q, h_i)

#βi

βi−1

= βi−1

wi

ki

D_i

βi

=











"

Y_σⁱ

i

#βi

βi−1

+

"

Z_ν;σⁱ

i

#βi

βi−1

q^ν











w_i^σⁱ(k_i, h_i)

4-point segment:

"

S_i(q, h_i)

#βi

βi−1

= βi−1

wi1 wi2

ki1 ki2

D

βi

=

"

Y_σⁱ

1σ2

#βi

βi−1

w^σ_i₁¹(k_i₁, h_i₁)w_i^σ₂²(k_i₂, h_i₂) (h_i = h_i₁ +h_i₂)

(7)

The OpenLoops dressing step

define partially dressed numerator N_n(q, ˆh_n) = S₁(q, h₁)· · ·S_n(q, h_n) (ˆh_n = ^Pⁿ

i=1h_i)

β₀

w

1

D

₁

w

2

D

₂

w

k

D

_k

β_k

w

k+1

D

_k+1

w

N−1

D

_N₋₁

w

N

D

₀

β_N

| {z }

dressed segments

| {z }

undressed segments

dressing step N_n(q,hˆ_n) = N_n−1(q,hˆ_n−1)S_n(q, h_n) with initial condition N₀ = 11 (rank R ≤ n) performed numerically for the tensor coefficients in

N(q,hˆ_n) =

R

X

r=0

N_µ₁_...µ_r(ˆh_n)q^µ¹ · · ·q^µ^r,

"

N_µ₁_...µ_r(ˆh_n)

#βn

β0

=











"

N_µ₁_...µ_r(ˆh_n−1)

#β_n−1

β0

"

Y_σⁿ

n

#βn

βn−1

+

"

N_µ₂_...µ_r(ˆh_n−1)

#β_n−1

β0

"

Z_µⁿ

1;σn

#βn

βn−1











w_n^σⁿ(k_n, h_n)

(8)

Colour, helicity and diagram sums in OpenLoops 1

• for each diagram d and global helicity h configuration construct Tr



N_N^(d)(q, h)





• colour sum with Born: V_N^(d)(q, h) = 2





 X

col

M₀(h)^∗C^(d)





Tr



N_N^(d)(q, h)





• helicity sum: V_N^(d)(q) = ^X

h

V_N^(d)(q,h)

• sum same topology diagrams, reduce and evaluate integrals: ^Z d^Dq ^X

d

Tr



V_N^(d)(q,0)





D¯₀, . . . ,D¯_N₋₁

⇒ parent-child trick (recycling of colour-stripped partially dressed numerators)

N_N₋₂ =

w1 wN−2

−→











w1 wk wk+1 wk+2

= N_N₋₂ S_N₋₁ S_N w1 wk wk+1 wk+2

= N_N₋₂ S˜_N₋₁

New idea: formulate the OpenLoops recursion directly for

the colour-helicity summed interference with the Born amplitude V_N^(d)(q,0).

(9)

II. New colour and helicity treatment

consider color-helicity summed numerator V_N(q,0) = ^X

h

2





 X

col

M₀(h)^∗C





N_N(q,h) = ^X

h₁...h_N

2





 X

col

M₀(h)^∗C







| {z }

=V₀(h)

S₁(q,h₁)· · ·S_N(q,h_N)

and formulate recursion for partially dressed numerator with nested helicity sums V_n(q,ˇh_n) = ^X

h_n







. . . ^X

h₂





 X

h₁

V₀(h)S₁(q,h₁)







S₂(q,h₂)· · ·







S_n(q,h_n) ∀ hˇ_n = h_n+1 + · · · + h_N

= ^X

h₁...h_n

X

col

wk+1

wN

w1 wk

LO ×

wk+1

wN

w1 wk

NLO

and a dressing step as V_n(q,ˇh_n) = ^X

h_n

V_n−1(q,ˇh_n−1)S_n(q, h_n)

⇒ Remaining helicity dof are those of the undressed segments!

Parent-child trick not possible (different colour factors) ⇒ OpenLoops Merging instead

(10)

The OpenLoops Merging

Sum partially dressed open loops V_n(q,hˇ_n) = ^P

αV_n^(α)(q,hˇ_n) with

• the same topology D¯₀, . . . , D¯_N₋₁

• the same undressed segments S_n+1, . . . , S_N

since

P

α

V_n^(α) S_n+1···S_N₋₁

D¯₀D¯₁···D¯_N₋₁

=

^Vⁿ_¯ ^Sⁿ⁺¹^···S^N⁻¹

D₀D¯₁···D¯_N₋₁

Example:

N⁽¹⁾

e1 e2 e3

Dn

wn+1

Dn+1

wN

D0

+

N⁽²⁾

e1 e2 e3

Dn

wn+1

Dn+1

wN

D0

+

N⁽³⁾

e3 e1 e2

Dn

wn+1

Dn+1

wN

D0

+

N⁽⁴⁾

e3 e1 e2

Dn

wn+1

Dn+1

wN

D0











=

N

e1 e2 e3

Dn

wn+1

Dn+1

wN

D0

B dressing steps for S_n+1, . . . , S_N performed only once for the merged object B crucial for combination with on-the-fly integrand reduction (see later)

(11)

Amplitude generation and tensor reduction in OpenLoops 1

Example:

n: # of attached external legs

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1

2

3

4

5

6

7

(12)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

(13)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

15

(14)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

15

35

(15)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

15

35

70

(16)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

15

35

70

126

(17)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

15

35

70

126

210

(18)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 OpenLoops 330

complexity grows exponentially with tensor rank

(19)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 OpenLoops 330

complexity grows exponentially with tensor rank

Collier CutTools

Numerical tensor integral reduction to scalar MI

(20)

III. On-the-fly Reduction

Use reduction identities valid at integrand level [del Aguila, Pittau ’05]

q^µq^ν = A^µν + B_λ^µνq^λ

= A^µν₋₁ + A^µν₀ D₀ +





B_−1,λ^µν + ^X³

i=0 B_i,λ^µνD_i





 q^λ, D_i = (q + p_i)² − m²_i in order to reduce the factorized open loop integrand:

VN(q)

D₀ · · ·D_N = S₁(q)S₂(q)· · ·S_n(q)· · ·S_N(q) D₀D₁D₂D₃ · · ·D_N₋₁

.

(21)

III. On-the-fly Reduction

Use reduction identities valid at integrand level [del Aguila, Pittau ’05]

q^µq^ν = A^µν + B_λ^µνq^λ

= A^µν₋₁ + A^µν₀ D₀ +





B_−1,λ^µν + ^X³

i=0

B_i,λ^µνD_i





 q^λ, D_i = (q + p_i)² − m²_i in order to reduce the factorized open loop integrand:

VN(q)

D₀· · ·D_N = S₁(q)S₂(q)· · ·S_n(q)· · ·S_N(q) D₀D₁D₂D₃ · · ·D_N₋₁

integrand reduction applicable after n steps ∀n ≥ 2 (independently of future steps!)

⇒ V^µν q_µq_ν

D¯₀ · · ·D¯_N₋₁ = V₋₁^µ q_µ + V₋₁

D¯₀ · · ·D¯_N₋₁ + ^X³

i=0

V_i^µq_µ + V_i

D¯₀ · · ·D¯_i−1D¯_i+1 · · ·D¯_N₋₁

• q-dependence reconstructed in terms of 4 propagators ⇒ new topologies with pinched propagators

• A^µν, B_λ^µν depend on external momenta p₁, p₂, p₃

⇒ Compute with momentum space basis l^µ = p^µ−α p^µ, l^µ = p^µ−α p^µ, l , l ⊥ l , l , l² = 0

(22)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5

15

35

70

126

210

330

(23)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 330

4 pinched subtopologies

(24)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 330

(25)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 330

4 double pinched subtopologies

(26)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 330

OpenLoops + OFR

complexity associated with tensor rank remains small!

(27)

Example:

1 2 3 4 5 6 7 n

# of tensor coefficients rank

1 2 3 4 5 6 7

5 15 35 70 126 210 OpenLoops 1 330

OpenLoops + OFR

complexity associated with tensor rank remains small!

(28)

Problem: huge proliferation of topologies due to pinching of propagators

⇒ V_µν q^µq^ν

D¯₀· · ·D¯_N−1 =









V₋₁^µ +

3

X

i=0

V_i^µD¯_i



q_µ

| {z }

rank 1

+V₋₁ +V0D¯₀

| {z }

rank 0

+ V˜₋₁q˜²

| {z }

rational term







1

D¯₀· · ·D¯_N₋₁

w1 w2

∼ V^µνq_µq_ν

w3 wN

=











w1 w2 w3 wN

V−1^µqµ+ ˜V−1q˜²

+

w1 w2 w3 wN

V¹^µq_µ

+

w1 w2 w3 wN

V2^µqµ

+

w1 w2 w3 wN

V3^µqµ

+

w1 w2 w3 wN

V⁰^µqµ

⇒ factor ∼ 5 higher complexity after each reduction step!

(29)

Solution: OpenLoops Merging

• Contract pinched propagator between dressed segments

wi

D_i

wi+1

D_i+1

−→

wi wi+1

D_i+1

• Merge with all (pinched and unpinched) diagrams with same topology and undressed segments

N⁽¹⁾

wn wn+1

Dn+1

wn+2 wN

N⁽²⁾

wn wn+1

Dn+1

wn+2 wN











−→

N

wn wn+1

Dn+1

wn+2 wN

• No extra cost for pinched topologies after merging

• Algorithm:

– Start with highest point diagrams → merging with lower point diagrams

– OpenLoops 2 recursion step: dress one segment → reduce if necessary → merge

(30)

Technicalities

• Important: Cutting rule , i.e. choice of D¯₀.

wN−1

wN

w1 w2

D0

D1

D2

DN−1

q →

β0

w1

D₁

w2

D₂

wk

D_k

βk

wk+1

D_k+1

wN−1

D_N₋₁

wN

D₀

βN

⇒ One specific external particle always in w₁.

⇒ Unique rule for dressing direction based on external particles in w₂ and w_N.

• Treatment of pinches of D¯₀ = (q² − m²₀) (p₀ = p_N = 0) w1

k1

wn+1 wN−1

kN−1

wN

kN

pN= 0

shift cut

−−−−→

wN

kN

w1

k1

wn+1 wN−1

pN−1= 0 kN−1

dress SN

−−−−→

wN

kN

w1

k1

wn+1 wN−1

pN−1= 0 kN−1

contract

−−−−→

wN w1

kN k1

wn+1 wN−1

p_N−1= 0 kN−1

(31)

Final integral reduction

• reduce bubbles, rank-1 triangles and boxes with integral level identities [del Aguila, Pittau ’05]

• reduce rank-1 and rank-0 integrals with N ≥ 5 propagators to scalar boxes via simple OPP relations [Ossola, Papadopoulos, Pittau ’07]

V + V_µq^µ

D¯₀D¯₁ · · ·D¯_N₋₁ = ^N^X⁻¹

i₀<i₁<i₂<i₃

d(i₀i₁i₂i₃) D¯_i₀D¯_i₁D¯_i₂D¯_i₃

• use Collier 1.2 [Denner, Dittmaier, Hofer ’16] for scalar boxes, triangles, bubbles, tadpoles

(32)

IV. Numerical Stability

q^µq^ν = A^µν₋₁ + A^µν₀ D₀ +





B_−1,λ^µν + ^X³

i=0

B_i,λ^µνD_i





 q^λ

A^µν_i , B_i,λ^µν computed from reduction basis l_i(p₁, p₂) with i = 1,2,3,4 and third momentum p₃

A^µν_i = 1

γa^µν_i , B_i,λ^µν = 1

γ²



b⁽¹⁾_i,λ





µν

+ 1 γ



b⁽²⁾_i,λ





µν

γ = γ(p₁, p₂) = 4 ^∆(p¹^,p²⁾

p₁p₂±√

∆(p₁,p₂) with ∆ = (p₁p₂)² − p²₁p²₂

Severe numerical instabilities for γ ∝ ∆(p₁, p₂) → 0

• Freedom to choose two momenta from p₁, p₂, p₃

⇒ maximize γ in on-the-fly reduction with N ≥ 4 propagators.

⇒ avoid small Gram determinants until triangle reduction

• For N = 3: identify problematic kinematic configurations and use targeted expansions.

(33)

Problematic kinematic configuration: t-channel diagrams with

q p₁

q + p₁

p₂ − p₁ q + p₂

−p₂

p²₁ = −p² < 0,

p²₂ = −p²(1 + δ), 0 ≤ δ 1, (p₂ −p₁)² = 0,

⇒ √

∆ = p² 2 δ

⇒ γ = −p²δ²

⇒ expand basis momenta l_i, reduction formula and scalar integrals in δ, e.g. massless rank 1:

C^µ = 2 δ²p²

B₀(−p²,0,0)[−p^µ₁(1 + δ) +p^µ₂] +B₀−p²(1 + δ),0,0[(p^µ₁ − p^µ₂)(1 + δ)]

+1

δC₀−p²,−p²(1 + δ),0,0,0[−p^µ₁(1 + δ) + p^µ₂]

= p^µ₁ + p^µ₂ 2p²

−B₀(−p²,0,0) + 1

+δ p^µ₁ + 2p^µ₂ 6p²

B₀(−p²,0,0)

+O(δ²) with C₀(p₁, p₂, m₀, m₁, m₂) ∼ ^Z d^Dq 1

D¯₀D¯₁D¯₂ and B₀(p₁, m₀, m₁) ∼ ^Z d^Dq 1 D¯₀D¯₁

Implemented: direct expansions for the full reduction of rank ≤ 3 triangles to scalars for all relevant mass configurations up to and including O(δ²) [soon O(δ⁴)].

(34)

CPU performance: OpenLoops 1 + Collier/Cuttools vs OpenLoops 2

Runtimes (10⁻³s) per phase-space point

Last column: timing ratio between the fastest OL1+reduction library and OL2

OL1 (Collier) OL1 (Cuttools) OL2 OL1/OL2

uu¯ → tt¯ 0.2355 0.4034 0.2385 0.99

uu¯ → tt g¯ 4.259 7.066 3.490 1.2

uu¯ → tt g g¯ 1.154 · 10² 1.612 · 10² 0.7505 · 10² 1.5

gg → tt¯ 1.408 2.486 1.019 1.4

gg → tt g¯ 35.03 50.23 22.93 1.5

gg → tt g g¯ 1.330 · 10³ 1.519 · 10³ 0.6010 · 10³ 2.2

ud¯→ W⁺g 0.2972 0.6274 0.3255 0.91

ud¯→ W⁺g g 5.690 11.30 5.222 1.1

ud¯→ W⁺g g g 1.787 · 10² 2.380 · 10² 1.078 · 10² 1.7

uu¯ → W⁺ W⁻ 0.2622 0.4140 0.1756 1.5

uu¯ → W⁺ W⁻ g 8.528 12.04 7.011 1.2

uu¯ → W⁺ W⁻ g g 2.441 · 10² 2.817 · 10² 1.278 · 10² 1.9

Factor ∼ 2 speedup wrt OpenLoops 1 for nontrivial processes!

(35)

Stability of OpenLoops 1 and 2 in double precision: 2 → 3 processes (at √ ˆ

s = 1 TeV)

Probability of relative accuracy A or less (wrt OL1 + Cuttools in quad precision, 10⁶ uniform random points)

Quadruple Precision OpenLoops1 + Collier OpenLoops1 + Cuttools OpenLoops2

-15 ^-10 ^-⁵ ⁰ ⁵

10^{- 6} 10^{- 5} 10^{- 4} 0.001 0.010 0.100 1

accuracy

fractionofpoints(cumulative)

gg →tt+g

-15 ^-10 ^-⁵ ⁰ ⁵

10^{- 6} 10^{- 5} 10^{- 4} 0.001 0.010 0.100 1

accuracy

fractionofpoints

(c)umulative

ud→W⁺+2g

• Hard cuts: p_T > 50GeV and ∆R_ij => 0.5 for final state QCD partons

(∆R_ij = ^q(η_i−η_j)² + (φ_i −φ_j)², φ_i azimuthal angle, η_i rapidity)

• Behaviour in the tails crucial for real-life applications

• 1 to 3 orders of magnitude improvement wrt OL1 + Cuttols and Collier in DP

Excellent stability thanks to on-the fly reduction and minimal ∆-expansions Soft region under investigation ⇒ important for real-virtual part of NNLO

(36)

Stability of OpenLoops 1 and 2 in double precision: 2 → 4 processes (at √ ˆ

s = 1 TeV)

Probability of relative accuracy A or less (wrt OL1 + Cuttools in quad precision, 10⁶ uniform random points)

-15 ^-10 ^-⁵ ⁰ ⁵

10^{- 6} 10^{- 5} 10^{- 4} 0.001 0.010 0.100 1

accuracy

fractionofpoints(cumulative)

gg→tt+2g

preliminary

-15 ^-10 ^-5 0 5

10^{- 6} 10^{- 5} 10^{- 4} 0.001 0.010 0.100 1

accuracy

fractionofpoints

(cumulative)

ud→W⁺+3g

preliminary

• Same hard cuts as for 2 → 3

• Orders of magnitude improvement wrt Cuttools and similar or better stability wrt Collier

• Further improvements in the tail under investigation

Very good stability thanks to on-the fly reduction and minimal ∆-expansions

(37)

V. Summary and Outlook

• New algorithm for construction and reduction of 1-loop ampitudes in a single recursion

• Drastic reduction of complexity at all stages of the calculation (rank ≤ 2)

• New colour and helicity treatment + OpenLoops merging ⇒ significant gain in CPU efficiency

• Same level of automation and same interface as OpenLoops 1

• Dedicated stability analysis possible in a single dressing and reduction tool

⇒ Simple targeted expansions provide excellent numerical stability in the hard regions

• future projects:

– improvement of stability in real-virtual NNLO contributions (soft region) – extension to 2 loops