• Keine Ergebnisse gefunden

Online Track and Vertex Reconstruction on GPUs for the Mu3e Experiment

N/A
N/A
Protected

Academic year: 2022

Aktie "Online Track and Vertex Reconstruction on GPUs for the Mu3e Experiment"

Copied!
44
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Online Track and Vertex Reconstruction on GPUs for the Mu3e Experiment

Dorothea vom Bruch

March 28

th

2017

DPG Frühjahrstagung 2017, T46: Elektronik

(2)

The Mu3e Experiment

Search for charged lepton flavour-violating decay

μ+

→ e

+

e

-

e

+

with a sensitivity in branching ratio better than 10

-16

Branching ratio

suppressed in Standard Model to below 10-54

Any hint of signal new physics→

Supersymmetry

Grand unified models

Extended Higgs sector

...

(3)

Mu3e Signal

Signal

Coincident in time

Single vertex

E = m

pi=0

e+

e+ e-

e+

e+ e-

pi0

Random Combinations

Not coincident in time

No single vertex

E ≠ m

(4)

The Mu3e Detector

Target

Inner pixel layers

Scintillating f bres

Outer pixel layers Recurl pixel layers

Scintillator tiles

μ Beam

i

10 cm 4.5 cm

B

(5)

The Mu3e Detector

10 cm 4.5 cm

Target

Inner pixel layers

Scintillating f bres

Outer pixel layers i

Recurl pixel layers Scintillator tiles

μ Beam

B

(6)

Readout Scheme

FPGA: Field-Programmable Gate Array GPU: Graphics Processing Unit

2844 Pixel Sensors

up to 45 1.25 Gbit/s links

FPGA FPGA FPGA

...

86 FPGAs 1 6 Gbit/s

link each

GPU PC GPU

PC GPU

12 PCs PC 12 10 Gbit/s

links per

8 Inputs each

3072 Fibre Readout Channels

FPGA FPGA

...

12 FPGAs

6272 Tiles

FPGA FPGA

...

14 FPGAs

Gbit Ethernet

Switching

Board Switching

Board Switching

Board

Front-end (inside magnet)

Switching Board

(7)

Readout Scheme

2844 Pixel Sensors

up to 45 1.25 Gbit/s links

FPGA FPGA FPGA

...

86 FPGAs 1 6 Gbit/s

link each

GPU PC GPU

PC GPU

12 PCs PC 12 10 Gbit/s

links per

8 Inputs each

3072 Fibre Readout Channels

FPGA FPGA

...

12 FPGAs

6272 Tiles

FPGA FPGA

...

14 FPGAs

Data Mass

Gbit Ethernet

Switching

Board Switching

Board Switching

Board

Front-end (inside magnet)

Switching Board

From Switching board: get 50 ns time slices of data containing full detector information

(8)

Readout Rate

Data rate [Gbit / s]

Pixel detector 40

Fiber detector 20

Tile detector negligible

Total ~ 60

At a rate of 108 muons / s

Triggerless, zero-suppressed readout

(9)

Selection Process

How do we find the three signal tracks?

1) Selection Cuts 2) Track fitting 3) Vertex search

e+

e+ e-

(10)

Geometrical Selection

After all cuts:

In subsequent layers, cut on:

x y

Ф1 - Ф0

z r

01

2

01

2

z1 - z0

(11)

Multiple Scattering Fit

Electrons: 12 – 53 MeV/c

Resolution dominated by multiple Coulomb scattering

Ignore hit uncertainty

Three consecutive hits: “triplet”

Multiple scattering at middle hit of triplet

Minimize multiple scattering

χ

2

= Φ

MS 2

σ

2MS ,Φ

+ θ

MS 2

σ

2MS ,θ

y

Triplet

Talk by A. Kozlinskiy (T 116.1, Thursday, 16:45)

(12)

Fitting

Fit hits in first three layers

Propagate to 4th layer

Select hit in 4th layer closest to propagated position

Redo fit with a second triplet, cut on χ2

After all selections:

98.5 % of true 4-hit MC tracks selected

74 % of 4-hit tracks are true MC tracks

(13)

Vertex Estimate: XY-Plane

Study each combination of two e+, one e-

In xy-plane: find intersections of track circles

Calculate weights of intersections based on uncertainties due to

multiple scattering

pixel size

e+ e+

e-

x y

(14)

Vertex Estimate: XY-Plane

Study each combination of two e+, one e-

In xy-plane: find intersections of track circles

Calculate weights of intersections based on uncertainties due to

multiple scattering

pixel size

e+ e+

e-

y

(15)

Vertex Estimate: XY-Plane

Study each combination of two e+, one e-

In xy-plane: find intersections of track circles

Calculate weights of intersections based on uncertainties due to

multiple scattering

pixel size

e+ e+

e-

x y

(16)

Vertex Estimate: XY-Plane

Study each combination of two e+, one e-

In xy-plane: find intersections of track circles

Calculate weights of intersections based on uncertainties due to

multiple scattering

pixel size

e+ e+

e-

y

(17)

Vertex Estimate

PCAxy 1

x y

PCAxy 2

PCAxy 3

Weighted mean

Calculate weighted mean of intersections from three different tracks

Find point of closest approach (PCAxy) to weighted mean in xy-plane on each track

Calculate z-position PCAz and weight at PCAxy

Find weighted mean in z-coordinate

Achieve vertex resolution of ~400 μm sigma

(18)

Cut Effects

Signal reference: full offline track reconstruction and offline vertex fit

no cuts

chi2

& target distance

ntum magnitude

& total energy

& recurler cut 0.986

0.988 0.990 0.992 0.994 0.996 0.998 1.000

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

background tightsignalcut

Signal frames accepted Background frames accepted

Less than 2%

(19)

Fast Reconstruction on GPU

Use time slices of 50 ns for track &

vertex search

→ Process 20∙106 time slices per second

Plan for 12 filter farm PCs with one GPU each

→ Process at least 1.7∙106 time slices per second

Thousands of cores

Optimal parallel performance

Best suited for many floating-point operations / second

(20)

Selection on GPU

PCIe FPGA

Recurl station hits, Timing information

Hits layer 1

Geometrical three-hit selection

Coordinate transformation

Hits layer 2

Hits layer 3

Hits layer 4

GPU

GPU memory Three-hit fit

Propagation, four-hit fit Positive

tracks Negative tracks Vertex selection

GPU memory

Selection decision

(21)

Selection on GPU

PCIe FPGA

Recurl station hits, Timing information

Hits layer 1

Geometrical three-hit selection

Coordinate transformation

Hits layer 2

Hits layer 3

Hits layer 4

GPU

GPU memory Three-hit fit

Propagation, four-hit fit Positive

tracks Negative tracks Vertex selection

GPU memory

Selection decision

(22)

Performance

Optimizations performed:

Memory layout and access pattern

Register usage

Grid dimensions

Currently process 2106 time slices / s on one nvidia GTX 1080 at a muon stopping rate of 7∙107 Hz

(23)

Backup

(24)

Other Mu3e Talks:

L. Huth: Test beam results for neutron and proton irradiated MuPix7 prototypes, T26, Monday, 17:45

H. Augustin: The MuPix8, HK 18, Tuesday, 11:00

T. Kar: Large Area Monolithic Pixel Detectors for HL-LHC & Future High Rate Experiments, HK 18, Tuesday, 11:15

J. Kroeger: Flexprint Design Studies for the Mu3e Experiment, T46, Tuesday, 12:15

U. Hartenstein: Track Based Alignment for the Mu3e Detector, T89, Wednesday, 17:00

A.-K. Perrevoort: Searches for New Physics with the Mu3e Experiment, T78, Wednesday, 17:35

A. Herkert: A Thin Silicon Pixel Tracker for the Mu3e Experiment, T94, Wednesday, 18:30

S. Dittmeier: Readout of the Mu3e pixel detector, T94, Wednesday, 18:50

(25)

Institutions

University of Geneva

Heidelberg University

Karlsruhe Institute of Technology

Mainz University

Paul Scherrer Institut

ETH Zurich

University of Zurich

(26)

Parallelization Track Fit

Time slice

1 Time slice 2

Time Slice N

...

...

...

... ... ...

Fit for one combination of three hits

Propagation to 4th layer

Loop over hits in 4th layer: check if hit exists in proximity of propagated track, re-fit

Wait for all cores in one time slice to be done with previous steps

Thread

1 Thread

2 ...

...

...

16 x 8192 50 ns time slices

96 threads / time slice

(27)

Parallelization Track Fit

Time slice

1 Time slice 2

Time Slice N

...

...

...

... ... ...

Fit for one combination of three hits

Propagation to 4th layer

Loop over hits in 4th layer: check if hit exists in proximity of propagated track, re-fit

Wait for all cores in one time slice to be done with previous steps

Thread

1 Thread

2 ...

...

... ... ...

16 x 8192 50 ns time slices

96 threads / time slice

(28)

Parallelization Vertex Selection

Time slice

1 Time slice 2

Time slice N

...

...

...

... ... ...

For one electron & one positron from this 50 ns time slice:

Loop over all other positrons

Find vertex estimate

Decide whether to keep this time slice

Thread

1 Thread

2 ...

...

...

(29)

Muon Stopping Rate Study I

4.00E+07 6.00E+07 8.00E+07 1.00E+08 1.20E+08 0.86

0.88 0.9 0.92 0.94 0.96 0.98 1

0 0.01 0.02 0.03 0.04 0.05 0.06

background tightsignalcut truthsignal losesignalcut

muon stopping rate on target [Hz]

Signal frames accepted Background frames accepted

(30)

Muon Stopping Rate Study II

4.0E+07 6.0E+07 8.0E+07 1.0E+08 1.2E+08 1.4E+08 0.0E+00

5.0E+05 1.0E+06 1.5E+06 2.0E+06 2.5E+06 3.0E+06 3.5E+06 4.0E+06

Muon stopping rate on target

Frames / s

4.0E+07 6.0E+07 8.0E+07 1.0E+08 1.2E+08 1.4E+08 0

0 0 0 0 0 0 0

0 0 0 0 0 0.01 0.01 0.01 0.01

frames with hit overflow

Muon stopping rate on target

Frames with hit overflow Frames with triplet overflow

(31)

Multiple Scattering

Muons decay at rest

→ momentum < 53 MeV/c

Momentum resolution to first order:

σp/p ∼ θMS

Use recurling tracks for momentum measurement

Ω ~ π MS

θMS

B

(32)

Multiple Scattering Fit

z r

x y

ΦMS

S01 S12

S 12 S 01

Θ MS

χ

2

= Φ

MS 2

σ

2

+ θ

MS 2

σ

2

(33)

Data Transfer

Transfer data from FPGA to RAM via direct memory access (DMA)

Tested at 1.5 GB/s: BER ≤ 4•10-16 (at 95% confidence level)

Tested on beam test campaigns

Will be used for readout of next MuPix prototype

LVDS connector for data

(34)

DMA: Implementation

CUDA API:

memory allocation

Physical memory Virtual

memory

Length 1 Length 2

Length 3

Address

Write addresses, lengths to FPGA

(35)

Radius Distribution

Number of events / 4 mm

500 1000 1500 2000 2500 3000

3500 Positrons

Electrons

(36)

Z distance

(37)

Uncertainty at Intersection

σ

MS , PCA

MS , first layer

s ≈ 0.8 mm

Take both into account when

multiple scattering sigma at first layer [rad]

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

Number of events / mrad

0 20 40 60 80 100 120

103

×

path length in xy-plane from first layer to PCA [mm]

0 5 10 15 20 25 30 35 40 45 50

Number of events / 0.5 mm

0 20 40 60 80 100 120 140 160 180 200 220 240

103

×

(38)

histo

Entries 7603

Mean 0.02629

RMS 0.8089

/ ndf

χ2 40.6 / 6

Constant 1065 ±18.8 Mean 0.01068±0.00355 Sigma 0.2314±0.0037

Number of events / 0.1 mm

600 800 1000

1200 histo

Entries 7603

Mean 0.02629

RMS 0.8089

/ ndf

χ2 40.6 / 6

Constant 1065 ±18.8 Mean 0.01068±0.00355 Sigma 0.2314±0.0037

Vertex Position Distribution

histo

Entries 7603

Mean 0.01541

RMS 1.332

/ ndf

χ2 84.29 / 12

Constant 600.6±11.1 Mean 0.002901±0.006102 Sigma 0.3914±0.0068

true - estimated vertex position in x [mm]

10

8 6 4 2 0 2 4 6 8 10

Number of events / 0.1 mm

0 100 200 300 400 500 600

700 histo

Entries 7603

Mean 0.01541

RMS 1.332

/ ndf

χ2 84.29 / 12

Constant 600.6±11.1 Mean 0.002901±0.006102 Sigma 0.3914±0.0068

histo

Entries 7603

Mean 0.04704

RMS 1.331

/ ndf

χ2 84.32 / 14

Constant 613 ±10.9 Mean 0.004342±0.005676 Sigma 0.3941±0.0058

true - estimated vertex position in y [mm]

10

8 6 4 2 0 2 4 6 8 10

Number of events / 0.1 mm

0 100 200 300 400 500 600

histo

Entries 7603

Mean 0.04704

RMS 1.331

/ ndf

χ2 84.32 / 14

Constant 613 ±10.9 Mean 0.004342±0.005676 Sigma 0.3941±0.0058

(39)

χ 2 Distribution

0 10 20 30 40 50 60 70 80 90 Chi2100

Number of Entries

103

104

Random combinations Signal

(40)

Combined Momentum and Energy

combined momentum magnitude [MeV/c]

0 10 20 30 40 50 60 70 80 90 100

Number of events / MeV/c

0 10000 20000 30000 40000 50000 60000 70000 80000

Signal

Random combinations

combined energy [MeV]

0 20 40 60 80 100 120 140 160 180 200

Number of events / MeV

10000 20000 30000 40000 50000 60000

70000 Random combinations

Signal

(41)

Distance to Target

0 2 4 6 8 10 12 14 16 18 20

Number of events / 0.1 mm

10000 20000 30000 40000 50000 60000 70000

Random combinations

Signal

(42)

Pixel Detector

High Voltage Monolithic Active Pixel Sensors (HV-MAPS)

Fast charge collection via drift

Thinned down to 50 mμ

Pixel size: 80 m x 80 mμ μ

Chip size: 2 cm x 2 cm

Thickness chip & readout:

Ơ(0.1 %) radiation length

I. Peric, P. Fischer et al, NIM A 582 (2007) 876

P-substrate N-well

Particle E f eld

(43)

Mupix: Mechanics

50 m siliconμ

∼ 50 m flexprint: Kapton, aluminum, μ copper

25 m Kapton foilμ

→ Ơ(0.1 %) radiation length

(44)

Sensitivity Study

2 Events per 0.2 MeV/c

10 3

10 2

10 1

1 10 102

at 10-12

 eee

at 10-13

 eee

at 10-14

 eee

at 10-15

 eee

 eee

muons/s muon stops at 108

1015

Mu3e Phase I

Bhabha + Michel

Referenzen

ÄHNLICHE DOKUMENTE

Figure 4.8: Orientation of the MuPix chips on layers 1 &amp; 2 with the detector in yellow, periphery in red and blue cooling flow..

Transfer these + hits in 4 th layer to GPU Positive tracks Negative tracks Select combinations of 2 positive, 1 negative track from one vertex, based on

Scintillating fibres Outer pixel layers μ Beam. Target Inner

From Switching board: get 50 ns time slices of data containing full detector information. 2844

● CUDA API: memory allocation of page-locked memory, usable for DMA from FPGA to RAM and from RAM to GPU memory. ● Use DMA with scatter /

Particularly important for the cooling system is the scintillating fibre detector, because it divides the helium volume between the outer and inner double pixel layer into two

On each side, the flex print cables from both sensors end at the bottom of the support structure, where they are connected to the scintillating fibre board (scifi board).. Figure

The experiment is built in a modular principle consisting of silicon pixel sensors for the vertex and momentum measurement and of scintillator fibers and tiles that deliver