• Keine Ergebnisse gefunden

4.5 PanTau – Tau ID with energy flow for ATLAS

4.5.4 Data – Monte Carlo comparison of energy flow quantities

sample. Hence, the following comparison of energy flow quantities between Monte Carlo samples and data only considers QCD jet backgrounds.

Already in early studies [129] it was shown that the MC09 tune [130] of Pythia does not describe the jet shapes in data very well. Instead the DW tune [131] of Pythia was shown to be more appropriate and is therefore used here. Unfortunately, the amount of officially available QCD di-jet samples with the DW tune of Pythia is rather limited at the time of writing. Especially the Monte Carlo samples in the energy range up to ET < 17 GeV and ET < 35 GeV have large scale factors between the statistics of the Monte Carlo samples and the expected number of events of the order of O(2000) and O(150), respectively, causing significant fluctuations in the following distributions. From the J0 sample (ET < 17 GeV) only one event out of its Monte Carlo sample of about 400 000 events passes the event selection cuts, mainly because of the requirement for a leading TauRec candidate with ET >30 GeV. Therefore the J0 sample was completely skipped in the following analysis. Without larger Monte Carlo datasets as they are produced at the time of writing it is therefore not useful to further increase the considered amount of data. In order to reflect the relative contributions of the different di-jet Monte

cut Z→τ τ W→τ ν QCD jets data

Good Runs List 81.3 866.9 5.960(9) ×107 6 396 104

leading tau 22.5 114.5 3.124(14)×106 5.2% 3 089 984 48.3%

TriggerL1 TAU5 22.4 112.8 2.876(13)×106 92.0% 2 875 777 93.1%

jet cleaning 22.1 111.5 2.841(13)×106 98.8% 2 804 790 97.5%

primary vertex 22.1 111.5 2.840(13)×106 100.0% 2 798 661 99.8%

tau candidates 14.7 48.6 1.961(10)×106 69.1% 1 978 937 70.7%

tau back-to-back pair

5.6 12.0 7.84(6) ×105 40.0% 807 235 40.8%

number of sub-leading PanTau candidates

7.9 16.6 9.20(13) ×105 1 063 523

Table 4.5: Cut-flow of the data to Monte Carlo comparison. The uncertainties include sta-tistical uncertainties of the Monte Carlo samples only, where the stasta-tistical uncertainties of the W→τ ν and Z→τ τ samples are negligible. In addition to the number of events the table also shows the event yield with respect to the previous cut in percent. Monte Carlo samples are scaled with respect to each other according to their predicted cross sections and scaled to data after applying the Good Runs List, leading tau and trigger selection. The last row shows the total number of sub-leading PanTau candidates in the kinematic range pT>15 GeV,|η|<2.0.

Carlo samples (labeled J1 to J4 with increasing transverse energy) their histograms have been stacked on top of each other in the following figures.

The distribution of the pseudo-rapidity η differs in the range where PanTau can-didates are reconstructed for the n-prong classes and the “other” class due to the η coverage of the Inner Detector, because the n-prong classes require a certain num-ber of reconstructed tracks, whereas also candidates without qualified tracks can end up in the “other” class. Above |η| > 2.5 the track requirement cannot be fulfilled, which is visible in the comparison of the η distribution of all PanTau candidates (Fig-ure 4.28a) and ignoring the “other” class (Fig(Fig-ure 4.28c). In the higher momentum range 25 GeV < pT <50 GeV (Figure 4.28b and Figure 4.28d) this difference in not as expressed as in the low-momentum range 10 GeV < pT < 25 GeV, because for higher momenta the “other” class is dominated by candidates with more than seven qualified tracks.

η PanTau candidate

-3 -2 -1 0 1 2 3

candidates

0 5000 10000 15000 20000 25000 30000 35000

QCD J1 QCD J2 QCD J3 QCD J4 data

(a) 10 GeV< pT<25 GeV

η PanTau candidate

-3 -2 -1 0 1 2 3

candidates

0 2000 4000 6000 8000 10000 12000 14000

16000 QCD J1

QCD J2 QCD J3 QCD J4 data

(b) 25 GeV< pT<50 GeV

η PanTau candidate

-3 -2 -1 0 1 2 3

candidates

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

QCD J1 QCD J2 QCD J3 QCD J4 data

(c) 10 GeV< pT<25 GeV, without “other”

η PanTau candidate

-3 -2 -1 0 1 2 3

candidates

0 1000 2000 3000 4000 5000 6000

7000 QCD J1

QCD J2 QCD J3 QCD J4 data

(d) 25 GeV< pT<50 GeV, without “other”

Figure 4.28: Pseudorapidityηdistribution of sub-leadingPanTaucandidates in Monte Carlo and data at √

s = 7 TeV. The Monte Carlo distribution is normalised to data by area and the different contributions of the QCD di-jet MC (J1 - J4, DW tune) are stacked on top of each other.

Both for all candidates and ignoring the “other” class one observes differences be-tween the QCD di-jet Monte Carlo prediction and the measured η distribution. Even though the Monte Carlo sample size is limited one can see, that the measurements are more central than the Monte Carlo prediction. This is a known feature of the Pythia DW tune and is also observed in theηdistribution of the TauRec candidates [125]. The ATLAS tau performance group decided to use the Pythia DW tune anyhow, because it describes the individual ID variables and jet shapes better. In order to reduce the effect of differences in the pseudo-rapidity distribution we restrict the pseudo-rapidity range to|η|<2.0 in the following.

Considering allPanTaucandidates in the pseudo-rapidity range|η|<2.0 one observes differences in the distribution of the reconstructed transverse momentumpTespecially at low momenta (Figure 4.29a). The “other” class is expected to be most sensitive to noise clusters in the calorimeter and other mismeasurements, because it includes candidates

[GeV]

PanTau candidate pT

10 15 20 25 30 35 40 45 50

candidates

0 20 40 60 80 100 120 140 160

103

×

QCD J1 QCD J2 QCD J3 QCD J4 data

(a) all classes

[GeV]

PanTau candidate pT

10 15 20 25 30 35 40 45 50

candidates

0 20 40 60 80 100

103

×

QCD J1 QCD J2 QCD J3 QCD J4 data

(b) without “other” class

[GeV]

PanTau candidate pT

10 20 30 40 50 60 70 80 90 100

candidates

102

103

104

105

QCD J1 QCD J2 QCD J3 QCD J4 data

(c) all classes, logarithmic scale

[GeV]

PanTau candidate pT

10 20 30 40 50 60 70 80 90 100

candidates

10 102

103

104

105 QCD J1

QCD J2 QCD J3 QCD J4 data

(d) without “other” class, logarithmic scale Figure 4.29: Transverse momentum distribution of sub-leadingPanTaucandidates in Monte

Carlo and data at √

s= 7 TeV in the pseudo-rapidity range |η|<2.0. The Monte Carlo distribution is normalised to data by area and the different contributions of the QCD di-jet MC (J1 - J4, DW tune) are stacked on top of each other. The left figures include all PanTau candidates, the right ones all candidates except for those belonging to the

“other” class. Note the different range of the pT axis for the logarithmic plots.

±) N(EFO 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

candidates

2000 4000 6000 8000 10000 12000 14000 16000 18000

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

(a) N(EFO±)

q

0 1 2 3 4 5 6

candidates

2000 4000 6000 8000 10000 12000 14000 16000 18000

20000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(b) |P

EFOqi|

ET

R(jet, EFO)>all

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

candidates

1000 2000 3000 4000 5000 6000

7000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(c) h∆R(jet,EFO)iallET

ET charged

R(jet, EFO)>

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45

candidates

2000 4000 6000 8000 10000

12000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(d) h∆R(jet,EFO)ichargedET

ET had neut

R(jet, EFO)>

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

candidates

1000 2000 3000 4000 5000 6000

7000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(e) h∆R(jet,EFO)ihad. neutral ET

0.1)) logit(I0.2

-15 -10 -5 0 5 10 15

candidates

2000 4000 6000 8000 10000 12000

(f ) logit(I0.20.1)

Sjet

0.000 0.005 0.010 0.015 0.020 0.025 0.030 0.035 0.040

candidates

0 2000 4000 6000 8000 10000 12000

14000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(g) Sjet

0) σ(d

0/ d

-4 -2 0 2 4 6 8 10

candidates

1000 2000 3000 4000 5000 6000 7000

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

(h) d0/σ(d0)

Figure 4.30: Distribution of the identification variables for the “1-prong” class in the trans-verse momentum range 10 GeV< pT <25 GeV. The stacked histograms correspond to the different QCD di-jet samples and the black dots are measured data. Monte Carlo samples have been scaled to the integral of the data. The black line shows the distribution for W→τ ν and Z→τ τ events for comparison.

q

0 1 2 3 4 5 6

candidates

20 40 60 80 100

×10

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

(a) |P

EFOqi| ET

R(jet, EFO)>all

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

candidates

2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

22000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(b) h∆R(jet,EFO)iallET

ET

EM neut

R(jet, EFO)>

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

candidates

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

(c) h∆R(jet,EFO)iEM neutralET

tot )

0.2, 0.4

logit(I

-14 -12 -10 -8 -6 -4 -2 0 2

candidates

5000 10000 15000 20000 25000

(d) logit(Itot0.2;0.4)

0) σ(d

0/ d

-4 -2 0 2 4 6 8 10

candidates

5000 10000 15000 20000 25000 30000 35000

40000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(e) d0/σ(d0)

leading) (jet, EFO±

0.00 0.05 0.10 0.15 0.20 0.25 0.30

candidates

2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 22000

24000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(f ) ^(jet,EFOleading± )

neutral) e

charged, e

(

0.00 0.05 0.10 0.15 0.20 0.25 0.30

candidates

2000 4000 6000 8000 10000 12000 14000

16000 QCD J1

QCD J2 QCD J3 QCD J4 Tau MC Data

(g) ^(~echarged, ~eneutral)

Figure 4.31: Distribution of the identification variables for the “1-prong+neutral” class in the transverse momentum range 10 GeV < pT < 25 GeV. The stacked histograms correspond to the different QCD di-jet samples and the black dots are measured data.

Monte Carlo samples have been scaled to the integral of the data. The black line shows the distribution for W→τ ν and Z→τ τ events for comparison.

without any qualified track as well as candidates with more than seven tracks. Indeed one observes a very good agreement between Monte Carlo prediction and data, when the “other” class is neglected in the transverse momentum distribution (Figure 4.29b, Figure 4.29d).

Figure 4.30, 4.31 and Figure C.8 to C.19 in Appendix C.3 compare the ID variables for each individual category in data and in Pythia Monte Carlo (DW tune). To illustrate the difference between QCD jets and tau jets the comparisons also include the Monte Carlo prediction for tau candidates in W→τ ν and Z→ττ+events with truth match.

Figure 4.30, 4.31 and Figure C.8 – C.12 show the variables in the transverse momentum range 10 GeV < pT <25 GeV and Figure C.13 – C.19 in the range 25 GeV< pT <50 GeV.

In general a good agreement between the Monte Carlo prediction for QCD jets and the measured data is achieved within the statistical uncertainties of the Monte Carlo samples. Some variables show larger discrepancies, though, and they are especially pro-nounced in the “1-prong” category (Figure 4.30). For example the number of charged energy flow objects N(EFO±) (Figure 4.30a) and the sum of charges |P

EFOqi| (Fig-ure 4.30b) in the “1-prong” category are significantly larger in the data than in the Monte Carlo prediction. Also the ET-weighted mean distance of charged energy flow objects (Figure 4.30d) has significantly larger tails to higher values in contrast to the same distribution for neutral, hadronic energy flow objects (Figure 4.30e).

Figure 4.32: Number of reconstructed pile-up vertices in the data events used for the Monte Carlo to data comparisons in this Section. Only events passing the selection cuts given in Table 4.5 are included here.

Entries 807235 Mean 0.6001 RMS 0.7939

N(Pile-up vertices)

0 1 2 3 4 5 6 7 8 9

events

1 10 102

103

104

105

Entries 807235 Mean 0.6001 RMS 0.7939

We will see that the discrepancies can mostly be explained by pile-up effects in data.

Due to the high number of protons in each of the proton bunches in the LHC, more than one pair of protons may interact in a single bunch crossing (cf.Section 2.1). Even though the number of protons in the bunches were small in the LHC runs in 2010 compared to the design parameters of the LHC one still observed a sizable fraction of events with additional pile-up vertices. Figure 4.32 shows the number of reconstructed pile-up vertices in the events used for the comparisons. Even though most events do not include additional pile-up interactions yet, there are even a few events with five or more reconstructed pile-up vertices. In total a mean number of 0.6 pile-up vertices is found.

The Monte Carlo samples on the other hand do not include simulated pile-up vertices.

±) N(EFO

0 2 4 6 8 10 12 14

N(pile-up vertices)

0 1 2 3 4 5 MC

(a) N(EFO±), 1-prong

q

0 1 2 3 4 5 6

N(pile-up vertices)

0 1 2 3 4 5 MC

(b) |P

EFOqi|, 1-prong

ET

R(jet, EFO)>all

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

N(pile-up vertices)

0 1 2 3 4 5 MC

(c) h∆R(jet,EFO)iallET, 1-prong ET

charged

R(jet, EFO)>

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45

N(pile-up vertices)

0 1 2 3 4 5 MC

(d) h∆R(jet,EFO)ichargedET , 1-prong

ET

had neut

R(jet, EFO)>

<

0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

N(pile-up vertices)

0 1 2 3 4 5 MC

(e) h∆R(jet,EFO)ihad. neutral

ET , 1-prong ET

EM neut

R(jet, EFO)>

<

0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

N(pile-up vertices)

0 1 2 3 4 5 MC

(f ) h∆R(jet,EFO)iEM neutralET , 1-prong+neutrals Figure 4.33: Pile-up dependency of PanTau identification variables in the transverse mo-mentum range 10 GeV < pT < 25 GeV. The variable distribution is given for different numbers of reconstructed pile-up vertices. The top row shows the Pythia Monte Carlo prediction (DW tune) for comparison, which does not include simulated pile-up. Entries are normalised in each row.

Figure 4.33 visualises the pile-up dependency for selected variables. Especially the total number of charged energy flow objects N(EFO±) (Figure 4.33a) shows a strong pile-up dependency. The distributions for events without pile-up vertices matches the MC prediction much better. Figure 4.34 allows for a better comparison with the MC prediction by showing events with no additional pile-up vertices. Note, thatN(EFO±)

±) N(EFO 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

fraction of tau candidates

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

=0)

pile-up

Data (N

(a) N(EFO±), 1-prong

q

0 1 2 3 4 5 6

fraction of tau candidates

0.0 0.1 0.2 0.3 0.4

0.5 QCD J1QCD J2

QCD J3 QCD J4 Tau MC Data

=0)

pile-up

Data (N

(b) |P

EFOqi|, 1-prong+neutrals

ET

R(jet, EFO)>all

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40

fraction of tau candidates

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

=0)

pile-up

Data (N

(c) h∆R(jet,EFO)iallET, 1-prong+neutrals ET

charged

R(jet, EFO)>

<

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45

fraction of tau candidates

0.00 0.02 0.04 0.06 0.08 0.10 0.12

QCD J1 QCD J2 QCD J3 QCD J4 Tau MC Data

=0)

pile-up

Data (N

(d) h∆R(jet,EFO)ichargedET , 1-prong Figure 4.34: Distribution of selected identification variables in the transverse momentum

range 10 GeV< pT<25 GeV. The stacked histograms correspond to the different QCD di-jet samples and the black dots denote measured data. Open circles represent events in data without additional measured vertices (Npile-up= 0). All histograms are normalised and show the fraction of tau candidates in each bin.

includes all charged energy flow objects without further track selection cuts. The same holds for the sum of charges |P

EFOqi| (Figure 4.33b and Figure 4.34b). In general one observes a stronger dependency on the number of pile-up vertices for those variables including charged EFOs. This is fortunate as track-based observables can be made pile-up proof more easily than purely calorimeter-based observables for the following reason.

The pile-up vertices are shifted up to a few cm along the z-axis with respect to the primary vertex due to the length of the colliding proton bunches. These shifts can be resolved by the Inner Detector tracking. The tracks of charged particles can be selected to keep only tracks compatible with the primary vertex. Such a selection is not possible for calorimeter clusters without track match, because the angular pointing resolution of the calorimeter clusters is by far not good enough to resolve different pile-up vertices along the beam pipe. It should therefore be sufficient to re-tune the track selection cuts for non-qualified energy flow objects in PanTau to minimise the pile-up sensitivity of the identification variables or to use only charged EFOs with qualified tracks in the calculation of variables. Note that the non-qualified selection is independent from the selection of qualified charged objects used in the categorisation, cf. Section 4.5.1.

[GeV]

reco

pT

10 15 20 25 30 35 40 45 50

ID hadτ efficiency

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

(a) Identification efficiencyτhadID

[GeV]

reco

pT

10 15 20 25 30 35 40 45 50

ID

jet rejection R

1 10 102

103

104

MC data

(b) Jet rejection RjetID

Figure 4.35: Tau identification efficiency and jet rejection versus the transverse momentum precoT , for Monte Carlo and data with standard variable selection (logit(p) with τ = 4).

The momentum dependencies are shown for mean efficiencies of 10 %, 30 % and 50 %, where the mean efficiencies have been calculated in the energy ranges 15 GeV < ET <

25 GeV and 25 GeV< ET<40 GeV, which leads to a “jump” atET= 25 GeV, but keeps the efficiencies more comparable over the whole range. In contrast to Figure 4.25 only the efficiency and rejection of the identification step is given here,i.e.the efficiencies and rejections are normalised to the number of reconstructed tau candidates, not true taus or true jets. The uncertainties of the jet rejection need to be taken with care and are most likely underestimated, because the different jet samples have very different scale factors.

In conclusion of the previous results Figure 4.35 shows the efficiency and rejection of the tau identification inPanTauas a function of the reconstructed transverse momentum of the tau candidates. In contrast to Figure 4.25 only the efficiency and rejection of the identification step is given here,i.e. the efficiencies and rejections are normalised to the number of reconstructed tau candidates, not true taus or true jets. This allows to easily apply the same calculation for the jet rejection to measured data. Figure 4.35b compares the jet rejection as estimated from the Monte Carlo samples with the estimates from data. Again one assumes that no or only negligible contributions from true tau decays exist in the data sample. Like before one observes a reasonably good agreement between the Monte Carlo estimate and the measured data. Only in the transverse momentum range below 15 GeV the rejection in data seems to be slightly worse. More sophisticated methods to estimate the jet rejection like tag-and-probe methods can be applied to QCD jets, but they will reduce the available statistics significantly and will not improve the results as long as true tau decays do not contribute. The extended methods have the advantage though that further systematic effects may be studied as well, like the difference between quark and gluon induced QCD jets, see e.g.reference [132].

Apart from the pile-up dependency of some identification variables the comparison of energy flow quantities and the PanTau jet rejection in Monte Carlo predictions and data show that the eflowRec and PanTau approaches are robust and work reliably not

only on simulated events, but also in “real life”. The sensitivity to pile-up can easily be improved by further track selection criteria as argued before.