ATLAS NOTE ATLAS-CONF-2010-087

(1)

A TLAS-CONF-2010-087 11 October 2010

ATLAS NOTE

ATLAS-CONF-2010-087

August 19, 2010

Background studies for top-pair production in lepton plus jets final states in √

s = 7 TeV ATLAS data

The ATLAS Collaboration

Abstract

The search for top quark pairs at the LHC requires the selection of events with leptons, high jet multiplicity, b-tagged jets and missing transverse energy, and the study of control samples with lower jet multiplicity to evaluate the backgrounds. A first study of the back- grounds in the lepton plus jets channel is presented using 295 nb

⁻¹

of ATLAS pp collision data taken at √

s = 7 TeV, including a data-driven determination of the contribution of QCD

multi-jet events.

(2)

1 Introduction

The observation of top quark pair production is one of the key milestones for the early LHC physics program. The ATLAS collaboration has already presented first top pair candidate events in both lepton plus jets and dilepton plus jets channels, using an event selection designed for an early measurement of the t¯ t production cross-section [1]. However, the lepton plus jets channel in particular su ff ers from significant background contributions from other Standard Model processes, and the understanding of these backgrounds using data is an essential step in measuring the rate of top pair production.

The main backgrounds in the lepton plus jets channel are the production of W bosons in association with multiple additional jets, and QCD multi-jet events where one jet is misidentified as a lepton (electron or muon). As b-jet tagging is used in the selection, the fraction of these processes where b b ¯ pairs are produced is particularly important. None of these backgrounds can be estimated reliably from Monte Carlo simulation; all require studies using data. The W + jets background at high jet multiplicity can be constrained by looking at lower jet multiplicity bins, with and without b-tagging. The QCD multi-jet background can be studied using background-enhanced control samples and loosened lepton selections, and predictions for the contribution in the signal region can then be derived.

This note presents distributions for the lepton plus jets top pair event selection, modified to relax the requirements on the number of jets and number of b-tagged jets. The distributions are compared to expectations from Monte Carlo simulation for t¯ t, single top and W / Z + jets, together with QCD multi-jet contributions predicted from data by the so-called ‘matrix method’. After introducing the data selection and corresponding samples of Monte Carlo simulated data, the object and event selections are described, and the selected event distributions are presented.

2 Data sample

The ATLAS detector [2] covers nearly the entire solid angle around the collision point with layers of tracking detectors, calorimeters and muon chambers.

¹

All these detectors play important roles in the reconstruction of t¯ t events, and only data where all were fully operational is used. Applying these re- quirements to √

s = 7 TeV pp collision data taken in stable beam conditions and recorded up until 19th July 2010 results in a data sample of about 295 nb

⁻¹

. This luminosity estimate has an uncertainty of 11%

[3].

3 Simulated event samples

For the generation of t¯ t signal and single top events, MC@NLO [4] v3.41 was used, with PDF set CTEQ66 [5], assuming a top mass of 172.5 GeV and normalizing the t¯ t cross-section to the prediction of [6], which is consistent with calculations performed at next-to-leading order. For single top the s, t and Wt channels are included, normalizing to the MC@NLO cross-section and using the ‘diagram removal scheme’ [7] for Wt to remove overlaps with the t¯ t final state.

For the generation of W +(b b)+jets and ¯ γ

^∗

/Z +(b b)+jets (Drell-Yan), A ¯  v2.13 was used, invok- ing the MLM matching scheme [8] with matching parameters RCLUS = 0.7 and ETCLUS = 20, and using parton density function set CTEQ6L1 [9]. For the γ

^∗

/Z +jets the phase space has been restricted to m(l

⁺

l

⁻

) > 40 GeV. The W / Z +jets samples were normalized with a K-factor (ratio of next-to-leading to leading-order cross-sections) of 1.22 [10]. All events were hadronized with H  [11], using Jimmy [12] for the underlying event model.

1

In the right-handed ATLAS coordinate system, the pseudorapidity

η

is defined as

η=−

ln(tan(θ/2)), where the polar angle

θ

is measured with respect to the LHC beamline. The azimuthal angle

φ

is measured with respect to the

x-axis, which points

towards the centre of the LHC ring. The z-axis is parallel to the anti-clockwise beam.

(3)

Subsequent detector and trigger simulation, followed by offline reconstruction, has been performed with standard ATLAS software making use of GEANT4 [13]. The e ff ect of pileup, i.e. additional proton-proton interactions in the same beam crossing, was not simulated, but is expected to be small for the studies presented here.

4 Object selection

The reconstruction of t¯ t events makes use of reconstructed electrons, muons and jets, and the overall momentum balance in the transverse plane. The following criteria, identical to those employed in [1], are used to define the selected objects in the events:

• Electrons: Electron candidates are required to pass the medium electron selection as defined in [14], with p

T

> 20 GeV and |η

cluster

| < 2.47, excluding the calorimeter transition region at 1.37 <

|η

cluster

| < 1.52, where η

cluster

is the pseudo-rapidity of the calorimeter energy cluster associated with the candidate. The track must have an associated hit in the innermost pixel layer in order to remove photon conversions, except where the track passes through one of the 2 % of modules known to be dead. Furthermore, the additional energy deposition in the calorimeter within a cone of radius ∆ R = 0.2 must be less than 4 GeV + 0.023 · p

^`_T

, where ∆ R = p

∆ η

²

+ ∆ φ

²

and p

^`_T

is the transverse momentum of the electron. The last requirement ensures electrons are isolated, reducing the rate of jets misidentified as electrons, and suppressing the selection of electrons from heavy flavour decays inside jets.

• Muons: Muons are reconstructed by combining tracks from the inner detector with tracks in the muon spectrometer using the algorithm defined as ‘chain 2’ in [15], requiring p

T

> 20 GeV and

|η| < 2.5. To ensure isolation, the energy deposition in the calorimeter and the sum of track transverse momenta measured in cones of radius of ∆ R = 0.3 around the muon track are each required to be less than 4 GeV. Additionally, muons are required to have a distance ∆ R greater than 0.4 from any jet with p

T

> 20 GeV, further suppressing muons from heavy flavour decays inside jets.

• Jets: Jets are reconstructed with the anti-k

t

algorithm [16] with parameter R = 0.4, by combining topological clusters in the calorimeters [17]. The clusters are reconstructed at the electromagnetic (EM) energy scale, assuming the energy deposits are due to electrons or photons. The jets are then calibrated to the hadronic energy scale, using p

T

and η dependent correction factors obtained from simulation [18]. If a jet is the closest jet to an electron candidate and the corresponding distance

∆ R is less than 0.2, the jet is removed from consideration in order to avoid double-counting of electrons as jets. Jets are considered b-tagged if the secondary vertex-based tagger SV0 returns a value above a threshold that is defined by a 50% tagging e ffi ciency, obtained from studies of simulated t¯ t events [19].

• Missing transverse energy: The missing transverse energy E

^miss_T

is constructed from the vector sum of all calorimeter cells, resolved into the transverse plane. Cells not associated to a jet or electron are included at the EM scale. Cells associated with jets are taken at the corrected energy scale that was used for jets, while the contribution from cells associated with electrons are substituted by the calibrated transverse energy of the electron. Finally, the contribution from muons passing the ‘chain 2’ requirements are included, also removing the contribution of any calorimeter cell associated to the muon.

For the implementation of the matrix method to estimate QCD background, looser lepton selections

are required. For electrons, a loose selection is defined by removing the requirement that the electron

(4)

have a hit in the innermost pixel layer. For muons, the loose selection removes the < 4 GeV requirements on isolation energy in cones of ∆ R = 0.3 in both tracking and calorimetry.

5 Event selections

The t¯ t candidate event selection for the lepton plus jets channel starts by requiring exactly one o ffl ine- reconstructed electron or muon with p

T

> 20 GeV and satisfying the object requirements given in Sec- tion 4 above. For the electron channel, the reconstructed electron must match a level-one electromagnetic trigger object with a 10 GeV p

_T

threshold within ∆ R < 0.15. For the muon channel, the 10 GeV-threshold level-one trigger is required to have fired, but no matching requirement is imposed.

²

Events must have a reconstructed primary vertex with at least 5 tracks, and are discarded if any jet with p

T

> 10 GeV at the EM scale fails jet quality cuts designed to reject jets arising from out-of-time activity or calorimeter noise [20]. These quality cuts remove a negligible fraction of simulated events.

For the t¯ t candidate selection, at least four jets with p

T

> 20 GeV and |η| < 2.5 are required, at least one of which must be b-tagged. For the control samples studied in this note, the event selection is modified to accept events with one or more jets, with and without b-tagging requirements. Finally, the missing transverse energy must satisfy E

_T^miss

> 20 GeV in both electron and muon channels.

6 Estimate of QCD multi-jet background

The QCD multi-jet background is determined from the data using the matrix method, which works by defining two event samples: the ‘tight’ sample passing all event selection cuts defined above, and a ‘loose’ sample where the lepton identification cuts are relaxed as discussed in Section 4 above. The numbers of events in the tight and loose samples (N

tight

and N

_loose

) can be broken down into contributions from ‘real’ leptons produced in W and Z decays, and ‘fake’ leptons from all other sources in QCD multi- jet events (leptons produced from heavy flavour decays inside jets, photon conversions, misidentified hadrons, etc.):

N

^loose

= N

_real^loose

+ N

_fake^loose

,

N

^tight

=

_real

N

_real^loose

+

_fake

N

_fake^loose

, (1)

where

_real

= N

_real^tight

/N

_real^loose

and

_fake

= N

_fake^tight

/N

_fake^loose

are the fractions of loose real and fake leptons that also satisfy the tight lepton identification criteria. These ratios are estimated using control samples enriched in real and fake leptons respectively. With this small initial data sample,

_real

is taken from Monte Carlo simulated Z → `` events, and

_fake

is taken from a real data control sample enriched in QCD multi-jet events, selected by requiring at least one jet with p

T

> 20 GeV and E

_T^miss

< 10 GeV.

Using these estimates, the equation (1), which can also be written in matrix form, can be solved to give an estimate of N

_fake^tight

, the number of fake leptons from QCD multi-jet events in the tight sample, together with the associated statistical error dependent on N

^loose

and N

^tight

. Since the

_fake

control sample is not pure, but has a small residual contamination from W and Z events, this procedure has to be iterated and typically converges after two iterations. In practice, the procedure is performed in bins to yield estimates as a function of jet multiplicity or kinematic variables, both before and after applying the b-tagging requirement.

Several other methods of measuring the QCD multi-jet background are under development, making use of alternative lepton selections and control samples, various fitting techniques, and complementary information such as the impact parameters of lepton candidate tracks with respect to the primary vertex.

At this preliminary stage, the di ff erent methods agree to within about 30 % in regions where enough

2

Due to the state of the muon trigger commissioning, the matching requirement is not fully e

ffi

cient for this data sample.

(5)

data are available to make a meaningful comparison. However, detailed studies with more data will be required before the systematic uncertainty on this background can be fully quantified.

7 Event distributions

The results of applying the modified lepton plus jets selection, requiring at least one jet with p

_T

> 20 GeV and imposing no b-tagging requirement, are shown in Figures 1 to 3. In these and the following plots, the data is shown by the points with error bars, compared to the sum of all expected contributions.

The t¯ t, single top and W / Z + jet contributions are taken from Monte Carlo simulation, whereas the QCD multi-jet background is estimated from the matrix method as discussed above. The uncertainty in the total expectation due to the QCD estimate statistical error is shown by the hatched area. No systematic uncertainties on the QCD or Monte Carlo-based estimates are shown.

Figure 1 shows the E

^miss_T

distributions, Figure 2 the transverse mass m

T

of the lepton-E

_T^miss

system, calculated as described in [14], and Figure 3 the p

T

of the leading jet. The data in these distributions show good agreement with expectations, giving confidence in the ability of the matrix method to predict the QCD multi-jet background contribution at low jet multiplicity, and clearly showing the W-boson peak in both electron and muon channels.

Figure 4 shows the jet multiplicity for events passing the modified selection, both without and with the additional requirement that at least one jet be b-tagged. The untagged distributions (top row) show that significant samples of W +1, 2, 3 jet events are becoming available, which will allow the simulation predictions for the 4-jet bin to be constrained from data. The purely simulation-based predictions of W +jets rates in the higher multiplicity bins have theoretical uncertainties of O(50 %) [21], and within these uncertainties, the data is in good agreement with the predictions.

The bottom row of Figure 4 shows the b-tagged jet multiplicity distributions, where the ≥ 4-jet bin corresponds to the event selection used for the top candidate search. These distributions currently have very low statistics, which also affects the precision of the QCD background estimate. Nevertheless, good agreement between data and the Standard Model prediction is also seen here, and with more statistics it will become possible to constrain the fraction of W + jets events which have b quarks from data.

8 Conclusion

Building on the selection of first top candidate events in ATLAS described in [1], the first studies of control distributions, which will eventually allow the backgrounds in the lepton plus jets channel to be constrained, have been performed using a 295 nb

⁻¹

data sample. Distributions for W production in association with jets have been shown, both before and after imposing a b-tag requirement. The contributions from QCD multi-jet events have been estimated from the data using the matrix method.

Within the large uncertainties at this early stage, the distributions are in agreement with expectations.

However, further studies on a larger data sample will be needed to quantify the backgrounds to a level where a more conclusive statement on the observation of top quark production in the lepton plus jets channel in ATLAS can be made.

References

[1] The ATLAS Collaboration, Search for top pair candidate events in ATLAS at √

s = 7 TeV, ATLAS- CONF-2010-063.

[2] G. Aad et al., ATLAS Collaboration, The ATLAS experiment at the CERN Large Hadron Collider,

JINST 3 S08003 (2008).

(6)

[3] The ATLAS Collaboration, Luminosity Determination Using the ATLAS Detector, ATLAS-CONF- 2010-60.

[4] S. Frixione and B.R. Webber, Matching NLO QCD computations and parton shower simulations, JHEP 06 (2002) 029, arXiv:hep-ph / 0204244;

S. Frixione, P. Nason and B.R. Webber, Matching NLO QCD and parton showers in heavy flavour production, JHEP 08 (2003) 007, arXiv:hep-ph/0305252;

S. Frixione, E. Laenen and P. Motylinski, Single-top production in MC@NLO, JHEP 03 (2006) 092, arXiv:hep-ph / 0512250.

[5] P.M. Nadolsky et al., Implications of CTEQ global analysis for collider observables, Phys. Rev.

D78 (2008) 013004, arXiv:0802.0007 [hep-ph].

[6] S. Moch and P. Uwer, Theoretical status and prospects for top-quark pair production at hadron col- liders, Phys. Rev. D78 (2008) 034003, arXiv:0804.1476 [hep-ph];

U. Langenfeld, S. Moch, and P. Uwer, New results for ttbar production at hadron colliders, arXiv:0907.2527 [hep-ph].

[7] S. Frixione, E. Laenen, P. Motylinski, B.R. Webber and C.D. White, Single-top hadroproduction in association with a W boson, JHEP 07 (2008) 029, arXiv:0805.3067 [hep-ph].

[8] M. L. Mangano, M. Moretti, F. Piccinini, R. Pittau and A.D. Polosa, ALPGEN, a generator for hard multiparton processes in hadronic collisions, JHEP 07 (2003) 001, arXiv:hep-ph / 0206293.

[9] J. Pumplin et al., New generation of parton distributions with uncertainties from global QCD anal- ysis, JHEP 07 (2002) 012, arXiv:hep-ph / 0201195.

[10] The ATLAS Collaboration, Expected Performance of the ATLAS Experiment: Detector, Trigger and Physics, CERN-OPEN-2008-020, pages 874–878.

[11] G. Corcella et al., HERWIG 6.5: an event generator for Hadron Emission Reactions With Interfer- ing Gluons (including supersymmetric processes), JHEP 01 (2001) 010, arXiv:hep-ph/0011363;

G. Corcella et al., HERWIG 6.5 release notes, arXiv:hep-ph / 0210213.

[12] J.M. Butterworth et al., Multiparton interactions in photoproduction at HERA, Z. Phys. C72 (1996) 637.

[13] S. Agostinelli et al., Geant4 – a simulation toolkit, Nucl. Instr Meth. A 506 (2003) 250;

J. Allison et al., Geant4 developments and applications, IEEE Transactions on Nuclear Science 53 No. 1 (2006) 270–278.

[14] The ATLAS Collaboration, Observation of W → `ν and Z → `` production in proton-proton collisions at √

s = 7 TeV with the ATLAS detector, ATLAS-CONF-2010-044.

[15] The ATLAS Collaboration, Muon Performance in Minimum Bias pp Collision Data at √

s = 7 TeV with ATLAS, ATLAS-CONF-2010-036.

[16] M. Cacciari, G.P. Salam and G. Soyez, The anti-k

t

jet clustering algorithm, JHEP 04 (2008) 063.

[17] The ATLAS Collaboration, Properties of Jets and Inputs to Jet Reconstruction and Calibration with the ATLAS Detector using Proton-Proton Collisions at √

s = 7 TeV, ATLAS-CONF-2010-053.

[18] The ATLAS Collaboration, Jet energy scale and its systematic uncertainty in ATLAS for jets pro- duced in proton-proton collisions at √

s =7 TeV, ATLAS-CONF-2010-056.

(7)

[19] The ATLAS Collaboration, Performance of the ATLAS Secondary Vertex b-tagging Algorithm in 7 TeV Collision Data, ATLAS-CONF-2010-042.

[20] The ATLAS Collaboration, Data-Quality Requirements and Event Cleaning for Jets and Miss- ing Transverse Energy Reconstruction with the ATLAS Detector in Proton-Proton Collisions at a Center-of-Mass Energy of √

s = 7 TeV, ATLAS-CONF-2010-038.

[21] M. Mangano, Understanding the Standard Model, as a bridge to the discovery of new phenomena

at the LHC, CERN-PH-TH-2008-019, arXiv:0802.0026 [hep-ph].

(8)

Figure 1: Missing transverse energy E

_T^miss

distributions for the modified electron (left) and muon (right) plus jets selections requiring at least one jet with p

T

> 20 GeV. The data are shown by the points with error bars, compared to the sum of all expected contributions, taken from Monte Carlo simulation (t¯ t, single top, W and Z + jets) or estimated using a data-driven technique (QCD multi-jet). The hatched area shows the uncertainty on the total expectation due to the statistical error on the QCD background estimate.

Figure 2: Distributions of the transverse mass M

T

of the lepton-E

^miss_T

system, for the modified electron (left) and muon (right) plus jets selection requiring at least one jet with p

T

> 20 GeV. The data are shown by the points with error bars, compared to the sum of all expected contributions, taken from Monte Carlo simulation (t¯ t, single top, W and Z +jets) or estimated using a data-driven technique (QCD multi-jet).

The hatched area shows the uncertainty on the total expectation due to the statistical error on the QCD

background estimate.

(9)

Figure 3: Leading jet p

T

distributions for the modified electron (left) and muon (right) plus jets selection

requiring at least one jet. The data are shown by the points with error bars, compared to the sum of all

expected contributions, taken from Monte Carlo simulation (t¯ t, single top, W and Z +jets) or estimated

using a data-driven technique (QCD multi-jet). The hatched area shows the uncertainty on the total

expectation due to the statistical error on the QCD background estimate.

(10)

Figure 4: Jet multiplicity distributions for the modified electron (left) and muon (right) plus jets selection,

requiring at least one jet with p

_T