• Keine Ergebnisse gefunden

this background as well if the W decays leptonically including a leptonically decaying τ-lepton. The nominalW+jets process is modelled in the same way as Z →ll+jets.

Alternative samples for W+jets, Z+jets and diboson are generated with Powheg +Pythia8 using the AZNLO tune [92] and the CTEQ6L1 PDF set [93]. Finally, another alternative sample forZ+jets is considered which is modelled by MadGraph 5 [88] for the ME interfaced toPythia 8 with the A14 tune.

All these MC generator setups, those producing the nominal samples as well as the alternative ones, are critical and contribute a significant amount to the total system-atic uncertainty which dominates this measurement. These uncertainties are further discussed in Section 6.5.3.

6.4 Sanity, closure and stress tests of the PDF method

Different tests have been performed to evaluate the robustness and precision of the new PDF likelihood method presented in Section 6.2. The tests work as follows:

1. Pseudo-data with a knownb-tagging efficiency (truthb ) is created by MC generators;

2. The LLH functions from Equations6.7 and6.8 are fitted to the pseudo-data;

3. The POI pseudo-data

b is extracted from the fit and compared to truthb , similarly to the measurement based on real data.

6.4.1 Sanity tests

At first, a very simple sanity check is performed. The LLH functions are used directly to build the pseudo-data from simulation. Thus, the PDF method should extract truthb from the fit or, correspondingly, the SFmeasuredb /truthb should be unity. This test should find its input and thus a closure of the distributions before and after the fit is expected.

As shown in Figure 6.9, this is exactly the case.

After this, a second sanity test is employed. In this check, the pseudo-data is fluctuated according to a Poisson distribution before the PDF method is applied. One would expect the SFs from the fit to still be compatible with unity within the statistical uncertainty of the pseudo-data. And indeed, this is observed in Figure 6.10. Moreover, this serves as a validation of the method with which the statistical uncertainty in data is estimated, described in Section 6.5.1.

6.4.2 Closure tests

In Section6.2, the LLH functions were derived with the assumption that the b-tagging weight of a b-jet depends only on the pT bin they fall into. However, this is only an

102

20-30 30-40 40-60 60-85 85-110 110-140 140-175 175-250 250-600

[GeV]

Figure 6.9: Theb-tagging efficiency SFs (left) and the bb yield correction factor (right) for the MV2c10 algorithm at the 77% WP. This test is performed using pseudo-data directly created from the LLH functions in Equations 6.7 and 6.8. Thus, a perfect closure is expected and observed.

102

20-30 30-40 40-60 60-85 85-110 110-140 140-175 175-250 250-600

[GeV]

Figure 6.10: Theb-tagging efficiency SFs (left) and the bbyield correction factor (right) for the MV2c10 algorithm at the 77% WP. This test is performed using pseudo-data directly created from the LLH functions in Equations 6.7and 6.8, but the pseudo-data is fluctuated using a Poisson distribution before the fit. Thus, a perfect closure within statistical uncertainty of the pseudo-data is expected and observed.

approximation for actual collision data, since the MV2c10 algorithm is trained on nu-merous variables that may be correlated to, but not entirely dependent on, thepTof the jet such as the pseudo-rapidity η. For example, both the pT and η distributions of the leading and subleading jets may be different in any givenpT bin of the calibration [76].

Furthermore, there may be additional or hidden variables that have different distribu-tions for the leading and subleading jets and thus affect theb-tagging algorithm, but are not considered at all [76].

6.4 Sanity, closure and stress tests of the PDF method To assess the impact of this assumption, a closure test is performed as follows: the nominal simulated samples of the considered physics processes are used both as the pseudo-data as well as the MC input to the LLH function. In this test, however, the LLH function uses a significantly restricted fraction of information compared to the total MC. This test is depicted in Figure6.11 which indeed shows a small non-closure effect.

Its highest value is 0.3% in the lowest pT bin which is expected, because the b-tagging efficiency variation is the highest from the lower to the upper bin threshold, i.e. 20 GeV to 30 GeV, with respect to other bins. Despite this small observed non-closure effect, it is always significantly smaller than the Poisson uncertainty that results from the limited pseudo-data statistics. Therefore, this effect is considered to be negligible.

102 20-30 30-40 40-60 60-85 85-110 110-140 140-175 175-250 250-600

[GeV]T,2p

Figure 6.11: Theb-tagging efficiency SFs (left) and the bbyield correction factor (right) for the MV2c10 algorithm at the 77% WP for a closure test. In this test, the same MC generated samples are used to create the pseudo-data and MC inputs to the LLH functions in the fit. Here, the LLH functions only include a significantly smaller fraction of information compared to the total MC.

A small non-closure effect is expected, because of the physical differences between leading and subleading jets within any givenpT bin, which is also observed.

6.4.3 Stress tests of the simulation

This final test measures the impact that the MC generator choice has on the PDF fit method. The idea is to use an alternative t¯t sample combined with the nominal non-t¯t samples as pseudo-data. The b value extracted from the fit is then compared to truthb to assess the dependence of the method on the MC generated input samples. The alternative t¯t sample chosen for this stress test was generated using Sherpa v2.2.1, because of three reasons:

1. Compared to the previous test, the alternative t¯t sample leads to a controlled difference between pseudo-data and simulated inputs for the LLH function.

2. The alternativet¯tsample generated withSherpais not used for the estimation of the modelling uncertainties in Section6.5.3.

3. The MC generator setup of Sherpais able to simulate thet¯tprocess beyond NLO precision in the ME. Because of this, the modelling of additional jets, especially light-jets, is improved and thus the setup is believed to predict data more precisely.

This procedure is performed twice, namely for the old PDF approach and the new method. The mis-identification efficiency of light-jets as b-jets, called mis-tag rate, is corrected in all MC samples by applying appropriate SFs derived from an individual calibration in Atlas, detailed in Ref. [94], as well as MC-to-MC SFs that bridge the difference betweenPowheg+Pythia8 andSherpav2.2.1. Therefore, the mis-tag rate for both generator setups should correspond to the one found in data. The comparison of the two PDF fit methods is shown in Figure6.12.

102

expected data stat. uncertainty (SR) expected MC stat. uncertainty (SR)

-modeling uncertainty

expected data stat. uncertainty (SR+CR) expected MC stat. uncertainty (SR+CR) -modeling uncertainty

Figure 6.12: The PDF method is performed twice using an alternative t¯t sample gen-erated with Sherpa v2.2.1 combined with the remaining nominal non-t¯t samples as pseudo-data. The extracted b-tagging efficiency is compared to the true one from the MC input. The left plot shows the result with the old LLH functions, while the right plot shows the results with the new LLH functions. The green band represents the statistical uncertainty of the pseudo-data, taken from the Minuit package [95] while assuming Poisso-nian errors. The statistical uncertainty of the t¯t MC input is represented by an open rectangle and estimated using the bootstrap resampling tech-nique [96] described in Section 6.5.2, which means performing the fit for each bootstrap replica and feeding it into the LLH function while leaving the pseudo-data unchanged. The vertical error bars are given by the sum of this squared statistical MC uncertainty and the squaredt¯tmodelling uncer-tainty that is taken from the actual measurement based on real data. This procedure is done for the nominalt¯tsample generated withPowheg inter-faced toPythia8 and repeated for an alternative setup, namelyPowheg +Herwig 7.

6.4 Sanity, closure and stress tests of the PDF method Here, the statistical uncertainty of the pseudo-data is estimated in the same way as the actual data in the calibration, that is by using Minuit [95] and assuming Poissonian errors, which is given by the green band. The statistical uncertainty of thet¯tMC sample is represented by the open rectangle and estimated using the same bootstrap resampling technique [96] described in Section 6.5.2. This means the measurement is repeated and fed into the LLH functions for each bootstrap replica while leaving the pseudo-data and non-t¯t samples untouched. The vertical error bars are the sum of the squared t¯t MC statistical uncertainty and the squared t¯t modelling uncertainty that is quoted in the actual calibration based on real data. Apart from these, no other uncertainties are taken into account, because only those related to thett¯inputs are considered relevant for this comparison.

The comparison of the two fit methods exhibits the following features:

• The statistical uncertainty of the pseudo-data is slightly higher with the new fit method. This is expected, because more information is extracted from data via the fit, namely the two jet flavour correction factors.

• The statistical uncertainty of the MC input is higher as well. But this is also expected, since more information is extracted from the MC samples in the new approach. In particular, the two jet flavour fraction templates i.e. bb,bl,lb, andll, are also needed for the LLH of the CRs to constrain them further in the likelihood fit, as opposed to the previous method which only used the SR to do so.

• The new method highlights one of its key motivations, namely that the dependence on the modelling of the t¯t inputs is significantly smaller compared to the old fit method. The old configuration shows a negative slope as a function of the jetpT, while the new one does not. The reason for this is the different jet flavour com-positions predicted bySherpacompared to the prediction byPowheginterfaced toPythia8. This could hint at a problem, namely that the central values of the b-tagging efficiencies in the calibration are biased if the jet flavour composition is different between actual data and simulation. With the new approach, however, the CRs are used to additionally constrain the jet flavour compositions in the SR and thus, the new extraction method produces less biasedb-tagging efficiency SFs.

• The new fit shows a peculiar behaviour in the lastpTbin, but this has been found to originate from the statistical uncertainty of thet¯tsample simulated withSherpa.

The problem is that its statistical uncertainty from the simulation is larger than the Poissonian uncertainty that is assumed in the determination of the pseudo-data statistical uncertainty. In other words: √

N <pP

w2, wherew refers to the MC event weights of theSherpa t¯tsample.

Finally, the significantly decreased dependence on the modelling of the t¯t inputs out-weighs the slightly increased statistical uncertainties of the data and MC inputs to the fit. Consequently, the total combined uncertainty is reduced with the new fit approach.

To conclude, the new PDF method behaves as expected. Both sanity and closure tests are passed without any significant issues and thus, no additional uncertainty is added to the measurement [76].