Likelihood discriminator - Identification of isolated photons at high energies at H1

The figures 7.10 and 7.11 show the Likelihood method with different binnings after passing the TM-VApplication tool. Unfortunately real data were available only for wheels 1 to 3.

The distributions show that there is no distinct separation possible between signal and background mostly because pions decay in two photons with very asymmetrical energies¹ so that these photons cannot be distinguished from single photons. Nevertheless background and single photons have a different shape of the discriminator distribution. This allows to fit the fraction of photon and back-ground events.

The Likelihood histograms show a good separation power for all wheels and all energy intervals except for wheel 1 and 2. In these two wheels signal and background shape of the histograms are still slightly different, so that a separation with a smaller probability is possible. Especially for the energy region of 9 to 11 GeV, the signal separates not too badly in the right bins and has therefore a good probability for correct identification of single photons. The worse separation power of wheel 1 and 2 can be explained by their granularity. An outstanding separation can be found in the forward wheels 4, 5 and 6 for the energy interval 9 to 13 GeV. There, signal and background peak very nicely and the probability to identify single photons correctly is very high.

For a better conclusion, the used methods should be tuned better (mainly the number of neural knots of the neural network could give a big impact to the results but needs longer simulation time).

Real data are shown in figures 7.10a to 7.10i only for illustration. For a conclusion about how real data fit the MC data, signal and background data need to be weighted, summed and normalized.

To become an idea how real data fit with the selected variables, figure 7.12 shows a separation with a good data set and well tuned methods which use very similar input variables as in this study (figure 7.9 shows schematically the analysis structure and the used variables). The plots show the likelihood distribution for five wheels and six bins in transverse energies (5-15 GeV). The data is described well in all bins by the sum of the signal for background and photons. The scaling factor for the background and the signal is determined by the fit. The results show that the method works also for high transverse energies where the separation power is poorer. These figures were kindly provided by Krzysztof Nowak [7].

HaQ meeting - 08/02/05 - Krzysztof Nowak 6

Signal extraction

About 50% of the sample consists of background

Using cluster shape based multivariate analysis to fit signal and background distributions

MVA:

Likelihood, Neural Net, Range Search,

...

Radius

FirstLayerFr

HotCoreFr

HottestCellFr

Kurtosis

Symmetry Cluster shapes:

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 10 20 30 40 50 60 70 80 90

Signal Background

Discriminator

Figure 7.9: Schematic structure of the ongoing analysis of K. Nowak. Source [7].

1 Eγ1∝1 +βcos(θCM) E_γ2∝1−βcos(θ_CM)

Page 24 of 33

7 Results

Figure 7.10: Likelihood discriminator (6 bins) of signal (full, blue), background (dashed, red) and

7 Results

Figure 7.11: Likelihood discriminator (100 bins) of signal (full, blue) and background (dashed, red).

Page 26 of 33

7 Results

HaQ meeting - 08/02/05 - Krzysztof Nowak7

S ig n a l e x tr a c ti o n

00.20.40.60.810100200300400500600 NSig = 1094 +- 51 NBkg = 710 +- 49 Chi2/ndf = 6 / 8

5.0< ET <6.0

<-0.6η-1.0< 00.20.40.60.810

100

200

300

400

500 NSig = 979 +- 52 NBkg = 913 +- 52 Chi2/ndf = 12 / 7

<0.2η-0.6< 00.20.40.60.810100200300400500600700800900 NSig = 883 +- 36 NBkg = 1220 +- 38 Chi2/ndf = 5 / 6

<1.0η0.2< 00.20.40.60.81050100150200250 NSig = 398 +- 24 NBkg = 335 +- 23 Chi2/ndf = 8 / 7

<1.8η1.0< 00.20.40.60.8101020304050 NSig = 105 +- 14 NBkg = 88 +- 14 Chi2/ndf = 19 / 7

<2.4η1.8< 00.20.40.60.81020406080100120140160180200220NSig = 494 +- 37 NBkg = 272 +- 36 Chi2/ndf = 4 / 7

6.0< ET <6.7

00.20.40.60.810

50100

150

200

250NSig = 514 +- 46 NBkg = 612 +- 46 Chi2/ndf = 7 / 8 00.20.40.60.81050100150200250300350400NSig = 526 +- 28 NBkg = 625 +- 29 Chi2/ndf = 13 / 5 00.20.40.60.81050100150200250NSig = 320 +- 25 NBkg = 437 +- 26 Chi2/ndf = 22 / 5 00.20.40.60.810102030405060NSig = 89 +- 16 NBkg = 145 +- 17 Chi2/ndf = 2 / 7 00.20.40.60.81020406080100120NSig = 369 +- 32 NBkg = 110 +- 30 Chi2/ndf = 7 / 8

6.7< ET <7.5

00.20.40.60.810

80100

120

140

160

180NSig = 478 +- 41 NBkg = 275 +- 39 Chi2/ndf = 15 / 7 00.20.40.60.81050100150200250NSig = 440 +- 28 NBkg = 419 +- 27 Chi2/ndf = 10 / 5 00.20.40.60.81020406080100120140160

180NSig = 282 +- 24 NBkg = 325 +- 24 Chi2/ndf = 9 / 7 00.20.40.60.810102030405060NSig = 79 +- 16 NBkg = 145 +- 17 Chi2/ndf = 6 / 8 00.20.40.60.810102030405060NSig = 214 +- 29 NBkg = 93 +- 28 Chi2/ndf = 13 / 8

7.5< ET <8.5

00.20.40.60.810

90NSig = 265 +- 35 NBkg = 239 +- 35 Chi2/ndf = 7 / 8 00.20.40.60.81020406080100120140160

180NSig = 366 +- 26 NBkg = 242 +- 25 Chi2/ndf = 7 / 7 00.20.40.60.81020406080100120NSig = 237 +- 23 NBkg = 206 +- 22 Chi2/ndf = 7 / 7 00.20.40.60.810510152025303540NSig = 80 +- 15 NBkg = 69 +- 15 Chi2/ndf = 4 / 7 00.20.40.60.8101020304050NSig = 187 +- 26 NBkg = 21 +- 25 Chi2/ndf = 10 / 8

8.5< ET <10.0

00.20.40.60.810

70NSig = 289 +- 35 NBkg = 90 +- 33 Chi2/ndf = 11 / 7 00.20.40.60.81020406080100120NSig = 336 +- 25 NBkg = 130 +- 23 Chi2/ndf = 13 / 8 00.20.40.60.81020406080100NSig = 256 +- 25 NBkg = 188 +- 24 Chi2/ndf = 20 / 7 00.20.40.60.810510152025

30NSig = 62 +- 15 NBkg = 76 +- 15 Chi2/ndf = 6 / 7 00.20.40.60.81024681012141618NSig = 101 +- 22 NBkg = 4 +- 24 Chi2/ndf = 1 / 7

10< ET <15

00.20.40.60.810

510

45NSig = 188 +- 31 NBkg = 76 +- 30 Chi2/ndf = 8 / 7 00.20.40.60.81010203040506070NSig = 242 +- 29 NBkg = 135 +- 28 Chi2/ndf = 10 / 8 00.20.40.60.81020406080100NSig = 256 +- 22 NBkg = 143 +- 21 Chi2/ndf = 18 / 8 00.20.40.60.8105101520253035NSig = 76 +- 18 NBkg = 92 +- 18 Chi2/ndf = 14 / 8

Data Sig Bkg

Figure 7.12: Simulated versus real data. Columns=Wheel, rows=E_T. Source [7].

8 Conclusions

The results show that good separation power can be reached for single and double photon events at high transverse energies for wheels in the forward region of the detector. It seems that the selected shower shape variablesRT,RL,EHottestF rac andEHottestL1F rac are well chosen. For a better understanding of the influence of each variable, a continuing treatment of the topic is indispensable.

A fine tuning of the methods by grouping different variables and applying them to the analysis tools in combination with larger data sets for signal, background and real data would surely increase the quality of separation power. Furthermore other analysis methods (especially the MLP method) should be taken into account for a more appropriate or more complete analysis at higher energies.

A next step would also be the calculation of a cross section that would give an even more precise statement about separation power of different combinations of shower shape variables linked to different analysis methods.

The results show with good credibility that future analysis should definitively make use of the energy range above 10 GeV and the longitudinal dimension of the shower.

Page 28 of 33

9 Acknowledgments

Many thanks go to Katharina M¨uller for a great supervision, for help in almost any situation, for the correction of this paper, for a view behind a physicist’s life and for a very good time. Thanks also go to Carsten Schmitz and Krzysztof Nowak from Desy for their help and support with ROOT and H1 and for the good inputs.

A Appendix

A.1 Description of TMVA methods

Im Dokument Identification of isolated photons at high energies at H1 (Seite 27-33)