Fakult¨ at f¨ ur Physik

(1)

Fakult¨ at f¨ ur Physik

Institut f¨ ur experimentelle Kernphysik

Prof. Dr. G. Quast, Prof. Dr. M. Feindt, Dr. A. Zupanc

Ubungsgruppen: G. Sieber, B. Kronenbitter, A. Heller¨ Ausgabe: 10.05.2012 Bearbeitung bis 17.05.2012

Computer¨ ubung zu Moderne Methoden der Datenanalyse Exercise 3: Fitting

The fitting of parametrised functions to measured data is important for the check of models and the determination of their parameters. However there are some pitfalls which can lead to wrong conclusions.

• Exercise 3.1: voluntary

First, we need some data to apply a fit to. Generate random numbers according to an exponential distribution exp(−x) for x > 0. Take the uniformly distributed random numbers (gRandom->Rndm()) and apply the transformation method. Write 100 000 exponentially distributed random numbers to an ntuple (root classTNtuple).

Plot the random numbers stored in the ntuple you created above. Use a histogram with 100 bins from 0 to 10. Have a look at the online documentation of theTTree::Draw method for this task. (Or take the macro getHisto.C provided together with this exercise.)

The plotted numbers can be interpreted as the measurements of decay times of a radioactive material. Fit the following function to the distribution:

f(t) = N ·exp

−t−t0

τ

The fit parameters are the normalisation N, a possible offset t0 and the lifetime τ. Use an object of type TF1 to define the fit function. Fit the histogram using its Fit method with the default options (χ² fit). You can find some information about fitting of user-defined functions in the root howto’s. Try out different start values for the three parameters. How does the result of the fit depend on the start values?

Print the correlations between the fitted parameters. See exercise 2 for help how to obtain them, or consult the online documentation of TH1::Fit. What can be learned from the correlations in this case?

Improve the parametrisation of the fit function. How does it have to look like in order to have the number of measurements as a parameter? Repeat the fit with the improved parametrisation.

Reduced number of bins:

Make a histogram of the numbers from the ntuple with only 10 bins from 0 to 10. Fit the improved fit function to the histogram using once the default options and once the option

(2)

2

¨I¨. Compare the fitted parameters to the expected ones for both cases and explain the difference.

Different methods:

Make three different histograms with 10, 1000 and 100000 entries from the ntuple respec- tively. Use 1000 bins from 0 to 10. Fit the function to each histogram using the χ² method and the binned likelihood method. Compare the fitted parameters and the χ² value of both methods and try to explain the results.

Your own likelihood function in MINUIT:

Write an unbinned log likelihood fit for 10, 1000 and 100000 entries from the ntuple. Create an object of type TMinuit for this with the number of fit parameters as argument. Define a function, that calculates the log likelihood, and set it in the TMinuit object with SetFCN.

As a starting point, consult the script example_minuit.cxx in Section 6.3 of the tutuorial

“Diving into Root”

The function FCN has to have the following signature:

void anyName(Int t& npar, Double t* gin, Double t& logLikelihood, Double t* param, Int t flag)

The first parameter of this function has to be the number of parameters, the third one the calculated log likelihood and the fourth one the array of fit parameters. The second and the last parameter can be ignored. In the function you have to loop over the ntuple entries and sum up the log likelihood values. Thei-th entry in the ntuple can be accessed by first calling the method GetEntry(i) and then GetArgs()[0] of the ntuple. The pointer to the ntuple and the number of entries have to be defined outside the function in global variables.

Set an initial parameter value and error with TMinuit::DefineParameter. The minimiza- tion is performed by TMinuit::Migrad.

Plot a histogram from 0 to 10 with the entries used in the unbinned log likelihood fit together with the fitted function normalized to the number of entries. Display the log likelihood value as a function of the fit parameterτ from 0.5 to 5. What can be learned from this plot about the errors of the fitted parameter?

• Exercise 3.2: Combination of correlated measurments obligatory A common problem in science is the combination of several measurements. For the calculation of the average value not only the errors of the individual measurements have to be taken into account but also the correlations between them. A wrong treatment of correlations or common systematic effects, however, can lead to biased results. Here is an example:

At the LEP accelerator at CERN the mass of the W boson was measured in two different channels:

e⁺e⁻ → W⁺W⁻ → q1q2q3q4

e⁺e⁻ → W⁺W⁻ → lνq1q2

The experimental signature in the detector for the first channel with four quarks are four reconstructed jets. The second kind of reaction is identified by a lepton (electron or muon) and two jets. The neutrino is not detected. The measured W masses are:

4 jets channel: mW = (80457±30±11±47±17±17) MeV lepton + 2 jets channel: mW = (80448±33±12± 0±19±17) MeV

(3)

3 The first two errors are the statistical and systematic experimental uncertainties. They are uncorrelated. The third error is an uncertainty from theory only present in the four jets channel. The fourth error is 100 % correlated because it comes from a common theoretical model. Also the last error which originates from the LEP accelerator is 100 % correlated between both measurements.

Construct a covariance matrix of the two W mass measurments taking into account all uncertainties and their correlations. Use this covariance matrix to define a χ² expression containing the average W mass ¯mW as a free parameter. Determine ¯mW and its error by minimizing the χ² expression with theTMinuit class.

For this exercise, you have to write your own χ²-function to be minimised, see the hints in the last part of the previous exercise how this can be done.

• Exercise 3.3: voluntary

Two measurements y1 = 8.0 and y2 = 8.5 of the same physical quantity with an uncorrelated relative statistical error of 2 % and a common normalisation error of 10 % should be combined. Construct a covariance matrix and a χ² expression and determine its minimum with TMinuit or analytically.

Is the result you obtained reasonable? What could be the cause for the unexpected value?

Hint: Make a plot of the covariance ellipse in the y₁^′y₂^′ plane defined by

∆y^TV⁻¹∆y=c², ∆y=

y1−y1^′

y2−y₂^′

forc= 1 and c= 2 together with the line y₁^′ =y₂^′.V is the covariance matrix. To draw the ellipse aTGraphobject can be used. The points on the ellipse can be calculated as a function of the angleφif ∆y is expressed byφ and the radiusr. (If you don’t want to implement this yourself you can use the macrodrawCovEllipse.C which comes with this exercise.)