Selection Criteria - B ELLE IIE XPERIMENT D ECAYATTHE B → K ( 892 ) S ENSITIVITY S TUDYFORTHE

4. Analysis 14

4.1.1. Selection Criteria

Since most of events have various combinatorial possibilities to reconstruct each intermediate particle, a selection of constrains has to be applied. This limits the amount of computational work and saves time. In this first selection step the amount of combinatorial background is drastically decreased.

PID A particle identification (PID) with the detector response can be performed. For each reconstructed particle, six different probabilities are calculated, each corresponding to one of the »stable« charged particles. These are electrons, muons, kaons, pions, deuterons and protons.

4.1. B MESON RECONSTRUCTION Their lifetime and therefore their mean free path is long enough to not decay in the detector, which makes them stable in this experiment.

To calculate a PID value, the interaction of each particle with different sub-detectors is taken into consideration individually. For each sub-detector, six likelihoods for the stable hypotheses are determined.

∆ln(L^α) =ln(L^hyp)−ln(L^α) (4.1) With eq. 4.1, a logarithmic difference in combined likelihood from all sub-detectors for a specific particle hypothesis is calculated. L^α is the sum of the likelihoods of different sub-detectors for particle hypothesisα, whileL^hyp is the summed likelihood for the hypothesis of the particle itself, which is arbitrarily chosen. The PID value can be extracted by normalizing

∆ln(L)to a scale from zero to one. This results in a powerful discriminator for opposing particle hypothesis [11].

To find the best PID information constraint that preserves enough efficiency, the PID informa-tion for every reconstructed electron, kaon and pion for a sample of 100,000 signal MC events is analyzed. Only the probability for a muon candidate to be a muon, and similarly for kaons and pions, is looked at. Other variants such as the probability of e.g. a muon candidate being a kaon are not taken into account.

Since the data is generated, every particle along the decay chain is known. Solely those particles that satisfy the following conditions are considered true candidates:

• ForK andπ:

– The particle candidate is a generatedK/π

– The particle candidates mother particle is aK^∗(892)⁺ – The particle candidates grandmother particle is aB⁺

• Forµ:

– The particle candidate is a generatedµ

– The particle candidates mother particle is aB⁺

All other particles are neglected because they do not matter for the main decay and are consid-ered background.

The efficiency and purity for each constraint are calculated as efficiency=N_(true_|_selected)

N_(true) (4.2)

purity= N_(true_|_selected)

N_(true_|_selected)+N₍_{f alse}_|_selected). (4.3) The index ’selected’ indicates that only candidates that fulfill the PID constraint are included whereas a lack of this index includes all candidates.

CHAPTER 4. ANALYSIS

layer and in case of a good decision tree, one subset should contain either only true or only false candidates. Generally this is not the case and the model is not very strong.

To create a stronger model, multiple decision trees are combined together in a process called boosting. To start with, a decision tree limited to a depth of a few layers is trained. Also referred to as a weak learner, this model will predict some of the data correctly and some of it incorrectly. This imperfect learner is then inputted into the boosting algorithm which tries to find another weak learner that improves the shortcomings of the imperfect learner. Both are combined to form another imperfect learner which is slightly better than its predecessor. This step can theoretically be repeated indefinitely [18].

The combination of many weak learners, also referred to as estimators, results in a model that has a lot of separation power, referred to as a strong learner. After the training, the model can be applied to test data that is similar to the training data and has the same features. All data points in the test set are run through the tree and a probability for each one being a signal is calculated.

In case of a good model, this divides the test set into two groups with few data points that have a probability around 50 % which indicates that the tree can not classify this point well.

4.2.1. Multivariate Analysis Features

The data set that has to be classified is the generic background mixed with the signal MC, both of which are discussed in section 3.5.2. The set is split into two subsets of equal size, where the signal MC is marked as signal for the classifier. One of the sets is used to train the classifier while the other one is used as a test set to later get the results. This step prevents the classifier to learn the training set by heart and not generalize to similar events.

To train the boosted decision tree, features from the data set have to be selected. There is no general way to always find the optimal features, they are selected through trial and error. Tab.

4.3 contains every feature used to train the boosted decision tree.

The histograms for each feature, comparing the signal shape to the different background shapes that form the generic background, are displayed in the appendix in fig. A.1.

Continuum Suppression To identify and suppress continuum events (e⁺e⁻→qq,¯ q=u,d,s, c), several sets of variables exist. The idea is to use the topological differences of continuum decays and realBB¯events. If aϒ(4S)is created, it only has a small momentum and is approx-imately at rest and it therefore decays almost isotropically into aBB¯ pair. If instead something other than aϒ(4S)is created, it has enough momentum that the decay forms two opposite jets of daughter particles.

For particles in the event with momenta p_i, (i=1, ...,N), the thrust axis~T is defined as the unit vector along which their total projection is maximal. The magnitude of the thrust axis

T ^max= ∑^N_i=1|Tˆ·~pi|

∑^N_i=1|~pi| (4.9)

Im Dokument B ELLE IIE XPERIMENT D ECAYATTHE B → K ( 892 ) S ENSITIVITY S TUDYFORTHE (Seite 21-30)