Basis extraction - A brief introduction to data integration by XDS

2 Introduction to the methods of crystallography

2.3 A brief introduction to data integration by XDS

2.3.2 Basis extraction

According to section2.2.6, any reciprocal lattice vector can be written in the formS₀=ha^∗₁+ ka^∗₂+la^∗₃, whereh,k,l are integer numbers anda^∗₁,a^∗₂,a^∗₃are basis vectors of the reciprocal lat-tice. The basis vectors as well as the reflection indicesh,k,lhave to be determined from the list of strong diffraction spotsX_i⁰,Y_i⁰,Z_i⁰(i=1, . . . ,n). Ideally, each recorded spot corresponds to a reciprocal lattice vectorS₀satisfying the Laue conditions (see section2.2.4) after a crystal rota-tion by the angleϕ in respect to the starting angleϕ₀. XDS computes a reciprocal lattice vector, referring to the unrotated crystal, for the coordinates of each observed strong spot centroid in

2 Introduction to the methods of crystallography

the following way

S^∗₀=D m₂,−Z⁰

S⁰−S₀

wherem₂is the unit vector of the fixed axis around which the crystal is rotated,D(m₂,−Z⁰)is a rotation aroundm₂by−Z⁰.Z⁰can be derived from thepartialityof reflections: The intensity of a reflection can be completely recorded on one image, or distributed among several adjacent images. The smaller the rotation increment ∆ϕ, the higher the partiality of most reflections.

Z⁰ can be derived from the summation of the fractionsR_i of total intensity recorded on all the images on which the reflection is distributed and the known rotation angle∆ϕ. For small angles

∆ϕ can be shown, that lim_∆ϕ_→0Z⁰(ϕ) =ϕ. So Z⁰ represents the rotation ϕ with respect to the starting angle ϕ₀. S⁰ can be derived from the coordinatesX⁰,Y⁰, the origin of the detector X₀,Y₀and the distance|F|in combination with the detector coordinate system{d₁,d₂,d₃}. So the vectors S^∗₀ can be derived from the known parameters, and knowing those vectors which must fulfill the Laue conditions, a set of basis vectors a^∗₁,a^∗₂,a^∗₃ and reflection indices can be computed that can explain all the measured reflections. Since these reciprocal lattice vectors S_0i(i=1, . . . ,n) derived from the list of strong diffraction spots often contain a number of aliens(spots arising from fluctuations of the background, from ice, or from satellite crystals) a robust method has to be used which is capable of recognizing the dominant lattice underlying the main part of the recorded spots. In XDS, a lattice basis is found by the following procedure:

First, the list of vectors S^∗_0i(i=1, . . . ,n) is reduced to a small number m of low-resolution difference-vector clusters. For each cluster the difference between any two reciprocal lattice vectorsS^∗_0i−S^∗_0j is approximately the same. Then the three “best” linear independent vectors a^∗₁,a^∗₂,a^∗₃are selected that allow all difference vectors to be expressed as small integral multiples with respect to this chosen basis. According to this reciprocal base vector triplet a reduced cell is derived as defined in [186].

2.3.3 Indexing

Now that a basis a^∗₁,a^∗₂,a^∗₃ of the reciprocal lattice is available, integral indicesh_i,k_i,l_i are as-signed to each reciprocal lattice vector S_0i(i=1, . . . ,n) by using the local indexing method [187]: The reciprocal lattice points are considered as the nodes of a tree connecting thenpoints to each other. The branches of the tree are the connections of thenpoints and the branch length

2.3 A brief introduction to data integration by XDS

between nodesiand jis defined as

l_{i j} =1−exp

The functionl_{i j} is 0 if none of the indices h^{i j}_k is absolutely larger than δ and the differences betweenξ_k^{i j} and the nearest integers h^{i j}_k are within ε. Typical values of those constraints are ε =0.05 andδ =5. Thus, reliable index differences are indicated by short branches. During traversal of the tree, each node is given a subtree number. Starting with arbitrary indices 0,0,0 for the root node, the local indexing method then traverses the shortest tree (along the shortest branches) and thereby assigns each node the indices of its predecessor plus the small index differences between the two nodes. Starting with subtree 1 for the root node, each successor node is given the same subtree number as its predecessor if the length of the connecting branch is below a minimal lengthl_{i j} <l_min. Otherwise its subtree number is incremented by 1. So all nodes in the same subtree have internally consistent reflection indices. Thus aliens are usually found in subtrees consisting of a small number of nodes whereas the tree with the largest number of nodes should represent all the lattice points caused by the dominant lattice. Finally, a constant index offset for the largest tree (consisting of the largest number of nodes) is determined, such that the centroids of the observed reciprocal lattice points and their corresponding calculated vectors are as close as possible. This index offset is added to the indices of each reciprocal lattice point.

2.3.4 Integration

There are two procedures available for determining the integrated intensities: summation in-tegration andprofile fitting[188]. Summation integration simply adds the pixel values for all pixels lying within the area of a spot and then subtracts the estimated background contribution to the same pixel. Profile fitting assumes that the actual spot shape is known and the intensity is derived by finding the scale factor that, when applied to the known profile, gives the best fit to the observed spot profile. This leads to a two step integration process: First the profile of the spots is determined and then the intensities fitting this profile are calculated.

For weak reflections, many of the pixels in the peak region of the spot will contain very little signal (Bragg intensity) but will contribute significantly to the noise because of the Poissonian

2 Introduction to the methods of crystallography

variation in the background (see [162, chapter 11.2]). Profile fitting can improve the signal-to-noise ratio for this class of reflection significantly because the intensity calculated by profile fitting is a weighted sum and the weight will be hightest on those pixels in the center of the spot where the contribution of the Bragg diffraction is greatest whereas the weight will be very low on the peripheral pixels where the Bragg diffraction is weakest. Thus, the standard devia-tion of the integrated intensity for weak reflecdevia-tions can be reduced compared with summadevia-tion integration.

In XDS integration is done using the profile fitting technique:

2.3.4.1 Spot extraction and standard profiles

A coordinate system(e₁,e₂,e₃), specific for each reflection, is introduced, which has its origin on the surface of the Ewald sphere at the terminus of the diffracted beam wave vector [189].

XDS then “learns” standard profiles from strong spots. Individual profiles are determined in nine equal parts of the detector and every 5^◦of crystal rotation. Only strong spots that are close to their predicted location are taken into account.

For each of the reflections consisting of these strong spots a threedimensional domain of pix-els belonging to the respective reflection is defined and each pixel content is mapped into the coordinate system (e₁,e₂,e₃). Without this mapping, geometrical distortions would be intro-duced into the profiles of the different reflections (for an explanation see [162, chapter 11.3]).

At the end of the learning process, an intensity distribution{p_i|i∈D₀}of the observed profile is determined within the defined threedimensional domainD₀.

2.3.4.2 Background

XDS determines the background by first sorting all pixels belonging to a reflection by increasing intensity. For weak or absent reflections, these values should represent a random sample drawn from a normal distribution. If this is not the case, the pixel with the largest intensity is (itera-tively) removed until the sampling distribution of the remaining smaller items satisfies normal distribution. By this method pixels with unexpectedly high values, such as ice reflections, will also be excluded. Then the background is determined as the mean of the accepted pixels. For strong spots this value will be systematically overestimated because some of the residual inten-sity will extend into the accepted background pixels. This residual inteninten-sity can be estimated and is removed from the final calculated background value.

2.3 A brief introduction to data integration by XDS

2.3.4.3 Intensity estimation

A problem that has not been mentioned yet is that regions of neighbouring reflections may overlap. Then a decision has to be made, which pixel belongs to which reflection. XDS handles this problem with a simple strategy: pixels within the overlap region are assigned to the nearest spot.

If the intensity distribution{p_i|i∈D₀}is given, the intensity of each reflection can be deter-mined by minimizing

Ψ(I) =

∑

i∈D

(c_i−I·p_i−b_i)²

v_i ,

∑

i∈D0

p_i=1. (2.27)

where c_i,v_i,(i ∈D) are measured contents and variance of pixels observed in a subdomain D⊆D₀ of the expected distribution, b_i is the background (see section 2.3.4.2) and I is the intensity to be determined. The variancev_i=b_i+I p_iis initially set tov_i=b_iand in an iterative process the intensity estimate can be determined. A detailed description of the algorithms used by XDS can be found in [162, chapter 11.3].

2 Introduction to the methods of crystallography

3 Crystal structure of the archaeal

Im Dokument X-ray crystallographic analysis of the archaeal transcriptional regulator TrmB and development of a graphical user interface for the monochromatic diffraction data processing software XDS (Seite 83-89)