• Keine Ergebnisse gefunden

1.3 Challenges of reconstructing single particle cryoEM datasingle particle cryoEM data

1.3.1 Contrast transfer function (CTF)

Optical aberrations in a TEM introduce blurring to the images. Electron dose or spherical aberration, e.g., affect the detected signal. The introduced phase shift by defocusing the TEM to force a better phase contrast in the images needs to be removed. The Point Spread Function (PSF) corrects for these kinds of defects. The function describes the ideal mapping of a point source in the object onto the image for an optical system. To correct the projection image with PSF the image is convoluted with the PSF. As mentioned in Theorem 2.2.5 the convolution in real space is the equivalent of a multiplication in the Fourier domain. Since the convolution over 2D projection images is a time-consuming calculation the projection images are Fourier transformed (see Theorem 2.2.3) and multiplied by CTF, the Fourier space equivalent of the PSF. The CTF, see Figure 1.7, is an oscillatory, sinusoidal function of spatial frequencies. A multiplication with the CTF Equation 1.1 corrects the displaced phases of the Fourier transformed image. Interpreting any single particle projection image beyond first zero crossing of the CTF is not possible if the CTF correction is skipped [30].

CT F(s) = sin

wheres =Òs2x+s2y is the length of the two-dimensional spatial frequency vector andis the phase with respect to the spatial frequencies. The wavelength (see subsubsection 2.3.1.1) depends on the electrons accelerating voltage used for imaging. The CTF describes the

Figure 1.7: Synthetic CTFHere, two CTFs are sketched. The CTF with the near focus is a slower varying sinusoidal function. Here, the defocus is set to f = 0.25 µm, which is close to the back focal plane. The CTF with underfocus corresponds to an imaging with higher defocus. The CTF is varying much faster. Both sinusoidal functions are plotted with the same parameter setting. Parameter: Cs = 2.7mm, pixel per Å= 1 Å, = 0.0197 Å

introduceddefocus”fset for the objective lens of the TEM. A focused image exists when the beam converges on the back-focal plane. Underfocus and overfocus converge either above or below the back-focal plane. In Figure 1.7 two CTFs with different defocus settings are plotted. With increasing defocus the wavelength of the sine waves decreases. The spherical aberration of a lens, calledCs, in the TEM is a constant value with respect to the microscope. It is the inability of the lens to converge the beam to a single focal point at high angles. The resulting image is blurred. Using cryo prepped data the TEM settings are set to underfocus to enhance the contrast of the projection images. All three parameters

”f, and Cs are known by microscope settings.

Other defects of the TEM, e.g. astigmatism, change the defocus settings of the micro-scope. Astigmatism leads to different foci with respect to perpendicular rays. It results from either lenses with a non-uniform electromagnetic field [31] or not perfectly centered aperture. Additionally, astigmatisms can occur from beam deflection due to charges from dirty apertures. It creates elliptic shaped Thon rings in Figure 1.8b in micrograph power spectrum. The astigmatism results in a deviation of the defocus based on the phase values.

The new defocus values ”fast are determined by fitting the rings of the CTF to the Thon rings, i.e. rings in the power spectrum, of the micrograph. The defocus ”f in Equation 1.1

is altered to

”fast(◊) =”fucos2(◊≠ast) +”fvsin2(◊≠ast), (1.2) where”fu,”fv define the defocus induced along the minimal and maximal axis with respect to the elliptic shaped rings in the power spectrum (see Figure 1.8b). The variable ast is the angle between the longest diameter of the ellipse and the Cartesian system with respect to the axis along defocus representation”fu [30, 32].

There are additional factors, e.g. amplitude contrast, which can further influence the image quality. The envelope function is introduced due to the spatial and temporal co-herence of the beam. This function dampens the CTF, especially in the high frequencies.

Possible damping functions rely on the drift of the energy spread in the beam or the in-stability of the current in a lens [30]. A state-of-the-art envelope function is based on the B-factor. Further details are introduced by Mallick et al.[30] and Zhang [32].

(a) Sketch of a 2D power spectrum with no astigmatism. The CTF is fitted to the power spectrum.

(b) Sketch of a 2D power spectrum with an astigmatism. The CTF is fitted to correct the astigmatism.

Figure 1.8: Correction of astigmatism The teal rings correspond to the maximum peaks of the power spectrum of a micrograph. The CTF is fitted to these Thon rings.

1.3.2 Noise

The objective of an experiment is to measure a particular signal of interest and further analyze and interpret this. The ideal signal in Figure 1.9 is the projection of the synthetic model. Here, the black parts of the image represent areas, where no signal was detected, and the other parts correspond to pixels, where a signal was generated by the 3D density model. In theory, this signal is considered to be the ideal or predicted signal. An ideal signal in cryo-EM is the projection of a protein complex formed by the electron signal. By the resolving power of the TEM the protein complexes can theoretically reach structures with atomic resolution. However, the average published resolution is not reaching the theoretical

potential of the method, the atomic resolution (see Figure 1.4). One difficulty is a random process disturbing the ideal projection signal.

Figure 1.9: Synthetic additive image noiseThe first summand is a projection image of a simulated 3D density map. The map was noise free so that the projection image contains the predicted (resp. ideal) signal. The second summand is a pure Gaussian distributed noise image simulated in MATLAB. The sum of both images represent the measured signal. It is distorted due to a variety of effects.

On the experimental side the measured signal deviates from the predicted signal. A variety of disturbances interfere with the signal of interest. All these combined disturbances are called noise. The noise leads to artifacts, unrealistic edges or blurs out information [33]. Informative content of the noisy image in Figure 1.9 is reduced compared to the ideal projection. Most likely the interpretation of the data based on the measured signal is difficult and leads to false assumption of the underlying structure. In digital image processing noise emerges from image acquisition, image coding, transmission and processing the data [33]. The contamination of a specimen can lead to a false signal. A faulty memory location, e.g., can corrupt the digital image [33]. All these interferences add up to generate the noisy measured signal in Figure 1.9.

In general, disturbances are unpredictable, random and describe the combination of all physical components which interfered with the predicted signal. The characteristics of noise are modeled by probability distributions describing the random statistical processes.

The most common distribution of noise is the Gaussian (see in Figure 1.9). There are also Poisson noise, uniform noise and impulse noise [33]. The noise in signal processing is often considered to be a white or colored noise. The power spectrum of the noise defines the color. White noise is image noise, which is normally distributed with zero-mean and variance of one. It has a constant power spectrum with respect to the identical length of spatial frequencies intervals. Colored, e.g. pink or blue, noise occurs with different spectral properties than white noise. Modeling the noise component in image processing is done in two different ways. On the one hand there is multiplicative noise, which depends on the signal. This type is more severe since it is not easily separated from the ideal signal. On the other hand there exists additive noise as in Figure 1.9. The noise is added on top of the signal and does not modify the predicted signal. In image processing theory of SPA of cryo-EM data the random processes are formed as an additive model. A simple representation of a single particle projection image is

I =f +m m≥N(µ,2), (1.3) where the Noise m is Gaussian distributed with mean µ and variance of2. In Figure 1.9 the Gaussian noise image was added onto the ideal projection imagef leading to a modified imageI. This is similar to a single particle projection image where a noisy component was added on the underlying ideal signal.

To define the information value of an image a ratio between the power of the signal and the power of the noise is determined. This ratio is called the SNR. An SNR equal to one indicates the same amount of signal as noise present in the data.

SN R= PSignal PN oise

(1.4) The three images in Table 1.1 represent the identical underlying signal but different powers of noise. The first image has about the same amount of power for noise and signal.

Here, the SNR is close to one. The other two images contain a greater amount of the additive noise. The second image with an SNR of about 0.25 has about four times more noise power than signal. The signal for the dinosaur tail has lost some visibility. In the third image the tail is completely invisible. Identifying the signal in the images, which is often one aim of image processing, is difficult. A low SNR affects the quality of the image processing results. Therefore, a sufficiently large SNR is necessary to be able to differentiate between the signal and the noise and consequently, be able to correctly extract the signal information. The SNR of cryo-EM projection images is very small. It often ranges from 0.1 to 0.3. To increase the SNR the number of electrons used for image acquisition could be increased, but the radiation sensitivity of biological matter makes it difficult to take images at higher electron dose. As a consequence, increasing the electron dose damages the structure of the protein complex.

The computational techniques aim to remove additive noise in recorded data depend on the noise sources. Noise is often caused by multiple aspects during image acquisition.

The model describes all sources that caused the random disturbances of the signal. The quantization error, e.g., emerges from the transmission of a continuous signal to a measured digital discrete signal [33]. In general, any wave function is in theory a continuous function.

The signal generated by electrons can only be measured at finitely many time points.

Therefore, there exists a difference in the ideal signal to be detected and the discrete on the spatial-scale depending signal. The mapping of the spatial frequencies to a pixel is not precise and furthermore, deducts the signal information quality. Quantization error is often assumed to be additive white noise. Thus, it is important to learn and understand the noise

SNR 1.0276 0.2569 0.0642 Variance of

the noise image 0.9731 3.8924 15.5695

Table 1.1: SNR of synthetic data Here, three projection images of a synthetic map with different SNR values are presented. All three of them contain the identical power of the signal. They differ in their power of additive component of noise 2. With decreasing SNR the signal of the maps features are more invisible. The tail of the dinosaur is a finer detail of the synthetic 3D model. The additive noise power covers the power of the signal with respect to this particular feature.

source before going into image processing.