Image Alignment - Image Processing - Deep Learning with Multi-Dimensional Medical Image Data

2.2 Image Processing

2.2.3 Image Alignment

Image alignment through registration algorithms is an important problem in medical image analysis. Often, several images of the same or several patients are aligned for visualization and direct comparison. For this purpose, different images with different coordinate systems need to be transformed into the same coordinate system. This can be performed for several scans of the same patient at different points in time, which allows for longitudinal comparison, for example, for monitoring disease progression.

In the context of deep learning with a longitudinal sequence of images, alignment is also important as the underlying convolutional operations initially exploit local context.

Thus, ensuring that local context matches between time points should improve the methods’ performance. Another example is the alignment of images of the same patient acquired with different imaging techniques, often referred to as multi-modal image fusion. This can be relevant for the use of, for example, different MR imaging modalities in deep learning models. Also, registration between patients is relevant in the context of visualization and comparison. While alignment between patients can be helpful for deep learning models, it is also possible to force models to learn invariance towards incorrect patient alignment through data augmentation techniques.

During image registration, a fixed imagef_I_F : R^N^d → Ris aligned with a moving imagef_I_M :R^N^d⁰ →Rusing a transformationt_reg :R^N^d →R^N^d⁰ withN_d, N_d⁰ ∈ {2,3}.

This is an optimization problem wheret_reg is chosen such thatf_I_M(t_reg(x_c))becomes similar tof_I_F(x)for all image coordinatesx_c, measured by a similarity metric.

In general, registration algorithms can be categorized through three properties. The

first property is the image dimensionality. Typically, there are 2D-2D, 2D-3D, and 3D-3D registration methods. 2D-3D can be relevant for the alignment of preoperative 3D CT scans with intraoperative 2D US or X-ray images. 2D-2D and 3D-3D registration are relevant for longitudinal alignment or multi-modal registration. The second property is the type of information that is used for registration. Typically, landmarks, curves and surfaces, or voxels are used for registration. For landmarks, curves, and surfaces, prior detection or segmentation of the relevant features for registration is required.

Third, registration methods can be categorized by being parametric or nonparametric.

Parametric methods include rigid, affine, and perspective transforms, which perform a global transformation of the moving image. Nonparametric transforms can also consider local deformation in the images to be registered.

In the context of this thesis, rigid transformations are particularly relevant. Here, the moving image is rotated and translated to match the reference image. Thus, we assume that the transformed objects in the image are rigid. Here, the transformation t_reg : R³ → R³ transforms a coordinate vectorx_c = (x, y, z)^T with a rotation matrix R_rot = (rij)i,j=1,...,3 ∈R^3×3and a translation vectors= (sx, sy, sz)^T ∈R³:

t_reg(x_c) = Rx_c+s (2.35)

The rotation matrixR_rotis defined by three rotation anglesα_rot,β_rotandγ_rotwhich form the rotation matrix by: Thus, the entire rigid transformation is defined by the six parametersα_rot,β_rot,γ_rot, s_x, s_y and s_z. An example application in the context of this thesis is the alignment of several longitudinal MRI scans for multiple sclerosis lesion activity segmentation, see Figure 2.15. Here, the images were taken several months apart, and in each case, different acquisition parameters such as slice orientation were chosen for acquisition.

Thus, the images need to be rotated and shifted for proper alignment. Another example of rigid image registration is the task of motion tracking and compensation, for example, in the context of intraoperative motion compensation with OCT. Here, patient or surgical tool movement causes a shift or rotation. Image registration can be used to estimate the translation and rotation between and initial image and an image acquired after shift or rotation. In this thesis, this task is also addressed using deep learning methods.

2.2 Image Processing

Fig. 2.15: An example of longitudinal 3D spatio-temporal MRI data, both unregistered (top) and registered using rigid registration (bottom). For the unregistered images, a slice along the slice acquisition orientation is shown.

Other parametric registration methods extend the rigid registration by additional de-grees of freedom. For an affine transformation, shearing is also covered. The matrix R ∈R^3×3is no longer a pure rotation matrix, and all its nine entries are parameters, lead-ing to a total of twelve parameters. For a perspective transform, perspective distortions are also considered, which leads to 15 parameters in total.

Nonparametric registration methods are useful if local deformation between the fixed and the moving image also need to be covered. This can be useful for intraoperative scenarios where soft tissue is being deformed and a preoperative scan needs to be registered to an intraoperative imaging modality. Also, the registration can be used to estimate motion fields from spatio-temporal image data or perform atlas-based image segmentation. For a nonparametric registration, the transformationt_reg is characterized by a displacement fieldu_d :R^N^d →R^N^dwhich shifts the original coordinate vectorx_c: t_reg(x_c) = x_c−u_d(x_c) (2.41) In order to find a plausible deformation fieldu_d, regularization is also required when formulating the optimization problem. Thus, the optimization problem can be defined as minD_reg[f_I_F, f_I_M ◦t_reg] +λ_regRegl[t] (2.42) whereD_reg is a distance metricRegl is a regularization method.λ_reg is a weighting factor. The regularization method is chosen based on the specific problem. Regular-ization is also an important part of deep learning methods and is discussed further in Section 3.3.5. Note that registration itself is a problem that can be solved using deep learning methods. This problem is not addressed in this thesis. An overview of these techniques is given by Haskins et al. [189].

All registration methods are applied by solving an optimization problem where a distance metricD_regand sometimes a regularization termReglare part of a loss function to be minimized. The distance metricD_regdepends on the information being used for registration. In the case of voxel-based registration, a typical example is the sum of squared intensity differences which is defined as

SSD_reg :=

i=1

(f_I_F(xⁱ_c)−fIM(t_reg(xⁱ_c)))² (2.43) whereN is the number of voxels in the image. This distance metric is useful if the two images’ intensities are within the same range, for example, for monomodal registration of longitudinal MRI scans. This condition is likely not fulfilled, for example, in case of different imaging modalities being registered. For this purpose, mutual information can be used instead. The mutual informationM_I between the moving and the fixed image is defined by the entropyH of the two images’ intensity distribution and their joint intensity distribution

M_I =H_f_IF_(x_c₎+H_f_IM_(t_reg_(x_c₎₎−H_f_IF_(x_c_),f_IM_(t_reg_(x_c₎₎ (2.44) where the images’ entropy of the intensity distribution is, in general, given by

H_i¹

I = −

l=1

p(i¹_I(l)) log(p(i¹_I(l))) (2.45) H_i¹

I,i²_I = −

l=1 a2

j=1

p(i¹_I(l), i²_I(j)) log(p(i¹_I(l), i²_I(j))) (2.46) wherei¹_I(l)andi²_I(j)are the pixel intensities with amplitude setsa₁ anda₂, respec-tively. The probabilitiesp(i¹_I(l)),p(i²_I(j))andp(i¹_I(l), i²_I(j))can be estimated based on the relative frequency of the intensitiesi¹_I(l)andi²_I(j). In contrast to the sum of squared differences metric, the goal is to maximize mutual information which leads to more similar images.

Summary. Although deep learning methods perform end-to-end image processing, image processing and preprocessing techniques are still relevant. Point operators can be used for simple image transformations such as normalization. Convolution-based filtering is the fundamental principle behind most deep learning methods for images.

Here, a kernel is swept over the image which reveals specific properties that depend on the type of kernel. In traditional image processing, kernels are defined for highlighting features such as edges or to smooth images. For preprocessing, medical images often have to be aligned where registration methods can be employed.

Im Dokument Deep Learning with Multi-Dimensional Medical Image Data (Seite 43-47)