• Keine Ergebnisse gefunden

5.2 OCT-Based Tissue Classification

5.2.1 Ophthalmology

In ophthalmology, typical tissue classification tasks include detection of AMD, DR, or glaucoma. Most eye diseases affect the retina in some way, which is why retina layer segmentation is also a common problem that is solved as a preprocessing step, followed by a more detailed examination [393].

Disease Classification. The first methods for OCT-based disease classification in ophthalmology relied on handcrafted feature extraction and conventional machine learning methods. Farsiu et al. obtained features by first segmenting the retina semi-automatically [134]. Then, features such as layer thickness and volume were calculated and used in a generalized linear model for disease classification. Sugmk et al. took a similar approach by using layer segmentation first, followed by the detection of drusen material [468]. Then, a binary classifier was applied for differentiating diabetic macular edema and AMD. Deng et al. relied on Gabor filter banks that undergo a nonlinear transformation as their features [102]. Then, several histograms are computed and concatenated from the filtered regions using different scales and orientations. Finally, the features are fed into random forests, SVMs, and FC-NNs for classifying the OCT images into AMD and normal cases. Another approach used unsupervised clustering for obtaining few descriptive image patches in a feature extraction stage [511]. These patches are used to build patch occurrence histograms for all training images that are then used for supervised classification with a random forest. Wang et al. obtained features based on linear configuration patterns, extracted from multi-scale OCT images, building feature pyramids [531]. Afterward, correlation-based feature selection was applied, followed by different classification models. These models included sequential linear optimization, naive Bayes, SVMs, FC-NNs, and random forests. Other work was more focused on diabetic macular edema in the context of DR [546]. For example, Alsaih et al. performed feature extraction with histogram of oriented gradients and local binary pattern, followed by principal component analysis for dimensionality reduction [16].

The authors used linear and kernel SVMs, as well as random forests for classification.

Lemaitre et al. took a similar approach, extracting features both from 3D OCT volumes and 2D patches [286].

More recently, deep learning methods have been applied to OCT-based retinal disease classification. Ravenscroft et al. bridged the gap between handcrafting features and deep learning-based feature learning by using a CNN for feature extraction [398]. Then, learned feature maps are transformed into feature vectors by using histograms. A typical classification stage using SVMs, FC-NNs, or random forests follows. Rasti et al.

relied on an end-to-end approach when using CNNs for AMD classification [397]. To improve performance, the authors used CNNs with multiple input scale resolutions in an ensemble. A more recent approach by Rong et al. applied extensive data augmentation by generating multiple surrogate images using denoising and a masking process [408].

The performance was improved by averaging over multiple surrogate images. Serener et al. distinguished dry and wet AMD using different standard models from the natural image domain, such as ResNet and AlexNet [437].

Very recently, Wu et al. proposed a modified CNN for classification of several retinal diseases using an attention mechanism that should focus on critical regions within OCT slices [545]. Also, a preprocessing mechanism for noise reduction was employed before

5.2 OCT-Based Tissue Classification

CNN processing. Wang et al. proposed a CNN architecture for AMD classification where a DenseNet-like structure is employed [521]. Additionally, features from multiple scales within the network are pooled to a global feature representation for the final classification. Das et al. followed a similar approach and also found that using features from multiple scales improves AMD classification performance [98]. Qiu et al. applied the idea of self-supervised iterative refinement training to AMD classification [390].

Here, during the CNN’s training stage, a part of the training data is relabeled by the model, followed by training on the newly obtained labels. Saha et al. took a different approach to AMD classification by learning typical clinical biomarkers with a CNN instead of the typical disease labels [421]. An et al. explored transfer learning in the context of classifying wet and dry AMD, where they found that a model pretrained on typical AMD detection is well suited for the more fine-grained task [17]. A new approach by Yoo et al. proposed to combine OCT with the modality color fundus imaging for improved AMD classification [571]. The authors used a CNN for each modality to extract a feature vector that is then fed into a random forest classifier.

Overall, the number of deep learning approaches for retinal disease classification has been growing, particularly in very recent years. Conventional approaches considered very low-dimensional context as they relied on a few features, extracted from A-Scans. With the emergence of deep learning, almost all recent approaches perform 2D B-Scan-based processing. However, there are no studies considering a multi-dimensional perspective or comparing different data representations. Also, there are many different handcrafted architectures, and often, it is unclear which architecture concept is preferable.

Retinal Layer Segmentation. Similar to early approaches for disease classification, earlier methods for retinal layer segmentation relied on conventional computer vision methods, often paired with classic machine learning methods. Koozekanani et al.

addressed retinal boundary segmentation of 2D OCT images using edge detection and a Markov model [256]. Another approach relied on A-Scan-based processing, where the individual intensity profiles were thresholded to obtain a segmentation of four retinal layers [217]. Similarly, Shahidi et al. made use of intensity peaks within A-Scans to segment several retinal layers [440]. Lu et al. made use of additional preprocessing and noise-filtering before applying edge detection for layer segmentation [316]. Cabrera et al. made use of a deformable model instead, which can also be used for the segmentation of fluid-filled regions and lesion around the retina [137]. The idea of higher-dimensional data processing was also introduced for classical methods by Garvin et al. [151]. Here, a graph was constructed based on detected edges in a 3D OCT volume and an optimization process to enforce surface smoothness. This was also applied to particular difficult cases with serious pigment epithelial detachments [443]. A similar idea was pursued by Mishra et al., where the rough location of retinal layers is determined first, followed by a refining stage [342]. Yazdanpanah et al. propose an energy-minimizing active contour approach that should deal particularly well with OCT images corrupted by high levels of noise [567]. The use of classical machine learning methods for retinal layer segmentation was proposed by Lang et al. [268]. Here, a random forest classifier was trained to predict each pixel’s retinal layer class. This was followed by a boundary refinement algorithm to output the final segmentation.

The emergence of deep learning methods in the medical image domain also led to applications for OCT-based retinal layer segmentation. One of the first approaches to

employed deep learning was presented by Fang et al. [131]. In this work, CNNs were fused with a graph search method. First, retinal boundaries were predicted by a 2D patch-based CNN predicting the layer class of the patches’ center pixel. Second, a graph search constructed the final segmentation using the CNN’s predicted probabilities. Ben-Cohen et al. also employed CNNs for retinal layer segmentation [40]. A key difference to Fang et al. is the prediction of an entire segmentation map instead of patch-wise processing.

In particular, Ben-Cohen et al. show that predicting segmentation maps outperforms the patch-based approach. Roy et al. proposed a CNN with an encoder-decoder structure for segmented 2D OCT B-Scans into nine retinal layers [416]. The authors demonstrated significantly improved performance both over conventional segmentation methods and early deep learning-based methods. Schlegl et al. used a similar CNN architecture for segmenting macular fluid regions within 2D OCT images [427]. Shah et al. also employed CNNs for retinal layer segmentation, however, they proposed a different model output scheme [439]. Instead of predicting a segmentation map, the authors’

model output is a vector of layer positions that produced more plausible outputs than a standard encoder-decoder model. A study by De Fauw et al. demonstrated the clinical applicability of deep learning-based OCT image analysis on a very large dataset [100].

First, a CNN is used for layer segmentation. Here, a 2D approach is employed where several neighboring 2D OCT slices are fed into the CNN, and the central 2D slice is segmented at the model output. Afterward, a classification model also performs a disease diagnosis and outputs a treatment recommendation. Another approach incorporated uncertainty estimates into a CNN model by making use of the concept of Bayesian Deep Learning [435]. In terms of implementation, the authors rely on Monte Carlo dropout [143] for uncertainty quantification.

Very recent methods for retinal layer segmentation have introduced more incremental improvements to the overall task. Orlando et al. focused on segmenting the photore-ceptor layer in particular [366]. The authors also relied on Monte Carlo dropout for uncertainty estimation. Masood et al. compared several conventional segmentation approaches to deep learning for choroid layer segmentation [325]. While the deep learning model did not come with any methodological innovation, substantial perfor-mance improvement over conventional methods was demonstrated. Matovinovic et al.

differentiate themselves from other work by making use of transfer learning for 2D OCT image segmentation [326]. The authors made use of a CNN pretrained on the ImageNet dataset as the model’s encoder. The approach outperformed the use of an encoder with standard random weight initialization. Ngo et al. took a very different approach where they went back to classic feature extraction, followed by an FC-NN that regresses the retinal boundaries [357]. The authors argue that the approach is more data-efficient and robust to noise than conventional deep learning approaches. The authors show that they outperform earlier encoder-decoder models for retinal layer segmentation.

Another recent approach performed joint prediction of segmentation maps and boundary regression for improved performance over several state-of-the-art methods [195].

All in all, for retinal layer segmentation, high-dimensional data processing was adopted early with classic methods as 2D, and 3D context was considered to extract smooth layer boundaries. However, deep learning approaches that followed did not make use of this concept. Most methods for retinal layer segmentation use 2D OCT slices as their model input with few exceptions utilizing 2.5D or 3D by incorporating

5.2 OCT-Based Tissue Classification

Tab. 5.2: Overview of related work on OCT-based retina disease classification and layer segmentation with deep learning methods. All methods employed 2D OCT B-Scans or few stacked neighboring slices (2.5D). Method distinction is largely based on model variations. Seg. refers to segmentation, and Reg. refers to regression.

Reference Application DL Method

Ravenscroft et al. (2017) [398] AMD CNN & RF/SVM

Rasti et al. (2017) [397] AMD CNN Ensemble

Rong et al. (2018) [408] AMD CNN & Surrogate Images

Serener et al. (2019) [437] Wet/Dry AMD Standard CNNs

Wu et al. (2020) [545] AMD CNN & Attention

Wang et al. (2019) [521] AMD CNN & Multi-Scale

Das et al. (2019) [98] AMD CNN & Multi-Scale

Qiu et al. (2019) [390] AMD CNN & Self-Supervision

Saha et al. (2019) [421] AMD Biomarkers CNN

An et al. (2019) [17] Wet/Dry AMD CNN & Transfer Learning

Yoo et al. (2019) [571] AMD 2 CNNs & RF

Fang et al. (2017) [131] Retina CNN & Graph Search Ben-Cohen et al. (2017) [40] Retina Enc-Dec CNN

Roy et al. (2017) [416] Retina Enc-Dec CNN

Schlegl et al. (2018) [427] Macular Fluid Enc-Dec CNN

Shah et al. (2018) [439] Retina CNN Boundary Reg.

De Fauw et al. (2018) [100] Retina & AMD Two-Stage CNN

Sedai et al. (2018) [435] Retina Bayesian CNN

Orlando et al. (2019) [366] Photoreceptor Bayesian CNN Masood et al. (2019) [325] Choroid Layer CNN

Matovinovic et al. (2019) [326] Retina CNN & Transfer Learning

Ngo et al. (2019) [357] Retina FC-NN

He et al. (2019) [195] Retina CNN Seg. & Reg.

neighboring slices. A lot of effort is put into architecture design and the proposition of novel architectures. Similar to eye disease classification, different data representations are usually not compared, and the role of data dimensionality is not addressed. In this thesis, extending our previous work [166], we address retinal layer segmentation from a multi-dimensional perspective. In particular, we try to overcome handcrafting architectures, which has been extensively used both for retinal layer segmentation and disease classification by introducing an efficient NAS approach for ophthalmic OCT data.

A summary of Deep learning approaches for retinal layer segmentation is given in Table 5.2.