Method 3: CNN-MIXED - Removal of sparse bone information Input: Stack of masked knee MRIs

6 Age Estimation

Algorithm 4: Removal of sparse bone information Input: Stack of masked knee MRIs

6.3 Method 3: CNN-MIXED

Ultimately, a total of 18 model variants are evaluated and trained on the classiﬁca-tion task given the six selected ML classiﬁers (subsecclassiﬁca-tion 6.1.2) and the same three combinations of data as used for the regression task.

6.3 Method 3: CNN-MIXED

Method 3 is the last approach for the age estimation of young males adults investi-gated in this work. The method is based on knee MRIs as well, similar toMethod 2 (section 6.2), and in addition, incorporates the numeric data from the subjects (AM and SKJ) into a new “multi-input and mixed data” CNN (Fig. 6.10). The data used by the multi-input CNN undergoes the data preparation (subsection 6.2.1) as well.

The CNN is designed as a neural network composed of two distinct branches, one for the image data and one for the combined numeric data acquired from the subjects.

Theleft branchis a copy of the CNN fromMethod 1 (Fig. 6.9) except for the last layer, which is replaced by a new FC-layer with ﬁve outputs. Theright branchis a multi-layer perceptron. It has ﬁve input neurons to accept all the numeric data, i.e.

+

{AM, SKJ}

MLP

CA 2D knee MRI

CNN-MRI

...

Figure 6.10:A “multi-input and mixed data” architecture to combine 2D knee MRIs and numeric data (anthropometric measurements (AM) and score of the knee joint (SKJ)) in one model

weight, standing height, sitting height, LLL, and SKJ. At the core, the MLP has a single fully-connected hidden layer with ten neurons. Each neuron takes in the ﬁve inputs and applies a non-linearity using an ELU activation function. The hidden layer is followed by Dropout (p= 0.5) to account for overﬁtting. The output layer is a FC-layer as well with ﬁve neurons. Here, each neuron receives ten inputs from the hidden layer and generates one output by applying a further ELU activation function.

The outputs of both branches are then concatenated and passed to a ﬁnal FC-layer.

This layer uses thedenserepresentation of the image and numeric data learned from both branches to regress the chronological age. A linear activation is used for this purpose. Ideally, the equal number of outputs of both branches leads to a balanced weighting of image and numeric data for age estimation.

The hyperparameters for training, are similar to the ones from Method 2 (subsec-tion 6.2.3). The diﬀerences are in the learning rate, which is set to 0.0005, and the number of epochs, which are set to 100. In general, the optimization of the multi--input CNN proved to be challenging since both branches have to be adjusted by the optimizer. Convergence during training started early, which was a reason that the epochs were limited to 100.

After the training of the CNN, the same steps as in Method 2 are performed for age regression and majority classiﬁcation. The diﬀerence here is, that AM and SKJ can no longer be used to train the ML algorithms. Moreover, numeric data is only available for subject with an MRI examination incoronal slice orientation.

As a consequence, the sagittal MRIs are disregarded for this method. Finally, there are just two trainable model variants forMethod 3, one for regression and one for classiﬁcation.

6.4 Model Evaluation

The performance of all models from Method 1, Method 2, and Method 3 on age estimation is evaluated on the test set, i.e. the part of the data the model has never seen nor learned from. Furthermore, an unbiased estimate of a model performance is achieved with stratiﬁed k-fold cross-validation. For this work, k = 5 folds are generated by splitting the data into training, validation (only for Method 2 and Method 3), and test set ﬁve times. Each time, the test set includes diﬀerent subjects

6.4 Model Evaluation

in order to get independent evaluations. Additionally, to enable the comparison between the three age estimation methods, the test set of a respective fold contains the same subjects, independent of the method chosen.

To attain a higher reliability of the estimate of the model performance, each fold is evaluated a total of ten times. This is necessary due to the stochastic nature of deep learning and most ML models. Training a model several times on the same data will result in diﬀerent prediction each time. The more robust the model, the lower the diﬀerences will be. The ﬁnal evaluation is deﬁned as “extended”

stratiﬁed 5-fold cross-validation. Thestratiﬁcationensures that the age distribution is approximately equal in all sets and all folds.

Age Regression

The following evaluation metrics are selected for the age regression task: MAE, stan-dard deviation of the absolute error,maximum absolute error, and thepercentage of samples within 1-year and 2-years of absolute errorbetween the true and predicted chronological ages. The MAE is deﬁned as:

MAE =1 n

n i=1

|(yi−yˆi)|, (6.3)

whereyiis the true chronological age of theith subject, ˆyi the corresponding pre-diction by the model, andnthe total number of subjects of the evaluation. As a reference for the existing variability, the values from a direct statistical evaluation of the training data (stat) are computed as: ˆyi= ¯y, where ¯yis the mean age of the samples in the training set. Thus,statmerely predicts the mean age for all subjects in the training set.

Im Dokument Towards Automated Age Estimation of Young Individuals (Seite 94-97)