• Keine Ergebnisse gefunden

5.5 Experimental Estimation of Physical Activity in Firefighters

5.5.6 Discussion

5.5 Experimental Estimation of Physical Activity in Firefighters

during an exercise) strongly depends on the micro-climate. Moreover, early detection of critical temperature ranges is a useful metric for command and control in training and the field.

In other work, for similar reasons, only total acceleration (eq. 5.18) is used[146, 169]. In this work, body orientation (pitch and roll), was used. Both mean and variance of pitch and roll contribute in deciding if the participant is sitting, lying, or moving.

It could be assumed that this information is also useful to distinguish various other activities. Consequently, in future work, the use of orientation corrected acceleration measures should be investigated. This could help to find features that generalize better because it contains more information than the flattened total acceleration signal.

The experiment also revealed a systematic error if the PA estimation is based on acceleration data alone. Comparing both trials of the experiment, it was found that EE differed by 17.5±8.3 %. Likewise, HR and RR followed this trend (14.8±5.7 % or 23.8±6.3 %, respectively). In other words, it is found that the physiological markers reflect the increase in metabolic EE. In contrast, the acceleration signals are not different from each other. As a consequence, the percentage difference in RMSE comparing models based on accelerometer data or physiological information alone is 16.4 % to 31.3 %. This highlights missing information on the intensity given accelerometer data.

In summary, the results underline the necessity of physiological markers to reflect the dynamics or intensities of PA. Still, prediction relies on having motion information to distinguish between sedentary and active activities, as it was previously suggested by Altini et al.[14].

Furthermore, it is noticed that normalization for inter-individual differences is crucial.

That is normalization of physiological markers such as HR, which is partly determined by genetic factors or individual CRF. Consequently, CRF should be added to every PA model because otherwise, no predictions of the maximum metabolic EE can be made.

Moreover, slope features turned out to improve the estimations. In the model’s response, it could be seen that the slope acts as an offset to the estimation with respect to HR. That is, if the HR rises (positive slope), the EE is also estimated higher. If HR decreases, EE estimation is slightly lower. That is in agreement with the temporal characteristics of metabolic EE, which shows certain inertia (sec. 2.1.3). Also, the model’s response shows a certain plateau with decreasing of the HR. This is likely a reflection of the EPOC effect taking place during the cool-down phase at the end of the experiment.

Having a closer look at the feature space, it can be seen that due to the normalization layer, the use or adaption of input features becomes more flexible. This can be illustrated by the examples of HR and RR. If one of the two values is missing, they can simply be replaced (equated) by each other. This eliminates the compensating effect of having both inputs, but the model itself remains functional. Similarly, this applies to pitch, roll, and slope features. If these are fixed at their mean value (0.5), the prediction becomes less accurate, but no extreme outliers are to be expected.

Another vital aspect of the presented model arises from the nature of the utilized activation function. The ANN-based model uses sigmoid activation functions in the hidden layer (output layer is not activated, i.e. identity function is used). Because the sigmoid function is bounded, its use guarantees a fixed limit for the model’s response.

5.5 Experimental Estimation of Physical Activity in Firefighters

For this reason, (even though the learned sample is small and thus prone to over-fitting) the model’s estimates can never drift towards an extreme. Given the weights learned for the data set used in this work, the estimation will always be in the range of 1.24 MET to 17.8 MET.

The model presented in this chapter shows close agreement (R2=0.82, RMSE= 1.4; sec. 5.5.3.5) to the reference gold standard (IC). Detailed analysis of the residuals, however, also revealed a certain variance. Therefore, it can be inferred that the model is not able to fully reveal all dynamics of PA.

Nevertheless, the error across all participants, in terms of mean deviation, is small

−0.03 MET. Given the mean PA during the entire experiment (5.59 MET), this corre-sponds to an error of 0.5 % across all participants. At this point, however, it must be noted that this result applies to this very data set only (i.e. the sample under consider-ation), and does not necessarily hold for another sample population. Also, with this particular error metric (mean error), negative and positive errors cancel out each other.

Although the model’s response is close to a perfect agreement, with respect to the sample under consideration, MAPE across all participants is 11.0±9.5 %. This comparison of the different participants is a better estimate for its out-of-test error than the Bland-Altman analysis. It must be acknowledged that this error compares in magnitude to that previously reported in other work (20±15 %[63]).

Here, it must furthermore be noted that it is not advisable to use the model in another context. The overall sample is too small and comes from a limited set of participants. Most importantly, no woman participated in the experiment. The model is thus expected to have a bias error if it would be re-applied onto a new sample without re-training the weights. Nevertheless, it could be shown that the ANN-based model (and its hyper-parameters) is a qualified candidate for PA estimation in general. This is because the model provides accurate estimations for the mean PA in the long term (averaged across all participants).

Re-Using a data set that was made available to the public by[95], the model was re-trained. It is found that the presented model does not cover all the dynamics in the ground truth data. Its variability is limited. Consequently, an inevitable bias error is observed. Nevertheless, on average, the estimated METs closely match the gold standards ground truth from IC.

Here, it remains unclear if the variation in the ground truth data comes from real metabolic changes or instead represent measurement uncertainties. It can be argued that from a physiological point of view, the dynamics are exceptional. This is in agreement with considerations for the use of IC data[196]. There is no formal proof that variability recorded with IC is not reflecting real physiological differences in EE.

However, there is strong evidence that the variability is mainly measurement uncertainty.

In order to trace this assumption of noisy ground truth, it was smoothed using digital filters. This way, certain dynamics are removed, and the new deviation found is close to that reported by Gjoreski et al.[95](RMSE:−0.004, MAE: 0.018) or Catal et al.

[55](RMSE: 0.089, MAE:−0.078).

Moreover, that result is achieved, even without using the full feature set provided but using HR, RR, and acceleration-peak count only. Again, as a matter of the limited sample size, the model cannot be used in a generalized context. Yet, it becomes clear that the model’s entropic capacity is a good fit because it is matching the task’s complexity. This is also consistent with recent findings presented by Lu et al.[146](sec. 5.3).