• Keine Ergebnisse gefunden

Selection, validation, and clinical application of digital health metrics for the assessment of upper limb sensorimotor impairments

N/A
N/A
Protected

Academic year: 2022

Aktie "Selection, validation, and clinical application of digital health metrics for the assessment of upper limb sensorimotor impairments"

Copied!
243
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Research Collection

Doctoral Thesis

Selection, validation, and clinical application of digital health metrics for the assessment of upper limb sensorimotor

impairments

Author(s):

Kanzler, Christoph M.

Publication Date:

2020-06

Permanent Link:

https://doi.org/10.3929/ethz-b-000419442

Rights / License:

In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

(2)

DISS. ETH NO. 26758

Selection, validation, and clinical application of digital health metrics for the assessment of

upper limb sensorimotor impairments

A thesis submitted to attain the degree of DOCTOR OF SCIENCES

(Dr. sc. ETH Zurich)

presented by Christoph Matthias Kanzler M.Sc. in Medical Engineering

Friedrich-Alexander-Universität Erlangen-Nürnberg Germany

born on 04.07.1993 citizen of Germany

accepted on the recommendation of Dr. Roger Gassert (examiner) Dr. Olivier Lambercy (co-examiner)

Dr. Peter Feys (co-examiner) Dr. Paolo Bonato (co-examiner)

2020

(3)
(4)

Acknowledgements

This thesis would not have been possible without the incredible and continuous support from students, collaborators, colleagues, friends, and family.

First and foremost though, I would like to express my deepest gratitude to my mentors Olivier Lambercy and Roger Gassert. When arriving 10 minutes late during my initial job interviews in Zurich, I would have never imagined that I will have the chance to experience such a fulfilling and fascinating scientific and personal adventure. I am extremely thankful for the opportunity you gave me, for the inspiration during our coffee breaks and meetings, and your advice and guidance during personal and scientific challenges. Thanks a lot for impressively caring about the people around you and for allowing me to learn from your ethical and personal values, your scientific and technical expertise, and your extremely thorough and critical way of thinking and working. I could not have wished for a more flourishing environment to pursuit my PhD and look forward to more shared experiences, scientific and personal, in the future.

A special thanks also goes to Paolo Bonato. While not being directly involved in my PhD project, his mentoring during my time spent in his laboratory build the foundation for the success of my doctorate, by teaching me early on fundamental technical and personal skills as well as meticulous scientific practices. Paolo continued serving me with scientific and career advice, and truly hilarious moments during our many video conferences, for which I would like to express my thankfulness.

Further, I would like to thank my students Kathrin Studer, Sofia Martinez Gomez, Sascha Motazedi Tabrizi, Nadine Wehrle, Sarah Curry, Bruna Azevedo da Cunha Andrade, Janick Sidler, Leo Simovic, and Pietro Oldrati for their tremendous contributions to the project and the enjoyable hours spent together.

This thesis was a highly inter-disciplinary endeavour and was enabled through fruitful collab- orations with multiple clinical and technical partners. A big thank you goes to Anne Schwarz and Janne Veerbeek, for sharing the pain of scanning 6129 scientific papers and transforming this frustration into a hopefully intriguing systematic review. I would like to further thank Jeremia Held and Andreas Luft for their continuing support during our collaborative clinical study. Moreover, I would like to emphasize the contribution of Ilse Lamers and Peter Feys, who always provided critical clinical input and impressed with their thorough and reliable

(5)

way of work, for which I would like to thank both of you. Lastly, a big thank you to Giuseppe Averta, Cristina Piazza, Matteo Bianchi, Manuel Giuseppe Catalano, and Antonio Bicchi for our innovative collaboration, novel technical insights, and the delicious and joyful dinners in Italy. Also, this thesis would have way less fancy graphics without the creative expertise of Stefan Schneller - your contribution is very well appreciated.

One of the biggest joys during my doctorate was the time spent in the RELab universe with all colleagues and friends. Our extensive coffee breaks, both childish and philosophical discus- sions over lunch, and social as well as sport events made my doctorate a genuine pleasure.

My deepest appreciation goes to Bützer Senpai, who replenished my childlike joy for Snow- boarding, shared many traveling experiences with me, and, I’m sure, will continue staying a true friend in the future. Same goes for Julio Due ˜nas, as many of our lively moments have already entered our personal history books. Further, a special thanks goes to Dominik Wyser, for helping me to have a smooth start in the lab and sharing up and downs over the years. A big thank you also to "Gymbro" Stefan, Werner, "Kiwi" Chris, Victoria, Mike, Gunda, Samara, Raffaele, Franziska, Volker, Camila, Alejandro, Gustl, Fabian, Matteo, and Maya for the great memories. Thank you also to the new RELab generation, including the Balgrist Boys (Jan &

Jan), Monika, Jessica, and Giada, for bringing a new spirit and energy into the campus.

Further thank you goes to Xiaozhou, Tamara, and my flatmate Lara for your enduring friend- ship and support. A big thanks also to my perlimpinpins, Aymeric, Barja, Camille, and Francois:

you are the most energetic and crazy people I know and it’s a pleasure to call you my friends and spend time with you. My appreciation also goes to Chris, for the very enjoyable car rides, the many culinaric endeavours, and for bringing Franconia a small step closer to Zurich. A huge thank you to all my friends at home, and especially Susl, Phips, and Maxi, for being the port in the storm, for sharing unforgettable moments together, and for enduring me during my overly energetic periods.

Most importantly, I wish to deeply thank my family, my brothers, my parents, and my grand- parents, for your enduring and unconditional support and love, and for always having an open ear for me in challenging and in joyful moments. This thesis is dedicated to you.

Zurich, 24t hof March 2020 C. M. K.

(6)

Acknowledgements

This project received funding from the European Union’s Horizon 2020 research and innova- tion programme under grant agreement No. 688857 (SoftPro).

(7)
(8)

Abstract

Assessments of impaired body functions, such as upper limb sensorimotor impairments in neurological disorders, and their functional impact are a fundamental part of the modern healthcare system. Specifically, assessments are essential to shed light on the often unknown mechanisms underlying impairments and their temporal evolution, to individualize and compare therapeutic interventions, and to provide documentation for insurances justifying therapy. Unfortunately, conventional assessments commonly applied in research settings are not sensitive enough to detect subtle physiological changes and to allow an unbiased modeling of impairments and recovery. Further, conventional assessments are often seen as a time-consuming burden in clinical practice, where instead physical examination by clinicians still remains the main mode of evaluation.

Digital health metrics promise to provide complementary endpoints for research through an objective, sensitive, and traceable descriptions of human behaviour. This promises to enable a more accurate modeling and prediction of impairments and recovery, and to detect subtle deficits that might not be captured by the eye of the trained clinician. This might provide compelling arguments for the integration of assessments into daily clinical care.

However, in reality, the integration of digital health metrics into clinical research and practice remains limited, partly because the research community does not emphasize the selection and validation of digital health metrics. This is evident from the undefined state of the art on the evaluation of metrics and missing methodologies allowing a transparent and robust selection of clinically relevant metrics. Also, when evaluating upper limb sensorimotor impair- ments in neurorehabilitation, existing technology-aided approaches focus on characterizing impairments during isolated arm movements without functional context, thereby questioning the functional relevance of the assessment. Similarly, grasping abilities, which are essential when performing daily life activities, are often not considered in the assessment.

This work aimed to advance the selection, validation, and clinical application of digital health metrics for assessing upper limb sensorimotor impairments. In addition, it strived to establish the metrics from the Virtual Peg Insertion Test (VPIT) as complementary endpoints for clinical trials. The VPIT is a technology-aided assessment developed in 2010 at the Rehabilitation En- gineering Laboratory of ETH Zurich and promises the time-efficient evaluation of functionally relevant sensorimotor impairments in arm and hand.

For this purpose, a large-scale systematic review was performed defining the state of the art on digital health metrics for assessing upper limb sensorimotor impairments. This revealed

(9)

that a large variety of over 150 different metrics is used in literature and that most metrics are only poorly evaluated according to their clinimetric properties, thereby questioning their responsiveness and robustness. Further, the review revealed a strong dependency of the clini- metric properties to the assessment context (task, measurement device, target population), thereby highlighting the need for context-specific procedures allowing an automated selection of optimal digital health metrics.

To address this, a novel task-specific and automated selection and validation framework for digital health metrics was defined. The framework builds on a use-case specific pathophys- iological motivation of digital health metrics to represent clinically relevant impairments, models the influence of confounds from participant demographics, and evaluates the most important clinimetric properties (discriminant validity, structural validity, test-retest reliability, measurement error, learning effects). Applied to 77 kinematic and kinetic metrics collected with the VPIT in 120 unaffected and 89 neurological individuals, the framework allowed to robustly discard metrics without information about the targeted impairments, leading to the selection of 10 core metrics. These assessed the severity of multiple sensorimotor impairments in a valid, reliable, and informative manner while being least susceptible to measurement error and learning effects. These results suggest that the framework provides a transparent, step-by-step, and weakly-supervised selection procedure based on clinically relevant evidence.

Hence, this framework creates an interesting alternative to subjective consensus-based ap- proaches, and machine learning-based algorithms that typically act as black-boxes and do not consider clinically relevant evidence.

Building upon these core metrics, the concept that the VPIT captures functionally relevant impairments was evaluated in 30 post-stroke individuals. Indeed, significant correlations between conventional scales of activity limitations and the VPIT metrics were found, which indicated their functional relevance for tasks involving specific goal-directed arm and hand manipulations. Further, the robustness of three out of the ten VPIT core metrics was confirmed for this specific neurological subpopulation. Also, correlations with clinical impairment scales suggest the sensitivity of the metrics to spasticity and pathological joint coupling. This establishes the VPIT metrics as endpoints for clinical trials in post-stroke individuals and promises to optimize the functional benefits of neurorehabilitation.

Further, the feasibility of predicting neurorehabilitation outcomes in 11 persons with multiple sclerosis based on clinical data, digital health metrics, and machine learning was explored.

By relying only on clinical routine data collected before the intervention, it was possible to predict the presence of considerable improvements in gross motor and manipulation abilities.

Intriguingly, the digital health metrics from the VPIT were required to accurately predict the presence of improvements in fine manual dexterity. This emphasizes the potential of digital health metrics for more sensitive and fine-grained assessments than conventional scales.

Overall, this thesis made a strong contribution to increase the clinical acceptance of digital health metrics and allowed establishing the VPIT metrics and robust secondary endpoints for clinical trials. Reaching beyond this thesis, further studies were initiated that attempt exploiting the untapped potential of digital health metrics and the VPIT in clinical practice.

(10)

Zusammenfassung

Assessments von körperlichen Beeinträchtigungen und deren Auswirkungen in alltäglichen Aktivitäten spielen eine fundamentale Rolle im modernen Gesundheitssystem und vor al- lem in der Neurorehabilitation bei Personen mit sensorimotorischen Einschränkungen der oberen Extremitäten. Diese Assessments sind wichtig, um die neuronalen Mechanismen der Genesung von solchen Einschränkungen besser zu verstehen, um Therapien genauer zu vergleichen und auf den Einzelnen anzupassen oder um Dokumentationen für Versiche- rungen bereitzustellen. Verwendete Assessments in der aktuellen klinischen Forschung sind jedoch nicht sensitiv genug um physiologische Veränderungen präzise zu messen und um eine genaue Vorhersage von Therapieerfolgen zu ermöglichen. In der klinischen Praxis werden Assessments meistens nicht verwendet, da sie zeitaufwändig sind und häufig kein Mehrwert für das klinische Personal sichtbar ist.

Digitale Gesundheitsmetriken haben das Potential neue, komplementäre, klinische Endpunk- te bereitzustellen, da diese objektive und sensitive Messungen menschlichen Verhaltens ermöglichen. Solche Metriken versprechen eine genauere Modellierung und Vorhersage von Therapiefortschritten und die Detektion von subtilen Einschränkungen, die vielleicht selbst erfahrene Kliniker nicht wahrnehmen können. Infolge dessen könnte man Argumente für einen Einsatz von Assessments im klinischen Alltag bereitstellen.

In der Realität sind digitale Gesundheitsmetriken jedoch noch nicht weit verbreitet. Dies liegt unter anderem daran, dass die Forschungsgemeinde zu wenig Fokus auf die Auswahl und Validierung solcher neuartigen Metriken legt. Das zeigt sich in dem aktuell unbekannten Stand der Technik über die Evaluation solcher digitalen Gesundheitsmetriken und fehlender Methoden um eine automatisierte, transparente und robuste Selektion dieser Metriken zu ermöglichen. Bei der digitalen Vermessung von sensorimotorischen Einschränkungen der oberen Extremitäten wird zudem häufig der Fokus auf isolierte Armbewegungen gelegt ohne die Hand oder einen funktionellen Kontext zu berücksichtigen. Dies führt dazu, dass der Bezug dieser Assessments zu alltäglichen Aktivitäten in Frage gestellt werden muss.

Diese Doktorarbeit hat zum Ziel die Auswahl, Validierung und klinische Anwendung von digitalen Gesundheitsmetriken zu verbessern, vor allem im Anwendungskontext von sensori- motorischen Einschränkungen der oberen Extremitäten. Diese Arbeit baut zudem auf einem technologie-basierten Assessment auf, dem sogenannten Virtual Peg Insertion Test (VPIT), welcher am Labor für Rehabilitationstechnik der ETH Zürich im Jahre 2010 eingeführt wurde.

Der Test verspricht die Vermessung von sensorimotorischen Einschränkungen in Arm und Hand mit Relevanz für Alltagsaktivitäten.

(11)

Daher wurde zunächst eine systematische Übersicht über den Stand der Technik der Evaluati- on von digitalen Gesundheitsmetriken für die Vermessung von sensorimotorischen Einschrän- kungen in den oberen Extremitäten erstellt. Es ergab sich, dass über 150 unterschiedliche Metriken in der Forschung verwendet werden und dass der Großteil dieser Metriken nur ungenügend validiert ist. Zudem fand man heraus, dass diese Metriken stark kontextabhängig sind, so dass für jede Verhaltensaufgabe, jedes Messsystem und jede Patientenpopulation eine separate Validierung nötig ist. Dies hebt die Notwendigkeit von neuen Methoden hervor, welche eine kontextspezifische und automatisierte Selektion und Validierung von digitalen Gesundheitsmetriken ermöglichen.

Aufgrund dessen wurde in dieser Arbeit ein solch neuer Algorithmus entwickelt. Dieser basiert auf einer kontextspezifischen, pathophysiologischen Motivation der digitalen Gesundheitsme- triken, einer Modellierung von möglichen demographischen Störfaktoren und die Evaluation der wichtigsten statistischen Eigenschaften der Metriken (diskriminative und strukturelle Validität, Wiederholbarkeit, Messfehler, Lerneffekte). Dieser Ansatz wurde auf 77 kinematische und kinetische Gesundheitsmetriken angewandt, die mit dem VPIT bei 120 gesunden und 89 neurologischen Personen aufgezeichnet wurden. So konnten Metriken ohne klinisch relevante Informationen aussortiert werden und zur Selektion von 10 digitalen Gesundheitsmetriken mit optimalen statistischen Eigenschaften führen. Diese Metriken ermöglichten es die Stärke von mehreren sensorimotorischen Einschränkungen der oberen Extremitäten in einer robu- sten und informativen Art und Weise zu beschreiben. Diese Ergebnisse suggerieren, dass der entwickelte Algorithmus eine transparente, schrittweise und datengetriebene Selektion von digitalen Gesundheitsmetriken ermöglicht. So ergibt sich eine neue Alternative für Forscher, da bestehende Ansätze häufig subjektiv und konsensbasiert, sowie existierende Algorithmen unverständlich sind und keinerlei klinisch relevante Kriterien berücksichtigen.

Basierend auf diesen 10 digitalen Gesundheitsmetriken des VPITs wurde eine Studie mit 30 Personen nach einem Schlaganfall durchgeführt um zu überprüfen, ob die durch den VPIT beschriebenen sensorimotorischen Einschränkungen tatsächlich Bezug zu Alltagsak- tivitäten haben. Durch eine Korrelationsanalyse ging hervor, dass die sensorimotorischen Einschränkungen, beschrieben durch die digitalen Gesundheitsmetriken des VPITs, Relevanz für Aktivitäten haben, welche zielgerichtete Bewegungen und Manipulationen der Arme und Hände beinhalten. Zusätzlich wurde die statistische Robustheit drei der digitalen Gesund- heitsmetriken in dieser spezifischen, neurologischen Subpopulation gezeigt. Zudem waren die Metriken in der Lage, die Stärke von Spastiken und pathologischen Gelenksynergien zu messen. Dies etabliert die digitalen Gesundheitsmetriken als sekundäre Endpunkte für die klinische Forschung und verspricht eine Optimierung des Nutzens von Neurorehabilitation für alltägliche Aktivitäten welche die oberen Extremitäten einbeziehen. Zusätzlich wurde die Machbarkeit der Modellierung und Vorhersage von Therapieerfolgen bei 11 Personen mit Multipler Sklerose, basierend auf klinischen Daten, digitalen Gesundheitsmetriken und ma- schinellem Lernen gezeigt. Mittels Daten welche vor der Intervention aufgenommen wurden, konnte die Präsenz von Veränderungen in der Fähigkeit grobe Bewegungen und Manipu- lationen auszuführen, vorhergesagt werden. Die digitalen Gesundheitsmetriken waren vor allem für die Vorhersage der Präsenz von Veränderungen in feinen Manipulationsaktivitäten

(12)

Zusammenfassung

nötig. Dadurch wird das Potential von digitalen Gesundheitsmetriken für ein sensitiveres und präziseres Assessment von körperlichen Einschränkungen unterstrichen.

Abschließend lässt sich zusammenfassen, dass diese Arbeit einen signifikanten Beitrag zur klinischen Akzeptanz von digitalen Gesundheitsmetriken leistet. Zusätzlich erlaubt diese Dissertation die digitalen Gesundheitsmetriken des VPITs als sekundäre, klinische Endpunkte in der Neurorehabilitation bereitzustellen. Über diese Arbeit hinausgehend wurden weitere klinische Studien initiiert, welche das ungenutzte Potential von digitalen Gesundheitsmetriken in der klinischen Praxis erforschen werden.

(13)
(14)

Contents

Acknowledgements iii

Abstract (English/Deutsch) vii

List of figures xvii

List of tables xix

Preface xxi

1 Introduction 1

1.1 Dexterous sensorimotor control of arm and hand as a fundamental feature of

human behaviour . . . 2

1.2 Neurological disorders can impair upper limb sensorimotor control . . . 3

1.3 Neurorehabilitation and the importance of assessments . . . 4

1.4 The limitations of conventional assessments . . . 6

1.5 Digital health metrics promise more fine-grained and sensitive assessments . . 9

1.6 Technology-aided assessment platforms and the Virtual Peg Insertion Test . . . 10

1.7 Aims of this thesis . . . 11

I Digital health metrics for assessing upper limb sensorimotor impairments 15 2 Systematic review on kinematic assessments of upper limb movements 17 2.1 Abstract . . . 18

2.2 Introduction . . . 19

2.3 Methods . . . 20

2.4 Results . . . 23

2.5 Discussion . . . 30

2.6 Conclusions . . . 32

3 A framework for selecting digital health metrics: use-case VPIT 35 3.1 Abstract . . . 36

3.2 Introduction . . . 37

3.3 Results . . . 40

3.4 Discussion . . . 43

(15)

3.5 Methods . . . 55

II Clinical application of the VPIT digital health metrics 69 4 Assessment of functionally relevant sensorimotor impairments post-stroke 71 4.1 Abstract . . . 72

4.2 Introduction . . . 73

4.3 Methods . . . 74

4.4 Results . . . 79

4.5 Discussion . . . 83

4.6 Conclusions . . . 89

5 Personalized prediction of rehabilitation outcomes in multiple sclerosis 91 5.1 Abstract . . . 92

5.2 Introduction . . . 93

5.3 Methods . . . 94

5.4 Results . . . 100

5.5 Discussion . . . 102

5.6 Conclusions . . . 107

III General discussion 109 6 Exploring novel avenues for the VPIT 111 6.1 Reducing intra-subject variability in the VPIT . . . 112

6.2 Towards the assessment of individuals with severe sensorimotor impairments . 114 6.3 Expanding the target population of the VPIT to upper limb prosthesis users . . 118

7 Synthesis, thesis contributions, and dissemination 123 7.1 Synthesis and thesis contributions . . . 123

7.2 Dissemination . . . 127

8 Outlook and closing 131 8.1 Digital health metrics for assessing upper limb sensorimotor impairments . . . 131

8.2 Clinical application of the VPIT digital health metrics . . . 132

8.3 Closing . . . 136

A Appendix: Systematic review on kinematic assessments of upper limb movements after stroke 137 B Appendix: A framework for selecting digital health metrics: use-case VPIT 151 B.1 Supplementary Analysis . . . 151

B.2 Supplementary Methods . . . 155

B.3 Supplementary Results . . . 157

(16)

Contents

C Appendix: Functionally relevant assessment of upper limb sensorimotor impairments

post-stroke 161

D Appendix: Personalized prediction of rehabilitation outcomes in multiple sclerosis171

Bibliography 175

(17)
(18)

List of figures

1.1 The Virtual Peg Insertion Test (VPIT). . . 12

2.1 PRISMA flow diagram of the systematic literature search. . . 21

2.2 Overview of tasks used for assessing upper limb kinematics post-stroke. . . 23

2.3 Overview of the usage of kinematic metrics and their clinimetric properties for 2D pointing tasks. . . 24

2.4 Overview of the usage of kinematic metrics and their clinimetric properties for 2D drawing tasks. . . 25

2.5 Overview of the usage of kinematic metrics and their clinimetric properties for the 3D pointing tasks. . . 26

2.6 Overview of the usage of kinematic metrics and their clinimetric properties for 3D reach-to-grasp tasks. . . 27

3.1 Overview of the data-driven framework. . . 39

3.2 Data-driven selection and validation of metrics: example of task completion time. 42 3.3 Partial correlation analysis. . . 46

3.4 Sensitivity of metrics to disability severity in stroke subjects. . . 47

3.5 Sensitivity of metrics to disability severity in MS subjects. . . 48

3.6 Sensitivity of metrics to disability severity in ARSACS subjects. . . 49

4.1 Example correlations between impairments (VPIT, Fugl-Meyer Upper Extremity) and activity limitations (Box and Block Test). . . 81

4.2 Clinimetric evaluation of the VPIT metrics: example log jerk transport. . . 84

5.1 Approach for prediction of neurorehabilitation outcomes in persons with multi- ple sclerosis. . . 95

5.2 Examplary machine learning model predicting neurorehabilitation outcomes using pre-intervention data. . . 103

6.1 The Physical Peg Insertion Test (PPIT). . . 113

6.2 The Virtual Peg Insertion Test (VPIT) in combination with a passive arm weight support device. . . 116

6.3 Application of the Virtual Peg Insertion Test (VPIT) in upper limb prosthesis users.119 8.1 Visual representation of the VPIT core metrics. . . 135

(19)

B.1 Selection of parameter lambda for the LASSO. . . 152

B.2 Temporal segmentation of kinematic and kinetic trajectories. . . 160

C.1 Test and retest scores for the sensor-based metrics of the VPIT. . . 165

C.2 Bland-Altman plots for the sensor-based metrics of the VPIT. . . 166

C.3 Learning effects in the VPIT metrics for the most affected side. . . 167

C.4 Learning effects in the VPIT metrics for the less affected side. . . 168

C.5 Intra-subject variability of the VPIT metrics. . . 169

(20)

List of tables

2.1 Recommendations for kinematic upper limb assessments post-stroke. . . 33

3.1 Demographics and clinical characteristics of the study population. . . 41

3.2 Results for the data-driven selection of kinematic metrics. . . 44

3.3 Results for the data-driven selection of kinetic metrics. . . 45

3.4 Structural validity: exploratory factor analysis. . . 50

4.1 Characterization of impairments and activity limitations. . . 80

4.2 Correlation between conventional scales and VPIT metrics for the most affected side. . . 82

4.3 Test-retest reliability: intra-class correlation (ICC) coefficients and smallest real differences (SRD). . . 85

5.1 Clinical information on persons with multiple sclerosis. . . 101

5.2 Predicting intervention outcomes using pre-intervention data and decision tree models. . . 104

A.1 Characteristics of all included studies of the systematic review. . . 138

A.2 Risk of bias for the investigated clinimetric properties. . . 149

B.1 Results for the metric selection using LASSO and comparison with their clini- metric properties. . . 153

B.2 Results for the metric selection using SLIM and comparison with their clinimetric properties. . . 153

B.3 Results for the metric selection using the random forest and comparison with their clinimetric properties. . . 154

B.4 Influence of potential confounds on each sensor-based metric. . . 158

C.1 Demographic and clinical information for all post-stroke subjects. . . 162

C.2 Confidence intervals for correlation between conventional scales and VPIT met- rics for the most affected side. . . 164

C.3 Quantification of learning effects. . . 170

D.1 Predicting intervention outcomes using data collected pre-intervention and a linear regression model. . . 172

(21)

D.2 Predicting intervention outcomes using data collected pre-intervention and a k-nearest neighbor model. . . 173 D.3 Predicting intervention outcomes using data collected pre-intervention and

random forest models. . . 174

(22)

Preface

This thesis reports on the selection, validation, and clinical application of digital health metrics for the assessment of upper limb sensorimotor impairments, with a special emphasis on the Virtual Peg Insertion Test, a technology-aided platform for evaluating upper limb sensorimotor impairments in clinical environments. To provide the reader with sufficient background to understand the importance of assessing upper limb sensorimotor impairments with digital health metrics, this thesis starts with sketching the neuroscientific mechanisms underlying the control of human movements for both the physiological and the pathological case. Sub- sequently, the reader is introduced to basic neurorehabilitation and assessment concepts that are currently applied in clinical research and practice. The caveats of these assessments motivate the need for more sensitive and robust assessments, in the form of digital health metrics, that are expected to positively shape the mechanistic understanding of sensorimotor recovery and can help to lay out the foundation for improving neurorehabilitation outcomes.

(23)
(24)

1 Introduction

(25)

1.1 Dexterous sensorimotor control of arm and hand as a funda- mental feature of human behaviour

The smooth, precise, and well-coordinated sensorimotor control of the human arm and hand had a fundamental role in human evolution, providing the basis for the development of simple and sophisticated tools and our unique ability to ignite fires. Also in modern times, the self-determination and quality of life of humans rely on our ability to engage the upper limb in everyday life activities (Yozbatiran et al., 2006). The main organ orchestrating actions of the upper limb is the central nervous system (CNS), which can be structured into the three components spinal cord, brainstem, and cortex (Scott, 2004). These are able to seamlessly control the high-dimensional human musculoskeletal apparatus and elicit smooth and coordinated goal-directed movements.

Over the last decades, a thorough neuroscientific understanding of these control mechanisms was achieved by observing and explaining specific behavioural landmarks during upper limb movements. One of them is the speed-accuracy tradeoff, defining that individuals can selectively either emphasize the speed or accuracy of a movement and that an increase in the one is accompanied by a decrease in the other dimension (Fitts and Peterson, 1964; Bogacz et al., 2010). Also, goal-directed movements were found to be consistently bell-shaped when looking at their time-course in the velocity domain, independent of the actuated joint as well as movement distance and direction (Morasso, 1981; Flash and Hogan, 1985). When manipulating and transporting objects, a tight predictive temporal coupling between changes in grip force and load force, described as the vertical force resulting from lifting an object against gravity, has been identified as behavioural landmark (Johansson and Westling, 1988;

Flanagan and Wing, 1993; Flanagan and Tresilian, 1994).

These and other observations led to the formation of multiple theories about the exact im- plementation of control schemes in the CNS. One of them suggests, that the CNS controls such movements by relying on internal forward and inverse models that estimate the required spatio-temporal patterns of neural commands to optimally achieve a specific movement task, leading to bell-shaped velocity profiles (Shadmehr and Mussa-Ivaldi, 1994; Wolpert et al., 1998;

Scott, 2004). Such models can also explain the coupling between grip and load force, as esti- mates about the object properties, such as weight, and currently performed body movements can be used to internally estimate the anticipated load forces. For such models, sensory feed- back can be used to compare the expected with the actual task outcome, thereby allowing to optimize the internal control policies, for example in need for environmental adaptations or in case of noise in the neural transmission system (Shadmehr and Mussa-Ivaldi, 1994; Harris and Wolpert, 1998; Faisal et al., 2009). The concept of neural noise has also been used to explain the speed-accuracy tradeoff and the omnipresent bell-shape in velocity profiles, by arguing that a linear relationship exists between the amplitude of a motor command and the transmission noise, thereby intrinsically preprogramming smooth movements (Harris and Wolpert, 1998;

Faisal et al., 2009). Given the redundancy of the human musculoskeletal apparatus, it is further suggested that the sensorimotor system relies on low-dimensional synergies that are flexibly

(26)

1.2. Neurological disorders can impair upper limb sensorimotor control

combined to jointly actuate multiple degrees of freedom in a combined and optimal fashion (Santello et al., 2016). It is believed that these theories explaining CNS mechanisms will help understanding human behaviour not only in regular physiological, but also in pathological conditions.

1.2 Neurological disorders can impair upper limb sensorimotor con- trol and reduce quality of life

Acute and progressive neurological disorders, such as stroke, multiple sclerosis, and hereditary ataxic conditions, affect the ability of the CNS to smoothly control upper limb movements and strongly impede an individuals’ ability to perform daily life activities, thereby often reducing their quality of life (Dowling et al., 1997; Lawrence et al., 2001; Kister et al., 2013;

Jayadev and Bird, 2013; Yozbatiran et al., 2006). Such disorders are among the leading causes of acquired adult disability (Benjamin et al., 2019). Their incidence rates and prevalence are moderate, with for example approximately 13.7 million new stroke cases globally each year and approximately 2.2 million individuals worldwide suffering from multiple sclerosis (Johnson et al., 2019; Wallin et al., 2019). Hereditary ataxias are rare (Jayadev and Bird, 2013), as for example individuals with one specific ataxia subtype, autosomal recessive spastic ataxia of Charlevoix-Saguenay (ARSACS), are almost exclusively present in northeastern Quebec (Quebec, Canada) (Braekeleer et al., 1993). Nevertheless, these disorders place a significant socioeconomic burden on society and individuals (Di Carlo, 2009). Moreover, given the anticipated strong increase in the average population age over the next decades and the fact that age is a primary risk factor for certain neurological disorders (Boysen et al., 1988; United Nations. Department of Economic and Social Affairs, 2019), one can expect their prevalence to rapidly grow over the coming years (World Health Organization, 2006). Hence, the already considerable socioeconomic burden of neurological disorders is projected to further surge in the future (Gooch et al., 2017).

Neurological damage can lead to heterogeneous phenotypes of upper limb impairments, which can be clustered into the clinical syndromes apraxia, ataxia, and paresis (Sathian et al., 2011). Apraxia can be defined as deficits in higher-order motor cognition and the inability to correctly run previously memorized motor sequences (Humphreys et al., 1997; West et al., 2008). Ataxia is broadly described as impaired spatio-temporal control of coordinated movements resulting from damage to the cerebellum and its connection in the brainstem (Sathian et al., 2011). Lastly, paresis disrupts the ability to precisely activate spinal motor neurons and manifests on a behavioural level as weakness, spasticity (abnormal increase in muscle tone), and the abnormal coupling between joints (Sathian et al., 2011). Overall, these impairments to inaccurate and slower goal-directed movements in neurologically affected subjects (Hardwick et al., 2016). In addition, these movements are often less smooth. This is evident by their velocity profiles that deviate from the commonly observed single-bell shape but instead feature multiple peaks resulting from corrective movements (Krebs et al., 1998; Rohrer et al., 2002, 2004). Also, depending on lesion location and extent, the predictive

(27)

coupling between grip and load forces can be impaired in neurological disorders, thereby inhibiting the efficacy of object manipulations (Hermsdörfer et al., 2003; Brandauer et al., 2008).

There is a multitude of suggested neural control mechanisms attempting to explain these behavioural alterations. Clearly, central signal generation is inhibited in brain areas affected by cell death, leading to reduced drive at the muscular level (Sathian et al., 2011). Depending on the condition, cell death can result from a blockage of blood flow in the brain, the rupturing of blood vessels, the demyelination of nerves, or the inability to express proteins important for neural functioning (Dowling et al., 1997; Jayadev and Bird, 2013; Sacco et al., 2013). In addition to a disrupted generation of neural signals, their transmission through the major neural pathways, namely the corticospinal tract, can be distorted depending on the lesion location (Lemon, 2008; Baker, 2011). This can lead to increased reliance on residual pathways, for example the reticulospinal tract, that have less focused spinal motor neuron projections, thereby leading to increased transmission noise and the involuntary co-activation of adjacent muscle group (Dewald et al., 1995; Dewald and Beer, 2001; Sukal et al., 2007). Also, abnormal reflex activity and increased muscle tone, potentially due to altered supraspinal inhibition of spinal muscle fibers, contribute to the observed abnormal movements (Sommerfeld et al., 2004; Mukherjee and Chakravarty, 2010). Overall, these impairments make the neural com- mands generated by internal models often inappropriate for the intended motor task. These impairments can further be accentuated by somatosensory deficits resulting from suboptimal neural transmission over the primary afferent pathways and an inability of the CNS to process afferent inputs in a meaningful manner (Carey, 1995; Tyson et al., 2008). This can addition- ally challenge the updating and optimization of the internal models. Within the concept of synergies, the observed impaired movement patterns can be explained by a reduction in the available motor repertoire in terms of low-dimensional building blocks and the inability to flexibly combine the remaining ones to achieve task success (Santello et al., 1998; Santello and Lang, 2015).

1.3 Neurorehabilitation and the importance of assessments

While an advanced healthcare system exists for managing acute stroke, approximately 36% of individuals with first-ever stroke still show persistent and long-term disability (Hankey et al., 2002). Pharmacotherapy is the main mode of treatment for individuals with multiple sclerosis, but these commonly only help to temporarily decline the rate of disease progression, still leading to long-term disability and premature death (Lublin, 2005; Conway and Cohen, 2010;

Ontaneda et al., 2017). Despite chronic disability and premature deaths also being observed in hereditary ataxias, the knowledge about the effect of potential therapies remains limited (Trujillo-Martín et al., 2009; Alonso et al., 2013; Girard et al., 2012). Inter-disciplinary neurore- habilitation approaches, relying on for example occupational therapy and physiotherapy, have shown promise to reduce upper limb disability in neurological disorders, both in acute and chronic settings (Langhorne et al., 2011; Rietberg et al., 2011; Beer et al., 2012; Pollock et al.,

(28)

1.3. Neurorehabilitation and the importance of assessments

2014; Veerbeek et al., 2014; French et al., 2016; World Health Organization, 2006; Ward et al., 2019). While such approaches are well established in post-stroke care, neurorehabilitation for persons with multiple sclerosis or hereditary ataxias started receiving attention only rather recently (Carr and Shepherd, 1989; Beer et al., 2012; Rietberg et al., 2011; Martins et al., 2013;

Lamers et al., 2016). Interventions usually focus either on restoring aspects of upper limb func- tion or on more successfully completing daily life activities with the reduced set of available body functions through compensatory movements (World Health Organization, 2001; Ward et al., 2019; Wolf et al., 2010).

However, the effects of most neurorehabilitation interventions on the level of disability are typically only moderate (Rietberg et al., 2011; Beer et al., 2012; Pollock et al., 2014; Veerbeek et al., 2014; French et al., 2016). This partly results from the largely unknown influence of intervention parameters, such as type, timing, dose, and intensity, even though a clear focus on increasing therapy dose and intensity currently exists in the research community (Han et al., 2013; Krakauer and Carmichael, 2017; Lamers et al., 2019; Ward et al., 2019). Similarly, the precise neural mechanisms underlying successful neurorehabilitation remain unclear, but are in general attributed to neuroplasticity, which can be seen as the ability of the CNS to adapt and reorganize itself, both biologically and functionally, as a result of injury to itself, practice, and environmental fluctuations (Hosp and Luft, 2011; Lipp and Tomassini, 2015; Jones, 2017; Ward, 2017; Maier et al., 2019). This lack of knowledge about sensorimotor recovery is further illustrated by the recent questioning of the proportional recovery rule, a long standing viewpoint in post-stroke research (Prabhakaran et al., 2008; Krakauer and Marshall, 2015; Hawe et al., 2018; Hope et al., 2019; Kundert et al., 2019; Senesh and Reinkensmeyer, 2019; Lohse et al., 2020; van der Vliet et al., 2020). This idea assumed the presence of a mainly biological mechanism that drives upper limb recovery, in the first three months after stroke, proportionally to the level of initial impairment, but has shown to be inaccurate on an individual level (Hawe et al., 2018; Hope et al., 2019; Kundert et al., 2019; Senesh and Reinkensmeyer, 2019; Lohse et al., 2020; van der Vliet et al., 2020). Instead, it seems more likely that a multi-modal, heterogeneous set of clinical and demographic variables is required to accurately understand and predict individual neurorehabilitation outcomes (Heinemann et al., 1994; Langdon and Thompson, 1999; Grasso et al., 2005).

One of the fundamental building blocks to provide more effective therapies and to elucidate their unclear neurological mechanisms areassessments(World Health Organization, 2001, 2006; Albert and Kesselring, 2011; Beer et al., 2012). Within this thesis, assessments are defined as a description of body function and structures (impairments) or the spectrum of activities an individual can perform (activity limitations) (World Health Organization, 2001). In clinical practice, neurorehabilitation is an iterative, circular process that builds upon multiple assessment and therapy steps (World Health Organization, 2006; Beer et al., 2012; Albert and Kesselring, 2011). Initially, assessments focus on identifying the clinical problems and needs of an individual, the evaluation of the rehabilitation potential and the prognosis, and the definition of therapy goals. Subsequently, the individual is assigned to a specific intervention that is tailored to his or her needs. After the intervention, the next iteration of the cycle

(29)

is triggered, as assessments are needed to evaluate the impact of the intervention and re- identify problems, needs, potential, and therapy goals (World Health Organization, 2006). On a higher level, assessments are also needed to justify the prescription of additional therapy to insurances and to provide internal and external quality control for hospitals.

In research settings, it is of special interest to sensitively compare the subtle differential impact of therapeutic approaches on impairments and activity limitations, for example within large- scale randomized controlled trials (Langhorne et al., 2011; Lambercy et al., 2016; Lamers et al., 2014; Burridge et al., 2019). This is achieved by measuring the level of disability before and after an intervention, thereby providing evidence about its efficacy. Such measures are further required to better disentangle the mechanisms underlying sensorimotor impairments and recovery. Specifically, this allows probing certain specific brain functions and structures and their temporal evaluation, allowing shedding light into the neural substrates orchestrating recovery. Also, assessments are essential to stratify individuals into homogeneous subgroups for clinical trials, such that the effect of therapies can be sensitively measured and does not get masked by inter-subject variability.

1.4 The limitations of conventional assessments challenge advances in the field of neurorehabilitation

In clinical practice, the most common way healthcare practitioners assess disability in indi- viduals with neurological disorders is through visual observation and physical examination.

Based on the extensive experience of clinicians and physiotherapists, this information is used to select and adapt therapies deemed optimal for an individual (World Health Organization, 2001, 2006). Even though these experiences are typically summarized and archived in the electronic health record of each patient, more standardized measures of disability are required to provide convincing arguments in favor of additional treatment to insurance providers and hospital management, and as outcome measures for clinical research. Early efforts to better describe the overall level of disability were attempted by constructing ordinal scales that require healthcare practitioners to probe and rate specific body functions, indicating for example the presence or absence of limb ataxia (Kurtzke, 1983a; Brott et al., 1989; Dewey et al., 1999). Similar scales were also defined to more specifically dissect impairments in the upper limb, such as the one introduced by Fugl-Meyer in 1975 that characterizes the presence of pathological joint coupling by eliciting and judging specific combinations of shoulder, arm, and hand movements (Fugl-Meyer et al., 1975). Moreover, time-based tasks were introduced that focus on the ability to use the set of available body functions to achieve goal-directed tasks, such as for example the grasping and transfer of blocks over a barrier (Box and Block Test) or the insertion of nine pegs into nine holes (Nine Hole Peg Test) introduced by Math- iowetz in 1985 (Mathiowetz et al., 1985a,b). Also, self-reports in the form of questionnaires were developed to better understand the perceived limitations of individuals in their daily life (Granger et al., 1986). Such approaches based on subjective rating or evaluation of time-based tasks will be referred to asconventional assessments.

(30)

1.4. The limitations of conventional assessments

Conventional assessments are only rarely used in clinical practice as their application is often too time-consuming given the short amount of time that clinicians and therapists can spend per patient (Swinkels et al., 2011; Lang et al., 2013; Burridge et al., 2019). In addition, the immediate benefit of conventional assessments to experienced clinical personnel might be questionable, as they assume they can observe most of the impairments and activity limi- tations themselves without the need for standardized tools, which might be consequently deemed as a time-consuming burden. Albeit not well suited for clinical practice, it is im- pressive that these conventional assessments introduced 35-45 years ago remain the most commonly applied methodology to describe upper limb disability in research studies up to date (Lamers et al., 2014; Alt Murphy et al., 2015). There are several factors contributing to these success stories: Conventional assessments have high usability, as they can be easily used by healthcare practitioners in a standardized manner at different sites and they only rely on simple and cost-efficient measurement setups. This is further facilitated by the avail- ability of publically accessible reference values from unaffected individuals, often referred to as normative data. Equally important, the outcome measures of conventional assessments can be intuitively interpreted and directly associated with a specific aspect of impaired body functions or activity limitations. For example, the Fugl-Meyer assessment is well known to describe upper limb paresis and especially the abnormal coupling between joints with 0 indicating the worst and 66 indicating the best score (Fugl-Meyer et al., 1975; Gladstone et al., 2002). Similarly, the Box and Block as well as Nine Hole Peg Test can be intuitively associated with manual and finger dexterity, respectively, with increasing values indicating increasing performance for the BBT and vice versa for the NHPT (Mathiowetz et al., 1985a,b).

Given that these scales are intuitively applicable and repeatable, it is not surprising that they mostly have good to excellentclinimetric properties(Gladstone et al., 2002; Platz et al., 2005; Lemmens et al., 2012; Lamers et al., 2014; Alt Murphy et al., 2015; Feys et al., 2017;

Burridge et al., 2019). Within this thesis, clinimetric properties are defined as statistical re- quirements aiming to describe the ability of an assessment to repeatedly and sensitively measure impairments, for example in longitudinal studies. Clinimetric properties build upon a characterization of intra- and inter-subject as well as intra- and inter-rater variability to define statistical constructs such as discriminant validity, reliability, measurement noise, and responsiveness (Prinsen et al., 2016, 2018). Conceptually, to yield a sensitive assessment, intra- subject variability should be minimized, inter-subject variability maximized, and intra- as well as inter-rater variability minimized. Hence, one of the reasons conventional assessments tend to receive positive evaluations of their clinimetric properties is the use of ordinal scales with a small choice of rating options that inherently accompanies low intra-subject variability and low inter-rater variability. Based on these evaluations, it seems that conventional assess- ments provide intuitively interpretable, simply usable, and standardized tools that can help characterizing upper limb neurorehabilitation outcomes in research trials.

However, a multitude of major issues arises when starting to critically dissect the properties of conventional assessments. While the ordinal nature of scales seems to be beneficial when evaluating clinimetric properties and suffices to create a general picture about the level

(31)

of sensorimotor disability, these scales impede the ability of the assessment to sensitively characterize subtle physiological changes. This is especially problematic given that the clinical interventions often show small effects on the level of disability that might not be captured by ordinal scales (Rietberg et al., 2011; Pollock et al., 2014; French et al., 2016). Most importantly, ordinal scales are inherently bounded, meaning that there are two mathematical ends to the scale (e.g., for the Fugl-Meyer assessment: maximum impairment at 0, no impairment at 66) that must not necessarily correspond to the physiological limits of impairments. This introduces so-called ceiling or floor effects, as evident in the Fugl-Meyer assessment, where it is well known that individuals with a full score in the test can still show sensorimotor impairments (Gladstone et al., 2002). This has drastic effects when attempting to statistically model sensorimotor recovery, as ceiling effects artificially lower the observed variances, which inflates correlation coefficients of the commonly applied linear models. This was one of the major factors contributing to the potential misinterpretation of the proportional recovery model (Hawe et al., 2018; Hope et al., 2019; Kundert et al., 2019; Senesh and Reinkensmeyer, 2019; Lohse et al., 2020). The fact that such major limitations exist, despite positively validated clinimetric properties, also suggests flaws in the validation process that can be attributed to the application of inappropriate statistics and the absence of a universally valid gold standard about the presence and severity of specific sensorimotor impairments. This illustrates that establishing and validating assessments is one of the fundamental challenges in the field of neurorehabilitation.

As an additional caveat, conventional assessments of impairments typically probe specific body functions in an isolated manner without considering their functional relevance. While some of the captured deficits show strong functional relevance, such as abnormal joint cou- pling as assessed by Fugl-Meyer, this is less clear for other symptoms, such as spasticity (Rabadi and Rabadi, 2006; Wei et al., 2011; Dietz and Sinkjaer, 2007; Hoonhorst et al., 2015). Spasticity is typically assessed through clinical personnel by passively moving a limb of a patient and rating the level of subjectively perceived resistance on a scale from 0 to 4 (Katz and Rymer, 1989).

Historically, the reduction of spasticity was often defined as one important therapy goal and was a major research focus. However, whether spasticity even has major negative functional impact is strongly debated in the research community and Dietz and Sinkjaer even argued that a reduction in spasticity would accentuate paresis, thereby impeding functional abilities (Barnes, 2001; Richardson, 2002; Woldag and Hummelsheim, 2003; Zackowski et al., 2004;

Sommerfeld et al., 2004; Dietz and Sinkjaer, 2007). This apparent dissociation between certain impairments and their functional relevance cannot be captured by conventional assessments, as impairment-focused approaches do not consider an activity context and activity-focused assessments, which are often only time-based, cannot isolate specific impairments. This has led to an overemphasis of the symptom spasticity and, similarly, to an unclear functional relevance of certain neuromotor control principles, such as synergies as low-dimensional basic building blocks (de Rugy et al., 2013; Krakauer and Carmichael, 2017).

Overall, it seems that, conventional assessments have, despite positively evaluated clinimetric properties, conceptual limitations that challenge their usage in clinical practice and hinder the

(32)

1.5. Digital health metrics promise more fine-grained and sensitive assessments

exploration of neuroscientific and clinical research questions that are essential for advancing the field of neurorehabilitation.

1.5 Digital health metrics promise more fine-grained and sensitive assessments

Digital health metrics are seen as a potential solution to answer the many limitations of con- ventional scales and to provide novel, complementary endpoints for clinical trials (Lambercy et al., 2016; Patel et al., 2012; Nordin et al., 2014; De Los Reyes-Guzmán et al., 2014; Alt Murphy and Häger, 2015; Ellis et al., 2016; Kwakkel et al., 2017; Wang et al., 2017a; Tran et al., 2018; Ona Simbana et al., 2019; Kourtis et al., 2019). Herein, digital health metrics are defined as discrete one-dimensional metrics extracted through multiple signal processing steps from health- related sensor data, for example for the purpose of characterizing upper limb sensorimotor impairments.

Health-related sensor data typically describe physical constructs, for example kinematics (e.g., velocity), that are naturally defined on ratio scales. This implies that they do not have any mathematical boundaries (i.e., are defined from minus to plus infinity) and their resolution is only limited by the characteristics of the sensing element. Hence, digital health metrics promise to be more responsive to physiological changes than conventional assessments relying on ordinal scales. In addition, the absence of ceiling effects might also enable a more accurate prediction of rehabilitation outcomes compared to conventional scales.

One of the novel challenges that digital health technologies bring is especially that a virtually endless amount of metrics can be extracted per assessment task and that the interpretation of the often abstract metrics is challenging and context-dependent (Nordin et al., 2014; Balasub- ramanian et al., 2012, 2015). For example, the same metric describing movement smoothness (e.g., number of velocity peaks) can be extracted for both a goal-directed planar movement and an explorative 3D movement, but its interpretation will differ between the conditions and also depend on the entire signal processing pipeline (e.g., data segmentation and filtering).

This is a fundamentally different situation than for conventional assessments, where out- comes are always directly coupled and unique to an assessment task (e.g., Fugl-Meyer score).

In addition, while the ratio scales of digital health metrics promise greater sensitivity, this inherently accompanies an increase in intra-subject variability, that might negatively affect the evaluation of clinimetric properties. Hence, digital health metrics require even more thorough validation than conventional assessments and require a shift in methodology towards a task- and context-dependent evaluation of metrics.

However, this high need for validation of digital health metrics is not reflected in the focus of the research community. This is evident by the largely unknown state of the art on the clinimetric properties and pathophysiological interpretation of digital health metrics. Also, appropriate methodologies for selecting robust digital health metrics out of the plethora of

(33)

available ones are scarce (Kwakkel et al., 2017; Shishov et al., 2017).

1.6 Technology-aided assessment platforms and the Virtual Peg In- sertion Test

A variety of technology-aided assessment platforms, including clinic-bound and wearable approaches, has been proposed to record digital health metrics of upper limb sensorimotor impairments (Lambercy et al., 2016; Nordin et al., 2014; De Los Reyes-Guzmán et al., 2014; Alt Murphy and Häger, 2015; Ellis et al., 2016; Tran et al., 2018; Ona Simbana et al., 2019; Kourtis et al., 2019). Given that this thesis focuses on clinic-bound approaches, their state of the art will be concisely summarized in the following, whereas the reader is referred to literature for an overview of wearable sensing systems (Patel et al., 2012; Kanzler et al., 2015; Mullan et al., 2015; Kanzler et al., 2016; Eskofier et al., 2017; Wang et al., 2017a; Kourtis et al., 2019).

Assessment platforms aimed at use in clinical environments typically rely on optical motion capture systems, robotic exoskeletons, or robotic end-effectors, which permit recording trace- able behavioural data during standardized movement and manipulation tasks. While optical motion capture systems allow accurately capturing movements of any body part of interest, their setup is time-consuming and therefore difficult to integrate into clinical environments.

Instead, robotic systems are more readily applicable. These commonly include a virtual reality environment displaying a behavioural task, sensing and actuating elements, and control strategies ensuring well-guided interaction with the user (Krebs et al., 2002; Riener et al., 2005;

Scott and Dukelow, 2011). This enables the haptic rendering of virtual reality environments and perturbations, thereby allowing to probe specific sensorimotor impairments of interest.

Exoskeleton approaches, such as the KINARM, have shown impressive results when precisely dissecting sensorimotor impairments into different underlying components with high granu- larity (Scott, 1999; Coderre et al., 2010; Tyryshkin et al., 2014; Lowrey et al., 2014; Semrau et al., 2013, 2017). Further, end-effector devices, such as the MIT-MANUS or the MEMOS, allowed a thorough characterization of sensorimotor deficits with more cost-efficient hardware (Krebs et al., 1998; Krebs et al., 2014; Rohrer et al., 2002, 2004; Colombo et al., 2005, 2008, 2010, 2014).

However, these approaches do not yet exploit the full potential of robotic technologies. Existing systems, both end-effector and exoskeletons, typically provide arm weight support to the user, which artificially reduces the influence of arm weakness on upper limb movements and therefore neglects an important part of disability (Ellis et al., 2008). Moreover, most of the existing approaches focus on characterizing sensorimotor impairments during isolated arm movements without functional context, thereby questioning the functional relevance of the assessment. More recently, researchers started emphasizing the functional context, but again relied on time-consuming and complex measurement setup (Alt Murphy et al., 2012, 2013;

Baak et al., 2015; Johansson and Häger, 2019). In addition, hand function and the ability to precisely coordinate grip forces are often not considered, which would be important to better relate impairments to activity limitations.

(34)

1.7. Aims of this thesis

At the Rehabilitation Engineering Laboratory of ETH Zurich, the Virtual Peg Insertion Test (VPIT, Figure 1.1) was proposed in 2010 as a next generation technology-aided assessment and builds the foundation for the work presented in this thesis. The VPIT is expected to provide a time- and cost-efficient assessments of sensorimotor impairments in arm and hand within the functional context of a virtual object manipulation task (Emery et al., 2010).

The approach relies on a haptic end-effector, an instrumented handle able to capture grip forces, and a virtual reality environment rendering a goal-directed pick and place task. The feasibility of the VPIT has been successfully tested through pilot studies in persons with mild to moderate sensorimotor impairments resulting from stroke, multiple sclerosis, or hereditary ataxia (Fluet et al., 2011, 2012; Lambercy et al., 2013; Gagnon et al., 2014; Hofmann et al., 2016;

Tobler-Ammann et al., 2016).

1.7 Aims of this thesis

This work aims to advance the selection, validation, and clinical application of digital health metrics for assessing upper limb sensorimotor impairments with a special emphasis on estab- lishing the VPIT as a novel and innovative technology-aided assessment. More specifically, the aims were 1) to systematically review the state of the art on digital health metrics of upper limb sensorimotor impairments and their clinimetric properties. In addition, the goal was 2) to define a novel methodology enabling the automated identification and selection of robust digital health metrics from technology-aided assessments, allowing to define a valid core set of metrics for the VPIT. Relying on these core metrics, the objective was further 3) to evaluate the concept that the VPIT informs on sensorimotor impairments that are functionally relevant.

Lastly, the aims were 4) to inspect the feasibility of predicting neurorehabilitation outcomes with the VPIT metrics and 5) to explore novel avenues for the design and application of the VPIT that can further enhance its impact.

It is expected that establishing the state of the art on digital health metrics and their clini- metrics could create awareness that most technology-aided assessments are not sufficiently validated and trigger more research in this highly important area. In addition, an automated framework for selecting optimal digital health metrics could help to strongly support the validation of technology-aided assessments and pave the way for their clinical integration.

Moreover, the application of the metric selection framework to the VPIT could help estab- lishing a novel tool for a sensitive and efficient assessment of arm and hand sensorimotor impairments that are functionally relevant, thereby promising superior endpoints for clinical trials. In combination with an accurate prediction of neurorehabilitation outcomes, this would improve the prioritization of therapy and enable a better identification of therapy targets.

Lastly, the proposal and pilot testing of novel designs and applications for the VPIT could build the foundation for its long-term success as a technology-aided assessment well suited for clinical trials and routine care.

This dissertation is structured into three parts, with the first part focusing on the selection and

(35)

User-controlled cursor Virtual

pegs

Force-sensing handle

Haptic device (a)VPIT setup.

0 5 10 15 20

Grip force (N)

(b)Representative VPIT kinematics and kinetics (color-coded) from an unaffected (left) and post-stroke (right) individual.

Figure 1.1 –The Virtual Peg Insertion Test (VPIT) is a technology-aided assessment platform aiming to characterize sensorimotor impairments in arm and hand during a functional object manipulation task. It relies on a haptic end-effector, a force-sensing handle, and a virtual reality environment (a). This allows sensing kinematic and kinetic behavioural data (b). The approach was introduced in 2010 at the Rehabilitation Engineering Laboratory of ETH Zurich and built the foundation for this dissertation.

(36)

1.7. Aims of this thesis

validation of digital health metrics, the second part on the clinical application of a core set of VPIT metrics, and the third part providing a general discussion and outlook.

(37)
(38)

Part I Digital health metrics for assessing upper limb sensorimotor impair-

ments: state of the art, a novel selec-

tion framework, and a core set for the

VPIT

(39)
(40)

2 Systematic review on kinematic as- sessments of upper limb movements after stroke

Anne Schwarz*, Christoph M. Kanzler*, Olivier Lambercy, Andreas R. Luft, Janne M. Verbeek

*authors contributed equally.

Stroke: A Journal of Cerebral Circulation, 2019.

The authors thank Dr. Werner L. Popp for the insightful discussions and Christopher Jarrett for language editing.

This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement number 688857 (SoftPro), from the Swiss State Secretariat for Education, Research and Innovation (contract number 15.0283-1), and the P&K Pühringer Foundation. The funders had no role in study design, data collection, and analysis, decision to publish, or preparation of the manuscript.

In collaboration with Anne Schwarz, Christoph M. Kanzler was leading the study design, implementation and data collection, data analysis, and writing of the manuscript.

Final publication is available at https://doi.org/10.1161/STROKEAHA.118.023531. The online-only supplementary material can also be found at the same link, and its most relevant parts were summarized in the appendix of this thesis.

(41)

2.1 Abstract

Background Assessing upper limb movements post-stroke is crucial to monitor and un- derstand sensorimotor recovery. Kinematic assessments are expected to enable a sensitive quantification of movement quality and distinguish between restitution and compensation.

The nature and practice of these assessments are highly variable and used without knowledge of their clinimetric properties. This presents a challenge when comparing and interpreting the results. The purpose of this review was to summarize state of the art kinematic upper limb assessments post-stroke, with respect to the assessment task, measurement system, and performance metrics and their clinimetric properties. Subsequently, we aimed to provide evidence-based recommendations for future applications of upper limb kinematics in stroke recovery research.

Methods A systematic search was conducted in PubMed, Embase, CINAHL, and IEEE Xplore.

Studies investigating clinimetric properties of applied metrics were assessed for risk of bias using the COSMIN checklist. The quality of evidence for metrics was determined according to the GRADE approach.

Results A total of 225 studies (N=6197) using 151 different kinematic metrics were identified and allocated to five task and three measurement system groups. Thirty studies investigated clinimetrics of 62 metrics: reliability (n=8), measurement error (n=5), convergent validity (n=22), and responsiveness (n=2). The metricstask/movement time, number of movement onsets, number of movement ends, path length ratio, peak velocity, number of velocity peaks, trunk displacement, andshoulder flexion/extensionreceived a positive evaluation for one clinimetric property.

Conclusions Studies on kinematic assessments of upper limb sensorimotor function are poorly standardized and rarely investigate clinimetrics in an unbiased manner. The provided evidence-based recommendations for the choice of task, measurement system, and kinematic metrics aim to increase standardization in stroke research. Further high-quality studies evaluating clinimetric properties are needed to validate kinematic assessments, with the longterm goal to elucidate upper limb sensorimotor recovery post-stroke.

Study registration https://www.crd.york.ac.uk/prospero/ (CRD42017064279).

(42)

2.2. Introduction

2.2 Introduction

Deficits in upper limb sensorimotor function are experienced by about 80% of stroke patients early after symptom onset (Langhorne et al., 2009). Despite the availability of acute medical treatment and rehabilitation, upper limb impairment persists in about 60% of patients 6 months post-stroke (Nijland et al., 2010). These impairments can include muscle weakness, loss of inter-joint coordination, and changes in muscle tone and sensation, which subse- quently reduce the ability to use the upper limb when performing daily activities and increase dependency (Langhorne et al., 2011; Veerbeek et al., 2011). Understanding upper limb senso- rimotor recovery post-stroke is required to optimize therapy outcomes by developing effective interventions. One constraint impeding this understanding is the lack of standardized and responsive approaches to define and measure stroke-related upper limb deficits and their evolution (Kwakkel et al., 2017).

Traditionally, upper limb deficits post-stroke are evaluated using established clinical assess- ments such as the upper extremity subscale of the Fugl-Meyer Assessment (FMA-UE) (Fugl- Meyer et al., 1975; Gladstone et al., 2002) and the Action Research Arm Test (ARAT) (Carroll, 1965; Lang et al., 2006a). A drawback of these assessments is that they are insufficiently sensi- tive to capture the quality of sensorimotor performance due to the use of ordinal scales. This impedes the ability to clearly distinguish behavioral restitution from compensation (Chen et al., 2009; Lin et al., 2010; Jones, 2017; Twitchell, 1951), which is essential to understand neuro- logical mechanisms of sensorimotor recovery post-stroke. Kinematic assessments promise to overcome these drawbacks by providing objective metrics that have the potential to sensitively capture movement quality and enable the monitoring of compensatory movements (Cirstea and Levin, 2000; Krebs et al., 2014; Lambercy et al., 2016). However, a variety of different tasks, measurement systems, and kinematic metrics are used in clinical research. This limits comparability and, by extension, the potential for meta-analysis that are need needed to establish a knowledge foundation about the mechanisms of upper limb recovery. Further- more, information about clinimetric properties such as reliability, measurement error, validity, and responsiveness of metrics derived from kinematic assessments is essential to confirm their physiological interpretation and robustness, and thereby suitability for stroke recovery research.

Previous reviews summarized the use of kinematic metrics for the upper limb (De Los Reyes- Guzmán et al., 2014; Alt Murphy et al., 2015; Ellis et al., 2016; Wang et al., 2017b; Tran et al., 2018) and their physiological interpretation (Nordin et al., 2014). However, they focused only on specific measurement systems, or did not differentiate metrics according to assessment tasks (Alt Murphy et al., 2015; Nordin et al., 2014), factors which are likely to influence the interpretation of kinematic metrics (Subramanian et al., 2010a). In addition, the majority of these reviews were not performed in a systematic way or did not rely on guidelines such as PRISMA for reporting systematic reviews and COSMIN for assessing risk of bias and grading the evidence (Moher et al., 2009; Mokkink et al., 2018). Despite the importance of characterizing clinimetric properties, only two reviews investigated clinimetrics, but these focused solely on

Abbildung

Figure 2.1 – PRISMA flow diagram of the systematic literature search. Figure adapted from Moher et al
Figure 3.2 – Data-driven selection and validation of metrics: example of task completion time
Table 3.2 – Results for the data-driven selection of kinematic metrics. The area under the curve (AUC, optimum at 1), intraclass correlation coefficient (ICC, optimum at 1), the smallest real difference (SRD%, optimum at 0), and η value (optimum at 0, wors
Table 3.3 – Results for the data-driven selection of kinetic metrics. The area under the curve (AUC, optimum at 1), intraclass correlation coefficient (ICC, optimum at 1), the smallest real difference (SRD%, optimum at 0), and η value (optimum at 0, worst
+7

Referenzen

ÄHNLICHE DOKUMENTE

Ziel der vorliegenden Studie war die erstmalige Anwendung des QoR-Scores zur vergleichenden Untersuchung der postoperativen Erholung nach zwei unterschiedlichen Anästhesie-

All patients were at first onset and had been diagnosed with definite neuronal surface antibody-associated AE according to published criteria [14], including 96 with

The method is based on the use of the spatial correlation of high density EMG signals (the variogram), which have not been used previously for controlling upper limb

By introducing a novel adaptive pre-processing algorithm (ACAR) for the surface EMG signals and designing a regression system based on a non-negative matrix

Nevertheless, this finding shows that a prosthesis control methods based on neural information will have to cope with the fact that the number of motor units detected at small

Horizontal bars, red for the flexor and blue for the extensor, showed a continuous feedback about the current level of muscle activity (prosthesis control signals).

In three patients who had suffered critical soft tissue defects (Figure 12), prosthetic hand function was measured both before and after bionic reconstruction

Most ML-based methods used for myoelectric control follow the conventional pattern recognition paradigm, where training data is collected using a supervised