Depending on the initial clinical or research purpose of the health- data collection, and dependent on the degree of structuring and quality motivated pre-processing, data sets can be grouped into different categories:
1. Clinical data extracted from clinic information sys-tems into data lake;
2. Basic routine data (core data set);
3. Specific routine data (extended data set, e.g., disease specific);
4. Specific routine data (imaging data, lab data);
5. Clinical data registries, inclusive HSM (Hochspezial-isierte Medizin Register);
6. Cohorts;
7. Clinical study data (clinical research project, clini-cal trial and cliniclini-cal observational study data);
8. Patient self-reported and wearable device data;
9. Molecular or -omics data (generated in the hospitals, clinical grade);
10. Molecular or -omics data (generated in the research facility, research grade);
11. Citizen/Consumer health data, lifestyle data, social media data, wearable devices;
12. Reference data sets of all kind (environmental data, potential exposure to noxious agents, geographical data, statistical data…).
Different health-related research data can further be characterized based on the following criteria:
ii. Type of data;
xi. Original purpose;
xii. Degree of processing;
xiii. Degree of structuring;
xiv. Quality control;
xv. Data protection status;
xvi. Return of actionable findings;
xvii. Data governance and ethics requirements;
xviii. Related research stakeholders;
xix. Utility for PH Research, utility for clinical research;
xx. IP scenario recommendation.
For each data type examples of SPHN and PHRT pro-jects are presented (where applicable). The further use of these data will depend on the above-mentioned criteria and needs to be defined for each category (Table 10).
Table 10: Data Types.
Type of data Original purpose Degree of
structuring Degree of
processing Quality of data Degree of linkage to
patient identity Return of actionable
findings Data governance, ethics
requirements Related research stakeholders and
potential collaborative partners IP scenario recommendation (SPHN
DTUA template) SPHN Projects
Definition for headlines Purpose for which the date is/was initially captured
The degree to which the data is structured in the context of the original use from a research perspective, encompasing completness, consistency, validity and accuracy (low, medium, high)
The degree of linkage to patient identity in the original data findings can be returned/reported back to the patient/study participant/citizen
Who/which institution decides if data can be used for research; what kind of consent possibilites are there for the specific data types?
Data are often pre-structured or quality-controlled by clinicians, clincial specialist or consortia. This effort needs to be considered, as well as the clincial scientific expertise of the latter stakehol-ders. They should be considered as collaborative research partners.
In the document DTUA, 3 scenarios are defined: 1. The RECIPIENT is the owner of the RESULT(S); 2. The RECIPIENT only is the owner of the Result(s) but the PROVIDER is granted a license on the Result(s) and/or receives a portion of the revenues from the commercialization; 3. The IP is jointly owned by the PARTIES (tbd, categories are first proposals).
Examples of Hospital clinical data (routine data)
Clincial data extracted from clinic information systems into data lake
Clinical use Heterogeniously structured, unstructured
Very hight Low to medium, heterogeneous (medium for clinical purpose, low for research purpose)
Identifying Possible Hospitals; general consent
or informed consent
none 1 or 2 or 3 SwissPKcdw,
Create, PRIMA, SPO
Basic routine data (Core dataset) Clinical use Mainly structured High Medium heteroge-neous
Identifying, but coded after transfer in datawarehouse
Possible Hospitals; general consent or informed consent
none 2 or 3 SPO, PSSS,
ImmunoRep, Frailty, SHFN Specific routine data (Extended dataset,
e. g. disease specific)
Clinical use Structured or unstructured
High Low to medium,
heterogeneous
Identifying, but coded after transfer in datawarehouse
Possible Hospitals; general consent or informed consent
none 2 or 3 SPO, PSSS,
ImmunoRep, Frailty, SHFN Specific routine data (imagine data, lab
data)
Clinical use Mainly structured Medium Medium Identifying, but
coded after transfer in datawarehouse
Possible Hospitals; general consent or informed consent
Variable, potentially none, but more likely clinical specialists, consortia of clinical specialists
2 or 3 SOIN, IMAGINE
Clinical data registries, inclusive HSM (Hochspezialisierte Medizin Register)
Clinical use mainly in specific disease areas, standardized clinical parameters, quality and moni- toring of patient care
Mainly structured Low Medium Identifying, but
coded after transfer in datawarehouse
Possible, after de-coding
Hospitals or PIs; general consent or informed consent
Clinical specialists, consortia of clinical specialists
2 or 3 SPO
Clinical research data
Cohorts Primarily assessed
for research purposes
Highly structured Low High Coded/de-identified Possible, after
de-coding, depends on intial specific consent
Pls; informed consent or legal basis (e.g. Transplantation, Cancer)
Cohort Consortia, PI 2 or 3 SACR
Clinical study data (Clinical research project, clinical trial and clinical observatio-nal study data)
Primarily assessed for research purposes
Highly structured Low to medium
High Coded/de-identified
or anonymous
Possible, after de-coding, depends on intial specific consent
Hospitals or PIs; informed consent or general consent
Cohort Consortia, PI 2 or 3 PRECISE,
SHFN, Frailty Patient self reported and wearable device
data
Primarily assessed for research or quality purposes
Structured or unstructured
High Medium Identifying Possible Patients none 1 or 2 or 3
Molecular data
Molecular or -omics data (generated in the hopsitals, clinical grade)
Clinical use Structured Low to
medium
High Identifying or
coded/de-identified
Possible, after de-coding
Hospitals; general consent or informed consent
Clinical specialists, consortia of clinical specialists
2 SPO, SOCIBP
Molecular or -omics data (generated in
the research facility, research grade) Research use Structured Low to
medium Medium Coded/de-identified Possible, after
de-coding, depends on intial specific consent
PIs (depending on the DTUA/MTA); general consent or informed consent
Scientific consortia 2 PRECISE,
PSSS, ImmunoRep Healthy citizen data
Citizien/Consumer health data, Life style
data, Social media data, wearable devices Divers Structured or
unstructured High Divers Identifying Possible Citizen Citizen data, PI 1
Reference data
Reference data sets of all kind (environ-mental data, potential exposure to noxious agents, geographical data, statistical data…)
Divers Structured Low High Anonymous Not applicable Not applicable Epidemiologist, consortia of
clinical specialists
2 or 3 SACR
Table 10: Data Types.
Type of data Original purpose Degree of
structuring Degree of
processing Quality of data Degree of linkage to
patient identity Return of actionable
findings Data governance, ethics
requirements Related research stakeholders and
potential collaborative partners IP scenario recommendation (SPHN
DTUA template) SPHN Projects
Definition for headlines Purpose for which the date is/was initially captured
The degree to which the data is structured in the context of the original use from a research perspective, encompasing completness, consistency, validity and accuracy (low, medium, high)
The degree of linkage to patient identity in the original data findings can be returned/reported back to the patient/study participant/citizen
Who/which institution decides if data can be used for research; what kind of consent possibilites are there for the specific data types?
Data are often pre-structured or quality-controlled by clinicians, clincial specialist or consortia.
This effort needs to be considered, as well as the clincial scientific expertise of the latter stakehol-ders. They should be considered as collaborative research partners.
In the document DTUA, 3 scenarios are defined: 1. The RECIPIENT is the owner of the RESULT(S); 2. The RECIPIENT only is the owner of the Result(s) but the PROVIDER is granted a license on the Result(s) and/or receives a portion of the revenues from the commercialization; 3. The IP is jointly owned by the PARTIES (tbd, categories are first proposals).
Examples of Hospital clinical data (routine data)
Clincial data extracted from clinic information systems into data lake
Clinical use Heterogeniously structured, unstructured
Very hight Low to medium, heterogeneous (medium for clinical purpose, low for research purpose)
Identifying Possible Hospitals; general consent
or informed consent
none 1 or 2 or 3 SwissPKcdw,
Create, PRIMA, SPO
Basic routine data (Core dataset) Clinical use Mainly structured High Medium heteroge-neous
Identifying, but coded after transfer in datawarehouse
Possible Hospitals; general consent or informed consent
none 2 or 3 SPO, PSSS,
ImmunoRep, Frailty, SHFN Specific routine data (Extended dataset,
e. g. disease specific)
Clinical use Structured or unstructured
High Low to medium,
heterogeneous
Identifying, but coded after transfer in datawarehouse
Possible Hospitals; general consent or informed consent
none 2 or 3 SPO, PSSS,
ImmunoRep, Frailty, SHFN Specific routine data (imagine data, lab
data)
Clinical use Mainly structured Medium Medium Identifying, but
coded after transfer in datawarehouse
Possible Hospitals; general consent or informed consent
Variable, potentially none, but more likely clinical specialists, consortia of clinical specialists
2 or 3 SOIN, IMAGINE
Clinical data registries, inclusive HSM (Hochspezialisierte Medizin Register)
Clinical use mainly in specific disease areas, standardized clinical parameters, quality and moni- toring of patient care
Mainly structured Low Medium Identifying, but
coded after transfer in datawarehouse
Possible, after de-coding
Hospitals or PIs; general consent or informed consent
Clinical specialists, consortia of clinical specialists
2 or 3 SPO
Clinical research data
Cohorts Primarily assessed
for research purposes
Highly structured Low High Coded/de-identified Possible, after
de-coding, depends on intial specific consent
Pls; informed consent or legal basis (e.g. Transplantation, Cancer)
Cohort Consortia, PI 2 or 3 SACR
Clinical study data (Clinical research project, clinical trial and clinical observatio-nal study data)
Primarily assessed for research purposes
Highly structured Low to medium
High Coded/de-identified
or anonymous
Possible, after de-coding, depends on intial specific consent
Hospitals or PIs; informed consent or general consent
Cohort Consortia, PI 2 or 3 PRECISE,
SHFN, Frailty Patient self reported and wearable device
data
Primarily assessed for research or quality purposes
Structured or unstructured
High Medium Identifying Possible Patients none 1 or 2 or 3
Molecular data
Molecular or -omics data (generated in the hopsitals, clinical grade)
Clinical use Structured Low to
medium
High Identifying or
coded/de-identified
Possible, after de-coding
Hospitals; general consent or informed consent
Clinical specialists, consortia of clinical specialists
2 SPO, SOCIBP
Molecular or -omics data (generated in
the research facility, research grade) Research use Structured Low to
medium Medium Coded/de-identified Possible, after
de-coding, depends on intial specific consent
PIs (depending on the DTUA/MTA); general consent or informed consent
Scientific consortia 2 PRECISE,
PSSS, ImmunoRep Healthy citizen data
Citizien/Consumer health data, Life style
data, Social media data, wearable devices Divers Structured or
unstructured High Divers Identifying Possible Citizen Citizen data, PI 1
Reference data
Reference data sets of all kind (environ-mental data, potential exposure to noxious agents, geographical data, statistical data…)
Divers Structured Low High Anonymous Not applicable Not applicable Epidemiologist, consortia of
clinical specialists
2 or 3 SACR