D ATA QUALITY ASSURANCE AND CONTROL A T THE EXAMPLE OF IAGOS-RH
DATAR Workshop | Forschungszentrum Jülich | 13th Nov 2018
S
USANNER
OHS| F
ORSCHUNGSZENTRUMJ
ÜLICHContact: s.rohs@fz-juelich.de
W HY IS QUALITY ASSURANCE AND CONTROL IMPORTANT ?
S
OYUZR
OCKETL
AUNCH TOISS
ONO
CT. 11, 2018 T
HE FLIGHT WAS ABORTED118
SECONDS AFTER LIFTOFF WHEN ONE OF THE ROCKET’
SFOUR SIDE BOOSTERSFAILED TOEJECT PROPERLY
—
COLLIDING WITH THE ROCKET’
S SECOND STAGE,
DAMAGING THE LOWERPORTION OF THE ROCKET AND SENDING THE ENTIRE ASSEMBLYINTO A SPIN,
TRIGGERING AN AUTOMATED ABORTTHAT JETTISONED THE CREWED CAPSULE.
“T
HE CAUSE OF THE ABNORMAL SEPARATION WASTHE LIDOF THE OXIDIZER TANK’
S NOZZLE INB
LOCKD
DID NOT OPEN DUE TO A DEFORMATION(6-
DEGREEBEND)
OFTHE CONTACT SENSOR DURING ASSEMBLY OF THE PACKAGE AT THEB
AIKONURC
OSMODROME,”
=> T
HE ACCIDENT INVESTIGATION COMMISSION SAYS THATIT HAS DEVELOPED NEWGUIDELINES AND CHECKS TO ENSURE THAT FUTURE
S
OYUZ ROCKETS DO NOT RUNINTO SIMILAR PROBLEMST
WOA
SPECTS:
• To prevent errors from happening
• To identify and correct errors that have taken place
.
QC QA
• used to verify the quality of the output
• process of managing for quality
• by inspection, measurement etc
• by planning and documenting processes to assure quality (e.g. quality plans, inspection, test plans).
• measures and determines the quality level of products or
services.
• complete system to assure the quality of products or services.
• It is a process itself. • It is not only a process, but a complete system including QC.
QC VS QA
M
ILLENNIUMP
ROJECTThe Millennium Project is a global participatory think tank established in 1996 under the American Council for the
United Nations University that became independent in 2009 and has grown to 63 Nodes around the world
W HY IS E NVIRONMENTAL DATA QUALITY ASSURANCE AND
CONTROL IMPORTANT ?
Fisher, C. and Kingma (2001) Criticality of data quality as exemplified in two disasters, Information and Management, 39, 2, 109-116
W HY DO WE NEED QA/QC?
https://atmos.washington.edu/~davidc/
1985: D ISCOVERY OF A NTARCTIC O ZONE H OLE
https://atmos.washington.edu/~davidc/
1985: D ISCOVERY OF A NTARCTIC O ZONE H OLE
https://www.dataone.org/best-practices /
D ATA LIFECYCLE
QC
European Research Infrastructure for Earth observation by passenger aircraft since 2014
Regular in situ global-scale monitoring of essential climate variables H
2O, O
3, CO, NO
x, CO
2, CH
4, aerosols, clouds
Long-term deployment envisaged (> 20yrs)
Today, 8 long-haul aircraft (IAGOS-CORE) and one flying laboratory (IAGOS-CARIBIC)
Open data policy; visit www.iagos.org
Provision of data in near real time for Copernicus and other services
Data available since 1995
> 20 years of UTH data
> 58000 flights, 500 flights/year/aircraft
I N - SERVICE A IRCRAFT FOR A G LOBAL O BSERVING S YSTEM
Association Internationale sans but lucratif
IAGOS – CORE C APACITIVE H YGROMETER
Established technique (balloon soundings)
Low maintenance requirements
Regular pre- and post-flight calibrations traceable to frost point mirror
In operation since 1995
I N - SERVICE A IRCRAFT FOR A G LOBAL O BSERVING S YSTEM
Association Internationale sans but lucratif
Calibration Centres
calibration maintenance data
IAGOS Data Centre hosted by AERIS (CNES-
CNRS/INSU) in Toulouse
calibration data
meta data
near real time data
for Copernicus model validation
IAGOS Data Services
Available: O
3, CO, H
2O (CORE NRT) O
3, CO (CARIBIC L1) In preparation: Cloud Index (CORE NRT)
NO
x/NO
2(CORE L1)
IAGOS data products
IAGOS data flow
13
Level Description
L0A raw data
L0B automatically validated data
NRT NRT for Copernicus use, bad data removed
L1 data validated by PI (preliminary data)
L2 calibrated data (final data) L3 averaged data and
climatologies
L4 added-value products
The IAGOS central database is hosted by AERIS (CNES-CNRS/INSU) in Toulouse.
Date access is free and open, the database can be accessed at www.iagos-data.fr
QAQC of IAGOS Measurements: IAGOS Scientific Symposium @ University Manchester, UK, 17-19 Oct. 2016
Problem:
Huge amount of data => manual control of data not manageable
Concept for automated data processing:
1) No entries should ever be removed from the original (raw) data set.
2) Traceability : Version control and storage of meta data 3) Relevant input information stored in txt-Files:
- flag definition - header
- qcChain, ppChain, directories - definition of limit values
4) Modular approach
IAGOS data flow
L0B-data Calibration coefficients
AC-Nr., Package-Nr.
and ICH-Sensor-Nr.
Automated processing, visualisation and flagging
Manual verification Upload data to Server in Toulouse
(with Flags)
Enviscope Toulouse FZ-Jülich
Destination
Toulouse Toulouse
Make data available for NRT users (Bad data removed)
FZJ
Main-Routine
Starts the IAGOS-RH data processing.
Import- and Preprocess Manager Txt file with raw data
Database Calibration-Manager
QC-Manager
(Manual QC)
Finalise data
INPUT OUTPUT
Main-Routine
Starts the IAGOS-RH data processing.
Import- and Preprocess Manager
• Import raw data
• Preprocess data
• QC check: Voltage range…
Txt file with raw data
Raw data in netcdf format
Database Calibration-Manager
QC-Manager
(Manual QC)
Finalise data Default Settings
QC Flag List
INPUT OUTPUT
Main-Routine
Starts the IAGOS-RH data processing.
Import- and Preprocess Manager
• Import raw data
• Preprocess data
• QC check: Voltage range…
Txt file with raw data
Calibrated data
Raw data in netcdf format
Database Calibration-Manager
Mean or linear or inflight Calibration coefficients
QC-Manager
(Manual QC)
Finalise data Default Settings
QC Flag List
INPUT OUTPUT
Main-Routine
Starts the IAGOS-RH data processing.
Import- and Preprocess Manager
• Import raw data
• Preprocess data
• QC check: Voltage range…
Txt file with raw data
Calibrated data
netcdf-Data after automatically QC
Raw data in netcdf format
Database Calibration-Manager
Mean or linear or inflight Calibration coefficients
QC-Manager
• ImpossibleTime
• GlobalRange
• RegionalRange
• VerticalSpike
• Rate of change
• Stationarity
• Accuracy
• RH specific tests
• Flag inheritance test
(Manual QC)
Finalise data Default Settings
QC Flag List
INPUT OUTPUT
QC Flag List
Visualisation QC txt-files
Main-Routine
Starts the IAGOS-RH data processing.
Import- and Preprocess Manager
• Import raw data
• Preprocess data
• QC check: Voltage range…
Txt file with raw data
Calibrated data
netcdf-Data after automatically QC
(ncdf-Data after manual QC) Raw data in netcdf format
Database Calibration-Manager
Mean or linear or inflight Calibration coefficients
QC-Manager
• ImpossibleTime
• GlobalRange
• RegionalRange
• VerticalSpike
• Rate of change
• Stationarity
• Accuracy
• RH specific tests
• Flag inheritance test
(Manual QC)
Finalise data Default Settings
QC Flag List
INPUT OUTPUT
QC Flag List
Visualisation
Visualisation QC txt-files
NRT
Test Parameter checked 1) Wrong Time dateMin ≤ TIME ≤ dateMax.
dateMin = beginning of IAGOS, dateMax = now monotonically increasing
TIME
2) Global range test validMin ≤ PARAMETER ≤ validMax TEMP, RH etc.
3) Regional range test regionalRangeMin ≤ PARAMETER ≤ regionalRangeMax Spec. Humidity, T vs. T(aircraft)
4) Vertical spike test Spikes among adjacent triplet of samples
|Vn - (Vn+1 + Vn-1)/2| - |(Vn+1 - Vn-1)/2| ≤ threshold RH, Temp, Pres
5) Rate of change |Vn - Vn-1| + |Vn - Vn+1| ≤ 2*(threshold) PRES 6) Stationarity number of consecutive points ≤ 24*(60/delta_t) with delta_t
(sampling interval in Minutes)
TEMP, PRES, RH
7) Confidence Range, Noise Data have reduced accuracy or precision TEMP, PRES, RH
8) Accuracy (Pre- and Postflight calibration)
Difference of data calculated with pre- and postflight calibration coefficients ≤ threshold
RH
9) Avionics flag Combines avionics flags time, lat, lon, pressure and
TAS_AC
RH SPECIFIC QUALITY CHECKS
1.) Volt signal of RHL and T confounded 2.) Same Volt signal for RHL and T 3.) RHL has imaginary part
4.) ∆ T between T_FZJ and T_AC too high 5.) Sensor too cold (<-40°C)
6.) RHL at cruise altitude too humid
7.) Only one calibration (or no calibration)
8.) Time between pre- and post calibration too long 9.) Range of Volt signals too flat
To be continued….
Fla g
Qualifier (Measurements )
Selection (Environme nt)
Information (Operations)
X.X unvalidated unvalidated
..P valid preliminary, passed
automatic tests
..D valid delayed mode
… valid final validated data
..N valid valid data, but noise exceeds
threshold
..L valid valid data with larger
uncertainty
..Z valid valid data with drift in zero
measurement
..T valid T aircraft used
I.X invalid reason unknown
I.R invalid out of range
I.S invalid stationarity
I.O invalid outlier (spike)
I.J invalid step
I.N invalid noisy
I.L invalid larger uncertainty
I.Z invalid drift in zero measurement
I.M invalid malfunction of instrument
Different approaches for flagging:
Keep it simple: valid invalid (limited) Or
information with reason for
flagging
F LAGGING SCHEME FOR IAGOS
C URRENT P OSITION OF IAGOS A IRCRAFT
Example:
Quicklook NRT
T HE Q UALITY A SSURANCE P LAN (QAP)
A QAP covers the full data lifecycle, from Acquisition through Publication and can:
Identify requirements for:
• Field and lab methods and equipment that meet data-collection standards
• Data standards, structure, and domains consistent with community conventions Periodic data-quality assessment using defined quality metrics
Describe a structure for data storage that can also facilitate checking for errors and help to document data quality
Establish data-quality criteria and data-screening processes for all of the data Include quality metrics that can determine current data-quality status
Establish a plan for 'data quality assessments' as part of the data flow
Contain a process for handling data corrections
name first organisation Kick/0off/meeting/in/Jena,/Germany
List/of/participants
A DAPTING QA-C ONCEPT OF WMO/GAW
IAGOS- Data Base
IAGOS-Instrument IAGOS-
Calibration Laboratory
IAGOS- QA/SAC
= IGAS-WP4
A DAPTING QA-C ONCEPT OF WMO/GAW
C ONCEPT QA/QC E VALUATION & H ARMONISATION
name first organisation
Kick/0off/meeting/in/Jena,/Germany
List/of/participants
name first organisation Kick/0off/meeting/in/Jena,/Germany
List/of/participants
32
E VALUATION AND HARMONISATION OF DATA QUALITY IN IAGOS
SOP‘s Standard Operating Procedures
1. Instrument layout and operation 2. Calibration procedure and traceability
3. Calculation of results from raw (L0) to final (L2) 4. Uncertainty Analysis
5. Maintenance
6. Validation and flagging scheme 7. Storage of data
QA/QC Protocols
1. Performance over flight period 2. Regular Calibration
3. Internal Consistency : IAGOS A/C by A/C
4. External Consistency: IAGOS A/C with other platforms 5. Development and use of automatic tools to match in
time and space (incl. use of trajectory analysis)
33
E VALUATION AND HARMONISATION OF DATA
QUALITY IN IAGOS
Regular QA/QC
&
Assessment Reports
For each measured compound:
Collecting all QA/QC-protocols over a period of 1-2 years
Prepare regular (every 1-2 years) QA/QC-report.
Internal review of QA/QC-report by IAGOS-PI‘s
Prepare regular (every 5 years) QA/QC-assessment report
Review by panel of external experts
Feedback to IAGOS Data Base on impact of archived data
Implemen- tation QA/QC Into IAGOS &
WMO/GAW
Migration of WP4-Concept into a WMO/GAW - QA/SAC (SAC= Scientific Activity Center), which means:
I. Establishment of WP4-QA/QC concept into operation as part of IAGOS-AISBL II. Link to WMO-GAW QA/QC infrastructure with a IAGOS-
QA/SAC; incl. link to its SAG’s (Scientific Advisory Groups)
name first organisation
Kick/0off/meeting/in/Jena,/Germany
List/of/participants
Internal consistency
34
Objective : Automatic detection of « coincident » profiles within 1-3 hours
Individual Quicklooks for comparisons between 2 profiles (O3 and CO)
Sept. 2006 Frankfurt
MOZAIC vs MOZAIC Delta_t = 9 minutes
Dashed line is the 1:1 line and the grey shading represents the total instrument uncertainties The grey shading represents the « comparable » records.
<1K <0.25 <25°
IGAC : 26 – 30 September 2016, Breckenridge, CO, USA
QAQC of IAGOS Measurements: IAGOS Scientific Symposium @ University Manchester, UK, 17-19 Oct. 2016
MOZAIC-IAGOS consistency
35
70% for O3 on average, 90 % for CO ; Same consistency over the 20 (12) years period.
IAGOS can be considered as the continuation of MOZAIC with the same data quality of O3 and CO measurements. A single data set to calculate trends from 1994 (2002).
Blot et al., in prep.
Nedelec et al., Tellus-B 2015
93% 81%
delta_t < 3 hours
32 & 55 vertical profiles, over Frankfurt between July 2011 and December 2012
QAQC of IAGOS Measurements: IAGOS Scientific Symposium @ University Manchester, UK, 17-19 Oct. 2016
Internal Consistency of RH by MCH & ICH:
Direct Matching in Space and Time
36
1. Natural variability of H2O already:
• > 20 % over radius = 100 km
• > 10 % over radius = 20 km
• < 1% over radius < 1 km
2. When matching in time and space H2O internal consistency cannot be done on statistical base but only by careful flight by flight and by use of trajectory analysis
Source: Offermann et al., JGR, 2002
name first organisation
Kick/0off/meeting/in/Jena,/Germany
List/of/participants
37
External Consistency MCH & ICH
CIRRUS-2006: MCH AIRTOSS-2013: ICH
Agreement MCH and ICH with Reference Instruments (FISH, OJSTER, SEALDH) within 5% RHL-uncertainty No bias at transition from MCH- to ICH-instruments
Neis et al., Tellus 2015 Neis et al., AMT 2015
name first organisation
Kick/0off/meeting/in/Jena,/Germany
List/of/participants
Research Flight Inter Comparison Against Ly (a) On board of Learjet operated by GFD/Enviscope
Implementation IAGOS-QA/QC Concept into Infrastructures of IAGOS & WMO/GAW
38
• Tasks IAGOS-QA/SAC:
1. Watch over SOP’s
2. Collecting on regular base (1-2 years) all QA/QC-protocols which should contain all
information on performance pre-, in- and post-flight operation, calibration and internal and external consistency)
3. Every 1-2 year internal review by instrument PI’s
4. Every 5 years preparation of assessment report on the performance of each instrument by its PI.
5. Review of assessment reports by external experts
• Report to IAGOS-AISBL and alert about eventual impact on archived data at IAGOS-Data Base
name first organisation
Kick/0off/meeting/in/Jena,/Germany
List/of/participants
Each compound measured by an IAGOS instrument needs
Standard Operating Procedures (SOP’s)
Transparency and traceability to well established standards
Guidelines for storage of its measured data in the IAGOS Data Base
Regular (≈yearly) documentation of QA/QC protocols on calibration and consistency Regular (≈5 years) assessment reports of QA/QC- documentation