DATA PROCESSING AND ANALYSIS
13 NOV 2018 I SABINE SCHRÖDER, IEK-8
TOOLS FOR DATA ANALYSIS
OUTLINE
1. Motivation
2. Tools and data standards
3. Commands, Interpreter, Programming: When to use what?
4. Tools in command line operators/viewers 5. Tools in interpreted languages
6. Tools in compiled languages
7. Summary
MOTIVATION
MOTIVATION
TOOLS AND DATA STANDARDS
A chosen data format influences the variety of available tools.
Sticking to data standards enhances the availability of reusable tools.
Keep up to date: Not only data formats develop, but also tools!
standard data
format
tool1 tool2 tool3 tool4
…
COMMANDS, INTERPRETERS, COMPILED PROGRAMS
• Ease of learning command line tool:
> print_helloWorld
interpreter:
>>> print("Hello, world!")
programming:
PROGRAM Hello
WRITE(*,*) "Hello, world!"
END PROGRAM Hello
COMMANDS, INTERPRETERS, COMPILED PROGRAMS
• Ease of learning
• Getting up and running command line tool:
• download
interpreter:
• download and installation of interpreter
• learn syntax of interpreter
programming:
• license compiler
• install compiler
• learn about compiler
• learn syntax of
programming language
• after programming, compile, then run
COMMANDS, INTERPRETERS, COMPILED PROGRAMS
• Ease of learning
• Getting up and running
• Speed: a productivity vs. performance tradeoff command line tool:
> ncwa --no_tmp_fl -y max -v tpot test.nc max_tpot.nc
interpreter:
>>> from netCDF4 import Dataset
>>> rootgrp =
Dataset("test.nc","r")
rootgrp.variables["tpot"][:
,:,:,:].max()
programming:
PROGRAM Maxi USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid)
st=nf90_inq_varid(ncid,
"tpot", tpotId)
st=nf90_get_var(ncid, tpotId,
COMMANDS, INTERPRETERS, COMPILED PROGRAMS
• Ease of learning
• Getting up and running
• Speed: a productivity vs. performance tradeoff command line tool:
> ncwa --no_tmp_fl -y max -v tpot test.nc max_tpot.nc
interpreter:
>>> from netCDF4 import Dataset
>>> rootgrp =
Dataset("test.nc","r")
programming:
PROGRAM Maxi USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid)
st=nf90_inq_varid(ncid,
COMMANDS, INTERPRETERS, COMPILED PROGRAMS
• Ease of learning
• Getting up and running
• Speed: a productivity vs. performance tradeoff
• Scope of use
• Requirements
TOOLS IN COMMAND LINE OPERATORS/VIEWERS NCDUMP NCGEN
From CDL to
• netCDF-3
• netCDF-4
• C/F77/JAVA program
Record Variable Header
TOOLS IN COMMAND LINE VIEWERS
NCVIEW
TOOLS IN COMMAND LINE VIEWERS
PANOPLY
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
http://nco.sourceforge.net/
• standalone, command-line programs to
derive new fields
compute statistics
hyperslab
manipulate metadata
regrid
• input:
netCDF, HDF, DAP
flat files
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncap2
arithmetically processes netCDF files instructions from command line
or from file treats missing values
definition of dimensions possible
can link to the GNU Scientific Library (GSL) can create derived fields
example:
ncap2 -v -s 'a=3;b=4;c=sqrt(a^2+b^2)' in.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
ncatted
append, create, delete, modify, and overwrite attributes
example:
ncatted -a history,global,a,c,'Data version 2.0\n' in.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncbo
four operations:
Addition, Subtraction, Multiplication, Division
example:
ncbo --op_typ=sub 86_0112.nc 85_0112.nc 86m85_0112.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
ncclimo
• Climatology modes:
annual, monthly, daily
• seasons:
jfm,amj,jas,ond,on,fm,djf,mam,jja,son,ann
• number of timesteps-per-day in output
• automatic filename creation and splitting of output files
example:
ncclimo -C ann -m cism -h h -c caseid -s 1851-e 1900 -i drc_in -o drc_out
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncecat
create one output file:
• Record Aggregation – NetCDF3
• Group Aggregation – NetCDF4 example:
ncecat -u realization 85_0[1-5].nc 85.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
nces
gridpoint statistics on variables across
• an ensemble of input-files
• input groups within each file statistics:
avg Mean value sqravg Square of the mean avgsqr Mean of sum of squares max Maximum value
min Minimum value mabs Maximum absolute value mebs Mean absolute value mibs Minimum absolute value rms Root-mean-square (normalized by N)
rmssdn Root-mean square (normalized by N-1)
sqrt Square root of the mean tabs Sum of absolute values ttl Sum of values
example:
nces 85_0[1-5].nc 85.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncflint
linear combination of input files:
• weighted average
• normalized weighted average
• interpolation example:
ncflint -i time,86 85.nc 87.nc 86.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
ncks
file creation and conversion with special features:
• extract
• hyperslab
• multi-slab
• sub-set
• translate
example:
ncks -d time,5 -d lat,,0.0 -d lon,260.0,45.0 -d lev,1000.0 in.nc out.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncpdq
two distinct functions:
• packing (pdq: Pack Data Quietly)
• dimension permutation (pdq:Permute Dimensions Quickly)
example:
packing:
ncpdq in.nc out.nc
dimension permuation:
ncpdq -a lon,-lat in.nc out.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
ncra
computes statistics of record variables across an arbitrary number of input-files
statistics:
avg Mean value sqravg Square of the mean avgsqr Mean of sum of squares max Maximum value
min Minimum value mabs Maximum absolute value mebs Mean absolute value mibs Minimum absolute value rms Root-mean-square (normalized by N)
rmssdn Root-mean square (normalized by N-1)
sqrt Square root of the mean tabs Sum of absolute values ttl Sum of values
example:
ncra -d time,11,13 85.nc 86.nc 87.nc 8512_8602.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncrcat
concatenates record variables across an arbitrary number of input-files
example:
ncrcat 85.nc 86.nc 87.nc 88.nc 89.nc 8589.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
ncremap
remap to grid specified by a map file (weight- file), grid destination file, or a template file (data file on destination grid)
example:
ncremap -d dst.nc in.nc out.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator)
ncrename
renames dimensions, variables, attributes, and groups
example:
ncrename -d lon,longitude -v lon,longitude in.nc
TOOLS IN COMMAND LINE OPERATORS NCO (NETCDF OPERATORS)
ncap2 (arithmetic processor (version 2)) ncatted (attribute editor)
ncbo (binary operator)
ncclimo (climatology generator) ncecat (ensemble concatenator) nces (ensemble statistics)
ncflint (file interpolator) ncks (kitchen sink)
ncpdq (permute dimensions quickly) ncra (record average)
ncrcat (record concatenator) ncremap (remapper)
ncrename (renamer)
ncwa
computes statistics on variables in a single file over arbitrary dimensions, with options to specify weights, masks, and normalization statistics:
avg Mean value sqravg Square of the mean avgsqr Mean of sum of squares max Maximum value
min Minimum value mabs Maximum absolute value mebs Mean absolute value mibs Minimum absolute value rms Root-mean-square (normalized by N)
rmssdn Root-mean square (normalized by N-1)
sqrt Square root of the mean tabs Sum of absolute values ttl Sum of values
example:
ncwa -y max -v tpot test.nc max_tpot.nc
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
https://code.mpimet.mpg.de/projects/cdo
• collection of command-line operators to
manipulate
analyse
climate and NWP model data
(more than 600 operators available)
• input:
GRIB 1/2
netCDF 3/4
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
• Miscellaneous/NCL
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
• Copy datasets
• Concatenate datasets
• Replace variables
• Merge datasets
• Split by
codenumber/levels/grids/hour s/days/…/time selection
• Distribute/collect horizontal grid
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
• Miscellaneous/NCL
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
• Select/delete fields
• Select
parameters/levels/grids
• Select
timesteps/hours/days/…
• Select lat-lon-box/index-box
• Select/delete grid cells
• Resample grid
• Use mask file with conditions (ifthen/ifnothen/ifthenelse/…)
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description eq Equal
ne Not equal le Less equal lt Less than
ge Greater equal gt Greater than
eqc Equal constant
nec Not equal constant lec Less equal constant ltc Less than constant
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
• Miscellaneous/NCL
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
• Modify
metadata
fields or part of a field
• in a dataset
• Set attributes/date/time
bounds/grids/levels/missing value/valid range
• Invert latitudes/levels
• Shift x/y
• Mask regions/boxes
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
• arithmetically process datasets via expression (script)s
• operators
(abs,sqrt,acos,log10,…)
• Operate on two fields (add/sub/min/…)
• Monthly/multiyear
[hourly/daily/monthly/seasonal]
arithmetics (add, sub, mul, div)
• Days per month (add, sub, …)
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
• Miscellaneous/NCL
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
• Cumulative values
• Ensemble/field/zonal/meridional/
gridbox/vertical/time
selection/running/time/hourly/
monthly/yearly/seasonal/
multiyear statistics
• Correlation in grid/time
• Covariance in grid/time
• Regression
• Trends
• EOF calculations
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Dataset information
• Comparison of two datasets
• Number of
parameters/levels/years/mont hs/dates/timesteps/…
• Show
standard_names/attributes/le vels/date information/…
• Grid description
• horizontal fields to a new grid
• interpolation of 3D variables from hybrid model levels to height or pressure levels
• interpolation in time between time steps and years
• linear/bilinear/cubic interpolation
• nearest neighbor/distance- weighted average remapping
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
• Miscellaneous/NCL
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• spectral to gridpoint and vice versa
• divergence and vorticity to U and V wind
and vice versa
• D and V to velocity potential and stream function
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
•
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
import and export data files which cannot be read or written directly with CDO
Collections:
• Information
• File operations
• Selection/Conditional selection
• Comparison
• Modification
• Arithmetic
• Statistical values/Correlation/Regression/EOF
• Interpolation
• Transformation
• Import/Export
• Miscellaneous/NCL
TOOLS IN COMMAND LINE OPERATORS CDO (CLIMATE DATA OPERATORS)
• Filtering (band/low/highpass)
• Create pressure/temperature values for hydrostatic
atmosphere
• Potential temperature to in-situ temperature (and vice versa)
• Histogram
• Frost/strong wind/strong
breeze/strong gale/hurrican days
• GrADS data descriptor file
• ECHAM post processor
TOOLS IN INTERPRETED LANGUAGES NCL (NCAR COMMAND LANGUAGE)
ncl 0> a=addfile("test.nc","r") (also OPeNDAP possible)
TOOLS IN INTERPRETED LANGUAGES NCL
• General NCL routines
• Input/output
• Math and statistics
• Earth Science
• Visualization
Array creation, manipulation, query Group creation, query
List routines String
System
Type conversion
Variable query, manipulation
TOOLS IN INTERPRETED LANGUAGES NCL
• General NCL routines
• Input/output
• Math and statistics
• Earth Science
• Visualization
File input/output Printing
TOOLS IN INTERPRETED LANGUAGES NCL
• General NCL routines
• Input/output
• Math and statistics
• Earth Science
• Visualization
General applied math Bootstrap
Cumulative distribution functions Empirical orthogonal functions ESMF regridding
Extreme values Heat stress
Interpolation
Ngmath routines
Random number generators Regridding
Singular value decomposition
TOOLS IN INTERPRETED LANGUAGES NCL
• General NCL routines
• Input/output
• Math and statistics
• Earth Science
• Visualization
Climatology CESM
Crop
Heat-stress Date
Drought
Lat/lon functions
Metadata/missing values Meteorology
Oceanography
TOOLS IN INTERPRETED LANGUAGES NCL
• General NCL routines
• Input/output
• Math and statistics
• Earth Science
• Visualization
Graphics routines Color
Object manipulation Workstation
TOOLS IN INTERPRETED LANGUAGES NCL
Climatology
TOOLS IN INTERPRETED LANGUAGES
GRADS (GRID ANALYSIS AND DISPLAY SYSTEM)
ga-> sdfopen test.nc
Data Descriptor File
DSET STID
CHSUB TVAR
DTYPE TOFFVAR
INDEX CACHESIZE
STNMAP OPTIONS
TITLE PDEF
UNDEF XDEF
UNPACK YDEF
FILEHEADER ZDEF
XYHEADER TDEF
XYTRAILER EDEF
THEADER VECTORPAIRS
HEADERBYTES VARS TRAILERBYTES ENDVARS
XVAR ATTRIBUTE METADATA
YVAR COMMENTS
ZVAR
TOOLS IN INTERPRETED LANGUAGES (PY)FERRET
yes? use test.nc
(USE is an alias for SET DATA/FORMAT=CDF;
also OPeNDAP possible)
To output a variable in NetCDF:
yes? LIST/FORMAT=CDF variable_name (If a filename is not specified for writing,
Ferret will generate one.)
TOOLS IN INTERPRETED LANGUAGES R
> library(ncdf4)
> ncin <- nc_open('test.nc')
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
CompoundType __init__
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
ncattrs
renameAttribute renameDimension renameGroup
renameVariable set_always_mask set_auto_chartostring set_auto_mask
set_auto_maskandscale set_auto_scale
set_fill_off set_fill_on setncattr
setncattr_string Class
Dataset __init__
close
createCompoundType createDimension
createEnumType createGroup
createVLType createVariable delncattr
filepath
get_variables_by_attributes
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
Dimension __init__
group
isunlimited
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
EnumType __init__
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
Group
__init__
close
createCompoundType createDimension
createEnumType createGroup
createVLType createVariable delncattr
ncattrs
renameAttribute renameDimension renameGroup
renameVariable set_always_mask set_auto_chartostring set_auto_mask
set_auto_maskandscale set_auto_scale
set_fill_off set_fill_on
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
MFDataset
createCompoundType createDimension
createEnumType createGroup
createVLType createVariable delncattr
filepath
get_variables_by_attributes getncattr
Isopen
renameGroup renameVariable set_always_mask set_auto_chartostring set_auto_mask
set_auto_maskandscale set_auto_scale
set_fill_off set_fill_on setncattr
setncattr_string setncatts
sync __init__
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
MFTime __init__
ncattrs
set_auto_chartostring set_auto_mask
set_auto_maskandscale set_auto_scale
typecode
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
VLType
__init__
TOOLS IN INTERPRETED LANGUAGES
PYTHON
(HTTP://UNIDATA.GITHUB.IO/NETCDF4-PYTHON/)>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc","r")
Functions
chartostring date2index date2num getlibversion num2date stringtoarr stringtochar
Classes
CompoundType Dataset
Dimension EnumType Group MFDataset MFTime VLType Variable
Class
Variable __init__
assignValue chunking delncattr endian filters getValue get_dims
get_var_chunk_cache
set_always_mask set_auto_chartostring set_auto_mask
set_auto_maskandscale set_auto_scale
set_collective
set_var_chunk_cache setncattr
setncattr_string setncatts
use_nc_get_vars
The Python Data Stack
TOOLS IN INTERPRETED LANGUAGES PYTHON (EXCURSUS: PANGEO)
The Pangeo Platform
• Foster collaboration around the open source scientific python ecosystem for ocean / atmosphere / land / climate science.
• Support the development with domain-specific geoscience packages.
• Improve scalability of these tools to handle petabyte-scale datasets on HPC and cloud platforms.
TOOLS IN INTERPRETED LANGUAGES IDL (INTERACTIVE DATA LANGUAGE)
IDL> result=ncdf_open("test.nc") proprietary programming tool from
Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation
Simplified Interface
NCDF_GET - Retrieve variables and attributes from a NetCDF file.
NCDF_LIST - Print out a list of variables and attributes from a NetCDF file.
NCDF_PUT - Create or modify a NetCDF file.
TOOLS IN INTERPRETED LANGUAGES IDL (INTERACTIVE DATA LANGUAGE)
IDL> result=ncdf_open("test.nc") proprietary programming tool from
Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation
Creating NetCDF Files
NCDF_CREATE: Call this procedure to begin creating a new file.
The new file is put into define mode.
NCDF_DIMDEF: Create dimensions for the file.
NCDF_VARDEF: Define the variables to be used in the file.
NCDF_ATTPUT: Optionally, use attributes to describe the data.
NCDF_CONTROL, /ENDEF: Call NCDF_CONTROL and set the ENDEF keyword to leave define mode and enter data mode.
NCDF_VARPUT: Write the appropriate data to the netCDF file.
TOOLS IN INTERPRETED LANGUAGES IDL (INTERACTIVE DATA LANGUAGE)
IDL> result=ncdf_open("test.nc") proprietary programming tool from
Exelis Visual Information Solutions, Inc., a subsidiary of Harris Corporation
Reading NetCDF Files
NCDF_IS_NCDF: Check if one or more input files are in NetCDF-3 format.
NCDF_OPEN: Open an existing netCDF file.
NCDF_PARSE: Return an ordered hash containing object information and data from a NetCDF-3 file.
NCDF_INQUIRE: Call this function to find the format of the netCDF file.
NCDF_DIMINQ: Retrieve the names and sizes of dimensions in the file.
NCDF_VARINQ: Retrieve the names, types, and sizes of variables in the file.
NCDF_ATTNAME: Optionally, retrieve attribute names.
NCDF_ATTINQ: Optionally, retrieve the types and lengths of attributes.
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
Enum Type
NF90_STRERROR NF90_INQ_LIBVERS NF90_CREATE
NF90_OPEN NF90_REDEF NF90_ENDDEF NF90_CLOSE
NF90_INQUIRE Family NF90_SYNC
NF90_ABORT NF90_SET_FILL
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
NF90_INQ_NCID NF90_INQ_GRPS NF90_INQ_VARIDS NF90_INQ_DIMIDS
NF90_INQ_GRPNAME_LEN NF90_INQ_GRPNAME
NF90_INQ_GRPNAME_FULL NF90_INQ_GRP_PARENT NF90_DEF_GRP
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
Enum Type
NF90_DEF_DIM NF90_INQ_DIMID
NF90_INQUIRE_DIMENSION NF90_RENAME_DIM
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
NF90_INQ_TYPEIDS NF90_INQ_TYPE
NF90_INQ_USER_TYPE
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
Enum Type
NF90_DEF_COMPOUND NF90_INSERT_COMPOUND
NF90_INSERT_ARRAY_COMPOUND NF90_INQ_COMPOUND
NF90_INQ_COMPOUND_FIELD
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
NF90_DEF_VLEN NF90_INQ_VLEN NF90_FREE_VLEN
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
Enum Type
NF90_DEF_OPAQUE NF90_INQ_OPAQUE
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
NF90_DEF_ENUM NF90_INSERT_ENUM NF90_INQ_ENUM
NF90_INQ_ENUM_MEMBER NF90_INQ_ENUM_IDENT
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
Enum Type
NF90_DEF_VAR
NF90_DEF_VAR_CHUNKING NF90_INQ_VAR_CHUNKING NF90_DEF_VAR_FILL
NF90_INQ_VAR_FILL
NF90_DEF_VAR_DEFLATE NF90_INQ_VAR_DEFLATE NF90_DEF_VAR_FLETCHER32 NF90_INQ_VAR_FLETCHER32 NF90_DEF_VAR_ENDIAN
NF90_INQ_VAR_ENDIAN NF90_INQUIRE_VARIABLE
TOOLS IN COMPILED LANGUAGES
FORTRAN90
(HTTPS://WWW.UNIDATA.UCAR.EDU/SOFTWARE/NETCDF/NETCDF -4/NEWDOCS/NETCDF -F90.HTML)USE netcdf
st=nf90_open("test.nc", NF90_NOWRITE, ncid) st=nf90_inq_varid(ncid, "tpot", tpotId)
st=nf90_get_var(ncid, tpotId, tpot) Datasets
Groups
Dimensions
User Defined Data Types Compound Types
Variable Length Array Opaque Type
NF90_PUT_ATT
NF90_INQUIRE_ATTRIBUTE NF90_GET_ATT
NF90_COPY_ATT NF90_RENAME_ATT NF90_DEL_ATT
SUMMARY/OUTLOOK
• Not one tool faces all requests on atmospheric data.
• Don’t use tools blindly!
• Keep up-to-date!
SUMMARY/OUTLOOK
Data format Tools/APIs/libs Reference
CSV Tools: MS Excel (proprietary)
a lot of public-domain tools (like csvkit, …) available
csv: Python module (import csv)
no special FORTRAN API needed
(list directed input and/or formatted output will do the trick)
http://docs.python.org/library/csv
NASA Ames nastools: Python module (import nastools)
https://files.pythonhosted.org/packages/e3/3c/3bbdd20ad05f737e4c4df3d8 ac5b8ee7ad172d6fcf16096500e2f5dfb3f1/NAStools-0.1.2.tar.gz
SUMMARY/OUTLOOK
Data format Tools/APIs/libs Reference
BUFR ecCodes: BUFR tools
(bufr_count, bufr_dump, bufr_ls, bufr_get, bufr_compare, bufr_copy, bufr_filter) ecCodes: Python module
(import eccodes; linking with -leccodes_f90 -leccodes) ecCodes: F90 module
(use eccodes)
https://confluence.ecmwf.int/display/ECC
GRIB ecCodes: GRIB tools
(grib_compare, grib_copy, grib_count, grib_dump, grib_filter, grib_get,
grib_get_data, grib_index_build, grib_ls, grib_set, grib_to_netcdf)
Tools: CDO
ecCodes: Python module
(import eccodes; linking with -leccodes_f90 -leccodes) ecCodes: F90 module
(use eccodes)
https://confluence.ecmwf.int/display/ECC
https://code.mpimet.mpg.de/projects/cdo/