• Keine Ergebnisse gefunden

DATA BASICS

N/A
N/A
Protected

Academic year: 2022

Aktie "DATA BASICS"

Copied!
49
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

DATA BASICS

12 NOV 2018 I SABINE SCHRÖDER, IEK-8

KINDS OF DATA

(2)

Sabine Schröder Kinds of data

OUTLINE

12 November 2018

1. Motivation

2. Data values and operations 3. Array data structures

4. Metadata

5. Digital data formats: human readable vs. binary

6. Example: geo-scientific data and coordinate systems

7. Summary/Outlook

(3)

MOTIVATION

(4)

Sabine Schröder Kinds of data

DATA VALUES AND OPERATIONS

12 November 2018

?

(5)

DATA VALUES AND OPERATIONS

?

(6)

Sabine Schröder Kinds of data

DATA VALUES AND OPERATIONS

12 November 2018

?

(7)

DATA VALUES AND OPERATIONS

?

(8)

Sabine Schröder Kinds of data

DATA VALUES AND OPERATIONS

12 November 2018

?

quantitativ

(9)

DATA VALUES AND OPERATIONS

?

, , ,

quantitativ

(10)

Sabine Schröder Kinds of data

1 2 3 4

DATA VALUES AND OPERATIONS

12 November 2018

?

, , ,

quantitativ

(11)

1 2 3 4

4 3 2 1

DATA VALUES AND OPERATIONS

?

, , ,

quantitativ

(12)

Sabine Schröder Kinds of data

3 31 17 19

1 2 3 4

4 3 2 1

DATA VALUES AND OPERATIONS

12 November 2018

?

, , ,

quantitativ

(13)

3 31 17 19

1 2 3 4

4 3 2 1

DATA VALUES AND OPERATIONS

?

, , , qualitativ

quantitativ

(14)

Sabine Schröder Kinds of data

3 31 17 19

1 2 3 4

4 3 2 1

DATA VALUES AND OPERATIONS

12 November 2018

?

, , , qualitativ

quantitativ

(15)

3 31 17 19

1 2 3 4

4 3 2 1

DATA VALUES AND OPERATIONS

?

, , , qualitativ

quantitativ

(16)

Sabine Schröder Kinds of data

3 31 17 19

1 2 3 4

4 3 2 1

DATA VALUES AND OPERATIONS

12 November 2018

?

, , , qualitativ

quantitativ ordinal

(17)

DATA VALUES AND OPERATIONS

(18)

Sabine Schröder Kinds of data

DATA VALUES AND OPERATIONS

12 November 2018

Type: positive integer Range: 0.. ∞

Plausible operations: +(1)

Type: integer

Range: −∞ ..

Plausible operations: +, -,

=, !=, <, >

(19)

DATA VALUES AND OPERATIONS

(20)

Sabine Schröder Kinds of data

DATA VALUES AND OPERATIONS

DIGITAL REPRESENTATION

12 November 2018

Elementary/primitive data types:

bit/logical/boolean byte

int/integer (short,int,long,INTEGER*

- unsigned)

float/real (real,double,REAL*) complex

char/character pointer

Excursus: floating point numbers

• represented with sign, mantissa, exponent +3.241592E-27

• not all real numbers are representable

(rounding)

(21)

DATA VALUES AND OPERATIONS

DIGITAL REPRESENTATION

non-primitive data types

• enumerations

• structures

 composition/union elements of different data types

 collection of elements of the same data types  arrays

(special case: string)

(22)

Sabine Schröder Kinds of data

ARRAY DATA STRUCTURES

12 November 2018

+  5D

, , , ...

(23)

ARRAY DATA STRUCTURES

Address arithmetics:

𝑎𝑑𝑑𝑟𝑒𝑠𝑠 𝐴𝑖,𝑗 = 𝐵 + (𝑖 ∗ 𝑛𝑐𝑜𝑙 + 𝑗) ∗ 𝑙 B: foundation address of array in memory

i: index of row

(24)

Sabine Schröder Kinds of data

ARRAY DATA STRUCTURES

12 November 2018

Main Memory Secondary Memory

Swapping Out

Swapping In

inefficient access:

1. A[0][0]

2. A[1][0]

3. A[2][0]

better:

1. A[0][0]

2. A[0][1]

3. A[0][2]

A(1000,1000,1000)

inefficient: 28.45 s

efficient: 2.85 s

(25)

ARRAY DATA STRUCTURES (EXCURSUS)

(26)

Sabine Schröder Kinds of data

ARRAY DATA STRUCTURES (EXCURSUS)

12 November 2018

11

(27)

METADATA

information that describes other data

(28)

Sabine Schröder Kinds of data

METADATA

12 November 2018

information that describes other data Types of metadata:

• descriptive metadata: standard_name, unit, parameter_measurement_method

• administrative metadata: creation_date, modification_date

• structural metadata: relationships Answers questions about data like

• What?

• When?

• Where?

• Who?

• How?

• Which?

• (Why?)

(29)

METADATA

information that describes other data Types of metadata:

• descriptive metadata: standard_name, unit, parameter_measurement_method

• administrative metadata: creation_date, modification_date

• structural metadata: relationships Answers questions about data like

• What?

• When?

• Where?

(30)

Sabine Schröder Kinds of data

encrypting should not be intuitively readable encrypting is easy and NOT human readable electronic processing needs conversion ( parser )

automatic compression usually leads to binary format

fast processing (but: byte - ordering ) can be compressed every character ( digit ) consumes one byte compact

 disk space

 band width

tools might not be for free or not available for the system

every system has a free simple editor

further information (at least about the format ) needed

DIGITAL DATA FORMATS

12 November 2018

Human Readable Text Machine Readable Binary

readable not understandable without aids

intuitively usable not usable without tools or programming

rarely further information needed

(31)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE

SYSTEMS

(32)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Horizontal coordinate systems:

(33)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Horizontal coordinate systems:

nx=900, ny=451

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

(34)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Horizontal coordinate systems:

nx=900, ny=451

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

(35)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Horizontal coordinate systems:

nx=900, ny=451

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

ncells=1310720

center_longitudes(ncells) = -3.141593, … 3.141593

[radian]

center_latitudes(ncells) = -1.568645, ...

1.568645 [radian]

nv=3

longitude_vertices(ncells, nv) = -1.570796, … 1.570796

(36)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Vertical coordinate systems:

(37)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Vertical coordinate systems:

nlevel=42

level(nlevel) = 1, 3, 14, 25, 36, 47, … 84952, 95898, 100369

[Pa]

(38)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Vertical coordinate systems:

𝜎 = 𝑝

𝑝

𝑠

(39)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Vertical coordinate systems:

formula: sigma(n,k,j,i) = p(k)/ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

p(nlevel) = 100369, 95898,…

47, 36, 25, 14, 3, 1 [Pa]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, …

(40)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Vertical coordinate systems:

𝜎 = 𝑝 𝑝

𝑠

formula: sigma(n,k,j,i) = p(k)/ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

p(nlevel) = 100369, 95898,…

47, 36, 25, 14, 3, 1 [Pa]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, … 2018-11-11 12, 2018-11-12 12

[datetime]

ps(ntimes,ny,nx) = … [Pa]

(41)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Vertical coordinate systems:

formula: sigma(n,k,j,i) = p(k)/ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

p(nlevel) = 100369, 95898,…

47, 36, 25, 14, 3, 1 [Pa]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, …

(42)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Vertical coordinate systems:

𝜎 = 𝑝 𝑝

𝑠

formula: sigma(n,k,j,i) = p(k)/ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

p(nlevel) = 100369, 95898,…

47, 36, 25, 14, 3, 1 [Pa]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, … 2018-11-11 12, 2018-11-12 12

[datetime]

ps(ntimes,ny,nx) = … [Pa]

(43)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Vertical coordinate systems:

(44)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Vertical coordinate systems:

𝜎 = 𝑝

𝑝

𝑠

𝑝

𝑖

= 𝐴

𝑖

× 𝑝

0

+ 𝐵

𝑖

× 𝑝

𝑠

formula: p(n,k,j,i) = a(k)*p0 + b(k)*ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, … 2018-11-11 12, 2018-11-12 12

[datetime]

ps(ntimes,ny,nx) = … [Pa]

p0 = 100000 [Pa]

a(nlevel) = 0, 3.68387e-05, 0.00037, 0.00138, ...

[-]

b(nlevel) = 0.99882, 0.99582, 0.99114, ...

[-]

(45)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Vertical coordinate systems:

formula: p(n,k,j,i) = a(k)*p0 + b(k)*ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, … 2018-11-11 12, 2018-11-12 12

[datetime]

ps(ntimes,ny,nx) = … [Pa]

p0 = 100000 [Pa]

(46)

Sabine Schröder Kinds of data

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

12 November 2018

Vertical coordinate systems:

𝜎 = 𝑝

𝑝

𝑠

𝑝

𝑖

= 𝐴

𝑖

× 𝑝

0

+ 𝐵

𝑖

× 𝑝

𝑠

formula: p(n,k,j,i) = a(k)*p0 + b(k)*ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, … 2018-11-11 12, 2018-11-12 12

[datetime]

ps(ntimes,ny,nx) = … [Pa]

p0 = 100000 [Pa]

a(nlevel) = 0, 3.68387e-05, 0.00037, 0.00138, ...

[-]

b(nlevel) = 0.99882, 0.99582, 0.99114, ...

[-]

(47)

EXAMPLE: GEO-SCIENTIFIC DATA AND COORDINATE SYSTEMS

Vertical coordinate systems:

nlevel=42

level(nlevel) = 1, 3, 14, 25, 36, 47, … 84952, 95898, 100369

[Pa]

formula: sigma(n,k,j,i) = p(k)/ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

p(nlevel) = 100369, 95898,…

47, 36, 25, 14, 3, 1 [Pa]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, …

formula: p(n,k,j,i) = a(k)*p0 + b(k)*ps(n,j,i) nx=900 [index:i], ny=451 [index:j]

nlevel=42 [index:k]

ntimes=240 [index:n]

longitudes(nx) = 0, 0.4, 0.8, 1.2, ...

358, 358.4, 358.8, 359.2, 359.6 [degrees_east]

latitudes(ny) = -90, -89.6, -89.2, -88.8, ...

88.4, 88.8, 89.2, 89.6, 90 [degrees_north]

times(ntimes) = 2018-03-18 12, 2018-03-19 12, … 2018-11-11 12, 2018-11-12 12

[datetime]

ps(ntimes,ny,nx) = … [Pa]

p0 = 100000 [Pa]

(48)

Sabine Schröder Kinds of data

SUMMARY/OUTLOOK

12 November 2018

Data type defined by - type of value

- type of mathematical, relational or logical operations Choosing a data type dependent on

- representation of relevant real-world aspects

- intention of using the digital data (access time and efficiency) - available data space

Metadata

- information about other data:

* descriptive

* administrative

* structural Digital data

- human readable - binary

(49)

SUMMARY/OUTLOOK

How to make sure digital data is interpreted in the right way?

data standards (next lecture)

Referenzen

ÄHNLICHE DOKUMENTE

Version control machine learning models, data sets and intermediate files. Open-source Version Control System for Machine

Definition 6 (Dynamic Relationships).. This definition allows to identify whether two Dynamic Tuples are related by a specific Relationship Type. Moreover, because each Natural can

ICC claims can also be paid when buildings are determined by the community building official to be repetitively damaged or a repetitive loss structure according to the local

framework (and ontological/epistemological assumptions), advocate the use of particular types of research question (a focus on social processes or the factors that

M 3 Ab History Jeopardy preparing questions for the game yourself M 4 Tx History Jeopardy questions and answers for Ancient Athens M 5 Tx History Jeopardy questions and answers for

Q 1.1.2: In the budget instructions (and actually also in the letter of commitment instructions), we see the following: «In case more than one research group/laboratory of the

Please note that the “Directive on the submission and evaluation of applications for financial support of energy research, pilot and demonstration projects” applies to all

Proposals submitted to the current SOUR call must focus on the guiding theme of SWEET Call 1-2020 (Integration of Renewables into a Sustainable and Resilient Swiss Energy System), but