Kapiel 2.2 Simple Geometry for Computers

(1)

www.martinwerner.de

Kapiel 2.2

Simple Geometry for Computers

(2)

www.martinwerner.de

Computers ?

 An algorithm is an abstract sequence of actions that should be performed in order to solve a specific problem.

 A computer program is a formulation of an algorithm in a form that can be executed by a computer

 A process is a running instance of an program.

A computer is a system able to execute a process.

 Usual notion of computer

 Smartphones, etc.

 Quantum Computers (Annealing and Beyond)

 People as well (Mechanical Turk?)

(3)

www.martinwerner.de

Electronic Compute Devices

 A simple computer consists of

 An Arithmetic Logical Unit (ALU), which is capable of performing calculations between registers

 A set of registers for holding values for calculations as well as

 Memory pointers (Stack Pointer, Instruction Pointer, ...)

 Flags (Overflow, Underflow, Zero, …)

 Some amount of Random Access Memory (RAM) to hold

 Binary programs

 Data

 Some ports to interface to the user

 Keyboard

(4)

www.martinwerner.de

Basic Computer

A very simple computer: Global memory is just a sequence of memory cells (array) of fixed size units (e.g., 64 bits). The CPU loads one instruction after

another for execution by an ALU for execution, which can interface to (hopefully) another area of memory

…

Program Data

Store Load

Register ALU Fetch

Arithmetic-Logical Unit Arithmetics (Compute) Logic (Compare)

Jumps (Control) Load Data

Load & Store Memory

(5)

www.martinwerner.de

Computational Complexity

 A rough estimate of the number of computations needed to perform an algorithm.

 Often only given approximate relative to the size of the input

 Constant Time O(1)

think: the very same set of instructions needs to be executed.

Involved loops have a constant size

 Linear Time O(n)

think: the number of instructions is a multiple of N plus some terms of lower scaling (e.g., some constants)

 Quadratic Time O(n²)

think: we are most likely relating all pairs of elements of the input

(6)

148

www.martinwerner.de

Example (Constant Time)

Assuming that the dimension is fixed, say 2, we compute

− + −

 Two Substractions

 Two Multiplications (Squares)

 One addition

 One Square Root

Assuming that multiplications take 3 units of time (e.g., Fast Multipliers in current processors), additions and subtractions take one one unit of time and square root takes three units of time as well, we would expect

3 ⋅ 1 + 2 ⋅ 3 + 3 = 12

units of time. However, the real time is more complicated (where does the memory come from, how much did the computer already pre-

compute, etc.)

(7)

www.martinwerner.de

C++ Implementation

// Simple Example of Euclidean Distance

#include<math.h>

double distance(double x1, double y1, double x2, double y2) {

return sqrt( (x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));

}

(8)

www.martinwerner.de

Real-World (No Optimizations, Generic x86_64)

distance(double, double, double, double):

push rbp mov rbp, rsp sub rsp, 48

movsd QWORD PTR [rbp-8], xmm0 movsd QWORD PTR [rbp-16], xmm1 movsd QWORD PTR [rbp-24], xmm2 movsd QWORD PTR [rbp-32], xmm3 movsd xmm0, QWORD PTR [rbp-24]

movapd xmm1, xmm0

subsd xmm1, QWORD PTR [rbp-8]

movsd xmm0, QWORD PTR [rbp-24]

mulsd xmm1, xmm0

movapd xmm2, xmm0

mulsd xmm0, xmm2 addsd xmm0, xmm1 call sqrt

movq rax, xmm0

mov QWORD PTR [rbp-40], rax movsd xmm0, QWORD PTR [rbp-40]

leave ret

Each line is a CPU instruction Many more instructions than we computed.

MOV* are memory movements, we did not look into those

CALL sqrt actually calls a function SQRT (though Intel CPUs support sqrt in CPU) Lets change the compiler options from -O0 (exactly as written C++) to

-Ofast (do all you are sure about to make it fast)

(9)

www.martinwerner.de

Real-World (No Optimizations, Generic x86_64)

distance(…):

subsd xmm3, xmm1 subsd xmm2, xmm0 mulsd xmm2, xmm2 movapd xmm0, xmm3 mulsd xmm0, xmm3 addsd xmm2, xmm0 sqrtsd xmm0, xmm2 ret

Okay, pretty much what we expected.

But there is another MOV instruction.

SQRT is done in CPU

(10)

www.martinwerner.de

Real-World (x86_64 SSE / AVX)

distance(…):

subsd xmm3, xmm1 xmm3.low = y2-y1 subsd xmm2, xmm0 xmm2.low = x2-x1 mulsd xmm2, xmm2 xmm2 = (x2-x1)² movapd xmm0, xmm3 xmm0 = y2-y1

mulsd xmm0, xmm3 xmm0 = (y2-y1)²

addsd xmm2, xmm0 xmm2 = sum of (…)

sqrtsd xmm0, xmm2 xmm0 = sqrt (…)

ret

(11)

www.martinwerner.de

Takeaways

 The difference of the two implementation was bound to a very small detail:

 The CPU architecture is 64 bits

 The arguments are 64 bits

 The squares are, therefore, 128 bits, represented with two CPU units.

 SSE allows for this (mmx registers are 128 bit) while CPU needs to copy around two values each time

 Remember: The estimated performance in terms of number of

operations is already overruled by the hardware architecture turning more than 26 instructions (base x86, in addition sqrt function) to 8

(12)

www.martinwerner.de

Remarks

The runtime of the previous example is – in addition – changed by the load of the computer. If there is a spare thread in the core that is actually executing, it is not unlikely that the two substractions are going to be

executed in parallel.

Conclusion: Complexity of Algorithms is a bad vehicle for estimating performance. You need careful real-world implementations for that.

However, it is a better than nothing machinery and helps you to compare two algorithms at least very roughly.

(13)

www.martinwerner.de

Clash of Cultures

This is actually related to a common clash of cultures

 Engineers like real-world thing.

 They feel this was more convincing

 They are right w.r.t computer program performance, as we have seen on the previous slides

 Results are, however, very vague even if public references exist:

how much more effort was put into optimizing the presented case as opposed to the reference.

„My algorithm is better than related work because I took care to optimize it while I just hacked together related work.“

(14)

www.martinwerner.de

Clash of Cultures

This is actually related to a common clash of cultures

 Scientists like idealized things.

 The claims are unquestionably true, supported by proofs

 They might be wrong with details, but on a larger scope, the conclusions are valid

 Their claims might, however, not be realizable or the relation between a claim and a real problem can be vague.

„My algorithm is better than related work because it has a lower theoretical complexity.“

(15)

www.martinwerner.de

Representing Geometry: Points

The basic element of Geometry is a Point.

It is usually expressed as an array-like object such that you can access the element coordinates.

It might as well be represented in object-oriented style

Computationally, the best representation might be an aligned memory block:

X Y Z

(16)

www.martinwerner.de

Lines

Sequences of Points are then concatenations of point arrays into a contiguous memory array

Interleaved: Points stay togehter. Pro: point components are near and the array is extendible. Con: You cannot access only one component efficiently.

X Y Z X Y Z

Interleaved

X X Y Y Z Z

Component-Wise

(17)

www.martinwerner.de

Polygons

Polygons (and many other geometric objects) contain a variable number of elements (e.g., points)

Two abstract strategies can be distinguished:

Modeling the length or Implicit Length Model the length

Delimiter

X Y Z X Y Z X Y Z

3

X Y Z X Y Z X Y Z NaN X Y Z

4 X Y Z

(18)

www.martinwerner.de

Abstraktion

 Im Rest der Vorlesung (außer bei bestimmten Algorithmen) werden wir nicht konkrete Datentypen besprechen

 Anstelle dessen gehen wir davon aus, dass die Datentypen während der Implementierung geeignet gewählt werden.

 Wir schauen also mehr auf die vom Datentyp zu erfüllenden Eigenschaften (Konzepte in C++)

Beispiel: Die beiden Polygonen verhalten sich geometrisch und semantisch identisch. Möchte man allerdings auf das n-te Polygon

zugreifen, erlaubt der erste Ansatz eine Iteration bei der man die Daten schnell überspringt (lese Länge, lese Speicherplatz, wo nächste Länge stehen sollte, …), während man im zweiten Ansatz alle Daten

anschauen muss, um die Begrenzungen zu finden.

(19)

www.martinwerner.de

Kapitel 2.3

OGC Simple Feature Geometry

(20)

www.martinwerner.de

Simple Features (vereinfacht, reduziert)

(21)

www.martinwerner.de

Geometry Class Hierarchy

(22)

www.martinwerner.de

Some Restrictions Made

 Geometry objecs are 0D, 1D, 2D, living all in coordinate spaces of 2D, 3D, or 4D space. Coordinates are named x,y,z,m

 Interpretation bound to coordinate reference systems

 Elements of one Geometry object should be from the same CRS

 Z is typically elevation, m is reseverved for „measurement“

(23)

www.martinwerner.de

What all Geometries support (excerpt)

Some properties like dimensionalities, SRID=Spatial Reference ID, Type

[instantiable type like Point, etc.], Enevlope = Coordinate-Parallel Box containing

(24)

www.martinwerner.de

Some Queries (equality, disjoint, intersection, touching, crossing, containment, overlapping, etc.). The locate functions are for subsetting based on the

measurement value.

(25)

www.martinwerner.de

Some operations for analysis. Buffer enlarges the object a bit, convex hull creates a convex closure, the other four (intersection, union, difference,

(26)

www.martinwerner.de

WKT and WKB

Simple Feature Access standardizes a language to express geometries in compact (e.g., binary) and human-readable (e.g., text) form.

WKB: binary version „Well-Known Binary“

WKT: text version „Well-Known Text“

The following slides give the SFC definition of the classes we have seen in our simple geometry chapter and an example in WKT.

(27)

www.martinwerner.de

Point

A point has up to 4 parameters, where X, Y, Z are spatial coordinates and M is reserved for „measurement“

(28)

www.martinwerner.de

Curve / LINESTRING

SFC defines a single curve called „LineString“ which assumes linear

interpolation as defined for our „Line“. It is closed if start and end point coincide.

It is simple if it does not pass through the same point twice. It is a ring if it is closed and simple.

(29)

www.martinwerner.de

Simple, Closed, Ring…

a) Simple, not closed; b) not simple, not closed c) Simple and closed, hence a ring

(30)

www.martinwerner.de

POLYGON

 One exterior boundary, Zero or more interior boundaries

 Triangle is three vertex polygon without interior boundary

 Exterior boundaries are counter-clockwise, Interior are clockwise

 Constraints

 Topologically closed

 No two rings in the boundaries cross, they are allowed to touch at a point (tangent)

 All rings are non-degenerate (e.g., P = Closure (interior(P))

 Interior is connected

 Exterior is not connected, each compact connected component is called a whole and is bounded by an interior boundary

(31)

www.martinwerner.de

Some valid examples

These polygons are valid SFC polygons.

(32)

www.martinwerner.de

These are not valid (as a single polygon). Why?

(33)

www.martinwerner.de

Too restrictive?

Why those many constraints?

This is a traditional tradeoff explained by some examples:

 Less constraints:

 Positive: The output of many algorithms using polygons is naturally a polygon

 Negative: The input does not have good properties meaning that we need to handle many special cases in algorithms

and vice versa

(34)

www.martinwerner.de

MULTI-Classes

 All geometries have a MULTI* class defined which is basically an array of the geometry

 MULTIPOINT: multiple points

 MULTIPOLYGON: multiple polygons

 MULTILINESTRING: multiple linestring curves

 …

(35)

www.martinwerner.de

MULTIPOLYGONS

The assumptions are similar to the polygon with holes case, see the SFC standard for details. Image shows valid MULTIPOLYGON instances

(36)

www.martinwerner.de

Invalid MULTIPOLYGONs

These are invalid as a single instance of a MULTIPOLYGON with similar

philosophy as explained for polygons: no spikes, no degeneracies, no overlap, touch only in a single point.

(37)

www.martinwerner.de

Kapitel 2.4

Relational Operators on SFC

(38)

www.martinwerner.de

Relational Operator

The topological relation of geometries can be defined for all pairs of

geometries by intuition. However, a more general and, thereby, simpler approach is given by the following construction:

 For each geometry, define

 The interior

 The exterior

 The boundary

We have done so for the presented geometries. We can forget the high- order structure of some boundaries (e.g., we defined boundaries of

polygons to be the union of the rings) and treat those three objects as (infinite) point sets.

(39)

www.martinwerner.de

9 Intersections

Given two geometries, each of them gives rise to those three sets and we can form three intersections. We record only the dimension of the intersection giving

(40)

182

www.martinwerner.de

Example: Polygons (without inner rings)

First, realize that some values are constrained:

The boundaries can intersect in a few points or nowhere (dimension 1 or -1)

The inner can intersect with a two-dimensional intersection (dim=2) or not at all (dim=-1)

Algorithm: Compute the set of points where the rings intersect (by line-line intersection, which we will learn later)

If this is empty, the matrix is defined from this fact

If it contains a single element, the polygons touch, the matrix is known

If it contains more than one element, the interior of the intersection is nonempty and we are done with the matrix

(41)

www.martinwerner.de

Case 1: No Intersection

Interior Boundary Exterior

Interior -1 -1 -1

Boundary -1 -1 -1

Exterior -1 -1 -1

(42)

www.martinwerner.de

Case 2: Touching

Interior -1 -1 2

Boundary -1 0 1

Exterior 2 1 -1

(43)

www.martinwerner.de

Case 3: Intersection

Interior 2 1 2

Boundary 1 0 1

Exterior 2 1 2

(44)

www.martinwerner.de

Defining Relations

 A relation is now defined as a pattern on this matrix

 A pattern consists of the following pattern values (one for each location in the matrix).

 T: the intersection is non-empty, ignore dimension

 F: the intersection is empty

 *: we don‘t care, any value is good

 0,1,2: the value must match exactly

 These patterns are encoded in row-major order (e.g., one row after another) and used in the Relate query as shown on the next slide

(45)

www.martinwerner.de

Example: Overlap

relation = geometry::Relate(A,B,”TTT”)

Read: The interior intersection is nonempty and so is the intersection of one interior with the other exterior and vice versa. The rest, we don’t care.

This relation is also known as “Overlap”.

Interior T * T

Boundary * * *

Exterior T * *

(46)

www.martinwerner.de

Some other relations

Simple Relations

(Topological) Equality T*F**FFF*

Disjoint FF*FF****

Contains T*****FF*

Multiple Pattern Relations

Touches / Meets FT******* or F**T***** or F***T****

Covers T*****FF* or

*T****FF* or

***T**FF* or

****T*FF*

(47)

www.martinwerner.de

A few examples

(48)

www.martinwerner.de

WKT Representations

WKT is formally defined by OGC by giving the BNF specification. In this way, you can create rock-solid parsers from this.

The following examples are taken from the OGC standard document.

They show how to use WKT to represent certain geometries.

(49)

www.martinwerner.de

WKT Examples

(50)

www.martinwerner.de

WKT Examples

(51)

www.martinwerner.de

WKT Examples

(52)

www.martinwerner.de

WKB

 OGC defines as well a binary representation for all the simple

features. Of course, this is more memory-efficient and faster to use in applications.

 For this, an interleaving memory alignment has been chosen.

 Whole object assumption

 If you use a simple feature, you use it completely. You don‘t use parts of it…

(53)

www.martinwerner.de

Kapitel 2.5

Examples of Constructive Operations for SFC

(54)

www.martinwerner.de

Buffer

The Buffer of a Geometry contains the geometry and all points that are within distance d from the Geometry.

(55)

www.martinwerner.de

The convex hull of a geometry is the smallest convex polygon enclosing the geometry.

(56)

www.martinwerner.de

Difference

The difference as defined by OGC: A representation (in this case two polygons) of the area of A that is not a subset of B.

− =

(57)

www.martinwerner.de

Set Operations

Operation A geometric object that…

Intersection Represents the intersection of the point sets of both geometries

Union Represents the union of the point sets of both geometries

Difference Represents the difference of the point sets of both geometries

SymDifference Represents the symmetric difference of the point sets of both geometries

(58)

www.martinwerner.de

Software Coverage

Simple Features are supported by all major GIS systems Relational DBMS with GIS Extension

 PostGIS, MySQL, MS SQL, most major database systems Command Line Tools and Libraries

 GDAL/OGR supports this through SQL extensions to most of the invovled programs

Programming / Scripting Languages (suggestions for starter)

Sprache Library Remark

R sf, sp Functional view

Python Shapely Slow, ideal for learning

C++ Boost::Geometry Fast, Simple & Correct

(59)

www.martinwerner.de

Shapely Game

Implement a software for the following interactive aspects:

- Draw two shapes, define the relation you want (or take from a library) - Draw the result

- Run the operations

- Export the result to WKT (and load in QGIS as a quick test)

(60)

www.martinwerner.de

Kapitel 3:

Geographic Information Systems

(61)

www.martinwerner.de

Our main tool (for learning)

Layers

Browser

Map

Navigation

Tools

Coordinates Projections

(62)

www.martinwerner.de

Layer

A unit of information in map generation (cartography, mapping) Typically, either

 a raster together with a spatial reference

 or a dataset of simple features

A map is composed as the composite of an ordered set of layers.

Typically, a cartograph creates novel layers by geoprocessing, assigns styles, and creates a map which transports the information to a human user.

(63)

www.martinwerner.de

Google Maps Basislayer

(64)

www.martinwerner.de

Google Maps

… + Verkehr

(65)

www.martinwerner.de

Google Maps

…+ Fotos

(66)

www.martinwerner.de

Google Maps

… + …

(67)

www.martinwerner.de

LOD bei CityGML

LOD0

• Höhen

• Gebäudeumrisse

• Straßen

(68)

210

www.martinwerner.de

LOD bei CityGML

LOD0

• Höhen

• Straßen LOD1

• Gebäudehöhen

Abb: T. Kolbe, G. Gröger, L. Plümer. “CityGML–Interoperable access to 3D city models”. In: First

(69)

www.martinwerner.de

LOD bei CityGML

LOD0

• Höhen

• Gebäudehöhen LOD2

• Gebäudestrukturen

(70)

212

www.martinwerner.de

LOD bei CityGML

LOD0

• Höhen

• Gebäudestrukturen LOD3

• Wandstrukturen

• Fenster

Abb: T. Kolbe, G. Gröger, L. Plümer. “CityGML–Interoperable access to 3D city models”. In: First

(71)

www.martinwerner.de

LOD bei CityGML

LOD0

• Höhen

• Gebäudestrukturen LOD3

• Wandstrukturen

• Fenster LOD5

Kapiel 2.2 Simple Geometry for Computers

Kapiel 2.2

Simple Geometry for Computers

distance(…):

subsd xmm3, xmm1 subsd xmm2, xmm0 mulsd xmm2, xmm2 movapd xmm0, xmm3 mulsd xmm0, xmm3 addsd xmm2, xmm0 sqrtsd xmm0, xmm2 ret

distance(…):

subsd xmm3, xmm1 xmm3.low = y2-y1 subsd xmm2, xmm0 xmm2.low = x2-x1 mulsd xmm2, xmm2 xmm2 = (x2-x1)² movapd xmm0, xmm3 xmm0 = y2-y1

mulsd xmm0, xmm3 xmm0 = (y2-y1)²

addsd xmm2, xmm0 xmm2 = sum of (…)

sqrtsd xmm0, xmm2 xmm0 = sqrt (…)

ret

Kapitel 2.3

OGC Simple Feature Geometry

Kapitel 2.4

Relational Operators on SFC

relation = geometry::Relate(A,B,”T*T***T**”)

Kapitel 2.5

Examples of Constructive Operations for SFC

− =

Kapitel 3:

Geographic Information Systems

relation = geometry::Relate(A,B,”TTT”)