www.martinwerner.de
Kapiel 2.2
Simple Geometry for Computers
www.martinwerner.de
Computers ?
An algorithm is an abstract sequence of actions that should be performed in order to solve a specific problem.
A computer program is a formulation of an algorithm in a form that can be executed by a computer
A process is a running instance of an program.
A computer is a system able to execute a process.
Usual notion of computer
Smartphones, etc.
Quantum Computers (Annealing and Beyond)
People as well (Mechanical Turk?)
www.martinwerner.de
Electronic Compute Devices
A simple computer consists of
An Arithmetic Logical Unit (ALU), which is capable of performing calculations between registers
A set of registers for holding values for calculations as well as
Memory pointers (Stack Pointer, Instruction Pointer, ...)
Flags (Overflow, Underflow, Zero, …)
Some amount of Random Access Memory (RAM) to hold
Binary programs
Data
Some ports to interface to the user
Keyboard
www.martinwerner.de
Basic Computer
A very simple computer: Global memory is just a sequence of memory cells (array) of fixed size units (e.g., 64 bits). The CPU loads one instruction after
another for execution by an ALU for execution, which can interface to (hopefully) another area of memory
…
Program Data
Store Load
Register ALU Fetch
Arithmetic-Logical Unit Arithmetics (Compute) Logic (Compare)
Jumps (Control) Load Data
Load & Store Memory
www.martinwerner.de
Computational Complexity
A rough estimate of the number of computations needed to perform an algorithm.
Often only given approximate relative to the size of the input
Constant Time O(1)
think: the very same set of instructions needs to be executed.
Involved loops have a constant size
Linear Time O(n)
think: the number of instructions is a multiple of N plus some terms of lower scaling (e.g., some constants)
Quadratic Time O(n²)
think: we are most likely relating all pairs of elements of the input
148
www.martinwerner.de
Example (Constant Time)
Assuming that the dimension is fixed, say 2, we compute
− + −
Two Substractions
Two Multiplications (Squares)
One addition
One Square Root
Assuming that multiplications take 3 units of time (e.g., Fast Multipliers in current processors), additions and subtractions take one one unit of time and square root takes three units of time as well, we would expect
3 ⋅ 1 + 2 ⋅ 3 + 3 = 12
units of time. However, the real time is more complicated (where does the memory come from, how much did the computer already pre-
compute, etc.)
www.martinwerner.de
C++ Implementation
// Simple Example of Euclidean Distance
#include<math.h>
double distance(double x1, double y1, double x2, double y2) {
return sqrt( (x2-x1)*(x2-x1)+(y2-y1)*(y2-y1));
}
www.martinwerner.de
Real-World (No Optimizations, Generic x86_64)
distance(double, double, double, double):
push rbp mov rbp, rsp sub rsp, 48
movsd QWORD PTR [rbp-8], xmm0 movsd QWORD PTR [rbp-16], xmm1 movsd QWORD PTR [rbp-24], xmm2 movsd QWORD PTR [rbp-32], xmm3 movsd xmm0, QWORD PTR [rbp-24]
movapd xmm1, xmm0
subsd xmm1, QWORD PTR [rbp-8]
movsd xmm0, QWORD PTR [rbp-24]
subsd xmm0, QWORD PTR [rbp-8]
mulsd xmm1, xmm0
movsd xmm0, QWORD PTR [rbp-32]
movapd xmm2, xmm0
subsd xmm2, QWORD PTR [rbp-16]
movsd xmm0, QWORD PTR [rbp-32]
subsd xmm0, QWORD PTR [rbp-16]
mulsd xmm0, xmm2 addsd xmm0, xmm1 call sqrt
movq rax, xmm0
mov QWORD PTR [rbp-40], rax movsd xmm0, QWORD PTR [rbp-40]
leave ret
Each line is a CPU instruction Many more instructions than we computed.
MOV* are memory movements, we did not look into those
CALL sqrt actually calls a function SQRT (though Intel CPUs support sqrt in CPU) Lets change the compiler options from -O0 (exactly as written C++) to
-Ofast (do all you are sure about to make it fast)
www.martinwerner.de
Real-World (No Optimizations, Generic x86_64)
distance(…):
subsd xmm3, xmm1 subsd xmm2, xmm0 mulsd xmm2, xmm2 movapd xmm0, xmm3 mulsd xmm0, xmm3 addsd xmm2, xmm0 sqrtsd xmm0, xmm2 ret
Okay, pretty much what we expected.
But there is another MOV instruction.
SQRT is done in CPU
www.martinwerner.de
Real-World (x86_64 SSE / AVX)
distance(…):
subsd xmm3, xmm1 xmm3.low = y2-y1 subsd xmm2, xmm0 xmm2.low = x2-x1 mulsd xmm2, xmm2 xmm2 = (x2-x1)² movapd xmm0, xmm3 xmm0 = y2-y1
mulsd xmm0, xmm3 xmm0 = (y2-y1)²
addsd xmm2, xmm0 xmm2 = sum of (…)
sqrtsd xmm0, xmm2 xmm0 = sqrt (…)
ret
www.martinwerner.de
Takeaways
The difference of the two implementation was bound to a very small detail:
The CPU architecture is 64 bits
The arguments are 64 bits
The squares are, therefore, 128 bits, represented with two CPU units.
SSE allows for this (mmx registers are 128 bit) while CPU needs to copy around two values each time
Remember: The estimated performance in terms of number of
operations is already overruled by the hardware architecture turning more than 26 instructions (base x86, in addition sqrt function) to 8
www.martinwerner.de
Remarks
The runtime of the previous example is – in addition – changed by the load of the computer. If there is a spare thread in the core that is actually executing, it is not unlikely that the two substractions are going to be
executed in parallel.
Conclusion: Complexity of Algorithms is a bad vehicle for estimating performance. You need careful real-world implementations for that.
However, it is a better than nothing machinery and helps you to compare two algorithms at least very roughly.
www.martinwerner.de
Clash of Cultures
This is actually related to a common clash of cultures
Engineers like real-world thing.
They feel this was more convincing
They are right w.r.t computer program performance, as we have seen on the previous slides
Results are, however, very vague even if public references exist:
how much more effort was put into optimizing the presented case as opposed to the reference.
„My algorithm is better than related work because I took care to optimize it while I just hacked together related work.“
www.martinwerner.de
Clash of Cultures
This is actually related to a common clash of cultures
Scientists like idealized things.
The claims are unquestionably true, supported by proofs
They might be wrong with details, but on a larger scope, the conclusions are valid
Their claims might, however, not be realizable or the relation between a claim and a real problem can be vague.
„My algorithm is better than related work because it has a lower theoretical complexity.“
www.martinwerner.de
Representing Geometry: Points
The basic element of Geometry is a Point.
It is usually expressed as an array-like object such that you can access the element coordinates.
It might as well be represented in object-oriented style
Computationally, the best representation might be an aligned memory block:
X Y Z
www.martinwerner.de
Lines
Sequences of Points are then concatenations of point arrays into a contiguous memory array
Interleaved: Points stay togehter. Pro: point components are near and the array is extendible. Con: You cannot access only one component efficiently.
X Y Z X Y Z
Interleaved
X X Y Y Z Z
Component-Wise
www.martinwerner.de
Polygons
Polygons (and many other geometric objects) contain a variable number of elements (e.g., points)
Two abstract strategies can be distinguished:
Modeling the length or Implicit Length Model the length
Delimiter
X Y Z X Y Z X Y Z
3
X Y Z X Y Z X Y Z NaN X Y Z
4 X Y Z
www.martinwerner.de
Abstraktion
Im Rest der Vorlesung (außer bei bestimmten Algorithmen) werden wir nicht konkrete Datentypen besprechen
Anstelle dessen gehen wir davon aus, dass die Datentypen während der Implementierung geeignet gewählt werden.
Wir schauen also mehr auf die vom Datentyp zu erfüllenden Eigenschaften (Konzepte in C++)
Beispiel: Die beiden Polygonen verhalten sich geometrisch und semantisch identisch. Möchte man allerdings auf das n-te Polygon
zugreifen, erlaubt der erste Ansatz eine Iteration bei der man die Daten schnell überspringt (lese Länge, lese Speicherplatz, wo nächste Länge stehen sollte, …), während man im zweiten Ansatz alle Daten
anschauen muss, um die Begrenzungen zu finden.
www.martinwerner.de
Kapitel 2.3
OGC Simple Feature Geometry
www.martinwerner.de
Simple Features (vereinfacht, reduziert)
www.martinwerner.de
Geometry Class Hierarchy
www.martinwerner.de
Some Restrictions Made
Geometry objecs are 0D, 1D, 2D, living all in coordinate spaces of 2D, 3D, or 4D space. Coordinates are named x,y,z,m
Interpretation bound to coordinate reference systems
Elements of one Geometry object should be from the same CRS
Z is typically elevation, m is reseverved for „measurement“
www.martinwerner.de
What all Geometries support (excerpt)
Some properties like dimensionalities, SRID=Spatial Reference ID, Type
[instantiable type like Point, etc.], Enevlope = Coordinate-Parallel Box containing
www.martinwerner.de
What all Geometries support (excerpt)
Some Queries (equality, disjoint, intersection, touching, crossing, containment, overlapping, etc.). The locate functions are for subsetting based on the
measurement value.
www.martinwerner.de
What all Geometries support (excerpt)
Some operations for analysis. Buffer enlarges the object a bit, convex hull creates a convex closure, the other four (intersection, union, difference,
www.martinwerner.de
WKT and WKB
Simple Feature Access standardizes a language to express geometries in compact (e.g., binary) and human-readable (e.g., text) form.
WKB: binary version „Well-Known Binary“
WKT: text version „Well-Known Text“
The following slides give the SFC definition of the classes we have seen in our simple geometry chapter and an example in WKT.
www.martinwerner.de
Point
A point has up to 4 parameters, where X, Y, Z are spatial coordinates and M is reserved for „measurement“
www.martinwerner.de
Curve / LINESTRING
SFC defines a single curve called „LineString“ which assumes linear
interpolation as defined for our „Line“. It is closed if start and end point coincide.
It is simple if it does not pass through the same point twice. It is a ring if it is closed and simple.
www.martinwerner.de
Simple, Closed, Ring…
a) Simple, not closed; b) not simple, not closed c) Simple and closed, hence a ring
www.martinwerner.de
POLYGON
One exterior boundary, Zero or more interior boundaries
Triangle is three vertex polygon without interior boundary
Exterior boundaries are counter-clockwise, Interior are clockwise
Constraints
Topologically closed
No two rings in the boundaries cross, they are allowed to touch at a point (tangent)
All rings are non-degenerate (e.g., P = Closure (interior(P))
Interior is connected
Exterior is not connected, each compact connected component is called a whole and is bounded by an interior boundary
www.martinwerner.de
Some valid examples
These polygons are valid SFC polygons.
www.martinwerner.de
These are not valid (as a single polygon). Why?
www.martinwerner.de
Too restrictive?
Why those many constraints?
This is a traditional tradeoff explained by some examples:
Less constraints:
Positive: The output of many algorithms using polygons is naturally a polygon
Negative: The input does not have good properties meaning that we need to handle many special cases in algorithms
and vice versa
www.martinwerner.de
MULTI-Classes
All geometries have a MULTI* class defined which is basically an array of the geometry
MULTIPOINT: multiple points
MULTIPOLYGON: multiple polygons
MULTILINESTRING: multiple linestring curves
…
www.martinwerner.de
MULTIPOLYGONS
The assumptions are similar to the polygon with holes case, see the SFC standard for details. Image shows valid MULTIPOLYGON instances
www.martinwerner.de
Invalid MULTIPOLYGONs
These are invalid as a single instance of a MULTIPOLYGON with similar
philosophy as explained for polygons: no spikes, no degeneracies, no overlap, touch only in a single point.
www.martinwerner.de
Kapitel 2.4
Relational Operators on SFC
www.martinwerner.de
Relational Operator
The topological relation of geometries can be defined for all pairs of
geometries by intuition. However, a more general and, thereby, simpler approach is given by the following construction:
For each geometry, define
The interior
The exterior
The boundary
We have done so for the presented geometries. We can forget the high- order structure of some boundaries (e.g., we defined boundaries of
polygons to be the union of the rings) and treat those three objects as (infinite) point sets.
www.martinwerner.de
9 Intersections
Given two geometries, each of them gives rise to those three sets and we can form three intersections. We record only the dimension of the intersection giving
182
www.martinwerner.de
Example: Polygons (without inner rings)
First, realize that some values are constrained:
The boundaries can intersect in a few points or nowhere (dimension 1 or -1)
The inner can intersect with a two-dimensional intersection (dim=2) or not at all (dim=-1)
Algorithm: Compute the set of points where the rings intersect (by line-line intersection, which we will learn later)
If this is empty, the matrix is defined from this fact
If it contains a single element, the polygons touch, the matrix is known
If it contains more than one element, the interior of the intersection is non- empty and we are done with the matrix
www.martinwerner.de
Case 1: No Intersection
Interior Boundary Exterior
Interior -1 -1 -1
Boundary -1 -1 -1
Exterior -1 -1 -1
www.martinwerner.de
Case 2: Touching
Interior Boundary Exterior
Interior -1 -1 2
Boundary -1 0 1
Exterior 2 1 -1
www.martinwerner.de
Case 3: Intersection
Interior Boundary Exterior
Interior 2 1 2
Boundary 1 0 1
Exterior 2 1 2
www.martinwerner.de
Defining Relations
A relation is now defined as a pattern on this matrix
A pattern consists of the following pattern values (one for each location in the matrix).
T: the intersection is non-empty, ignore dimension
F: the intersection is empty
*: we don‘t care, any value is good
0,1,2: the value must match exactly
These patterns are encoded in row-major order (e.g., one row after another) and used in the Relate query as shown on the next slide
www.martinwerner.de
Example: Overlap
relation = geometry::Relate(A,B,”T*T***T**”)
Read: The interior intersection is nonempty and so is the intersection of one interior with the other exterior and vice versa. The rest, we don’t care.
This relation is also known as “Overlap”.
Interior Boundary Exterior
Interior T * T
Boundary * * *
Exterior T * *
www.martinwerner.de
Some other relations
Simple Relations
(Topological) Equality T*F**FFF*
Disjoint FF*FF****
Contains T*****FF*
Multiple Pattern Relations
Touches / Meets FT******* or F**T***** or F***T****
Covers T*****FF* or
*T****FF* or
***T**FF* or
****T*FF*
www.martinwerner.de
A few examples
www.martinwerner.de
WKT Representations
WKT is formally defined by OGC by giving the BNF specification. In this way, you can create rock-solid parsers from this.
The following examples are taken from the OGC standard document.
They show how to use WKT to represent certain geometries.
www.martinwerner.de
WKT Examples
www.martinwerner.de
WKT Examples
www.martinwerner.de
WKT Examples
www.martinwerner.de
WKB
OGC defines as well a binary representation for all the simple
features. Of course, this is more memory-efficient and faster to use in applications.
For this, an interleaving memory alignment has been chosen.
Whole object assumption
If you use a simple feature, you use it completely. You don‘t use parts of it…
www.martinwerner.de
Kapitel 2.5
Examples of Constructive Operations for SFC
www.martinwerner.de
Buffer
The Buffer of a Geometry contains the geometry and all points that are within distance d from the Geometry.
www.martinwerner.de
The convex hull of a geometry is the smallest convex polygon enclosing the geometry.
www.martinwerner.de
Difference
The difference as defined by OGC: A representation (in this case two polygons) of the area of A that is not a subset of B.
− =
www.martinwerner.de
Set Operations
Operation A geometric object that…
Intersection Represents the intersection of the point sets of both geometries
Union Represents the union of the point sets of both geometries
Difference Represents the difference of the point sets of both geometries
SymDifference Represents the symmetric difference of the point sets of both geometries
www.martinwerner.de
Software Coverage
Simple Features are supported by all major GIS systems Relational DBMS with GIS Extension
PostGIS, MySQL, MS SQL, most major database systems Command Line Tools and Libraries
GDAL/OGR supports this through SQL extensions to most of the invovled programs
Programming / Scripting Languages (suggestions for starter)
Sprache Library Remark
R sf, sp Functional view
Python Shapely Slow, ideal for learning
C++ Boost::Geometry Fast, Simple & Correct
www.martinwerner.de
Shapely Game
Implement a software for the following interactive aspects:
- Draw two shapes, define the relation you want (or take from a library) - Draw the result
- Run the operations
- Export the result to WKT (and load in QGIS as a quick test)
www.martinwerner.de
Kapitel 3:
Geographic Information Systems
www.martinwerner.de
Our main tool (for learning)
Layers
Browser
Map
Navigation
Tools
Coordinates Projections
www.martinwerner.de
Layer
A unit of information in map generation (cartography, mapping) Typically, either
a raster together with a spatial reference
or a dataset of simple features
A map is composed as the composite of an ordered set of layers.
Typically, a cartograph creates novel layers by geoprocessing, assigns styles, and creates a map which transports the information to a human user.
www.martinwerner.de
Google Maps Basislayer
www.martinwerner.de
Google Maps
… + Verkehr
www.martinwerner.de
Google Maps
…+ Fotos
www.martinwerner.de
Google Maps
… + …
www.martinwerner.de
LOD bei CityGML
LOD0
• Höhen
• Gebäudeumrisse
• Straßen
210
www.martinwerner.de
LOD bei CityGML
LOD0
• Höhen
• Gebäudeumrisse
• Straßen LOD1
• Gebäudehöhen
Abb: T. Kolbe, G. Gröger, L. Plümer. “CityGML–Interoperable access to 3D city models”. In: First
www.martinwerner.de
LOD bei CityGML
LOD0
• Höhen
• Gebäudeumrisse
• Straßen LOD1
• Gebäudehöhen LOD2
• Gebäudestrukturen
212
www.martinwerner.de
LOD bei CityGML
LOD0
• Höhen
• Gebäudeumrisse
• Straßen LOD1
• Gebäudehöhen LOD2
• Gebäudestrukturen LOD3
• Wandstrukturen
• Fenster
Abb: T. Kolbe, G. Gröger, L. Plümer. “CityGML–Interoperable access to 3D city models”. In: First
www.martinwerner.de
LOD bei CityGML
LOD0
• Höhen
• Gebäudeumrisse
• Straßen LOD1
• Gebäudehöhen LOD2
• Gebäudestrukturen LOD3
• Wandstrukturen
• Fenster LOD5