• Keine Ergebnisse gefunden

Data Management

N/A
N/A
Protected

Academic year: 2021

Aktie "Data Management"

Copied!
55
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Peer-to-Peer

Data Management

Hans-Dieter Ehrich

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

(2)

7. Unstructured P2P Networks

The transparencies of this chapter are based on the package

Structured Peer-to-Peer Networks by

Wolf-Tilo Balke and Wolf Siberski 31.10.2007

● Original slides partially provided by

K. Wehrle, S. Götz, S. Rieche

(University of Tübingen)

(3)

8. Structured P2P Networks

1. Distributed Management and Retrieval of Data

1.

Comparison of strategies for data retrieval

2.

Central server

3.

Flooding search

4.

Distributed indexing

5.

Comparison of lookup concepts

2. Fundamentals of Distributed Hash Tables

1.

Distributed management of data

2.

Addressing in Distributed Hash Tables

3.

Routing

4.

Data Storage 3. DHT Mechanisms

1.

Node Arrival

2.

Node Failure / Departure 4. DHT Interfaces

5. Example: Chord

(4)

Distributed Management and Retrieval of Data

Essential challenge in (most) Peer-to-Peer systems?

Location of a data item among systems distributed

 Where shall the item be stored by the provider?

 How does a requester find the actual location of an item?

Scalability: keep the complexity for communication and storage scalable

Robustness and resilience in case of faults and frequent changes

D

?

Data item „D“

distributed system

7.31.10.25 peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18 planet-lab.org

berkeley.edu 89.11.20.15

I have item „D“.

Where to place „D“?

I want item „D“.

Where can I find „D“?

(5)

Comparison of Strategies for Data Retrieval

Strategies to store and retrieve data items in distributed systems

Central server

Flooding search

Distributed indexing

D

?

Data item „D“

distributed system

7.31.10.25 peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18 planet-lab.org

berkeley.edu 89.11.20.15

I have item „D“.

Where to place „D“?

I want item „D“.

Where can I find „D“?

(6)

Transmission: D Node B

“Where is D ?”

“A stores D”

Node A Node B

Server S

“A stores D”

“A stores D”

Approach I: Central Server

● Simple strategy: Central Server

Server stores information about locations

 Node A (provider) tells server that it stores item D

 Node B (requester) asks server S for the location of D

 Server S tells B that node A stores item D

 Node B requests item D from node A

(7)

Approach I: Central Server

Advantages

Search complexity of O(1) – “just ask the server”

Complex and fuzzy queries are possible

Simple and fast

Problems

No Scalability

 O(N) node state in server

 O(N) network and system load of server

Single point of failure or attack (also for law suites ;-)

Non-linear increasing implementation and maintenance cost

(in particular for achieving high availability and scalability)

Central server not suitable for systems with massive numbers of users

But overall, …

Best principle for small and simple applications!

(8)

Approach II: Flooding Search

Fully Distributed Approach

Central systems are vulnerable and do not scale

Unstructured Peer-to-Peer systems follow opposite approach

No information on location of a content

Content is only stored in the node providing it

Retrieval of data

No routing information for content

Necessity to ask as much systems as possible / necessary

Approaches

 Flooding: high traffic load on network, does not scale

 Highest degree search: quick search through large areas – large number of messages needed for unique identification

(9)

& Transmission: D  Node B

“I have D ?”

“B searchesD”

Node B

“I store D”

 Fully Decentralized Approach: Flooding Search

 No information about location of data in the intermediate systems

 Necessity for broad search

 Node B (requester) asks neighboring nodes for item D

- Nodes forward request to further nodes (breadth-first search / flooding)

 Node A (provider of item D) sends D to requesting node B

 

Approach II: Flooding Search

(10)

Motivation Distributed Indexing – I

● Communication overhead vs. node state

Comm un ica tio n Ov er hea d

Node State

Flooding

Central Server O(N)

O(N) O(1)

O(1)

O(log N) O(log N)

Bottleneck:

Communication Overhead

False negatives

Bottlenecks:

Memory, CPU, Network

Availability

?

Scalable solution between both

extremes?

(11)

Motivation Distributed Indexing – II

● Communication overhead vs. node state

Comm un ica tio n Ov er hea d

Flooding

Central Server O(N)

O(N) O(1)

O(1)

O(log N) O(log N)

Bottleneck:

Communication Overhead

False negatives

Bottlenecks:

Memory, CPU, Network

Availability Distributed

Hash Table

 Scalability: O(log N)

 No false negatives

 Resistant against changes

Failures, Attacks

Short time users

(12)

Distributed Indexing

Goal is scalable complexity for

Communication effort: O(log(N)) hops

Node state: O(log(N)) routing entries

H(„ my data“ )

= 3107

2207

7.31.10.25 peer-to-peer.info

12.5.7.31

95.7.6.10

86.8.10.18 planet-lab.org berkeley.edu

3485 2906 1622 2011 709 1008

611

89.11.20.15

?

Routing in O(log(N)) steps to the node

storing the data

Nodes store O(log(N)) routing information to

other nodes

(13)

Distributed Indexing

Approach of distributed indexing schemes

Data and nodes are mapped into same address space

Intermediate nodes maintain routing information to target nodes

 Efficient forwarding to „destination“ (content – not location)

 Definitive statement of existence of content

Problems

Maintenance of routing information required

Fuzzy queries not primarily supported (e.g, wildcard searches)

H(„ my data“ )

= 3107

2207

3485 2906 1622 2011 709 1008

611

?

H(„ my data“ )

= 3107

2207

3485 2906 1622 2011 709 1008

611

?

(14)

8. Structured P2P Networks

1. Distributed Management and Retrieval of Data

1.

Comparison of strategies for data retrieval

2.

Central server

3.

Flooding search

4.

Distributed indexing

5.

Comparison of lookup concepts

2. Fundamentals of Distributed Hash Tables

1.

Distributed management of data

2.

Addressing in Distributed Hash Tables

3.

Routing

4.

Data Storage 3. DHT Mechanisms

1.

Node Arrival

2.

Node Failure / Departure 4. DHT Interfaces

5. Example: Chord

(15)

Fundamentals of Distributed Hash Tables I

● Characteristics of Hash Tables

Basic idea: keys are mapped via a common function to smaller fingerprints (hashes)

 Every number defines a position in an array (bucket)

 Keys mapped onto the same hash are put into the same bucket

Look-up works by hashing the query and searching the respective bucket

Hash Function

 Poor choice leads to clustering, i.e. probability of keys mapping to the same hash bucket (collision) is great and the performance degrades

 Good choices should be easy to compute, result in few collisions, and show a uniform distribution of hash values

Hash Tables

 provide constant-time O(1) lookup on average, regardless of the number

of items in the table

(16)

Fundamentals of Distributed Hash Tables II

● Challenges for designing Distributed Hash Tables

Desired Characteristics

 Flexibility

 Reliability

 Scalability

Equal distribution of content among nodes

 Crucial for efficient lookup of content

Permanent adaptation to faults, joins, exits of nodes

 Assignment of responsibilities to new nodes

 Re-assignment and re-distribution of responsibilities

in case of node failure or departure

(17)

Distributed Management of Data

Sequence of operations

1. Mapping of nodes and data into same address space

► Peers and content are addressed using flat identifiers (IDs)

► Common address space for data and nodes

► Nodes are responsible for data in certain parts of the address space

► Association of data to nodes may change since nodes may disappear

2. Storing / Looking up data in the DHT

► Search for data = routing to the responsible node

 Responsible node not necessarily known in advance

 Deterministic statement about availability of data

(18)

Addressing in Distributed Hash Tables

● Step 1: Mapping of content/nodes into linear space

Usually: 0, …, 2

m

-1 >> number of objects to be stored

Mapping of data and nodes into an address space (with hash function)

 E.g., Hash(String) mod 2

m

: H(„my data“)  2313

Association of parts of address space to DHT nodes

H(Node Y)=3485

3485 - 610

1622 - 2010 611 -

709

2011 - 2206

2207- 2905

(3485 - 610) 2906 -

3484 1008 -

1621

Y

X

2m-1 0

Often, the address space is viewed as a circle.

Data item “D”:

H(“D”)=3107 H(Node X)=2906

(19)

Association of Address Space with Nodes

● Each node is responsible for part of the value range

Often with redundancy (overlapping of parts)

Continuous adaptation

Real (underlay) and logical (overlay) topology are (mostly) uncorrelated

Logical view of the Distributed Hash Table

Mapping on the real topology

2207

3485 2906 1622 2011 709 1008

611

Node 3485 is responsible for data items in range 2907 to 3485

(in case of a Chord-DHT)

(20)

Step 2: Routing to a Data Item

● Step 2:

Locating the data (content-based routing)

● Goal: Small and scalable effort

► O(1) with centralized hash table

 But:

Management of a centralized hash table is very costly (server!)

► Minimum overhead with distributed hash tables

 O(log N): DHT hops to locate object

 O(log N): number of keys and routing information per node (N = # nodes)

(21)

Step 2: Routing to a Data Item

● Routing to a K/V-pair

Start lookup at arbitrary node of DHT

Routing to requested data item (key)

(

3107, (ip, port)

)

Value = pointer to location of data Key = H(“my data”)

Node 3485 manages keys 2907-3485,

Initial node (arbitrary) H(„my data“)

= 3107

2207

2906 3485

1622 2011 1008

709

611

(22)

Step 2: Routing to a Data Item

● Getting the content

K/V-pair is delivered to requester

Requester analyzes K/V-tuple

(and downloads data from actual location – in case of indirect storage) H(„my data“)

= 3107

2207

2906 3485

1622 2011 1008

709

611

Get_Data(ip, port)

Node 3485 sends

(3107, (ip/port)) to requester

In case of indirect storage:

After knowing the actual Location, data is requested

(23)

Association of Data with IDs – Direct Storage

● How is content stored on the nodes?

Example:

H(“my data”) = 3107 is mapped into DHT address space

● Direct storage

Content is stored in responsible node for H(“my data”)

 Inflexible for large content – o.k., if small amount data (<1KB)

D D

134.2.11.68

2207

3485 2906 1622 2011 709 1008

611

HSHA-1(„D“)=3107

D

(24)

Association of Data with IDs – Indirect Storage

● Indirect storage

Nodes in a DHT store tuples like (key,value)

 Key = Hash(„my data”)  2313

 Value is often real storage address of content:

(IP, Port) = (134.2.11.140, 4711)

More flexible, but one step more to reach content

2207

3485 2906 1622 2011

709 1008

611

HSHA-1(„D“)=3107

Item D: 134.2.11.68

D

134.2.11.68

(25)

8. Structured P2P Networks

1. Distributed Management and Retrieval of Data

1.

Comparison of strategies for data retrieval

2.

Central server

3.

Flooding search

4.

Distributed indexing

5.

Comparison of lookup concepts

2. Fundamentals of Distributed Hash Tables

1.

Distributed management of data

2.

Addressing in Distributed Hash Tables

3.

Routing

4.

Data Storage 3. DHT Mechanisms

1.

Node Arrival

2.

Node Failure / Departure 4. DHT Interfaces

5. Example: Chord

(26)

Node Arrival

● Joining of a new node

1.

Calculation of node ID

2.

New node contacts DHT via arbitrary node

3.

Assignment of a particular hash range

4.

Copying of K/V-pairs of hash range (usually with redundancy)

5.

Binding into routing environment

2207

3485 2906 1622 2011

709 1008

611

ID: 3256 134.2.11.68

(27)

Node Failure / Departure

● Failure of a node

Use of redundant K/V pairs (if a node fails)

Use of redundant / alternative routing paths

Key-value usually still retrievable if at least one copy remains

● Departure of a node

Partitioning of hash range to neighbor nodes

Copying of K/V pairs to corresponding nodes

Unbinding from routing environment

(28)

8. Structured P2P Networks

1. Distributed Management and Retrieval of Data

1.

Comparison of strategies for data retrieval

2.

Central server

3.

Flooding search

4.

Distributed indexing

5.

Comparison of lookup concepts

2. Fundamentals of Distributed Hash Tables

1.

Distributed management of data

2.

Addressing in Distributed Hash Tables

3.

Routing

4.

Data Storage 3. DHT Mechanisms

1.

Node Arrival

2.

Node Failure / Departure 4. DHT Interfaces

5. Example: Chord

(29)

DHT Interfaces

Generic interface of distributed hash tables

Provisioning of information

 Publish(key,value)

Requesting of information (search for content)

 Lookup(key)

Reply

 value

DHT approaches are interchangeable (with respect to interface)

Put(Key,Value) Get(Key)

Value Distributed Application

Node 1 Node 2 Node 3 . . . . Node N Distributed Hash Table

(CAN, Chord, Pastry, Tapestry, …)

(30)

Comparison: DHT vs. DNS

● Comparison DHT vs. DNS

Traditional name services follow fixed mapping

 DNS maps a logical node name to an IP address

DHTs offer flat / generic mapping of addresses

 Not bound to particular applications or services

 „value“ in (key, value) may be o an address

o a document

o or other data …

(31)

Comparison: DHT vs. DNS

Domain Name System

Mapping:

Symbolic name IP address

Is built on a hierarchical structure with root servers

Names refer to administrative domains

Specialized to search for computer names and services

Distributed Hash Table

Mapping: key  value can easily realize DNS

Does not need a special server

Does not require special name space

Can find data that are

independently located of computers

(32)

Conclusions

● Properties of DHTs

Use of routing information for efficient search for content

Keys are evenly distributed across nodes of DHT

 No bottlenecks

 A continuous increase in number of stored keys is admissible

 Failure of nodes can be tolerated

 Survival of attacks possible

Self-organizing system

Simple and efficient realization

Supporting a wide spectrum of applications

 Flat (hash) key without semantic meaning

 Value depends on application

(33)

Next …

● Specific examples of Distributed Hash Tables

Chord

UC Berkeley, MIT

Pastry

Microsoft Research, Rice University

CAN

UC Berkeley, ICSI

P-Grid

EPFL Lausanne

… and there are plenty of others: Kademlia, Symphony, Viceroy, …

(34)

8. Structured P2P Networks

1. Distributed Management and Retrieval of Data

1.

Comparison of strategies for data retrieval

2.

Central server

3.

Flooding search

4.

Distributed indexing

5.

Comparison of lookup concepts

2. Fundamentals of Distributed Hash Tables

1.

Distributed management of data

2.

Addressing in Distributed Hash Tables

3.

Routing

4.

Data Storage 3. DHT Mechanisms

1.

Node Arrival

2.

Node Failure / Departure 4. DHT Interfaces

5. Example: Chord

(35)

Chord

Ion Stoica Robert Morris David Karger

M. Frans Kaashoek Hari Balakrishnan (2001)

(36)

Chord: Overview

● Early and successful algorithm

● Simple & elegant

easy to understand and implement

many improvements and optimizations exist

Ion Stoica et al. in 2001

● Main responsibilities:

Routing

 Flat logical address space: l-bit identifiers instead of IP addresses

 Efficient routing in large systems: log(N) hops with N total nodes

Self-organization

 Handle node arrival, departure, and failure

(37)

Chord: Topology

● Hash-table storage

put (key, value) inserts data to Chord

Value = get (key) retrieves data from Chord

● Identifiers

Derived from hash function

 E.g. SHA-1, 160-bit output → 0 <= identifier < 2^160

Key associated with data item

 E.g. key = sha-1(value)

ID associated with host

 E.g. id = sha-1 (IP address, port)

(38)

Chord: Topology

● Keys and IDs on ring, i.e., all arithmetic modulo 2^160

● (key, value) pairs managed by clockwise next node: successor

6

1

2 6

0

4

2 6

5

1

3 7

Chord 2 Ring

Identifier Node

X Key

successor(1) = 1

successor(2) = 3 successor(6) = 0

(39)

Chord: Topology

● Topology determined by links between nodes

Link: knowledge about another node

Stored in routing table on each node

● Simplest topology: circular linked list

Each node has link to clockwise next node

0

4

2 6

5

1

3 7

(40)

Chord: Routing

● Primitive routing:

Forward query for key x until successor(x) is found

Return result to source of query

● Pros:

Simple

Little node state

● Cons:

Poor lookup efficiency:

O(1/2 * N) hops on average (with N nodes)

Node failure breaks circle

0

4

2 6

5

1

3 7

1

2 6

Key 6?

Node 0

(41)

Chord: Routing

● Advanced routing:

Store links to z next neighbors

Forward queries for k to farthest known predecessor of k

For z = N: fully meshed routing system

 Lookup efficiency: O(1)

 Per-node state: O(N)

Still poor scalability

● Scalable routing:

Linear routing progress scales poorly

Mix of short- and long-distance links required:

 Accurate routing in node’s vicinity

 Fast routing progress over large distances

 Bounded number of links per node

(42)

Chord: Routing

● Chord’s routing table: finger table

Stores log(N) links per node

Covers exponentially increasing distances:

 Node n: entry i points to successor(n + 2^i) (i-th finger)

0

4

2 6

5

1

3 7

finger table

i succ.

keys 1 0

1 2

3 3 0 start 2 3 5

finger table

i succ.

keys 2 0

1 2

0 0 0 start 4 5 7 1

2 4

1 3 0 finger table

start succ.

keys 6 0

1 2 i

(43)

● Chord’s routing algorithm:

Each node n forwards query for key k clockwise

 To farthest finger preceding k

 Until n = predecessor(k) and successor(n) = successor(k)

 Return successor(n) to source of query

63

4 7

16 14 13

19

23 26 37 30

39 42 45 49

52 54

56

60

i 2^i Target Link

0 1 53 54

1 2 54 54

2 4 56 56

3 8 60 60

4 16 4 4

5 32 20 23 i 2^i Target Link

0 1 24 26

1 2 25 26

2 4 27 30

3 8 31 33

4 16 39 39 5 32 55 56 i 2^i Target Link

0 1 40 42

1 2 41 42

2 4 43 45

3 8 47 49

4 16 55 56

5 32 7 7

45 42 49

i 2^i Target Link

0 1 43 45

1 2 44 45

2 4 46 49

3 8 50 52

4 16 58 60

5 32 10 13 44

lookup (44) lookup (44) = 45

Chord: Routing

(44)

Chord: Self-Organization

● Handle changing network environment

Failure of nodes

Network failures

Arrival of new nodes

Departure of participating nodes

● Maintain consistent system state for routing

Keep routing information up to date

 Routing correctness depends on correct successor information

 Routing efficiency depends on correct finger tables

Failure tolerance required for all operations

(45)

Chord: Failure Tolerance: Storage

● Layered design

Chord DHT mainly responsible for routing

Data storage managed by application

 persistence

 consistency

 fairness

● Chord soft-state approach:

Nodes delete (key, value) pairs after timeout

Applications need to refresh (key, value) pairs periodically

Worst case: data unavailable for refresh interval after node failure

(46)

Chord: Failure Tolerance: Routing

● Finger failures during routing

query cannot be forwarded to finger

forward to previous finger (do not overshoot destination node)

trigger repair mechanism: replace finger with its successor

● Active finger maintenance

periodically check liveness of fingers

replace with correct nodes on failures

trade-off: maintenance traffic vs. correctness & timeliness

63

4 7

16 14 13

19

23 26 33 30

37 39 42 45 49

52 54

56

60

45 42 49

44

(47)

Chord: Failure Tolerance: Routing

● Successor failure during routing

Last step of routing can return failed node to source of query -> all queries for successor fail

Store n successors in successor list

 successor[0] fails -> use successor[1] etc.

 routing fails only if n consecutive nodes fail simultaneously

● Active maintenance of successor list

periodic checks similar to finger table maintenance

crucial for correct routing

(48)

Chord: Node Arrival

● New node picks ID

● Contact existing node

● Construct finger table via standard routing/lookup()

● Retrieve (key, value) pairs from successor

0

4

2 6

5

1

3 7

finger table

i succ.

keys 1 0

1 2

3 3 0 start 2 3 5

finger table

i succ.

keys 2 0

1 2

0 0 0 start 4 5 7 1

2 4

1 3 0 finger table

start succ.

keys 6 0

1 2 i

7 0 2

0 0 3 finger table

start succ.

keys 0

1 2 i

(49)

Chord: Node Arrival

● Examples for choosing new node IDs

random ID: equal distribution assumed but not guaranteed

hash IP address & port

place new nodes based on

 load on existing nodes

 geographic location, etc.

● Retrieval of existing node IDs

Controlled flooding

DNS aliases

Published through web

etc.

0

4

2 6

5

1

3 7

ID = ?

ID = rand() = 6

DNS

entrypoint.chord.org?

182.84.10.23

(50)

Chord: Node Arrival

● Construction of finger table

iterate over finger table rows

for each row: query entry point for successor

standard Chord routing on entry point

● Construction of successor list

add immediate successor from finger table

request successor list from successor

0

4

2 6

5

1

3 7

7 0 2

0 0 3 finger table

start succ.

keys 0

1 2

i succ(7)?

succ(0)?

succ(2)?

succ(7) = 0 succ(0) = 0 succ(2) = 3

successor list 0 1

successor list 1 3

(51)

Chord: Node Departure

● Deliberate node departure

clean shutdown instead of failure

● For simplicity: treat as failure

system already failure tolerant

soft state: automatic state restoration

state is lost briefly

invalid finger table entries: reduced routing efficiency

● For efficiency: handle explicitly

notification by departing node to

 successor, predecessor, nodes at finger distances

copy (key, value) pairs before shutdown

(52)

Chord: Summary

● Complexity

Messages per lookup: O(log N)

Memory per node: O(log N)

Messages per management action (join/leave/fail): O(log² N)

● Advantages

Theoretical models and proofs about complexity

Simple & flexible

● Disadvantages

No notion of node proximity and proximity-based routing optimizations

Chord rings may become disjoint in realistic settings

● Many improvements published

e.g. proximity, bi-directional links, load balancing, etc.

(53)

The Architectures of 1

st

and 2

nd

Gen. P2P

Client-Server Peer-to-Peer

1. Server is the central entity and only provider of service and content.

Network managed by the Server

2. Server as the higher performance system.

3. Clients as the lower performance system

Example: WWW

1. Resources are shared between the peers

2. Resources can be accessed directly from other peers 3. Peer is provider and requestor (Servent concept)

Unstructured P2P Structured P2P

Centralized P2P Pure P2P Hybrid P2P DHT-Based

1. All features of Peer-to- Peer included 2. Central entity is

necessary to provide the service

3. Central entity is some kind of index/group database

Example: Napster

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. No central entities Examples: Gnutella 0.4,

Freenet

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. dynamic central entities

Example: Gnutella 0.6, JXTA

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. No central entities 4. Connections in the

overlay are “fixed”

Examples: Chord, CAN

(54)

Reminder: Distributed Indexing

● Communication overhead vs. node state

Comm un ica tio n Ov er hea d

Node State

Flooding

Central Server O(N)

O(N) O(1)

O(1)

O(log N) O(log N)

Bottleneck:

Communication Overhead

False negatives

Bottlenecks:

Memory, CPU, Network

Availability Distributed

Hash Table

 Scalability: O(log N)

 No false negatives

 Resistant against changes

Failures, Attacks

Short time users

(55)

Comparison of Lookup Concepts

System Per Node State

Communi- cation Overhead

Fuzzy Queries

No false

negatives Robust-ness

Central

Server O(N) O(1)   

Flooding

Search O(1) O(N²)   

Distributed

Hash Tables O(log N) O(log N)   

Referenzen

ÄHNLICHE DOKUMENTE

• Dynamo is a low-level distributed storage system in the Amazon service infrastructure.

– Specialized root tablets and metadata tablets are used as an index to look up responsible tablet servers for a given data range. • Clients don’t communicate with

• If an acceptor receives an accept request with higher or equal number that its highest seen proposal, it sends its value to each learner. • A value is chosen when a learner

• Basic storage is offered within the VM, but usually additional storage services are used by application which cost extra.

– Page renderer service looses connection to the whole partition containing preferred Dynamo node. • Switches to another node from the

– Specialized root tablets and metadata tablets are used as an index to look up responsible tablet servers for a given data range. • Clients don’t communicate with

•  Send accept message to all acceptors in quorum with chosen value.

Distributed Data Management – Christoph Lofi – IfIS – TU Braunschweig 38. 13.1 Map