Distributed Data Management

(1)

Wolf-Tilo Balke Christoph Lofi

Institut für Informationssysteme

Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de

Distributed Data Management

(2)

4.3 Byzantine Agreements

5.0 Introduction and History 5.1 The First Generation

– Centralized P2P – Pure P2P

5.2 The Second Generation

– Hybrid P2P

5 Unstructured P2P Networks

(3)

• But what happens autonomy of nodes increase?

– Think loosely coupled federated database!

• Or even worse: P2P databases!

• Nodes or communication may start to misbehave!

– Malicious Behavior

• A node may aim at sabotaging the whole system just to harm it

– Some people are just nasty or want to weaken the system for other reasons

– e.g. claim a failure for each sub-transaction the node was responsible for ⇒ all global transaction involving that node fail

– Malfunctions

• The more the autonomy increases, the more difficult it is to detect if a given system behavior is a real answer or a malfunction

4.3 Byzantine Agreements

(4)

• What to do if trust cannot be assured?

– Byzantine agreements!

• Addresses Byzantine Fault Tolerance

– Defends against Byzantine failures, i.e. nodes in a distributed system failing by producing wrong results – Sane systems correct output of failing ones

• Unanimous agreement of sane systems needed

4.3 Byzantine Agreements

(5)

• Assumptions of the “historical” problem

– Agreement

• No two “good” generals agree on different outcomes

– Validity

• If all “good” generals start with the belief they are ready to attack, then the only possible outcome is to attack

– Termination

• All “good” generals eventually decide

• ‘Generals’ can be peers, database nodes, circuit switches, etc.

4.3 Byzantine Agreements

(6)

• For what percentage of malicious nodes can protocols be designed?

– Triple Modular Redundancy  > 3f nodes

• Assuming f treacherous generals (malicious peers),

we need at least (3f+1) peers to come to an agreement

• This means: System does not work when 1/3 or more of all generals are malicious

– M. Castro, B. Liskov: Practical Byzantine Fault Tolerance. Operating Systems Design and

Implementation, 1999

4.3 Byzantine Agreements

(7)

• How to prove:

– Reduce to Commander and Lieutenant problem

• Is unsolvable for the 1 commander and 2 Lieutenants if commander is malicious

• Sketch: a system with only 3 peers

– Each starts with an initial value (0 or 1) – One peer is malicious

– Good nodes need to agree upon value (0 or 1) – Nodes act solely based on messages coming

in along incident edges

• Assume there exists an algorithm that allows good nodes to agree

4.3 Byzantine Agreements

1 0

0 N2

(8)

• Assume that N1 is a good peer

– Scenario 1: N3 is treacherous

• N2 relates that it is in state 0 to N1 and N3

• But N3 relates to N1 that N2 is in state 1

– Scenario 2: N2 is treacherous

• N2 relates that it is in state 0 to N1 and that it is in state 1 to N3

• N3 relates to N1 that N2 is in state 1

4.3 Byzantine Agreements

1 0

0

N1

N2

N3

(9)

• Obviously N1 cannot distinguish the two scenarios

– In both cases it would have to decide for a value of 0 for the respective loyal peer

4.3 Byzantine Agreements

1 0

0

N1

N2

N3

Scenario 1

1 0

0

N1

N2

N3

N2=0 N2=0

N2=1

N2=0 N2=1

N2=1

Scenario 2

(10)

• Now look at N3 in scenario 2

• Remember in scenario 2 N2 is treacherous

– N2 relates that it is in state 0 to N1 and that it is in state 1 to N3

– N1 relates to N3 that it is in state 1

• N3 would have to decide for a

value of 1 and thus vote with the loyal peer N1

• Contradiction: in scenario 2, N1 and N3 would both

4.3 Byzantine Agreements

1 0

0

N1

N2

N3

N2=0 N2=1

N2=1

(11)

• One peer starts the agreement process by broadcasting its value (commander)

– Whenever a message is supposed to be sent, but a peer does not send it, it is detected, and a default value is

assumed

• Echo the result to all other peers

• Do this for more peers than can be malicious

– Algorithm is recursive with (f+1) levels

• Bottom case: No traitors

– the commander broadcasts its initial value

– every other process decides on the value it receives

4.3 Byzantine Agreement (n > 3f)

(12)

4.3 Byzantine Agreement (n > 3f)

• Idea: Amplify the original message over different

channels starting from (f+1) commanders

(13)

• echo_broadcast(node C, message m)

– C sends [initial,C,m] to all nodes

– Every recipient replies with [echo,C,m] to all and ignores subsequent [initial,C,m’]

– Upon receiving [echo,C,m] from (n+f)/2 distinct nodes, then a nodes accepts m from C

• Terminates? Yes —all non-malicious nodes accept (n-f) messages and exit both wait phases.

• If the system is initially proper (all non-malicious nodes have the same value m) then every such node terminates the algorithm with M=m.

4.3 Byzantine Agreement (n > 3f)

Distributed Data Management – Christoph Lofi – IfIS – TU Braunschweig

(14)

4.3 Byzantine Agreement (n > 3f)

C_i: M := M_i

for k =1 to (f+1) do

(* Phase 1: SEND *) broadcast M;

wait to receive M-messages from (n-f) distinct processes;

proof := set of received messages;

count(1) := number of received messages with M = 1;

if count(1) > (n-2f) then M := 1 else M :=0;

(* Phase 2: ECHO *) echo_broadcast [M, proof ];

wait to accept [M, proof ]-messages, with a correct proof, from (n–f) distinct

processes;

count(1) := number of accepted messages with M =1;

Compute_new_vote( s_k );

if (s_k = 0 and count(1) ≥ 1) or (s_k= 1 and count(1) ≥ (2f+1)) then M := 1

else M : = O;

(15)

• If the Commander is not malicious (agreement by majority vote)

4.3 Example: Four Generals

(16)

• If the Commander is malicious (no agreement possible)

4.3 Example: Four Generals

(17)

• Partition nodes into three groups, with at least 1 and at most 1/3 of the nodes in each group

• Theorem: A Byzantine agreement can be solved in a network G of n nodes while tolerating f

faults if and only if

– n > 3f and

– connectivity(G) > 2f

• Graph G is 2f-connected if the removal of 2f or more nodes will result in a disconnected graph (or a trivial 1-node graph)

4.3 Four Generals

(18)

• Byzantine Generals

– Defends against failing or malicious nodes in a distributed system by unanimous agreement – Very expensive due to high message overhead

• Clever implementations necessary, e.g.

Miguel Castro , Barbara Loskov, Practical Byzantine Fault Tolerance,, ACM Transactions on Computer Systems, 2002

– Used in many modern systems requiring distributed fault tolerance

• Fault-tolerant systems in e.g. aeronautics

• Software protocols like BitCoin

4.3 Four Generals

(19)

• Peer To Peer (P2P) Systems

– P2P systems have been popularized in 1999 by Napster for sharing MP3’s

– Base Problem: How can resources easily be shared within a highly volatile and decentralized network of independent and autonomous peers (nodes)?

• There is an (potentially) large number of peers

• Peers may join or leave the network any time

• Only rudimentary features necessary

5.0 Peer-To-Peer Systems

(20)

• What can be shared?

– Information

• File & document sharing

– Bandwidth

• Load balancing

• Shared bandwidth

– Storage space

• DAS, NAS, SAN

• Storage networks

– Computing Power

5.0 Peer-To-Peer Systems

(21)

• What is a P2P network?

– A virtual overlay network for sharing resources

• Virtual and physical network are logically independent

• Mostly IP based

– Usually decentralized and self-organizing

– Peers can transfer data directly without intermediate servers

5.0 What is Peer-To-Peer?

(22)

• “Virtual” signaling network established via TCP connections between the peers

• Characteristics of the overlay topology:

– completely independent from physical network – Separate addressing and routing scheme

– No relation between physical network edges and overlay network edges

– Overlay network can be seen as graph

• Peers as nodes

5.0 Overlay Networks

(23)

5.0 Overlay Networks

(24)

• The topology of the overlay network may show different properties

– May be centralized or decentralized

– May use strict structures or may be unstructured – May show be flat or be organized in hierachies

– We will use these properties later to classify P2P systems!

• In this lecture only unstructured networks

5.0 Overlay Networks

(25)

• P2P technology was enabled by various technological and social developments

– Performance increase of home user’s personal computers

• When P2P system have been established in 1999, the average computing performance of a home PC was comparable to high end servers of the late 80s

– General availability of high-speed internet

• In 1999, DSL connections have been introduced

• Flat rate models gained momentum

5.0 Towards P2P

(26)

• Late 1960s: Establishment of the ARPANET

– “Advanced Research Projects Agency Network”

• Based on the concept of the “Intergalactic Computer Network”

of Prof. J.C.R. Licklider

– Funded by DARPA

• Share computing resources and documents between US research facilities

– The rumor that ARPANET was build in order to control the military after a nuclear war is NOT true!

– Most popular applications

• Email (1971), FTP (1974) and TelNet (1969)  client/server model

– Central steering committee to organize the network – Later became “the internet”

5.0 Towards P2P

(27)

• 1979: Development of the UseNet protocol

– Newsgroup application to organize content

– Newsgroup server network exhibits some P2P characteristics

• No central server but a server network (compare to super-peer-networks)

• Clients only communicate with a server which may reroute the requests to other servers

– Different groups for different content

• Initially, only text messages

• Later: infamous BIN groups usually distributing copyrighted music, software or movies

5.0 Towards P2P

(28)

• ~1990 rush of the general public to join the Internet

– The WWW is invented at CERN by Tim-Berners Lee – Centrally hosted, interlinked websites are state-of-art

• Illegal file sharing using warez sites…

5.0 Towards P2P

(29)

• Northeastern University, Boston, June 1999

– Shawn Fanning (19) and Sean Parker (20) invent Napster

• Problem: Both liked to share music and software (for free…)

• But: warez sites, UseNet binary groups, and IRC bots were very painful to use

– Bad search, broken links, tiny retention caches, low bandwidth, etc..

• Idea: establish a system offering powerful search

capabilities, no broken links, and performance which increases with the number of users!

5.0 Towards P2P

(30)

• Basic Idea of Napster

– Users store music on their home PCs

– Users connect to the Napster server and provide a list of all songs they currently have

– Users can query the Napster server for any song

• Result: a list of all users currently possessing that song

– User can download the song directly from another user

• Peer-to-Peer!

5.0 Towards P2P

(31)

5.0 Towards P2P

(32)

• Napster Inc. initially aimed at being a market place for digital music

– Like iTunes today

– Napster tried multiple times

to establish usage agreements with record labels, but they failed

• No legal business model for selling single songs possible

• Labels felt threatened by Napster’s fast growth

• Negotiations have been stopped by labels

5.0 Towards P2P

(33)

• December 1999:

– RIAA files a lawsuit against Napster Inc.

• Target of the RIAA: the central lookup server of Napster

• February 2001:

– 2.79 billion files exchanged via the Napster network per month

• July 2001: Napster Inc. is convicted

– Napster has to stop the operation of the Napster server – Napster network breaks down

– BUT: Already a number of promising successors available

5.0 Towards P2P

(34)

• May 2002:

– Bertelsmann tries to buy Napster assets for $85 million – American courts blocks the transaction and forces

Napster to liquidate all assets

• Roxio buys the logo and name in the bankruptcy auction

– Roxio owned an iTunes-like store called “pressplay” which was rebranded with the Napster cooperate design

– Launch of new Napster in October 2003 as a centralized paid-subscription service

• Not very successful because it launched shortly after iTunes and without hardware support

– Sold 2008 to BestBuy for $121M

5.0 Towards P2P

(35)

• Generally, the RIAA lawsuit is considered a big failure

– Napster could have become an early iTunes, if labels had cooperated

– The lawsuit gave birth to an even more dangerous software: e.g., Gnutella

• Open source

• Fully decentralized

– A Gnutella network cannot be shut down

– No company to sue, no servers to disconnect, …

– P2P piracy became even stronger after Napster was convicted due to publicity

5.0 Towards P2P

(36)

• The “hot” years for P2P have been 1999-2008

• In 2006, nearly 70% of all network traffic was attributed to P2P traffic

– Nowadays, P2P traffic declines in favor of video streaming and social networks...

5.0 P2P Development

(37)

• First generation peer-to-peer networks tried simple paradigms to build the network

– Centralized directory model:

all content listed in a central directory whose server also is used as a central point of connection

– Pure peer-to-peer model:

there is no central authority, but

peers do only connect to neighbors in the network

5.1The First Generation

(38)

• Centralized directory model

– Index service provided centrally by a coordinating entity – Search requests are issued to the coordinating entity

• Returns a list of peers having the desired files available for download

– Requesting peer obtains respective files directly from the peer offering them

• Characteristics

– Lookup of existing documents can be guaranteed

– Index service as single point of failure

5.1 Centralized Directories

(39)

? Prince

purple rain ? ! Prince

purple rain !

Central database with index of all shared files Peer X shares on his

client several MP3- files

Peer Y shares on his client several MP3-files,

too.

5.1 Example: Napster

! Prince purple rain !

@ Peer Y

? Prince purple rain ?

(40)

• All peers are connected to a central entity

– Central entity is necessary to provide network services

• Joining the network: central server is also the bootstrap-server

• Central entity can be established as a server farm, but one single entry point (also single point of failure)

• All signaling connections are directed to central entity

– Central entity is some kind of index/group database – Central entity has a lookup/routing table

• Peers establish connections between each other on demand to exchange user data

5.1 Centralized P2P

(41)

• Peer  central entity: special P2P protocol, e.g., Napster protocol

– Registering/logging on to the overlay – Finding content

– Updating shared content information – Update the routing tables

• Peer  Peer: HTTP

– Exchanging the actual content

5.1 Protocols Used

(42)

5.1 Centralized Topology

Peer

Connection between 2 peers (TCP)

Connection between router & peer Connection between

(43)

• Application-level, client-server protocol over point-to-point TCP

• Participants

– Napster hosts/peers – Client Service

• Login

• Data-requests

• Download-requests

– P2P Service

• Data-transfer

– Napster Indexserver

• Pure server

5.1 How Does Napster Work?

Central Napster

Index server

Data Transfer

Napster Host

Napster Host Napster

Host

(44)

• General Header Structure

5.1 Napster Messages

<Payload Length>

2byte

<Function>

2Byte

HEADER 4byte PAYLOAD

Describes the

message type (e.g.

login, search,…)

Describes

parameters of the

message (e.g. IDs,

keywords,…)

(45)

5.1 Napster Initialization

<Nick> <Password> <Port>

1: LOGIN (Function:0x02)

Napster Host IP: 001 Nick: LKN

<Client-Info>

<Link-type>

LOGIN(0x02)

lkn 54332 6699 „nap v0.8“ 9 LOGIN ACK(0x03)

2: LOGIN ACK (Function: 0x03)

„<Filename>“ <MD5>

3: NOTIFICATION OF SHARED FILE (0x64)

<Size>

<Bitrate>

<Freq>

<Time>

NOTIFICATION(0x64)

„band - song.mp3“ 3f3a3... 5674544 128 44100 342

Index server

Client/Server Service

(46)

5.1 Napster: File Requests

[FILENAME CONTAINS Search Criteria]

[LINESPEED <Compare> Link-Type]

1: SEARCH (Function: 0xC8)

[BITRATE <Compare> Bitrate]

SEARCH(0xC8)

FILENAME CONTAINS song MAX_RESULTS 100 LINESPEED AT_LEAST 6 BITRATE AT_LEAST 128

[FREQ <Compare> Freq]

[MAX_RESULT Max]

Napster Host IP: 002

2: SEARCH RESPONSE (Function: 0xC9)

Filename <MD5>

<Size>

<Bitrate>

<Freq>

<Time>

<Nick> <IP> <Link-Type>

Index

(47)

– Sample message sequence chart for a Napster server with one requesting and one providing peer

5.1 Napster Signaling

Napster Peer (Req)

Napster Server

Napster Peer (Prov)

Login: [0x24|0x02|…]

Login Ack: [0x00|0x03|…]

HTTP: GET[Filename]

Notif: [0x46|0x64|…]

Search: [0x7E|0xC8|…]

Response: [0xC4|0xC9|…]

(48)

• Search Request

– User sends out a music file request and Napster

searches its central data base

5.1 Wrap-Up

0101

1001

(49)

• Search Response

– The Napster Server sends back a list of peers that share the file

5.1 Wrap-Up

(50)

• File Download

– The requesting user downloads the file directly from the

computer of

another Napster user viaHTTP

5.1Wrap-Up

(51)

• Advantages

– Fast and complete lookup (one hop lookup) – Central managing/trust authority

– Easy bootstrapping

• Disadvantages

– Single Point of Failure is a bottleneck and makes it easily attackable

– Central server in control of all peers

• Usage

– Application areas: file sharing, VoIP (SIP, H.323) – Systems: Skype, Audiogalaxy, WinMX

5.1 Discussion: Centralized P2P

(52)

• March 2000: Nullsoft releases Gnutella for free

– Nullsoft planned to release the source code und GPL license a couple of days later

• Developed by Justinf Frankel and Tom Pepper

– Nullsoft’s mother company AOL cancels the distribution and further development of Gnutellas a day after its release

• AOL merged with Time Warner shortly after buying NullSoft for $100M

– The Gnutella protocol is reverse engineered and distributed under GPL license

– Many compatible clients and various forks of Gnutella are developed

– Became extremely popular after Napster has to shut down

5.1 Gnutella

(53)

• August 2001

– Users adapt very fast to the breakdown of Napster

– Already 3.05 billion files exchanged per months via the Gnutella network

• 2001

– Invention of structured and hybrid P2P networks

• Gnutella scaled badly, new network paradigms necessary

• e.g KaZaA which quickly gains popularity

• August 2002

– Amount of exchanged data in KaZaA (FastTrack) decreases, caused by a high number of defected files

• weak hash keys to identify files provoked file collisions

– EDonkey and Gnutella regain popularity

Gnutella

(54)

• May 2003

– BitTorrent is released

– BitTorrent quickly becomes most popular file sharing protocol

• Middle of 2003

– Beyond the exchange of content, new concepts are developed to use P2P also for other applications – Skype a Voice over P2P application is developed

• 2005:

– Major efforts are made to increase the reliability of P2P- searches, also in mobile networks, …

5.1 Gnutella

(55)

– In 2005 Ebay buys Skype for communication between bidders and sellers for $2.6 billion

– In 2009, an investor group buys 65% of Skype for $1.9 billion

– Plans are to turn Skype into an own company again in 2010

• But instead is bought by Microsoft for $8.5B

– 13% of all international phone calls are handled by Skype in 2010

5.1 Skype

(56)

• Which protocols are used?

– Traffic measured between 2002 and 2004 in Abilene backbone

5.1 Gnutella

(57)

• The base idea of Gnutella was to avoid the weaknesses of Napster

• Result:

– Fully decentralized, unstructured and flat P2P network

– Initially: All peers are equal!

• Gnutella 0.4

• Thus called a pure P2P system

5.1 Gnutella

(58)

• Pure P2P systems have following characteristics

– Decentralized

• There is no central authority

• Any peer can be removed without loss of functionality

– Unstructured

• Overlay network is constructed randomly without any structure

• All peers are equal

5.1 Pure P2P

Peer Connection between

(59)

5.1 Pure P2P : Graphs

Major component

Separate sub networks

Sample Graph

(60)

• To query pure networks, a Flooded Request Model is used

– Search request is passed on to neighbors.

– Neighbors forward the message to their respective neighbors

– When a node can answer the request located, results notifications are sent to the requesting node

– Requesting peer then can establish a direct connection to any peer

which sent a result notification

5.1 Pure P2P: Flooding

(61)

• Request flooding relies on message forwarding

– If a peer receives a request, it usually forwards the request to all its neighbors

• Forwarding every message to all neighbors in a

uncontrolled fashion will soon overload the network

– One node could spam the whole network – Message will can be caught in infinite cycles

• Restrictions needed!

– Each message has a maximum time-to-life (TTL) and a hop counter

• The hop counter is initially set to 0

• Each forwarded message will have the hop counter increased by 1

• A message with TTL=hop counter is not forwarded and dies

• TTL thus limits the maximum distance a message can travel

– Prevents spamming the whole network

Distributed Data Management – Christoph Lofi – IfIS – TU Braunschweig 61

5.1 Pure P2P: Flooding

(62)

– Every message which is forwarded is cached by the forwarding peer for a short time

– Message cache is used to prevent message cycles

• Don’t forward a message which you already forwarded!

5.1 Pure P2P: Flooding

(63)

• Response messages a routed back to the

original requester using the same message trail

– Use message caches to perform back-tracking

• For each forwarded message stored in the cache, also store the node from which the message was received

• If a response message is received, look up the respective request message in cache

– Forward response to the node which sent in the request

5.1 Pure P2P: Flooding

(64)

• Effects of flooding technique

– Fully decentralized, no central lookup server

• No single point of failure or control

– Unreliable lookup (no guarantees)

• Time-to-live limits the maximum distance a query can travel

• Query may be restricted a sub-network not containing the desired results

– System doesn't scale well

• Number of message increases drastically with number of peers

• Peers with low-bandwidth connections are rendered useless

5.1 Pure P2P: Flooding

(65)

5.1 Pure P2P: Flooding

Requesting Peer Peer

Query […]

Query-Hit […]

Requested Data

• Broadcast

Query[XYZ, TTL = 3, …]

(66)

5.1 Pure P2P: Flooding

Query […]

Query-Hit […]

Requested Data

• 1. Query-Hit

• Broadcast

Query[XYZ, TTL = 2, …]

(67)

5.1 Pure P2P: Flooding

Query […]

Query-Hit […]

Requested Data

• 2. Query-Hit

• Broadcast

Query[XYZ, TTL = 1, …]

(68)

5.1 Pure P2P: Flooding

Query […]

Query-Hit […]

Requested Data

• 3. + 4. Query-Hit

• [TTL = 0]  no further

Broadcast

(69)

5.1 Pure P2P: Flooding

Query […]

Query-Hit […]

Requested Data

• Establish HTTP

Connection

(70)

5.1 Pure P2P: Flooding

Query […]

Query-Hit […]

Requested Data

• HTTP Connection Get[XYZ, …, …]

Download Data

(71)

• How can a new node join a pure P2P network?

– No central server – Network is volatile

• Bootstrapping necessary

– Usually not part of the protocol specification

• Implemented by client

– Necessary to know at least one active participant of the network

• Otherwise no participation at the overlay possible for a new node

5.1 Pure P2P: Bootstrapping

(72)

• The address of an active node can be retrieved by different means

– Bootstrap cache

• Try to establish a connection to any node known from the last user session

– Stable nodes

• Connect to a “well known host” which is usually always in the network

– Bootstrap server

• Ask a bootstrap server to provide a valid address of at least one active node

• Realizations:

– FIFO of all node-addresses which recently used this bootstrap (a node which just connected is assumed to be still active)

– Random pick of addresses which recently connected via this server to the overlay

5.1 Pure P2P: Bootstrapping

(73)

– Broadcast on the IP layer

• Use multicast channels

• Use IP broadcasting

– limited to local network

– Bootstrap lists

• Maintain a list of potential bootstrap servers outside the network, e.g. on a website

– Used by most file sharing clients

5.1 Pure P2P: Bootstrapping

(74)

• Bootstrapping

– Via bootstrap-server (host list from a web server) – Via peer-cache (from previous sessions)

– Via well-known host

• Routing

– Completely decentralized

– Reactive protocol: routes to content providers are only established on demand, no content announcements

– Requests: flooding (limited by TTL and GUID)

– Responses: routed (Backward routing with help of GUID)

• Content transfer connections (temporary)

– Based on HTTP

– Out of band transmission

5.1 Pure P2P: Summary

(75)

• How is pure P2P implemented in Gnutella 0.4?

• Application-level, peer-to-peer protocol over point-to- point TCP

– Router Service

• Flood incoming requests

– regard TTL!

• Route responses for other peers

– Regard GUID of message

– Keep alive messages (PING/PONG) – Content responses (QUERYHIT)

– Lookup Service

• Initialize Queries requests

• Initialize keep alive requests

– Download service

• Establish direct connection for download

5.1 Gnutella 0.4

G

G G

G

TCP connection G

Peer

G

(76)

• Five steps

– Connect to at least one active peer

• address received from bootstrap

– Explore your neighborhood

• PING/PONG protocol

– Submit Query with a list of keywords to your neighbors

• Neighbor forward the query

– Receive QUERYHIT messages

• Select the most promising QUERYHIT message

– Connect to providing peer for file transfer

5.1 Gnutella 0.4

(77)

• Ping-Pong Messages

– Each “Ping” message is answered by a “Pong” message – Keep-Alive-Ping-Pong

• Simple messages with TTL one sent to neighbors

• Tests if neighbor is offline / disconnected / overloaded

– Exploration-Ping-Pong

• Used to explore and gather

information of a node’s neighborhood

• Higher TTL

• Pings are forwarded, Pongs are returned carrying information on other peers

– Uptime, Bandwidth, number of shared files, etc.

• Store information about neighboring nodes in a peer cache

5.1 Gnutella 0.4

(78)

• Exploiting the Ping-Pong cache

– Use stable peers in next bootstrap process – Boost your connectivity

• Add additional direct links to strong remote neighbors

– Compensate direct neighbor failures

• Just reconnect to a remote neighbor

• Ping-Pong protocols use a lot of bandwidth and are avoided in most modern protocols

5.1 Gnutella 0.4

(79)

5.1 Gnutella 0.4

Measurements taken at the LKN in May 2002

(80)

5.1 Gnutella 0.4

General Header Structure:

Describes the

message type (e.g.

ping/pong, search,…)

Describes

parameters of the message (e.g. IDs, keywords,…)

General Header Structure:

GnodeID

16 Bytes Function

1 Byte

MESSAGEHEADER: 23Byte

TTL

1 Byte Hops

1 Byte Payload Length

4 Bytes

•

GnodeID: unique 128bit Id of the Hosts

•

TTL(Time-To-Live): number of nodes a message may pass

before it is killed

(81)

5.1 Gnutella 0.4

Port

2 Bytes IP Address 4 Bytes

PING (Function:0x00)

Nb. of shared Files

4 Bytes Nb. of Kbytes shared 4 Bytes

No Payload PONG (Function:0x01)

Minimum Speed

2 Bytes Search Criteria

n Bytes

QUERY (Function:0x80)

Nb. of Hits

1 Byte Port

2 Bytes GnodeID

16 Bytes Result Set

n Bytes

QUERY HIT (Function:0x81)

Speed 1 Byte

File Index 4 Bytes

File Name n Bytes IP Address

4 Bytes

(82)

5.1 Gnutella 0.4

• Flooding: Received PINGS and QUERIES must be forwarded to all connected Gnodes

• PINGS or QUERYS with the same FUNCTION ID and GNODE ID as previous messages are

destroyed (avoid loops)

• Save Origin of received PINGs and QUERIEs

• Increase Hops by 1

• If Hops equals TTL, kill the message

• PONG and QUERY HIT are forwarded to the origin of the according PING or QUERY

• Basic Routing Principle: „Enhanced“

Flooding

(83)

5.1 Gnutella 0.4: Ping-Pong

GNODE ID: 2000

IP: 002

GNODE ID: 3000 IP: 003

GNODE ID: 4000 IP: 004 GNODE

ID: 1000 IP: 001

25 26

17

22 24

17Gnutella Connect 18Gnutella OK

19PING 20PONG/IP:004

21PING 23PONG/IP:001 27PONG/IP:001

22PING 24PONG/IP:003 28PONG/IP:003 25PING

26PING 18

19 20 2728

Gnode 2000 establishes a connection to 4000

(84)

5.1 Gnutella 0.4: Ping-Pong

1

7 3

2

4 5

6

8

Gnu-Con Gnu-Con

Peer7 Peer3 Peer1 Peer5 Peer2 Peer4 Peer6

Gnu-Con OK

OK

PING PING

PING

PONG PING

PING

PONG

Peer8

PING

PONG

PONG PONG

PONG PING

Sample Gnutella 0.4 network: Sample message sequence chart according to the

sample network:

(85)

• Disadvantages

– High signaling traffic due to flooding

– Low bandwidth nodes may become bottlenecks – No search guarantees

– Overlay topology not optimal

• no complete view available

• no coordinator

– If not adapted to physical structure network load is sub-optimal

• Zigzag routes

• Loops

• Advantages

– No single point of failure

– Can be adapted to physical network – Can provide anonymity

• Routing anonymous, direct connection for transfer

• Application areas

– File-sharing (Freenet, Gnutella, Gnunet) – Context based routing Systems

5.1 Pure P2P: DIscussion

(86)

• A major problem of pure P2P systems is the limited scalability

– Main reason

• Random network layout

– Possibly degenerated network with high

diameters and potentially small bisection width

– Especially, weak nodes may easily become bottlenecks

– Request message often don’t reach their intended destinations (TTL too short) or clog the whole network (TTL too long)

• Also, in reality, not all peers are equal

– Weak modem peer always going on and off

5.2 Hybrid P2P

(87)

• Idea of, e.g. Gnutella 0.6:

– Take advantage of “stronger” peers, minimize damage “weak” peers can do

– Strong peers are promoted to super peers or ultra peers

• Have high uptime

• Posses high-bandwidth, low-latency network

• High computational power

• High storage capacity

5.2 Hybrid P2P

(88)

• Hybrid P2P uses a hierarchical network layout

– Super peers form a pure P2P network among themselves

– All other peers (Leaf peers) directly attach to one super-peer

– Super-peer network acts as distributed file index

• Super-peers request file lists from their leaf peers

• i.e. each super “knows” what is offered by its leafs

– Queries are distributed in super-peer subnet only – Combination of Pure and Central P2P!

5.2 Hybrid P2P

(89)

• Network characteristic, compared to pure P2P

– Hub based network

– Reduces the signaling load without reducing the reliability

– Election process to select and assign super peers

• Voluntarily or by statistics

– Super peers:

• high node degree (degree>>20, depending on network size)

– Leaf nodes:

• connected to one or

more Superpeers (degree<7)

5.2 Hybrid P2P

Leafnode

(90)

5.2 Hybrid P2P

Sample graph

Major component

Separate sub networks

Hub connections (2nd hierarchy)

Superpeer

Leafnode

(91)

• Bootstrapping:

– Via bootstrap-server

(hosted list from a web server)

• Contains super peer addresses

– Via peer-cache (from previous sessions)

– Registration of each leaf node at the super peer it connects to

• e.g. it announces its shared files to the super peer

• Super peer updates routing tables

– Table containing which file is shared by which node

– Super-peer may perform some load balancing

• Hand peer over to another super peer if super peers are unbalanced

• Suggest a node to be promoted to a super peer

5.2 Hybrid P2P

(92)

• Routing

– Partly decentralized

• Leaf nodes send request to a Super peer

• Super peer distributes this request in the Super peer layer

• If a Super peer has information about a matching file shared by one of its leaf nodes, it sends this information back to the

requesting leaf node (backward routing)

– Hybrid protocol (reactive and proactive)

• Routes to content providers are only established on demand;

content announcements from leaf nodes to their super peers

– Routing within super peer layer equal to Pure P2P

5.2 Hybrid P2P

(93)

• Signaling connections (stable, as long as neighbors do not change):

– Based on TCP

– Keep-alive Ping-Pong – Content search

• Content transfer connections (temporary):

– Based on HTTP

– Out of band transmission (directly between leaf nodes)

• Out-of-band ≡ not using signal routes

5.2 Hybrid P2P

(94)

• Query Requests

– Leafnode sends request to super peer

– A super peer receiving a request looks up in its routing tables whether content is offered by one of its leaf nodes

• If yes, response message is returned to request sender with information on the node offering the content

– Back-track-routing of responses is similar to pure P2P

– Additionally, the superpeer forwards the request to the super peer network per flooding

• Flooding similar to pure P2P, but messages remain in the super peer network (i.e. TTL, hopcounters, message caches, etc)

• No query communication with leafs necesarry due to routing

5.2 Hybrid P2P: Routing

(95)

5.2 Hybrid P2P: Flooding

Leafnode Ultrapeer

(96)

5.2 Hybrid P2P: Flooding

Leafnode Ultrapeer

(97)

5.2 Hybrid P2P: Flooding

Leafnode Ultrapeer

(98)

5.2 Hybrid P2P: Flooding

Leafnode Ultrapeer

(99)

5.2 Hybrid P2P: Flooding

Leafnode Ultrapeer

(100)

5.2 Hybrid P2P: Flooding

Leafnode Ultrapeer

(101)

5.2 Hybrid P2P: Ping-Pong

Sample Gnutella 0.6 network:

Sample message sequence chart according to the

sample network:

4

L1 L3

S2 S3

S1 L2

L5 L4 L6 L7

Gnu-Con

L2 L3 L1 S1 S3 S2 L7

OK

PONG

L6 L5 L4

RTU PING

PONG PONG

PING PING PONG

PONG PINGPING

QUERY

QUERY QUERY QUERY

QUERY QUERY

QUERY

QUHIT

QUHIT QUHIT QUHIT

QUHIT

(102)

5.2 Gnutella 0.6: Topology

43 39

7

100

3

118 116

18

Abstract network structure of a part of the Gnutella network (222 nodes

Geographical view given by Figure on the right, measured on 01.08.2002

Geographical view of a part of the Gnutella network (222 nodes); The numbers depict the node numbers from the abstract view (Figure on the left, measured on 01.08.2002)

(103)

• Content requests and responses

– QUERY (as defined in Gnutella 0.4)

– QUERY_HIT (as defined in Gnutella 0.4)

• Keep alive

– PING (as defined in Gnutella 0.4) – PONG (as defined in Gnutella 0.4)

• Announcement of shared content

– ROUTE_TABLE_UPDATE (0x30), Reset variant (0x0): to clear the routing table and to set a new routing table for one leafnode

– ROUTE_TABLE_UPDATE (0x30), Patch variant(0x1): to update and set a new routing table with a certain number of entries (e.g. new shared files)

5.2 Gnutella 0.6: Messages

0 1 4 5

Variant Table_Length Infinity

0 1 2 3 4 5 n+4

Variant Seq_No Seq_Size Compressor Entry_Bits DATA

(104)

• Disadvantages

– Still high signaling traffic because of decentralization

– No definitive statement possible if content is not available or not found

– Overlay topology not optimal, as

• no complete view available

• no coordinator

– Difficult to adapt to physical network completely because of hub structure

• Advantages

– No single point of failure – Can provide anonymity

• Application areas

– File-sharing (Gnutella, eDonkey, Kazaa)

5.2 Hybrid P2P: Discussion

(105)

Summary

Client-Server Peer-to-Peer

1. Server is the central entity and only provider of service and content.

 Network managed by the Server

2. Server as the higher performance system.

3. Clients as the lower performance system Example: WWW

1. Resources are shared between the peers

2. Resources can be accessed directly from other peers 3. Peer is provider and requestor (Servent concept)

Unstructured P2P Structured P2P

Centralized P2P Pure P2P Hybrid P2P Pure P2P Hybrid P2P

1. All features of Peer-to- Peer included

2. Central entity is necessary to provide the service 3. Central entity is some kind

of index/group database Example: Napster

1. All features of Peer-to-Peer included

2. Any terminal entity can be removed without loss of functionality

3.  No central entities Examples: Gnutella 0.4, Freenet

1. All features of Peer-to-Peer included

2. Any terminal entity can be removed without loss of functionality

3.  dynamic central entities Example: Gnutella 0.6, JXTA

(106)