• Keine Ergebnisse gefunden

Data Management

N/A
N/A
Protected

Academic year: 2021

Aktie "Data Management"

Copied!
75
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Peer-to-Peer

Data Management

Hans-Dieter Ehrich

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

(2)

7. Unstructured P2P Networks

The transparencies of this chapter are based on the package

Unstructured Peer-to-Peer Networks by

Wolf-Tilo Balke and Wolf Siberski 24.10.2007

Original slides partially provided by

Rüdiger Schollmeier

Jörg Eberspächer

(3)

7. Unstructured P2P Networks

History

Overlay Network Characteristics

Network Types

Centralized P2P

Pure P2P

Hybrid P2P

(4)

Review: Driving Forces of P2P – File Sharing

Sharing of otherwise unused resources

Storage

Bandwidth

(Processing)

No central control

No single point of failure

No administrative efforts

Difficult to attack with judicial means

(5)

Unstructured P2P Networks

History

Overlay Network Characteristics

Network Types

Centralized P2P

Pure P2P

Hybrid P2P

(6)

How It All Began: From Arpanet to Peer-to-Peer

1. How It All Began: From Arpanet to Peer-to-Peer

2.

The Napster Story

3.

Gnutella and its Relatives: Fully Decentralized Architectures

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(7)

From ARPANET to Peer-to-Peer

Late 1960s: Establishment of the ARPANET

Goal: share computing resources and documents between US research facilities

The logical network matched the physical network to a large extent

Applications: FTP and TelNet  client/server model

Central steering committee to organize the network

1979: Development of the UseNet protocol

Newsgroup application to organize content

Newsgroup server network exhibits P2P characteristics

Self organizing approach to add and remove newsgroup servers

Fully distributed content replication

Still client/server application with respect to endpoints

~1990 rush of the general public to join the Internet

Applications following the client/server approach: WWW, email, streaming

Straightforward model to administrate and control the content distribution

(8)

The Napster Story

1.

How It All Began: From Arpanet to Peer-to-Peer

2. The Napster Story

3.

Gnutella and its Relatives: Fully Decentralized Architectures

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(9)

The Napster Story

MAY 1999: Disruption of the Internet community First Generation of P2P

Introduction of Napster

User not only consume and download content but also offer and provide content to other participants

Users establish a virtual network, entirely independent from physical network and administrative authorities or restrictions

Basis: UDP and TCP connections between the peers

December 1999: RIAA files a lawsuit against Napster Inc.

Target of the RIAA: the central lookup server of Napster

February 2001: 2.79 billion files exchanged via the Napster network per month

July 2001: Napster Inc. is convicted

Napster has to stop the operation of the Napster server

Napster network breaks down

BUT: Already a number of promising successors available

(10)

Centralized Directory Model

Centralized Directory Model

The index service is provided centrally by a coordinating entity.

Search request is issued to the coordinating entity which delivers a list of peers having the desired files available for download.

Requesting peer obtains the respective files directly from the peer offering them.

Characteristics

Lookup of existing documents can be guaranteed.

Index service is a “Single Point of Failure”.

Centralized P2P system

Representative: Napster

(11)

? Prince

purple rain ? ! Prince

purple rain ! Central database with

index of all shared files Mr. Müller shares on

his client several MP3- files.

Mr. Arayama shares on his client several MP3-

files, too.

Centralized Directory Model - Example

! Prince purple rain !

@ Mr. Arayama

? Prince purple rain ?

(12)

Gnutella and its Relatives

1.

How It All Began: From Arpanet to Peer-to-Peer

2.

The Napster Story

3. Gnutella and its Relatives: Fully Decentralized Architectures

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(13)

Gnutella

March 2000: Nullsoft releases Gnutella as an open source project

Major developer: Gene Khan

Additionally to servent functionality, the peers also take over routing tasks

Becomes extremely popular after Napster has to shut down

October 2000: introduction of hierarchical routing layers.

Gnutella 0.6: Ultrapeer concept

Increases the scalability significantly

Variety of similar fully decentralized P2P-protocols followed soon:

Audiogalaxy

FastTrack/KaZaA

iMesh

Freenet

(14)

Flooded Request Model

Flooded Request Model:

Atomic P2P system

Without central coordination authority (all peers are equal).

Search request is passed on to neighbors.

If they cannot answer the request, they pass it on to various other nodes until a predetermined search depth (ttl=time-to-live).

When requested file has been located, positive search results are sent to the requesting entity.

Requesting peer can then download the desired file directly from the entity which is offering it.

Characteristics:

Fully decentralized, no central lookup server

 no single point of failure or control

Unreliable lookup (no guarantees)

System doesn't scale.

Pure P2P system

Representative: Gnutella 0.4

(15)

Mr. Müller is searching for

Prince No central

Database

?

?

?

?

?

?

? ? ?

http://www.gnutelliums.com/

Flooded Request Model - Example

?

Mr. Arayama serves Prince

(16)

Super-peer based Flooding

Super-peer based Flooding:

Hierarchical P2P system

Super-peers (Ultrapeers) form pure P2P subnet

All other peers (Leaf nodes) directly attach to one super-peer

Super-peer netwerk acts as distributed file index o Super-peers request file list from their leaf peers

Queries are distributed in super-peer subnet only

Combination of Pure and Central P2P

Characteristics:

Fully decentralized, no central lookup server

 no single point of failure or control

Systems scales much better

Unreliable lookup

(but more success due to smaller network)

Hybrid P2P system

Representative: Gnutella 0.6

(17)

Mr. Müller is searching for

Prince

?

?

?

http://www.gnutelliums.com/

Super-peer based Flooding- Example

?

Mr. Arayama serves Prince

?

? ?

?

?

?

(18)

Gnutella and its Relatives: The Story Goes on

August 2001

Users adapt very fast to the breakdown of Napster

Already 3.05 billion files exchanged per months via the Gnutella network

Year 2001

Invention of structured P2P networks (regular instead of random network graph)

August 2002

Amount of exchanged data in KaZaA decreases, caused by a high number of defected files (reason: weak hash keys to identify files)

Edonkey and Gnutella regain popularity

May 2003

Bittorrent is released

Soon causes majority of the observed traffic, due to its efficiency

Middle of 2003

Beyond the exchange of content, new concepts are developed to use P2P also for other applications

Skype a Voice over P2P application is developed

Today:

Major efforts are made to increase the reliability of P2P-searches, also in mobile networks, …

In 2005 Ebay buys Skype to use the paradigm for the communication between bidders and sellers

(19)

1

st

and 2

nd

Generations of P2P

Client-Server Peer-to-Peer

1. Server is the central entity and only provider of service and content.

Network managed by the Server

2. Server as the higher performance system.

3. Clients as the lower performance system

Example: WWW

1. Resources are shared between the peers

2. Resources can be accessed directly from other peers 3. Peer is provider and requestor (Servent concept)

Unstructured P2P Structured P2P

Centralized P2P Pure P2P Hybrid P2P Pure P2P Hybrid P2P

1. All features of Peer-to- Peer included 2. Central entity is

necessary to provide the service

3. Central entity is some kind of index/group database

Example: Napster

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. No central entities Examples: Gnutella 0.4,

Freenet

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. dynamic central entities

Example: Gnutella 0.6, JXTA

1st Gen. 2nd Gen.

(20)

Unstructured P2P Networks

History

Overlay Network Characteristics

Network Types

Central P2P: Napster

Pure P2P: Gnutella 0.4

Hybrid P2P: Gnutella 0.6

(21)

Overlay Networks

“Virtual” signaling network established via TCP connections between the peers

Characteristics of the overlay topology:

completely independent from physical network

Separate addressing and routing scheme

No relation between physical network edges and overlay network edges

May include hierarchies (hub network) (e.g. rendezvous peers in JXTA)

May include centralized elements (star network) (lookup server in Napster)

May be a completely randomized network (Gnutella 0.4) (randomly meshed network)

Overlay network can be seen as graph

Peers as nodes

Conceptual connections as edges

(22)

General Characteristics of 1

st

And 2

nd

Gen. P2P

1st and 2nd Generation P2P systems are overlay architectures, with the following characteristics:

TCP/IP based

Decentralized and self organizing (with possible centralized elements)

Content:

Distributed “randomly” on the network, with several replicas (due to popularity)

Content stays at provider peer

Content transfer:

o

Out of band, i.e. on separate connections and not via signaling connections

o

Mostly via HTTP

Employ distributed shared resources (data storage, bandwidth)

Generally two kinds of requests:

Content requests: to find content in the overlay

Keep-alive requests: stay connected in the overlay

Initially developed for file-sharing

Various realizations exist

(23)

Basic Routing Behavior

Request messages:

Include a hop-counter, a GUID (Globally Unique Identifier) and a TTL (Time-To-Live) in the header

TTL determines along how many hops a message may be forwarded

Are flooded in the overlay network

Every node forwards every incoming message to all neighbors except the neighbor, it received the message from

Exceptions: see below

Request messages terminate, if

Same message-type with same GUID is received more than once (loop!!)

Hop-counter=TTL

Response messages:

Include a hop-counter, a GUID and a TTL (Time-to-Live) in the header

GUID is the same as of the initializing request message

Are routed back on the same way to the requestor, the request message was transmitted to the responding peer

every peer has to store the GUID of each request for a certain amount of time

No flooding to save resources

(24)

Basic Bootstrapping

Mostly not part of the protocol specification

Necessary to know at least one active participant of the network

Otherwise no participation at the overlay possible for a new node

Address (TCP) of an active node can be retrieved by different means:

Bootstrap cache: Try to establish one after another a connection to a node seen in a previous session

Bootstrap server:

Connect to a “well known host”, which almost always participates

Ask a bootstrap server to provide a valid address of at least one active node

Realizations:

o FIFO of all node-addresses which recently used this bootstrap (a node which just connected is assumed to be still active)

o Random pick of addresses which recently connected via this server to the overlay (+ no loops, -may be outdated)

Broadcast on the IP layer

Use multicast channels

Use IP broadcasting (-limited to local network)

(25)

Unstructured P2P Networks

History

Overlay Network Characteristics

Network Types

Central P2P: Napster 1. Basic Characteristics 2. Signaling Characteristics 3. Discussion

Pure P2P: Gnutella 0.4

Hybrid P2P: Gnutella 0.6

(26)

Definition of centralized P2P

All peers are connected to central entity

Peers establish connections between each other on demand to exchange user data (e.g. mp3 compressed data)

Central entity is necessary to provide network services

Central entity is some kind of index/group database

Central entity is lookup/routing table

(27)

Basic Characteristics of centralized P2P

Bootstrapping: Bootstrap-server = central server

Central entity can be established as a server farm, but one single entry point = single point of failure (SPOF)

All signaling connections are directed to central entity

Peer  central entity: P2P protocol, e.g. Napster protocol

To find content

To log on to the overlay

To register

To update the routing tables

To update shared content information

Peer  Peer: HTTP

To exchange content/data

(28)

Topology of Centralized P2P

Servent

Connection between 2 servents (TCP)

Connection between router & servent Connection between

routers (Core)

(29)

Napster: How Does It Work

Application-level, client-server protocol over point-to-point TCP

Partcipants:

Napster Hosts/peers

Client Service

Login

Data-requests

Download-requests

P2P Service

Data-transfer

Napster Indexserver

Pure Server

Five steps:

Connect to Napster Server

Upload your list of files (push) to server

Query Indexserver with a list of keywords to search the full list with

Select “best” of correct answers

Connect to providing host/peer

Central Napster Index server

Data Transfer

Napster Host

Napster Host

Napster Host Napster

Host

(30)

Napster Message Structure

<Payload Length>

2byte <Function>

2Byte

HEADER 4byte PAYLOAD

General Header Structure:

Describes the

message type (e.g.

login, search,…)

Describes parameters of the message (e.g.

IDs, keywords,…)

(31)

Napster: Initialization

<Nick> <Password> <Port>

1: LOGIN (Function:0x02)

Napster Host IP: 001 Nick: LKN

<Client-Info> <Link-type>

LOGIN(0x02)

lkn 54332 6699 „nap v0.8“ 9 LOGIN ACK(0x03)

2: LOGIN ACK (Function: 0x03)

„<Filename>“ <MD5>

3: NOTIFICATION OF SHARED FILE (0x64)

<Size> <Bitrate> <Freq> <Time>

NOTIFICATION(0x64)

„band - song.mp3“ 3f3a3... 5674544 128 44100 342

Central Napster

Index server

Client/Server Service

(32)

Napster: File Request Procedure

[FILENAME CONTAINS „Search Criteria“]

[LINESPEED <Compare> <Link-Type>]

1: SEARCH (Function: 0xC8)

[BITRATE <Compare> “<Bitrate>”]

SEARCH(0xC8)

FILENAME CONTAINS „song“ MAX_RESULTS 100 LINESPEED „AT LEAST“ 6 BITRATE „AT LEAST“

„128“

FREQ „EQUAL TO“ „44100“

[FREQ <Compare> “<Freq>”]

[MAX_RESULT <Max>]

Napster Host IP: 002 Nick: MIT

2: SEARCH RESPONSE (Function: 0xC9)

„<Filename>“ <MD5> <Size> <Bitrate> <Freq>

<Time> <Nick> <IP> <Link-Type>

Central Napster

Index server

(33)

Summary of Napster Signaling

Napster Peer (Req)

Napster Server

Napster Peer (Prov)

Login: [0x24|0x02|…]

Login Ack: [0x00|0x03|…]

HTTP: GET[Filename]

OK[data]

Notif: [0x46|0x64|…]

Notif: [0x46|0x64|…]

Notif: [0x46|0x64|…]

Search: [0x7E|0xC8|…]

Response: [0xC4|0xC9|…]

Response: [0xC4|0xC9|…]

Sample message sequence chart for one Napster server with one requesting and one providing peer

(34)

1.

Search Request

User sends out a music file request and Napster searches its central

data base.

Napster: Wrap-Up I

0101 1001 1001

(35)

2.

Search Response

The Napster Server sends back a list of peers that share the file.

Napster: Wrap-Up II

(36)

3.

File Download

The requesting user

downloads the file directly from the computer of

another Napster user via HTTP.

Napster: Wrap-Up II

(37)

Centralized P2P: Discussion

Disadvantages

Single Point of Failure easily attackable

Bottleneck

Potential of congestion

Central server in control of all peers

Advantages

Fast and complete lookup (one hop lookup)

Central managing/trust authority

No keep alive necessary, beyond content updates

Application areas

File Sharing

VoIP (SIP, H.323)

Conceptually: „Social Web‟ applications (eBay, YouTube, del.icio.us, etc.)

Systems

BitTorrent, Audiogalaxy, WinMX

(38)

Unstructured P2P Networks

History

Overlay Network Characteristics

Network Types

Central P2P: Napster

Pure P2P: Gnutella 0.4 1. Basic Characteristics

2. Signaling Characteristics 3. Discussion

Hybrid P2P: Gnutella 0.6

(39)

Definition of Pure P2P

Any terminal entity can be removed without loss of functionality

No central entities employed in the overlay

Peers establish connections between each other randomly

To route request and response messages

To insert request messages into the overlay

(40)

=



c d d

Model of Pure P2P Networks

Major component Separate sub

networks Degree distribution:

According Sample Graph:

( ) ( )

( )

1.4 1

, 0 7 0 , ,

: 2.2 var 1.63

d

p d with c p d in any other case c

average d d

- -



== 



=

(41)

Basic Characteristics of Pure P2P

Bootstrapping:

Via bootstrap-server (host list from a web server)

Via peer-cache (from previous sessions)

Via well-known host

No registration

Routing:

Completely decentralized

Reactive protocol: routes to content providers are only established on demand, no content announcements

Requests: flooding (limited by TTL and GUID)

Responses: routed (Backward routing with help of GUID)

Signaling connections

(stable, as long as neighbors do not change)

:

Based on TCP

Keep-alive

Content search

Content transfer connections

(temporary)

:

Based on HTTP

Out of band transmission

(42)

Topology of Pure P2P

Servent

Connection between 2 servents (TCP)

Connection between router & servent Connection between

routers (Core)

(43)

Gnutella 0.4: How Does It Work

Application-level, peer-to-peer protocol over point-to-point TCP Partcipants:

Gnutella peers/servents

Router Service

Flood incoming requests (regard TTL!)

o Keep alive o content

Route responses for other peers (regard GUID of message)

o Keep alive (PING/PONG) o Content (QUERY/QUERYHIT)

Data-requests

Download-requests

Lookup Service

Initialize Data requests

Initialize keep alive requests

“Server”-Service

Serve Data-requests (HTTP)

Five steps:

Connect to at least one active peer (address received from bootstrap)

Explore your neighborhood (PING/PONG)

Submit Query with a list of keywords to your neighbors (they forward it)

Select “best” of correct answers (which we receive after a while)

Connect to providing host/peer

G

G

G G

G

G

G

TCP connection G

Peer/

Servent G

G

G

G

(44)

The Gnutella Network

Measurements taken at the LKN in May 2002

(45)

Gnutella Message Structure

General Header Structure:

Describes the

message type (e.g.

login, search,…)

Describes parameters of the message (e.g.

IDs, keywords,…)

General Header Structure:

GnodeID

16 Bytes Function

1 Byte

MESSAGEHEADER: 23Byte

TTL

1 Byte Hops

1 Byte Payload Length

4 Bytes

GnodeID: unique 128bit Id of any Hosts

TTL(Time-To-Live): number of servents, a message may pass before it is killed

Hops: number of servents a message already passed

(46)

Gnutella Messages

2 BytesPort IP Address 4 Bytes

PING (Function:0x00)

Nb. of shared Files

4 Bytes Nb. of Kbytes shared 4 Bytes

No Payload PONG (Function:0x01)

Minimum Speed

2 Bytes Search Criteria

n Bytes

QUERY (Function:0x80)

Nb. of Hits

1 Byte Port

2 Bytes GnodeID

16 Bytes Result Set

n Bytes

QUERY HIT (Function:0x81)

Speed 1 Byte

File Index 4 Bytes

File Name n Bytes IP Address

4 Bytes

(47)

Gnutella Routing

Flooding: Received PINGS and QUERIES must be forwarded to all connected Gnodes

PINGS or QUERYS with the same FUNCTION ID and GNODE ID as previous messages are

destroyed (avoid loops)

• Save Origin of received PINGs and QUERIEs

• Increase Hops by 1

• If Hops equals TTL, kill the message

PONG and QUERY HIT are forwarded to the origin of the according PING or QUERY

Basic Routing Principle: „Enhanced“

Flooding

(48)

Gnutella Connection Setup

GNODE ID: 2000

IP: 002

GNODE ID: 3000

IP: 003

GNODE ID: 4000

IP: 004 GNODE

ID: 1000 IP: 001

2526

17

2224

17Gnutella Connect 18Gnutella OK

19PING 20PONG/IP:004

21PING 23PONG/IP:001 27PONG/IP:001

22PING 24PONG/IP:003 28PONG/IP:003

25PING 26PING 18

1920 2728

Gnode 2000 establishes a connection to 4000

(49)

Summary of the Signaling in Gnutella 0.4

1

7 3

2 4 5

6

8

Gnu-Con Gnu-Con

Peer7 Peer3 Peer1 Peer5 Peer2 Peer4 Peer6

Gnu-Con OK

OK

OK PING PING

PING PING

PING PING

PING

PONG PING

PING

PING

PONG

PONG

PONG

Peer8

PING

PONG

PONG

PONG

PONG

PONG PONG

PONG

PONG PING

Sample Gnutella 0.4 network:

Sample message sequence chart according to the sample network:

(50)

Gnutella Wrap-Up I

Requesting Servent Servent

Query […]

Query-Hit […]

Requested Data

Broadcast

Query[XYZ, TTL = 3, …]

(51)

Gnutella Wrap-Up II

Requesting Servent Servent

Query […]

Query-Hit […]

Requested Data

1. Query-Hit

Broadcast

Query[XYZ, TTL = 2, …]

(52)

Gnutella Wrap-Up III

Requesting Servent Servent

Query […]

Query-Hit […]

Requested Data

2. Query-Hit

Broadcast

Query[XYZ, TTL = 1, …]

(53)

Gnutella Wrap-Up IV

Requesting Servent Servent

Query […]

Query-Hit […]

Requested Data

3. + 4. Query-Hit

[TTL = 0] no further

Broadcast

(54)

Gnutella Wrap-Up V

Requesting Servent Servent

Query […]

Query-Hit […]

Requested Data

Establish HTTP

Connection

(55)

Gnutella Wrap-Up VI

Requesting Servent Servent

Query […]

Query-Hit […]

Requested Data

HTTP Connection Get[XYZ, …, …]

Download Data

(56)

Discussion

Disadvantages

High signaling traffic, because of decentralization

Modem nodes may become bottlenecks

Overlay topology not optimal, as

no complete view available,

no coordinator

If not adapted to physical structure delay and total network load increases

Zigzag routes

loops

Advantages

No single point of failure

Can be adapted to physical network

Can provide anonymity

Can be adapted to special interest groups

Application areas

File-sharing

Context based routing (see chapter about mobility)

Systems

Freenet, Gnutella, Gnunet

(57)

Unstructured P2P Networks

History

Overlay Network Characteristics

Network Types

Central P2P: Napster

Pure P2P: Gnutella 0.4

Hybrid P2P: Gnutella 0.6 1. Basic Characteristics

2. Signaling Characteristics 3. Discussion

(58)

Definition of Hybrid P2P

Main characteristic, compared to pure P2P: Introduction of another dynamic hierarchical layer

Hub based network

Reduces the signaling load without reducing the reliability

Election process to select and assign Superpeers

Superpeers: high degree (degree>>20, depending on network size)

Leafnodes: connected to one or more Superpeers (degree<7)

Superpeer

leafnode

(59)

- =



, 1 7 c d d

average d



Model of Hybrid P2P Networks

Degree distribution:

According sample graph:

Major component

Separate sub networks

Hub connections (2nd hierarchy)

Superpeer

leafnode

( ) ( )

( )

1.4 1.4 1

1 0.05, 1 0.05, 20 , 0,

: 2.8 var 3.55

d

p d with cc d p d c d c

in any other case

d

- - -

== = 

= =

(60)

Basic Characteristics of Hybrid P2P

Bootstrapping:

Via bootstrap-server (host list from a web server)

Via peer-cache (from previous sessions)

Via well-known host

Registration of each leafnode at the Superpeer it connects to, i.e. it announces its shared files to the Superpeer

Routing:

Partly decentralized

Leafnodes send request to a Superpeer

Superpeer distributes this request in the Superpeer layer

If a Superpeer has information about a matching file shared by one of its leafnodes, it sends this information back to the requesting leafnode (backward routing)

Hybrid protocol (reactive and proactive): routes to content providers are only established on demand;

content announcements from leafnodes to their Superpeers

Routing within Superpeer layer equal to Pure P2P

Signaling connections (stable, as long as neighbors do not change):

Based on TCP

Keep-alive

Content search

Content transfer connections (temporary):

Based on HTTP

Out of band transmission (directly between leafnodes)

(61)

Gnutella 0.6 Network Organization

New connection/network setup

Upon connection to the network via a Superpeer, each node is a leafnode

It announces its shared content to the Superpeer it connected to

Superpeer thus updates its routing tables

Election mechanism decides which node becomes a Superpeer or a leafnode (depending on capabilities (storage, processing power)

network connection, the uptime of a node,…), if

Too many nodes are connected to one Superpeer

A Superpeer leaves the network

To less nodes are connected to a Superpeer

(62)

Gnutella 0.6 Routing

Content requests:

Leafnode sends request to Superpeer

Superpeer looks up in its routing tables whether content is offered by one of its leafnode. In this case the request is forwarded to this node.

Additionally the Superpeer increases the hopcounter and forwards this request to the Superpeers it is connected to.

To enable backward routing, the peer has to store the GUID of the message connected to the information from which peer it received the request in the previous hop

If a Superpeer receives such a request from another Superpeer, this request is handled the same way, as if it would have received it from one of its leafnodes

After the hopcounter of the request reaches the TTL-value it is not forwarded any further (prevent circles)

Content responses:

If a leafnode receives a request, it double-checks whether it shares the file (should be the case, as long as the routing tables of the Superpeer are correct)

In case of success, the leafnode sends a content reply back to the requesting peer, by sending it back to that node (Superpeer) it received the message from (backward routing)

Hop by hop the message can thus be routed back to the requesting node

Content exchange:

Directly between the leafnodes, via HTTP connections

(63)

Topology of Hybrid P2P

43 39

7

100

3

118 116

18

39 118

7

116

3, 43 100 18

Abstract network structure of a part of the Gnutella network (222 nodes

Geographical view given by Figure on the right, measured on 01.08.2002

Geographical view of a part of the Gnutella network (222 nodes); The numbers depict the node numbers from the abstract view (Figure on the left, measured on 01.08.2002)

• Virtual network not matched to physical network. See path from node 118 to node 18.

• Superpeer (hub) structure clearly visible in abstract view

(64)

Gnutella 0.6 Messages

Content requests and responses

QUERY (defined as in Gnutella 0.4)

QUERY_HIT (defined as in Gnutella 0.4)

Keep alive:

PING (defined as in Gnutella 0.4)

PONG (defined as in Gnutella 0.4)

Announcement of shared content:

ROUTE_TABLE_UPDATE (0x30), Reset variant (0x0): to clear the routing table and to set a new routing table for one leafnode

ROUTE_TABLE_UPDATE (0x30), Patch variant(0x1): to update and set a new routing table with a certain number of entries (e.g. new shared files)

0 1 4 5

Variant Table_Length Infinity

0 1 2 3 4 5 n+4

Variant Seq_No Seq_Size Compressor Entry_Bits DATA

(65)

Summary of the Signaling in Gnutella 0.6

Sample Gnutella 0.6 network:

Sample message sequence chart according to the sample network:

4

L1 L3

S2 S3

S1 L2

L5 L4 L6 L7

Gnu-Con

L2 L3 L1 S1 S3 S2 L7

OK

PONG

L6 L5 L4

RTU PING

PONG PONG

PING PING PONG

PONG PINGPING

QUERY

QUERY

QUERY QUERY QUERY

QUERY QUERY

QUERY

QUHIT

QUHIT QUHIT QUHIT

QUHIT QUHIT QUHIT

QUHIT

(66)

Gnutella 0.6: How Does It Work I

Leafnode Ultrapeer

(67)

Gnutella 0.6: How Does It Work II

Leafnode Ultrapeer

(68)

Gnutella 0.6: How Does It Work III

Leafnode Ultrapeer

(69)

Gnutella 0.6: How Does It Work III

Leafnode Ultrapeer

(70)

Gnutella 0.6: How Does It Work IV

Leafnode Ultrapeer

(71)

Gnutella 0.6: How Does It Work V

Leafnode Ultrapeer

(72)

Discussion

Disadvantages

Still High signaling traffic, because of decentralization

No definitive statement possible if content is not available or not found

Overlay topology not optimal, as

no complete view available,

no coordinator

If not adapted to physical structure delay and total network load increases

Zigzag routes

Loops

Difficult to adapt to physical network completely because of hub structure

Advantages

No single point of failure

Can provide anonymity

Can be adapted to special interest groups

Application areas

File-sharing

Context based routing (see chapter about mobility)

Systems

Gnutella, eDonkey, Kazaa

(73)

Topology combinations

Each approach comes with a different set of advantages/disadvantages

Suitability depends on application context

Combination of approaches

Use different techniques for different application aspects

Example: Skype

Centralized P2P for Login/Account Mgmt.

o Routed by super-nodes if necessary

Attempts to establish direct Voice over IP connections

Hybrid P2P to route through firewall, between NATs, etc.

Figure from Salman A. Baset and Henning G. Schulzrinne: An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol, INFOCOM2006

(74)

1

st

and 2

nd

Generations of P2P

Client-Server Peer-to-Peer

1. Server is the central entity and only provider of service and content.

Network managed by the Server

2. Server as the higher performance system.

3. Clients as the lower performance system

Example: WWW

1. Resources are shared between the peers

2. Resources can be accessed directly from other peers 3. Peer is provider and requestor (Servent concept)

Unstructured P2P Structured P2P

Centralized P2P Pure P2P Hybrid P2P Pure P2P Hybrid P2P

1. All features of Peer-to- Peer included 2. Central entity is

necessary to provide the service

3. Central entity is some kind of index/group database

Example: Napster

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. No central entities Examples: Gnutella 0.4,

Freenet

1. All features of Peer-to- Peer included

2. Any terminal entity can be removed without loss of functionality

3. dynamic central entities

Example: Gnutella 0.6, JXTA

1st Gen. 2nd Gen.

(75)

Outlook

Structured Networks

Distributed Hash Table (DHT) Basics

DHT Algorithms

DHT Dynamics

File Distribution Networks

Referenzen

ÄHNLICHE DOKUMENTE

war nicht nur eine länderübergreifende Veranstaltung, sondern auch eine Ver- anstaltung, die zeigte, dass die verschie- denen zahnmedizinischen Fachrichtun- gen bisher und auch

– Inner ring holds committed data uses byzantine agreement. – Target is global scale data access

– Focus on high network scalability (e.g., Edutella) – Focus on high query expressivity (e.g., PIER) – Focus on information integration (e.g., Piazza) – Focus on specific

where N is the number of all peers and Nt is the number of peers offering documents on term t If summarizations of peers abstracts are eagerly disseminated, each peer can locally

The challenges of modern times do not stop at the ceramics and refrac- tory industry: Refractory linings in kilns should improve the energy foot- print, their components should be as

Additionally, there is a lack of visible women that are depicted as successful in ICT and therefore, the field does not provide role models that can be very impactful when

Given the redistributive impact of inflation and, more broadly, of monetary policy, the choice of a degree of independence for a society's central bank has to be politically backed

The upcoming change of President in Afghanistan and the ar- rival of the “pro-business” (but rightwing) Hindu-nationalist conservative BJP government and its Prime Minister Naren-