• Keine Ergebnisse gefunden

Data Management Peer-to-Peer

N/A
N/A
Protected

Academic year: 2021

Aktie "Data Management Peer-to-Peer"

Copied!
58
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

Peer-to-Peer

Data Management

Hans-Dieter Ehrich

Institut für Informationssysteme

Technische Universität Braunschweig

http://www.ifis.cs.tu-bs.de

(2)

6. Peer-to-Peer Basics

The transparencies of this chapter are based on the package

Characteristics and Applications of Peer-to-Peer Infrastructures

by

Wolf-Tilo Balke and Wolf Siberski

24.10.2007

(3)

Overview

1.

Status Quo: Networks (Over)Filled with Peer-to-Peer Traffic

2.

Driving Forces Behind Peer-to-Peer

3.

Applications and Classification of P2P

4.

What is shared?

5.

Markets and Revenue Generation

6.

Where is P2P technology reasonable?

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(4)

What is P2P?

P2P systems are overlay architectures, with the following characteristics:

Two logically separate networks

Mostly IP based

Decentralized and self organizing

Employ distributed shared resources (computing power and data storage)

Initially developed for file-sharing

Various realizations

Common basis for signaling: IP (TCP and UDP)

Common basis for data transmission: HTTP or special directly IP- based protocols

Use flooding in the overlay to a certain extent

(5)

Impacts of P2P

Rising flow sizes (60 kbyte -> 2 Gbyte)

30%-60% of the traffic in the Abilene backbone is caused by P2P applications

70% of the traffic in the German Research Network (DFN) is caused by P2P applications.

T-Online observes an increasing symmetry at the access-level.

LRZ (Munich Network Center) observes an increasing symmetry between

US and Europe

(6)

Impacts of P2P at the Abilene Backbone

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18.02.2002 18.04.2002

18.06.2002 18.08.2002

18.10.2002 18.12.2002

18.02.2003 18.04.2003

18.06.2003 18.08.2003

18.10.2003 18.12.2003

18.02.2004 18.04.2004

18.06.2004 18.08.2004

Traffic portions in % per week

Unidentified Data_Transfers File_Sharing

Core of Internet2 infrastructure, connecting 190 US

universities and research centers

Only Signaling Possible data

transfers

Unidentified + data_transfers + file_sharing causes 90% of the traffic

Unidentified traffic and data_transfers increased significantly

 Parts of P2P is hidden (port hopping,…)

 Some P2P applications use port 80  data_transfers

(7)

Impacts of P2P at the Abilene Backbone

P2P Traffic amount (only signaling)

Is still high (~50 TByte per week)

Becomes a constant part of the traffic (since end 2002)

Slumps are assumed to be caused by

Port closures (firewalls, NATs)

Verdicts (Napster Case,…)

Data Transfers are caused presumably to a large extent by P2P apps

0 50 100 150 200 250 300

18.02.2002

18.04.2002

18.06.2002

18.08.2002

18.10.2002

18.12.2002

18.02.2003

18.04.2003

18.06.2003

18.08.2003

18.10.2003

18.12.2003

18.02.2004

18.04.2004

18.06.2004

18.08.2004

traffic in TByte per week

Unidentified Data_Transfers File_Sharing

(8)

Reason for These Experiences

(9)

Overview

1.

Status Quo: Networks (Over)Filled with Peer-to-Peer Traffic

2.

Driving Forces Behind Peer-to-Peer

3.

Applications and Classification of P2P

4.

What is shared?

5.

Markets and Revenue Generation

6.

Where is P2P technology reasonable?

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(10)

Driving Forces Behind Peer-to-Peer

Development of the terminal capabilities:

1992:

Average hard disk size: ~0.3Gbyte

Average processing power (clock frequency) of personal computers: ~ 100MHz

2002:

Average hard disk size: 100 Gbyte

2007:

Average processing power (clock frequency) of personal computers: ~ 3GHz

 Personal computers have capabilities comparable

to servers in the 1990s

(11)

Driving Forces Behind Peer-to-Peer

Development of the communication networks:

Early 1990s: private users start to connect to the Internet via 56kbps modems

1997/1998

first broadband connections for residential users become available

cable modem with up to 10Mbps

1999

Introduction of DSL and ADSL connections

Data rates of up to 8.5Mbps via common telephone connections become available

The deregulation of the telephone market shows first effects with significantly reduced tariffs, due to increased competition on the last mile

 bandwidth is plentiful and cheap!

(12)

Development of P2P Applications

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

18.02.2002 18.05.2002 18.08.2002 18.11.2002 18.02.2003 18.05.2003 18.08.2003 18.11.2003 18.02.2004 18.05.2004 18.08.2004

datavolumes in % per week

Freenet

Direct Co nnect++

Carracho B lubster Neo -M o dus FastTrack WinM X Sho utcast A udio galaxy eDo nkey2000 Ho tline Gnutella B itTo rrent

BitTorrent FastTrack

Gnutella

edonkey

Shoutcast

Traffic portions of the different P2P applications and protocols from the traffic measured per week in the Abilene backbone from 18.02.2002 until 18.010.2004

(13)

Overview

1.

Status Quo: Networks (Over)Filled with Peer-to-Peer Traffic

2.

Driving Forces Behind Peer-to-Peer

3.

Applications and Classification of P2P

4.

What is shared?

5.

Markets and Revenue Generation

6.

Where is P2P technology reasonable?

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(14)

Applications and Classification of P2P

● Abstract definition of the peer-to-peer paradigm

[A peer-to-peer system is] a self-organizing system of equal,

autonomous entities (peers) [which] aims for the shared usage of distributed resources in a networked environment avoiding central services.

Andy Oram (ed.). Peer-to-Peer: Harnessing the Power of Disruptive

Technologies. O‟Reilly, 2001.

(15)

No clear distinction

Some cases even misleading

Applications and Classification of P2P

Conventional Classification of P2P

File Sharing (Napster, Gnutella, Freenet)

Grid Computing (SETI@home)

Instant Messaging (ICQ, AIM)

Collaboration (Groove Workspace)

Classification by Means of Shared Resources

Information

Files

Bandwidth

Storage space

Processor cycles

P2P Applications Can Be Classified by Shared Resources

(16)

Overview

1.

Status Quo: Networks (Over)Filled with Peer-to-Peer Traffic

2.

Driving Forces Behind Peer-to-Peer

3.

Applications and Classification of P2P

4.

What is shared?

5.

Markets and Revenue Generation

6.

Where is P2P technology reasonable?

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(17)

What is shared?

1. Information

File Sharing and Document Management

Presence Information

Collaboration 2. Bandwidth

Increased Load Balancing

Shared Use of Bandwidth 3. Storage Space

DAS, NAS, SAN

P2P Storage Networks 4. Processor Cycles

High Performance Computing

(18)

Information (1/5)

File sharing

Classical application of P2P systems

 Users offer files (music , videos, etc.) for free download

 The application provides a unified view

 Napster, Gnutella & Co

First large scale occurrence of digital copyright infringement

 Strong reactions from industry,

e.g. Recording Industry Association of America (RIAA)

(19)

Information (2/5)

Distribution of Software/Updates

Basic idea of distributing software updates or patches in a P2P fashion

Obviously used for obtaining updates for P2P client software (Gnutella & Co)

But also for a wide variety of other software distributions

Prominent examples

Patches for the game „World of Warcraft‟ by Blizzard Entertainment

Linux company Lindows distributes their Linspire (prev. LindowsOS) via P2P

Technology used

Today mostly BitTorrent (Block-based File Swarming)

Microsoft‟s Avalanche (File Swarming with Network Coding)

(20)

Information (3/5)

Document Management

Usually centrally organized

But

 Large portion of the documents created in a company are distributed among desktop PCs

 without a central repository having any knowledge of their existence.

Solution

 P2P networks which create a connected repository from the local data

on the individual peers.

 Indexing and categorization of data by each peer on the basis of individually selected criteria.

 Self organized aggregation of information from areas of knowledge.

(21)

Information (4/5)

Presence Information

Important role in the self-organization of P2P networks and in scenarios related to omnipresent computers and information availability (ubiquitous computing).

Provides information about which peers and which resources are available in the network.

● Example: Instant Messaging Systems

P2P application which essentially uses presence information.

Peers pass on information via the network, whether or not they are available for

communication.

http://www.trillian.cc/

(22)

Information (5/5)

Collaboration

Members of working groups can communicate synchronously, conduct joint online meetings and edit shared documents.

Groupware :

offers functions like instant messaging, file sharing, notification, co-browsing, whiteboards, voice conferences and databases with real time synchronization.

Client/server based groupware has to be set up and administered on the server for each working group.

P2P groupware avoid additional administrative task and

central data management:

 All of the data created is stored on each peer and is synchronized automatically.

 Users can set up shared working environment for virtual teams (so-called shared spaces).

 Users can invite other users to work in these teams.

(23)

Bandwidth (1/4)

Typical Centralized Approach

Files are held on the server of an information provider.

Files are transferred from there to the requesting client.

Spontaneous increases in demand exert a negative influence on the

Unicast

Router

Receiver Receiver

Router

Router

Receiver Receiver Receiver

Router Sender

(24)

Bandwidth (2/4)

● Increased Load Balancing

Achieve increased load balancing by taking advantage of transmission routes which are not being fully exploited.

Peer-to-Peer Unicast:

Initial requests for files have to be served by a central server.

Further requests can be automatically forwarded to peers within the network, who have already received and

replicated these files.

Sample application: Skype

Router Sender

Router

Receiver/

Sender

Receiver/

Sender

Router

Receiver/

Sender

Receiver/

Sender

Receiver/

Sender

Receiver/

Sender Receiver/

Sender

(25)

Bandwidth (3/4)

Increased Load Balancing

Achieve increased load balancing by taking advantage of transmission routes which are not being fully exploited.

Information Channel Approach: new

new

new

new

new

new

info channel

info channel

info channel

info channel

info channel

info channel info

channel

new

(26)

Bandwidth (4/4)

Shared Use of Bandwidth

also facilitate the shared use of the bandwidth provided by the information providers.

Segmentation Approach:

Doc Part

3Part 2 Part

3

Part 1

Part 1

Part 2Part

Part 1 Part

2

Part 2Part

3

Part 3

Doc

Doc

Doc

(27)

Centralized Design Concepts Used to Store Data in a Company

Disadvantages:

 Inefficient use of the available storage.

 Additional load on the company network.

 Necessity for specially trained personnel.

 Additional backup solutions.

P2P Storage Networks (1/5)

Direct Attached Storage (DAS)

Network Attached Storage (NAS)

Storage Area

Networks (SAN)

(28)

P2P Storage Networks (2/5)

A P2P Storage Network is a cluster of computers, formed on the basis of existing networks, which share all storage available in the network

Examples: PAST, Pasta, OceanStore

Organization:

Each peer receives a public/private key pair

The public key is used to create an unambiguous identification number for each peer (with the aid of a hash function)

Each peer must make available some of its own storage, or pay a fee

Corresponding to its contribution, each peer is assigned a maximum volume of data which can be added to the network

A file is assigned an unambiguous identification number (hash function from the name or the content and the public key of the owner)

Storing the file and searching for it in the network takes place in the manner

described for the document routing model

(29)

P2P Storage Networks (3/5)

Buildup

ID 3

ID 25

ID 4

Hello ???

Hello ???

ID 1

Hash

neighbors

3 4 25

ID 1

ID 4

ID 25

ID 3

ID 17

ID 10

ID 8

(30)

P2P Storage Networks (4/5)

Store Documents

ID 1

ID 4

ID 25

ID 3

ID 17

ID 10

ID 8 3

4 25

1 17 25

1 4 10

3 4 8

1 10 17

10 17

3 8 25

Hash ID 11

ID 11

ID 11

ID 11

ID 11

ID 11

(31)

P2P Storage Networks (5/5)

Retrieve Documents

ID 1

ID 4

ID 25

ID 3

ID 17

ID 10

ID 8 3

4 25

1 17 25

1 4 10

3 4 8

1 10 17

10 17

3 8 25

ID 11

ID 11

ID 11 requestor: 1

ID 11

ID 11

requestor: 1

(32)

Processor Cycles

Increasing Requirements for High Performance Computing

i.e. in the field of bio-informatics, logistics or the financial sector

Available Computing Power of the Networked Entities often Unused

Using P2P Applications to Bundle Processor Cycles:

Forming a cluster of independent, networked computers in which a single computer is transparent and all networked nodes are combined into a single logical computer

Achieve computing power which even the most expensive super-computers can scarcely provide

“Grid Computing”

Examples:

Popular example: SETI@home

 Calculations during the idle processor cycles of participating peers

Advanced vision of grid computing: Globus Toolkit

 Standardized middleware for grid application

Note: The core of SETI@home is a classical Client/Server

application

(33)

Overview

1.

Status Quo: Networks (Over)Filled with Peer-to-Peer Traffic

2.

Driving Forces Behind Peer-to-Peer

3.

Applications and Classification of P2P

4.

What is shared?

5.

Markets and Revenue Generation

6.

Where is P2P technology reasonable?

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(34)

Financial Motivation in P2P Systems

P2P applications often lack revenue generation

Needed? Usually barter structures are instantiated (principle of reciprocity)

Revenue model of P2P

Currently only indirect revenues (e.g. ads, cross-selling)

Viable direct business models are sought

Key Questions

Who are the players?

What open issues are to be solved?

How can parties recover their costs

and earn a margin of profit?

(35)

P2P Business Applications need Revenue Creation

Instant Messaging

Direct message exchange

At least two interaction partners

Services like AIM have to provide infrastructure for about 200 Mio users

Grid Computing

Offering of computing resources

Digital Content Sharing

Exchange of content

Additional functionality connected with content

Collaboration

Work or play in ad hoc groups

Support regarding coordination and

cooperation

(36)

Application Style vs. Service Style

P2P Application Style

Packaged solutions

(e.g. Lotus Instant Messaging, Groove)

Set of common definitions (e.g. .NET, Gnutella)

P2P Service Style

Services based on P2P interaction model

No once-bought-

used-forever model

(37)

P2P Interaction Styles

Providing Interaction

Partner

Receiving Interaction

Partner Legal Owner

of the Object

Mediating Service Object of Interaction

provides receives

owns the rights of

facilitates the interaction

(38)

Business Models vs. Revenue Models

Business Model:

Totality of processes and arrangements that define a company‘s approach to commercial markets in order to sell services and/or goods and generate profits.

Revenue Model:

Includes all arrangements that permit the participants in business interactions to charge fees which are covered by one or several other participants in order to cover costs and add a margin to create profit.

Revenue Model is part of a business model

(39)

Revenue Models

Revenue Models

Indirect Revenue Models

Product is free of charge

Gain received from third party

Realisations:

Advertisement

Affiliate Model

Bundling

(40)

Revenue Models

Revenue Models

Indirect Revenue Models Direct Revenue Models

Product is free of charge

Gain received from third party

Realisations:

Advertisement

Affiliate Model

Bundling

Receipts come directly from customer

Realisations:

Sales

Transaction fees

Subscription

(41)

Requirements of a Revenue Model

● Differentiated Charging

Charge according to criteria of usage

Prerequisite for efficient revenue models

 Intense usage leads to high charges

● Allocation Effectiveness

Revenue stream to the appropriate receiver

 Party that has incurred the cost receives revenue

(42)

Revenue models for… Instant Messaging (1/3)

● Features of Instant Messaging

Text and/or voice message exchange between peers

Services like “Buddy list” and other functionalities

 Services don‟t have to be central

Object: message

Owner Provider: peer/ sender of message

Receiver: peer

Mediator: instant messaging service

(43)

Revenue models for… Instant Messaging (2/3)

Not P2P from

technological point of view

Communication between peers

Topologies

Server provides service Server only lists buddies Pure P2P-topology

Communication self- governed by peers

Message exchange and buddy list

service

decentralised

No server involved

(44)

Revenue models for… Instant Messaging (3/3)

● Revenue model for IM provided in application style

License fees

Optional professional services

● Revenue model for IM provided in service style

Subscription fees

 Undifferentiated  Not efficient

 Fees per log on  Not very efficient

 Hard to realise in pure topology

 Usage dependent  Efficient

 Only problem-free in C/S-Topology

(45)

Revenue models for… Digital Content Sharing (1/3)

● Features of digital content sharing

Prominent example: Exchange of entertainment media files

But: Sharing of any content possible,

in particular for decentralised knowledge management

Streaming of Content

Catalogue service

Object: digital content

Owner: provider or third party

Provider: peer

Receiver: peer

Mediator: digital content sharing service (not necessarily)

(46)

Revenue models for… Digital Content Sharing (2/3)

● Revenue model for DCS as application style

License fees

Consulting services

● Revenue model for DCS as service style

a) Legal owner is not identical with provider

Membership/Subscription fees

Fees per log on

Matchmaking fees

legally

problematic

Legal owner of the rights of exchanged files is not a participant in the transaction, therefore he cannot cover his costs

(47)

Revenue models for… Digital Content Sharing (3/3)

● Revenue model for DCS as service style

b) Legal owner is not identical with provider but the owner receives compensation

► Billing step implemented into content exchange

► Mediator  aggregating middleman

► P2P-Distribution: No clear economic value for owners

c) Legal Owner is identical with provider

► Differentiated charging and owner is compensated

► Providers don‟t sell object but limited rights to its usage

In all cases: Additional content protection

scheme is needed to enforce payment!

(48)

Revenue models for… Grid Computing (1/3)

● Features of grid computing

► Utilization of distributed computing resources

► Often C/S-based, not true P2P from technological point of view but: complex problems are solved by more or less independent peers

► Pure understanding: Peers can provide and demand resources

Object: computing resources

Owner Provider: peer, providing resources

Receiver: using the computing resources

Mediator: management of resource provision, often central server application

(49)

Revenue models for… Grid Computing (2/3)

● Revenue model for GC as application style

 Enterprise software sale

License fees

Professional services

● Revenue model for GC as service style

 Public internet exchange or cross-company a) Compensating the Mediator

Management of Grid on behalf of a third party:

Cost of mediating service + margin has to be charged

Management of Grid by Receiver:

Business utilisation has to cover the cost

(50)

Revenue models for… Grid Computing (3/3)

● Revenue model for GC as service style b) Compensating the Provider

Often: providing for free or for a part of the results

Desirable: monetary reimbursement

 Pay-per-use model feasible from technical point of view

 Problem: high transactional costs for payment

 Highly efficient methods for micropayment needed

 Problem: Financial incentives may be incapable of attracting providers

No problems regarding allocation effectiveness and efficiency but

Problem of micropayment

Problem of sufficient business value

(51)

Revenue models for… Collaboration (1/2)

● Features of Collaboration

Providing functions beyond email und workflow

Supporting standard groupware applications

P2P adds flexibility, e.g. ad hoc working groups

Here: groupware applications used in business context

 defined and authenticated members

Object: message or document

Owner Provider: partner in workgroup

Receiver: partner in workgroup

Mediator: collaboration server (not necessarily)

(52)

Revenue models for… Collaboration (2/2)

● Revenue model for Collaboration as application style

► Licensing models

► High demand for professional services

● Revenue model for Collaboration as service style

 hosted as a service

► Undifferentiated  Not efficient

► Fees for buddy list/  Not very efficient

catalogue service etc.  Hard to realise in the pure topology

► Transaction-based fees  Efficient

(e.g. transferred data)  Only problem-free in C/S-Topology

► Further consideration: User or group based bills possible

(53)

Discussion (1/3)

● Revenue models for P2P application style are not different from those for traditional application style

● Differentiated charging difficult for IM and Collaboration Groupware

Providers  Infrastructure

● Allocation effectiveness difficult for DCS

DCS affects copyrights belonging to third party

● GC suffers from overhead of micropayments

Accounting centre

required

P2P

Strategies for different parties

to increase their revenue?

(54)

Discussion (2/3) – How to increase Revenue?

Instant messaging

Bundling with Interactive agents

Providing Location based services

Multiple service levels (example: Skype/ Skype Out)

Digital content sharing

 Try to own communities

 Bundling digital content with other goods (example: concert tickets)

 …

What further possibilities are conceivable?

(55)

Discussion (3/3) – How to increase Revenue?

Grid computing

Bundling is no solution and

micropayment may not be feasible

Barter-like structures:

provide information goods as reimbursement

Collaboration

 Bundling similar to IM

 Multiple service levels

 …

Build ‘Closed Communities’:

P2P technology, but strong access control

(56)

Overview

1.

Status Quo: Networks (Over)Filled with Peer-to-Peer Traffic

2.

Driving Forces Behind Peer-to-Peer

3.

Applications and Classification of P2P

4.

What is shared?

5.

Markets and Revenue Generation

6.

Where is P2P technology reasonable?

1) Freenet 2) Buzzpad 3) WuWu 1)

2)

3)

[Most relevant P2P-Applications in the year 2001]

(57)

To Peer-to-Peer or not to Peer-to-Peer

● Often Discussed Problem: Where is P2P Really Needed?

Multiple classification systems have been designed to judge how suitable a P2P solution might be for a particular problem

● E.g. in the Form of Decision Trees

(58)

Conclusions

Based on characteristics of a wide range of P2P systems including budget, resource relevance, trust, rate of system change, criticality

“…the characteristics that motivate a P2P solution are limited budget, high relevance of the resource, high trust between nodes, a low rate of system

change, and a low criticality of the solution. We believe that the limited budget requirement is the most important motivator.”

M. Roussopoulos, M. Baker, D. Rosenthal, T. Giuli, P. Maniatis and J. Mogul: “2 P2P or not 2 P2P?“, IPTPS 2004

http://www.springerlink.com/content/bvx594yud8rd2gfp/

Indeed many centralized large-scale systems can ‘kill the problem

with iron‘, e.g. Web search engines like Google, Yahoo, etc.

Referenzen

ÄHNLICHE DOKUMENTE

Jeder Knoten leitet ein Broadcast-Paket mit RangeHash X an alle ihm bekannten Knoten (mit aktualisiertem Range) zwischen seiner ID und X weiter.. Der Startknoten sendet

Basics of peer-to-peer systems: motivation, characteristics, and examples Distributed object location and routing in peer-to-peer systems3. Unstructured

ƒ Peer-to-Peer: Anwendungen, die Ressourcen am Rand des Internets ohne feste IP-Adressen ausnutzen Ressourcen: Speicherkapazität, CPU-Zeit, Inhalte, menschliche Präsenz.. Î

[r]

Umgekehrt k¨onnen Aktivit¨aten, die aus Sicht eines Peers atomar sind, von einem anderen Peer ausgef¨uhrt werden, der die Aktivit¨at mit Hilfe einer eigenen Prozessspezifikation

AMD, age-related macular degeneration; BCVA, best-corrected visual acuity; NA, not applicable; PCV, polypoidal choroidal vasculopathy; PICOS, Population, Intervention,

RIPC by 4 cycles of 5min arm ischemia/5 min reperfusion (n=19) and sham (n=21) procedure, after connection to cardiopulmonary bypass (CPB), at the end of surgery, 24h

Trotz beträchtlicher Zuwachsraten sowohl in der privaten als auch in der gewerblichen Nutzung, weisen Lastenräder nach wie vor eine geringe Verbreitung auf und sind