Lect5-Peer2Peer
Lect5-Peer2Peer
2023
PEER-TO-PEER SYSTEMS
1. Characteristics of Peer-to-Peer Systems
2. The Napster File System
3. BitTorrent
2
P2P Basics
Main characteristics of peer-to-peer systems:
Each user contributes resources to the system.
All the nodes have the same functional capabilities and responsibilities
(although they may differ in the resources they contribute).
Correct operation does not depend on the existence of any centrally
administered system.
Key issues:
Choice of strategy for
the placement of data and their replica across many hosts;
the access to data
such that
workload of nodes and communication lines is balanced;
availability of data is provided.
Important!
Availability of processes/computers in peer-to-peer systems is a
problem!
Services cannot rely on guaranteed access to a host.
4
Peer-to-Peer Systems - History
Volunteer computing
Offering idle CPU cycles for HPC applications
Early pioneer: SETI@home (1999) https://setiathome.berkeley.edu/
Downloading and processing radioastronomy data packets as a useful
screensaver application (at a time when electricity was cheap...)
Later (early 2000s): Grid computing middleware, such as GLOBUS toolkit
More stable scenarios and resources; organizations as contributors
A predecessor of modern cloud computing
File/data sharing
First Pioneer: Napster (1999)
The index is centralized!
Later Systems: Freenet, Gnutella, Kazaa, BitTorrent
Only semi-centralized or completely distributed
Better anonymity, scalability, fault tolerance
Blockchain: Secure, decentralised database;
Bitcoin is implemented on top of Blockchains.
5
The Napster File Sharing System
Napster provided a globally-scalable information storage and
retrieval service for digital music (.mp3) files.
Napster was the first to demonstrate the feasibility of a peer-to-peer
solution on large scale.
Napster, as an open service, was shut down July 2001, as result of
lawsuits on copyright issues.
6
The Napster File Sharing System
8
Problems with Napster
Centralised index:
Scaling problem (server capacity and network bandwidth).
Anonymity of operators is not possible:
legal responsibility for copyright issues can be put on operators
maintaining the central index.
A completely distributed index can provide better scaling and
anonymity.
Napster did not provide solutions for consistency of replica
updates nor for guaranteed availability.
This was no problem because of the particular application,
music files:
Music files are immutable (do not change after being created)
no need to maintain replicas consistent.
If a file is unavailable at a certain moment,
it can be downloaded later.
Later systems, like BitTorrent, tried to solve some of the above
problems by applying various specific ad-hoc solutions.
9
BitTorrent
Similar to Napster, BitTorrent is a peer-to-peer file-sharing
application
much more decentralized than Napster
Designed by Bram Cohen; first release 2001; several versions followed
* The actual way to identify the .torrent file corresponding to the file one is interested
in, is not part of the protocol: google, or go to specialised pages (e.g. PirateBay, but
also many other less controversial ones).
12
BitTorrent
13
BitTorrent
Step 0: Search for the .torrent
file and save it;
14
BitTorrent
The tracker is the computer in charge of managing the transfer of a file:
The tracker’s URL is extracted from the .torrent file
It keeps track of the connected computers;
it facilitates the computers in the swarm to connect to each other
by sharing their IP addresses.
NB the file is not downloaded from the tracker!
The tracker coordinates the swarm.
15
BitTorrent
Tit-for-tat reward system
The reward system tries to avoid peers only downloading but not
contributing with uploading:
In order to receive files, you also have to give.
Clients reward other clients who upload, preferring to send data
to clients who contribute more upload bandwidth
the more files you share with others,
the faster your downloads are.
After you have got the whole file, you should continue to run the
client
you stay as a potential seeder which others can use
your rates improve in the tit-for-tat system.
16
Napster vs. BitTorrent:
Scaling, Availability, Developments
Napster:
centralized indexing service – if it fails
the whole file is downloaded from the same peer –
if it fails
Potentially reduced scalability and availability
BitTorrent:
no indexing system (just need a .torrent file);
pieces of the file are downloaded (in parallel) from multiple
seeders and leechers from the swarm.
Increased scalability, availability, performance.
17
BitTorrent:
Scaling, Availability, Developments
A potential point of failure is the tracker supervising the swarm!
Multi-tracker implementations:
Multiple trackers can be used for one torrent;
they are specified in the .torrent file.
18