MSC Thesis
MSC Thesis
Candidate: Supervisor:
Michele RULLO Leonardo QUERZONI
1328929
Academic Year 2015/2016
2
Acknowledgement
I wish to thank Giada, from the bottom of my heart, for being present
in the most important decisions of my life.
Finally, thanks to all those wonderful people who played and keep playing
a fundamental role in my life.
1
Abstract
1 Introduction 3
1.1 Anonymous Networks . . . . . . . . . . . . . . . . . . . . 6
1
3.2.3 Task Execution . . . . . . . . . . . . . . . . . . . 35
3.2.4 Output Production . . . . . . . . . . . . . . . . . 35
4 Validation 37
4.1 Gateway Container . . . . . . . . . . . . . . . . . . . . . 39
4.2 Logic Container . . . . . . . . . . . . . . . . . . . . . . . 41
4.3 Validation Objectives . . . . . . . . . . . . . . . . . . . . 43
4.4 Validation Results . . . . . . . . . . . . . . . . . . . . . . 44
4.5 Conclusions and Future Work . . . . . . . . . . . . . . . . 52
B Architecture: Scripts 68
B.1 Network Tests: Scripts . . . . . . . . . . . . . . . . . . . 73
2
Chapter 1
Introduction
Around the 40% of the entire world population has an Internet connec-
tion today, and we expect this number to grow in the near future. As
more people get connected every day, Internet established a real revolu-
tion for information exchange. In this fast-paced context, it has become
crucial to protect data from potential attacker and/or eavesdropper. To
address this problem, security measures have to be taken into account.
In order to secure our system we must ensure three particular properties:
3
constantly be accessed performing hardware and software mainte-
nance. Backups, RAID, redundancy, high-availability clusters ecc.
are usual methods to guarantee availability.
4
pothetical adversary attempting to attack the crawler.
This work ends discussing the results achieved, and the actual limitation
of the current architecture version.
5
1.1 Anonymous Networks
"An anonymous P2P communication system is a peer-to-peer distributed
application in which the nodes or participants are anonymous or pseudony-
mous. Anonymity of participants is usually achieved by special routing
overlay networks that hide the physical location of each node from other
participants." [1]
6
You’re Not Paying for It; You’re the Product", in the sense that many
companies are actually interested in collecting your personal data while
you navigate the web with or without your consensus (and they success-
fully do it on a daily basis).
In the next chapter an insight of the Tor network is presented, introduc-
ing the basic concepts to understand in order to extensively comprehend
its limitations.
7
Chapter 2
8
the next relay. This guarantees anonymity, since the intermediate nodes
are only able to decrypt the top layer of the packet, being unable to
trace back the original sender and discovering the actual end-point of
the communication.
Tor allows to establish an arbitrary-length circuit. However, to reduce
the performance overhead, the default length is set to the minimum num-
ber which guarantees anonymity, that is 3.
It is not known if a longer number actually improves security/perfor-
mance balance. As Hotpets-Bauer paper states:
"Choosing a path length for low latency anonymous networks that opti-
mally balances security and performance is an open problem." [2]
9
Figure 2.1: Tor Scheme
10
when communicating between onion routers. It is worth mentioning that
the short-term keys are periodically rotated, to further increase security.
Tor relays exchange fixed-size messages named cells via TLS connec-
tions. Cells have a size of 512 bytes, and they are of two types: control
cells and relay cells, both consisting of a payload and a header. A relay
cell carry end-to-end data, while a control cell is used to give instructions
to the node that receives it. In particular there are 4 types of control
commands for a control cell:
11
• relay teardown: to close a stream.
12
While a relay cell contains all the above mentioned metadata, only the
circuit ID and a command are contained into a control cell.
The actual establishment of a circuit is accomplished taking advantage
of Diffie-Hellman exchange, sending control and relay cells among the
involved nodes. For a deep explanation of how the circuit is constructed
please refer to the Tor design document. It is worth mentioning that all
the TCP requests/responses are accomplished using the SOCKS proto-
col, which makes possible the existence of TCP streams across multiple
proxies.
13
2.1 Threat Model
It is important, under the anonymity point of view, to define what are the
actual capabilities of an attacker. We remark how the goal of a malicious
user concerns de-anonymization rather than decrypting flowing packets.
Focusing on general issues is useful, but it is not sufficient to outline a
realistic threat.
In this section two types of attacker are analyzed: single-node attacker
and multi-node attacker. Both the models are interested in disclosing
the original IP address of the sender.
14
to assume the malicious user has enough resources to perform such an
attack.
We identify four different scenarios:
• The attacker is an exit node: this case is similar to the first one,
in which the attacker impersonates a guard node. If end-to-end
traffic is not encrypted he is able to sniff traffic going toward the
recipient.
15
It is worth mentioning that traffic analysis techniques are quite in-
effective, unless the attacker is able to directly sniff incoming/outgoing
traffic to/from our crawler and the target.
Due to the fact that eventually a network packet must enter (exit) the
Tor network, one of the major issue regards who controls the guard (exit)
node. Assume that an attacker both owns a guard and an exit nodes.
If the end-to-end connection between Alice (our crawler) and Bob (our
target) is not encrypted the attacker may inject "tags" (e.g. an HTTP
tag/comment) inside the network packets that flow between Alice and
Bob. Exploiting this technique, the attacker may de-anonymize Alice
correlating the traffic that flows between the guard and the exit nodes
by means of the previously injected tags.
This issue can be mitigated forcing end-to-end encryption (TLS/SSL
or VPN) or checking integrity (hashes). However, since this might not
always be possible (e.g. Bob does not support secure protocols), a proxy
between Tor and Bob might be used.
16
Another solution might be running a proxy as a bridge between Alice
and the Tor network.
Recent researches point out how traffic analysis is a powerful tool which
can successfully lead to user de-anonymization. Several complex attacks
have been researched using this technique, and some of them have been
proved to be extremely effective[5][7]. Most of these kind of attacks
rely on traffic correlation and pattern analysis. Observing the flowing
encrypted data, an attacker owning a certain number of malicious nodes
may de-anonymize a particular user by observing data pattern.
In certain situations, the attacker might also overload certain nodes to
route Alice traffic toward nodes he owns.
Tor does not guarantee protection against end-to-end timing correlation.
Assuming the attacker is sniffing Alice and Bob at the same time, he
might be able to correlate traffic by observing request/response pattens.
The effectiveness of these kind of attacks highly depends on the amount
of resources owned by the attacker. Indeed, assuming the attacker owns a
high number of nodes in the network (i.e. Sybil attack), it becomes easier
to successfully identify traffic. To mitigate this problem, data pattern
must be concealed. In the specific case of a net crawler, a solution might
be to simulate a human-like traffic, adjusting packets flow in a proper
way. It is worth mentioning that Tor relays communicate exchanging
cells, which are fixed-size packets (512Kb).
17
2.1.2.3 Sybil Attacks
"[...]they signed up around 115 fast non-exit relays, all running on 50.7.0.0/16
or 204.45.0.0/16. Together these relays summed to about 6.4% of the
Guard capacity in the network. Then, in part because of our current
guard rotation parameters, these relays became entry guards for a signif-
icant chunk of users over their five months of operation." [4]
18
2.2 Defending from Node Compromission
While it is important to take into account vulnerabilities of Tor archi-
tecture, we should also consider the possible compromission of a Tor
relay, or, in the worst case, of our Crawler/Proxy. Defending the con-
trollable perimeter should not be overlooked, since successful attacks to
our nodes might lead to the disclosure of our identity. Operating systems
like Tails[12] or Whonix [11] aim to avoid de-anonymization even if the
machine is compromised.
In particular, Whonix is characterized by an interesting design. As
shown in the figure, it is composed of two modular blocks: a Whonix-
Workstation and a Whonix-Gateway. While the former is the actual OS,
the latter specifies the network protocol used. As mentioned in the offi-
cial documentation:
Hence, it basically works enforcing all the traffic coming from the Work-
station to the Gateway, ensuring that no packets will flow using a differ-
ent network protocol.
As we will see, the structure of Whonix inspired the design of the archi-
19
tecture presented in this thesis work.
20
While Whonix is thought to be installed on static machines, Tails is
thought to be portable:
"Tails is a live system that aims to preserve your privacy and anonymity.
It helps you to use the Internet anonymously and circumvent censorship
almost anywhere you go and on any computer but leaving no trace unless
you ask it to explicitly."[12]
21
2.3 Hidden Services Vulnerabilities
A Hidden Service is a server which hosts web services within the Tor
network. It is uniquely identified by an alphanumeric hash ending with
.onion.
Now, assume Alice wants to connect to Bob’s hidden service. Both
parties want to stay anonymous and communicate in a secure way. It is
clear that Alice and Bob must talk in an indirect manner. Setting up
such a Tor circuit involves six steps:
• Alice is now able to contact the directory server and gets the in-
formation about Bob’s hidden service. Furthermore, Alice picks
randomly a relay which will act as a "bridge" between her and
Bob. This relay is named rendezvous point. Alice shares a one-
time secret with the rendezvous point.
• Alice sends the one-time secret and the rendezvous address to Bob,
encrypting the message using Bob’s public key. Alice picks one
introduction point in order to indirectly send the message to Bob.
22
• Bob connects to the rendezvous point providing the one-time se-
cret.
• Alice and Bob are now able to communicate through the ren-
dezvous point.
"We found that we can identify the users’ involvement with hidden ser-
vices with more than 98% true positive rate and less than 0.1% false
positive rate with the first attack, and 99% true positive rate and 0.07%
false positive rate with the second. [...] we show that we can correctly
determine which of the 50 monitored pages the client is visiting with 88%
true positive rate and false positive rate as low as 2.9%, and correctly
deanonymize 50 monitored hidden service servers with true positive rate
of 88% and false positive rate of 7.8% in an open world setting."
23
Figure 2.4: Hidden Service
24
2.4 Tor Performance Evaluation
Due to Tor complexity, it comes natural to think about how performance
is affected by the high amount of overhead needed to handle the circuits.
A summary of [6] illustrates the main issues about Tor performance.
In the mentioned paper, six reasons for low performance have been iden-
tified:
• Tor flow control does not work properly, meaning that low-traffic
streams (as web browsing) does not co-exist well with high-traffic
ones (bulk transfers).
• Tor relays put too much traffic inside the network with respect to
the traffic they actually forward.
25
Tor performance has an important impact on our crawler architecture,
since we have to know in advance how much if a certain crawling task
can or cannot be accomplished in a reasonable time.
As a first option an analytical approach has been considered to properly
model Tor performance (queueing theory), but due to the highly hetero-
geneous nature of Tor (being composed of very different type of relay
under a performance point of view), it would not have led to interesting
results.
At this point, the option to lead empirical tests directly on the Tor net-
work has been designed as the best hypothesis.
26
Figure 2.5: Bandwidth Test - Day 1 - Average and Standard Deviation
It is worth to underline how the packet loss measurement has been omit-
ted, since it did not give useful information (always 0%). The obtained
results had a remarkable impact on the design of our software architec-
ture: as mentioned before, we designed the crawler to work on a virtual
27
machine that is, basically, a proxy. To further increase anonymity, one
might want to deliver tasks to it passing through an anonymous network
(i.e. Tor), thus it becomes crucial to have a clear picture of how per-
formance are affected. Indeed, how we will see in the next section, the
design of our crawler takes into account these results.
28
Chapter 3
29
want the crawler to be technology-agnostic, in such a way it can per-
form whatever type of task regardless the programming language or en-
vironment. Furthermore, we want to have the possibility to specify the
network protocol to use for every single task. For instance, we would
like to execute two tasks that contain the same crawling logic but dif-
ferent network specifications (e.g. crawl the target passing through Tor).
30
3.1 Design
31
it up properly. Docker implements its own caching logic to avoid re-
downloading container images. This is a positive aspect with respect to
performance evaluation, since we can avoid sending the whole container
to the crawler (which would result in a huge throughput bottleneck).
• Input files
32
implementation.
The task output is then moved into the sftp folder, ready to be retrieved.
33
3.2 Task Execution Pipeline
The following steps are executed:
3. Task execution
4. Output production
34
to avoid running a task with incomplete input.
A script called executor is in charge to accomplish the following steps:
35
For further information about the implementation, refer to the appendix
at the end of this document.
36
Chapter 4
Validation
This chapter aims to show how the proposed architecture has been tested,
illustrating the technical details needed to comprehend its functioning.
The software has been primarily tested on a CentOS virtual machine,
however, thanks to Docker portability, it runs on any other platform
supporting Docker. It is worth mentioning that it has also been tested
on a Ubuntu virtual machine without any modification.
The main difficulty encountered lies in defining a communication stan-
dard between the logic and the gateway containers. Indeed, it is impor-
tant to underline that the logic container is not aware about the actual
implementation of the gateway, since it might not always be the same
due to the modularity constraint we set from the beginning.
Recalling the fact that a Docker container is, in effect, a virtual machine,
it has its own IP address. Thanks to the capability of Docker of linking
two containers, it is possible to let them communicate as if they are on
37
the same network. Therefore, we have the following scenario:
It is crucial that the ports exposed by the gateway are always the
same, since the logic has to know where to send its traffic. Before ex-
plaining exactly what ports to expose, we first illustrate how the gateway
is implemented, and what kind of traffic it expects to forward.
38
4.1 Gateway Container
To validate the architecture, three gateway containers have been designed
and realized:
They coincide with the three main types of useful proxies for a web
crawler. Indeed, depending by the requirements, a configuration might
suit better with respect to the others.
The pass-through gateway acts as an elementary proxy, forwarding
all the incoming traffic from the logic container to the crawling target. It
has been implemented building a Docker container which installs Squid
in order to improve performance:
39
The Tor gateway pulls an image from the Docker Hub called torprivoxy
(arulrajnet/torprivoxy) which natively exposes three ports:
40
4.2 Logic Container
As already mentioned, the logic container is in charge to execute the
crawling task. Two tasks are executed for validation purposes:
The second task involves one of the most used web crawling frame-
works, Scrapy:
The reason behind such choice, is to report a log file which is a re-
sult of an actual crawling task, effectively showing how the architecture
41
performs in a realistic scenario. In particular, a crawling task targeting
stackoverflow.com has been executed, retrieving a list of the most up-
voted questions on the renowned website.
42
4.3 Validation Objectives
The validation tests have been executed according to the following cri-
teria:
43
4.4 Validation Results
The following steps are followed in order to test the architecture:
87.16.237.48
109.163.234.5
44
One might verify how the returned ip is associated to a german ISP.
Lastly, the output for the gateway implementing a vpn client:
176.126.237.217
45
HttpAuthMiddleware , DownloadTimeoutMiddleware ,
UserAgentMiddleware , RetryMiddleware ,
DefaultHeadersMiddleware , MetaRefreshMiddleware ,
HttpCompressionMiddleware , RedirectMiddleware ,
CookiesMiddleware , HttpProxyMiddleware ,
ChunkedTransferMiddleware , DownloaderStats
2016−05−09 1 8 : 4 1 : 5 1 [ s c r a p y ] INFO : Enabled
s p i d e r middlewares : HttpErrorMiddleware ,
O f f s i t e M i d d l e w a r e , RefererMiddleware ,
UrlLengthMiddleware , DepthMiddleware
2016−05−09 1 8 : 4 1 : 5 1 [ s c r a p y ] INFO :
Enabled item p i p e l i n e s :
2016−05−09 1 8 : 4 1 : 5 1 [ s c r a p y ] INFO :
S p i d e r opened
2016−05−09 1 8 : 4 1 : 5 1 [ s c r a p y ] INFO :
Crawled 0 pages ( at 0 pages /min ) ,
s c r a p e d 0 i t e m s ( at 0 i t e m s /min )
2016−05−09 1 8 : 4 1 : 5 1 [ s c r a p y ] DEBUG:
T e l n e t c o n s o l e l i s t e n i n g on 1 2 7 . 0 . 0 . 1 : 6 0 2 3
2016−05−09 1 8 : 4 1 : 5 1 [ s c r a p y ] DEBUG:
Crawled ( 2 0 0 )
<GET http : / / s t a c k o v e r f l o w . com/ q u e s t i o n s ? s o r t=v o t e s >
( r e f e r e r : None )
2016−05−09 1 8 : 4 1 : 5 2 [ s c r a p y ] DEBUG:
46
Crawled ( 2 0 0 )
<GET http : / / s t a c k o v e r f l o w . com/ q u e s t i o n s /11227809/
why−i s −p r o c e s s i n g −a−s o r t e d −array−f a s t e r −
than−an−unsorted−array > ( r e f e r e r :
http : / / s t a c k o v e r f l o w . com/ q u e s t i o n s ? s o r t=v o t e s )
2016−05−09 1 8 : 4 1 : 5 2 [ s c r a p y ] DEBUG:
Scraped from <200 http : / / s t a c k o v e r f l o w . com
/ q u e s t i o n s /11227809/
why−i s −p r o c e s s i n g −a−s o r t e d −array−f a s t e r −
than−an−unsorted−array >
{ . . RESPONSE BODY . . }
Clearly, the output reports a 200 http code to confirm that the crawling
task was successful. Obviously, the output is exactly the same for each
gateway type.
The last test aims to show how the architecture forwards each packet
through the gateway: we expect that tcpdump shows DNS requests when
using a pass-through gateway, while no logs are supposed to be shown
when using a Tor or a VPN container. Indeed, in the last two cases, all
the traffic is never directly routed to the crawling target.
47
Figure 4.1: DNS Leak Test: Pass-through Gateway
48
Figure 4.2: DNS Leak Test: Tor Gateway
49
Figure 4.3: DNS Leak Test: VPN Gateway
50
As it possible to notice from the pictures, the results are clear: no
DNS requests are captured from tcpdump when using the Tor and the
VPN gateways. On the contrary, this is not true for the pass-through
gateway.
51
4.5 Conclusions and Future Work
Being an experimental software architecture, it is worth to underline its
main current limitations:
52
containers, ready to be plugged in. Indeed, one of the most impor-
tant issue for a web crawler concerns the need to receive frequent
updates. A centralized repository allows to maintain and update
the containers efficiently
53
Appendix A
Architecture: Technical
Guide
This chapter illustrates how to interact with the architecture step by
step, showing how the validation test has been executed. These are the
steps needed to correctly deploy the virtual machine and launching a
crawling task:
• Input delivery
• Output retrieval
54
A.1 Architecture Deployment
As already mentioned, the architecture has been tested on a CentOS vir-
tual machine. Virtualbox has been used to accomplish this. In order to
have our machine reachable from sftp/ssh, we need to setup two network
adapters, as shown in the pictures.
55
Figure A.2: Virtual Machine setup: network adapter 2
56
By default, sftp and ssh daemons are launched at startup, therefore
we can connect to the virtual machine using:
s f t p root@<ip_address>
Or:
s s h root@<ip_address>
57
A.2 Start Background Service
Using ssh, we are able to start the virtual machine controller as shown in
the picture. The controller is now waiting for the input to be delivered,
once it finds the file done in the sftp folder, it launches executor.sh which
is in charge to build the dockerfiles and launch the input task.
58
A.3 Input Delivery
Now, using sftp or ssh we can deliver the following items:
• Gateway container
• Logic container
• Input files
• "done" file
59
Figure A.4: Virtual Machine setup: input files delivery
60
Figure A.5: Virtual Machine setup: delivery of "done" file
61
Figure A.6: Virtual Machine setup: crawling task running
62
A.3.1 Output Retrieval
It is now possible to retrieve the output directly from the sftp folder.
This step is shown in figure A.7 using scp.
63
A.4 Accessing the VM through Tor
Assuming we want to connect to our architecture from an Ubuntu ma-
chine (the operations for a different distribution are very similar), the
first step is to download Privoxy:
sudo / e t c / i n i t . d/ p r i v o x y s t a r t
Or (systemd):
sudo s e r v i c e p r i v o x y s t a r t
64
By default, Privoxy listens on port 8118, therefore we launch ssh with
the following options:
s s h −L 8 1 1 8 : l o c a l h o s t : 8 1 1 8 <username>@<ip_address>
65
A.5 Architecture: Organization
The architecture has the following structure:
Arch
sftp
logic
Dockerfile
scripts
gateway
Dockerfile
scripts
input
controller
controller.sh
executor.sh
shared
– logic: contains the dockerfile and the scripts needed to set the
logic container up.
– gateway: contains the dockerfile and the scripts needed to set
the gateway container up.
– input: contains the task logic to be launched by the crawler.
66
• controller : folder containing the scripts needed to transparently
launch the crawling task as soon as they arrive.
• shared : shared folder between the host machine and the containers.
Reserved to the system.
67
Appendix B
Architecture: Scripts
Controller code (to be launched with root privileges):
68
f o r e n t r y i n " $ s f t p _ d i r "/∗
do
# As soon as the c o n f i r m a t i o n f i l e a r r i v e s ,
# launch e x e c u t o r . sh , then d e l e t e i n p u t f i l e s
# in sftp f o l d e r
i f [ " $ e n t r y " = " $ d o n e _ f i l e " ] ; then
# Delete input f i l e s
rm −r f " $ l o g i c _ f o l d e r "
rm −r f " $gateway_folder "
rm −r f " $ i n p u t _ f o l d e r "
rm " $ d o n e _ f i l e "
fi
done
done
Executor code:
#!/ bin / bash
l o g i c _ f o l d e r =/home/ mike / do ck er / s h a r e d / l o g i c
gateway_folder=/home/ mike / d oc ke r / s h a r e d / gateway
i n p u t _ f o l d e r=/home/ mike / d oc ke r / s h a r e d / i n p u t
69
# Get Input
cp −a /home/ mike / do ck er / s f t p / l o g i c
/home/ mike / d oc ke r / s h a r e d
cp −a /home/ mike / do ck er / s f t p / gateway
/home/ mike / d oc ke r / s h a r e d
cp −a /home/ mike / do ck er / s f t p / i n p u t
/home/ mike / d oc ke r / s h a r e d
# Run D o c k e r f i l e s
do c ke r b u i l d −t arch / l o g i c
/home/ mike / d oc ke r / s h a r e d / l o g i c /
do c ke r b u i l d −t arch / gateway
/home/ mike / d oc ke r / s h a r e d / gateway /
# Launch L o g i c C o n t a i n e r
do c ke r run −−p r i v i l e g e d −−rm −−l i n k gateway
−−name l o g i c
−v /home/ mike / d oc ke r / s h a r e d / i n p u t : / s h a r e d arch / l o g i c
70
bash −c " r o u t e d e l d e f a u l t ; r o u t e add d e f a u l t gw
gateway eth0 ; sudo sh s c r i p t s / s c r i p t . sh "
# K i l l Gateway
do c ke r s t o p gateway
do c ke r rm gateway
# Remove images
do c ke r rmi arch / gateway
do c ke r rmi arch / l o g i c
# Produce output
mv /home/ mike / do ck er / s h a r e d / i n p u t /∗
/home/ mike / d oc ke r / s f t p
# Remove i n p u t f i l e s
rm −r f " $ l o g i c _ f o l d e r "
rm −r f " $gateway_fol der "
rm −r f " $ i n p u t _ f o l d e r "
71
Gateway container Dockerfile:
FROM ubuntu : l a t e s t
RUN apt−g e t −y update && apt−g e t i n s t a l l −y i p t a b l e s
COPY s c r i p t s / s c r i p t s
72
B.1 Network Tests: Scripts
#!/ bin / bash
# Run with sudo !
a r r a y =( http : / / s p e e d t e s t . f t p . o t e n e t . gr / f i l e s /
t e s t 1 0 0 k . db
http : / / s p e e d t e s t . f t p . o t e n e t . gr / f i l e s / test1Mb . db
http : / / s p e e d t e s t . f t p . o t e n e t . gr / f i l e s / test10Mb . db )
73
[KM] ∗B/ s "
echo Latency : $ i
time wget −q −O / dev / n u l l $ i 2>&1 | grep
elapsed
echo \n\n
done
echo S l e e p i n g 6 minutes
s l e e p 6m
done
74
do
echo Bandwidth : $ i
( p r o x y c h a i n s wget −O / dev / n u l l $ i ) 2>&1
| grep −o "[0 −9.]\+ [KM] ∗B/ s "
echo Latency : $ i
( p r o x y c h a i n s time wget −q −O / dev / n u l l $ i )
2>&1 | grep e l a p s e d
echo \n\n
done
echo S l e e p i n g 6 minutes
s l e e p 6m
done
75
Appendix C
76
Figure C.1: Bandwidth Test - Day 1 - 10 a.m.
77
Figure C.2: Bandwidth Test - Day 1 - 12 a.m.
78
Figure C.3: Bandwidth Test - Day 1 - 14 p.m.
79
Figure C.4: Bandwidth Test - Day 1 - 16 p.m.
80
Figure C.5: Bandwidth Test - Day 1 - Average and Standard Deviation
81
Figure C.6: Bandwidth Test - Day 2 - 10 a.m.
82
Figure C.7: Bandwidth Test - Day 2 - 12 a.m.
83
Figure C.8: Bandwidth Test - Day 2 - 14 p.m.
84
Figure C.9: Bandwidth Test - Day 2 - 16 p.m.
85
Figure C.10: Bandwidth Test - Day 2 - Average and Standard Deviation
86
Figure C.11: Bandwidth Test - Day 3 - 10 a.m.
87
Figure C.12: Bandwidth Test - Day 3 - 12 a.m.
88
Figure C.13: Bandwidth Test - Day 3 - 14 p.m.
89
Figure C.14: Bandwidth Test - Day 3 - 16 p.m.
90
Figure C.15: Bandwidth Test - Day 3 - Average and Standard Deviation
91
Figure C.16: Bandwidth Test - Day 4 - 10 a.m.
92
Figure C.17: Bandwidth Test - Day 4 - 12 a.m.
93
Figure C.18: Bandwidth Test - Day 4 - 14 p.m.
94
Figure C.19: Bandwidth Test - Day 4 - 16 p.m.
95
Figure C.20: Bandwidth Test - Day 4 - Average and Standard Deviation
96
Bibliography
[1] Wikipedia, Anonymous P2P, https://en.wikipedia.org/wiki/Anonymous_P2P
[2] Kevin Bauer, Joshua Juen, Nikita Borisov, Dirk Grunwald, Douglas
Sicker, and Damon McCoy, On the Optimal Path Length for Tor
[4] Roger Dingledine, Tor security advisory: "relay early" traffic confir-
mation attack, https://blog.torproject.org/blog/tor-security-advisory-
relay-early-traffic-confirmation-attack
[5] Albert Kwon, Mashael AlSabah, David Lazar, Marc Dacier, Srinivas
Devadas, Circuit Fingerprinting Attacks: Passive Deanonymization
of Tor Hidden Services
[7] Aaron Johnson, Chris Wacek, Rob Jansen, Micah Sherr, Paul Syver-
97
son, Users Get Routed: Traffic Correlation on Tor by Realistic Ad-
versaries
98