SlideShare a Scribd company logo
Remote File Path Traversal
Attacks for Fun and Profit
Dr. Dharma Ganesan
Disclaimer
● The opinions expressed here are my own but not the views of my employer
● The source code fragments shown here can be reused but
○ without any warranty nor accept any responsibility for failures
● Do not apply the exploit discussed here on other systems
○ without obtaining authorization from owners
2
Goal
● Demonstrate how attackers can steal information from servers
● Present an anti-pattern that enables file path traversal attacks
● Discuss how to prevent file path traversal attacks (in C)
● Present some metrics to compare # of lines before and after patching
3
Intended Audience
● Anyone interested in foundations of secure programming (in C)
● Exploits discussed here are well-known to the security community
● But I hope it is still informative for newcomers to software security
4
Context: Client-Server Architectural Style
● Clients send request for a file to the server
○ Clients can be web-browsers, telnet clients, web clients, etc.
● Server sends the requested file to the clients
○ Server can be any program (e.g., web server) that responds to requests
● Of course, the server should not disclose not-public files to the clients
5
Context: Client-Server Architectural Style...
Request (public) File
Response File
Threat: Attackers could steal files from the private folder of the server.
Caution: This threat is not only applicable to web but also to any distributed systems
Server
6
System under attack and Tools
● System under attack: The web server used here is part of a Hacking book by
Jon Erickson
● The web server vulnerability presented here is not discussed in the book
○ Slides complement the book by exploiting a different vulnerability
● Tools
○ Telnet web client - part of standard Linux installations
■ Telnet has its own security problems but it is fine for this demo
○ But you may use any other proxy/plugins to browsers (e.g., developer tools, etc.)
7
A simple web application - Proof of Concept
What are the attack surfaces
to steal private files?
- No forms to submit
- No javascripts
Take sometime before seeing
the rest of the slides
8
Right click to view page source of index.html
<html>
<img src="image.jpg">
</html>
So image.jpg must be the smiley face
9
Under-the-hood steps to just smile
1. Request: “GET / HTTP/1.1” from the browser to the server
2. Response: index.html from the server to the browser
3. Request: “GET /image.jpg” from the browser to the server
4. Response: image.jpg contents from the server to the browser
5. Request: Browser sends an implicit request for the url icon
1.GET / HTTP/1.1
2. Contents of index.html
3. GET /image.jpg HTTP/1.1
4. Contents of image.jpg
Server
10
Confirm these steps - take a peek at server’s log
● Just to be sure let’s see our web server’s log fragment:
● Got request from 127.0.0.1:50628 "GET / HTTP/1.1"
Opening ./webroot/index.html't 200 OK
Got request from 127.0.0.1:50630 "GET /image.jpg HTTP/1.1"
Opening ./webroot/image.jpg't 200 OK
● The log shows that the browser actually sent two requests as expected
● The files are delivered relative to the (public) webroot directory
11
Attack surface to reach the server’s private files
● Of course, what if we target one of the GET request’s path?
● For example, the second GET request and mess with the file name
● Core Idea: Instead of image.jpg what if we request any other file
12
Let’s steal arbitrary file contents (telnet client)
● Instead of a web browser, let’s use a telnet client
○ Telnet has its own security problems but for our purpose it is fine.
● # telnet localhost 80 (i.e., connect to our web server listening at port 80)
...
GET / HTTP/1.1 (I typed this command)
HTTP/1.0 200 OK
Server: Tiny webserver
<html>
<img src="image.jpg">
</html>
13
Let’s steal server’s /etc/passwd using a telnet client
# telnet localhost 80
...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 200 OK
Server: Tiny webserver
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
...
Contents of the
/etc/passwd
exposed
14
The web server log fragment and analysis
# ./tinyweb
Accepting web requests on port 80
Got request from 127.0.0.1:50750 "GET /../../../etc/passwd HTTP/1.1"
Opening ./webroot/../../../etc/passwd't 200 OK
● The attacker was able to get out of the public root directory!
● The tiny web server is clearly vulnerable to remote file content disclosure
● It appears that the server strcat the web root with the incoming file name
○ We will confirm by looking into the source code of the web server
15
Analysis of the web server code fragment
if(strncmp(request, "GET ", 4) == 0) // GET request
ptr = request+4; // ptr is the URL.
...
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // ptr points to the input url string in GET
…
}
● The anti-pattern is that the server concatenate the public web root with the input
filename (can be evil)
● The resource variable contains the filename as a string but it was not
evaluated/canonicalized before opening the file
16
How to fix this vulnerability? (high-level idea)
● It is not that difficult to find this anti-pattern but fixing is important
○ For large systems, grep for strcpy followed by strcat with variable names (e.g. filename)
○ grep -A 1 “strcpy” -r * | grep “strcat” …
● Canonicalize after combining the public webroot with the input filename
● Evaluate whether the canonicalized file is within the webroot
● If yes, we are safe and can disclose the content
● Otherwise, raise a generic error that the file is not found
17
Canonicalize and then validate filenames - core idea
● canonicalize_file_name is a library function
● For example, if the input string is ./webroot/../../../etc/passwd
● Output of canonicalize is : /etc/passwd (in my case)
○ Do not forgot to free the memory returned by canonicalize
● Check whether the prefix of the canonicalized file name is within the public dir
○ See starts_with_substr in the appendix
○ I don’t think the C language has built-in functions to check the prefix
❖ If you find bugs in my patch (see the appendix), please contact me
18
The patched web server stopped the exploit
# telnet localhost 80
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
GET /../../../etc/passwd HTTP/1.1
HTTP/1.0 404 NOT FOUND
Server: Tiny webserver
<html><head><title>404 NOT Found</title></head><body><h1>URL not
found</h1></body></html>
Connection closed by foreign host 19
The patched web server log fragment and analysis
# ./tinyweb_secure
Accepting web requests on port 80
Got request from 127.0.0.1:50816 "GET /../../../etc/passwd HTTP/1.1"
input file name = /etc/passwd
Unsafe file ./webroot/../../../etc/passwd't 404 Not Found
● The server knows that the attacker is reaching into non-public directories
● The server successfully stopped the attacker
20
Number of lines before and after patching
# wc -l tinyweb_secure.c tinyweb.c
224 tinyweb_secure.c (after patching)
122 tinyweb.c (before patching)
● To my surprise, the patched version (tinyweb_secure.c) has nearly two times
more code than the original version (tinyweb.c)
○ Comments are inlined using “//” - so they do not contribute much to the metrics
● This shows to me that secure coding (in C) will take at least two times more
coding effort than “traditional” coding
○ If I add my code review and testing effort, it is at least three times more expensive!
● More study is needed on other systems to confirm my claims on effort
○ May be it is the C language and its small library contribute to more application code
○ Or, it is me who did not patch it in a compact way - but I doubt 21
Space of inputs for traditional vs secure programs
22
Evil input values
Valid input values
● Traditional programs handle only valid input
values well
● Secure coding requires the programs to
handle evil input values, too
● The problem is that the threats (evil inputs)
have to identified up-front and
○ software has to be designed to resist
and recover
Conclusion and broader applicability
● Using string concat to construct file names can be dangerous
○ This anti-pattern should be avoided
● The server should canonicalize file names and check the resulting filenames
○ Otherwise attackers will get into private directories and steal files
● File Path exploitation is independent of web-applications
● Any client-server architecture must close this attack surface
● Usage of TLS between clients and server will not stop the attack
○ In this case, TLS will just help attackers to securely download private files :)
● Firewalls do not usually stop file path exploitation payload
23
References
1. HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP
2. OWASP: https://www.owasp.org/index.php/Path_Traversal
3. Jon Erickson, Hacking - The Art of Exploitation
a. The web server of this book is exploited for buffer overflows in the book
b. My slides show a different vulnerability not discussed in the book AFAIK
4. Robert Seacord, Secure coding in C and C++.
a. There is a nice chapter dedicated to file system exploits but my slides show a detailed demo
b. Usage of strcat to construct and test filenames is also well-explained
24
Questions/Comments
dharmalingam.ganesan11@gmail.com
25
Appendix - Implementation to fix the vulnerability
26
How to fix this vulnerability?
strcpy(resource, WEBROOT); // Begin resource with web root path
strcat(resource, ptr); // and join it with resource path pointed by ptr.
if(is_safe_file(resource)) { // Is it inside the web root?
fd = open(resource, O_RDONLY, 0); // Try to open the file.
printf("tOpening %s't", resource);
}
else {
printf("tUnsafe file %s't", resource); // Hacker is attacking us.
fd = -1;
}
27
My implementation of is_safe_file
/* Returns -1 if the absolute filename is NULL.
* Returns 1 if the absolute filename is present inside the web root.
* Returns 0 otherwise.
*/
int is_safe_file(char *filename) {
char *realFname;
char *fullwebRootPath;
int status = -1;
if(NULL == filename)
return status;
fullwebRootPath = getWebrootFullPath();
if(fullwebRootPath != NULL) {
realFname = canonicalize_file_name(filename);
status = starts_with_substr(realFname, fullwebRootPath);
printf("input file name = %sn", realFname);
free(realFname);
free(fullwebRootPath);
}
return status;
}
cont...
28
My implementation of starts_with_substr
/* Returns 1 if the given str starts with the given prefix. * Returns -1 if the
arguments are invalid. Otherwise, returns 0.
*/
int starts_with_substr(char *str, char *prefix) {
if(NULL == str || NULL == prefix)
return -1;
while(*prefix) {
if(*str++ != *prefix++) return 0;
}
return 1;
} 29
My implementation of the get web root path
/* Returns the full web root file path. The caller must free the returned memory.
*/
char* getWebrootFullPath() {
long size;
char *cwd; /* current working dir */
char *ptr = NULL;
char *webrootPath = NULL;
char *webrootCanoPath = NULL;
size = pathconf(".", _PC_PATH_MAX);
if ((cwd = (char *)malloc((size_t)size)) == NULL) return NULL;
30
get web root path ...
if(getcwd(cwd, (size_t)size)) {
size = strlen(cwd) + strlen(WEBROOT) + 2;// +2 is for '/' and '0'
if ((webrootPath = (char*)malloc((size_t)size)) == NULL) {
free(cwd);
return NULL;
}
strcpy(webrootPath, cwd);
strcat(webrootPath, "/");
strcat(webrootPath, WEBROOT);
}
webrootCanoPath = canonicalize_file_name(webrootPath);
free(cwd);
free(webrootPath);
return webrootCanoPath;
} 31

More Related Content

What's hot (20)

PDF
Linux programming - Getting self started
Emertxe Information Technologies Pvt Ltd
 
PDF
Windows 10 - Endpoint Security Improvements and the Implant Since Windows 2000
CTruncer
 
PDF
CNIT 126: 10: Kernel Debugging with WinDbg
Sam Bowne
 
PDF
DEF CON 23 - Ryan o'neil - advances in linux forensics with ecfs
Felipe Prado
 
PDF
CNIT 127 Ch 4: Introduction to format string bugs
Sam Bowne
 
PDF
CNIT 127 14: Protection Mechanisms
Sam Bowne
 
PDF
AV Evasion with the Veil Framework
VeilFramework
 
PDF
CNIT 127 Ch 8: Windows overflows (Part 1)
Sam Bowne
 
PDF
CNIT 127: 8: Windows overflows (Part 2)
Sam Bowne
 
PDF
Pentester++
CTruncer
 
PDF
CNIT 127 Ch 6: The Wild World of Windows
Sam Bowne
 
PDF
Volatile IOCs for Fast Incident Response
Takahiro Haruyama
 
PDF
Compilation and Execution
Chong-Kuan Chen
 
PPTX
Anatomy of a Buffer Overflow Attack
Rob Gillen
 
PDF
Practical Malware Analysis Ch13
Sam Bowne
 
PPTX
Char Drivers And Debugging Techniques
YourHelper1
 
PPTX
The Veil-Framework
VeilFramework
 
PDF
Practical Malware Analysis Ch12
Sam Bowne
 
PDF
Top five reasons why every DV engineer will love the latest systemverilog 201...
Srinivasan Venkataramanan
 
PPTX
Concurrency 2010
敬倫 林
 
Linux programming - Getting self started
Emertxe Information Technologies Pvt Ltd
 
Windows 10 - Endpoint Security Improvements and the Implant Since Windows 2000
CTruncer
 
CNIT 126: 10: Kernel Debugging with WinDbg
Sam Bowne
 
DEF CON 23 - Ryan o'neil - advances in linux forensics with ecfs
Felipe Prado
 
CNIT 127 Ch 4: Introduction to format string bugs
Sam Bowne
 
CNIT 127 14: Protection Mechanisms
Sam Bowne
 
AV Evasion with the Veil Framework
VeilFramework
 
CNIT 127 Ch 8: Windows overflows (Part 1)
Sam Bowne
 
CNIT 127: 8: Windows overflows (Part 2)
Sam Bowne
 
Pentester++
CTruncer
 
CNIT 127 Ch 6: The Wild World of Windows
Sam Bowne
 
Volatile IOCs for Fast Incident Response
Takahiro Haruyama
 
Compilation and Execution
Chong-Kuan Chen
 
Anatomy of a Buffer Overflow Attack
Rob Gillen
 
Practical Malware Analysis Ch13
Sam Bowne
 
Char Drivers And Debugging Techniques
YourHelper1
 
The Veil-Framework
VeilFramework
 
Practical Malware Analysis Ch12
Sam Bowne
 
Top five reasons why every DV engineer will love the latest systemverilog 201...
Srinivasan Venkataramanan
 
Concurrency 2010
敬倫 林
 

Similar to Remote file path traversal attacks for fun and profit (20)

PDF
Higher Level Malware
CTruncer
 
DOCX
Network and Internet Security.docx
stirlingvwriters
 
PDF
Ever Present Persistence - Established Footholds Seen in the Wild
CTruncer
 
PPT
Web security programming_ii
googli
 
PPT
Web Security Programming I I
Pavu Jas
 
PPT
demo1
googli
 
PPT
Web security programming_ii
googli
 
PDF
"15 Technique to Exploit File Upload Pages", Ebrahim Hegazy
HackIT Ukraine
 
PPTX
Owning computers without shell access dark
Royce Davis
 
PPTX
"Docker best practice", Станислав Коленкин (senior devops, DataArt)
DataArt
 
PDF
FreeBSD and Hardening Web Server
Muhammad Moinur Rahman
 
PDF
Kubernetes 101 for_penetration_testers_-_null_mumbai
n|u - The Open Security Community
 
PDF
The State of the Veil Framework
VeilFramework
 
PDF
DevOops & How I hacked you DevopsDays DC June 2015
Chris Gates
 
PPTX
Secure programming with php
Mohmad Feroz
 
PDF
Looking for Vulnerable Code. Vlad Savitsky
Vlad Savitsky
 
PPTX
So you want to be a security expert
Royce Davis
 
PDF
The Supporting Role of Antivirus Evasion while Persisting
CTruncer
 
DOCX
Was faqs
sruthilaya
 
PPTX
Introduction to containers
Nitish Jadia
 
Higher Level Malware
CTruncer
 
Network and Internet Security.docx
stirlingvwriters
 
Ever Present Persistence - Established Footholds Seen in the Wild
CTruncer
 
Web security programming_ii
googli
 
Web Security Programming I I
Pavu Jas
 
demo1
googli
 
Web security programming_ii
googli
 
"15 Technique to Exploit File Upload Pages", Ebrahim Hegazy
HackIT Ukraine
 
Owning computers without shell access dark
Royce Davis
 
"Docker best practice", Станислав Коленкин (senior devops, DataArt)
DataArt
 
FreeBSD and Hardening Web Server
Muhammad Moinur Rahman
 
Kubernetes 101 for_penetration_testers_-_null_mumbai
n|u - The Open Security Community
 
The State of the Veil Framework
VeilFramework
 
DevOops & How I hacked you DevopsDays DC June 2015
Chris Gates
 
Secure programming with php
Mohmad Feroz
 
Looking for Vulnerable Code. Vlad Savitsky
Vlad Savitsky
 
So you want to be a security expert
Royce Davis
 
The Supporting Role of Antivirus Evasion while Persisting
CTruncer
 
Was faqs
sruthilaya
 
Introduction to containers
Nitish Jadia
 
Ad

More from Dharmalingam Ganesan (20)

PDF
.NET Deserialization Attacks
Dharmalingam Ganesan
 
PDF
Reverse Architecting using Relation Algebra.pdf
Dharmalingam Ganesan
 
PDF
How to exploit rand()?
Dharmalingam Ganesan
 
PDF
Cyclic Attacks on the RSA Trapdoor Function
Dharmalingam Ganesan
 
PDF
An Analysis of RSA Public Exponent e
Dharmalingam Ganesan
 
PDF
An Analysis of Secure Remote Password (SRP)
Dharmalingam Ganesan
 
PDF
Thank-a-Gram
Dharmalingam Ganesan
 
PDF
Active Attacks on DH Key Exchange
Dharmalingam Ganesan
 
PDF
Can I write to a read only file ?
Dharmalingam Ganesan
 
PPTX
How do computers exchange secrets using Math?
Dharmalingam Ganesan
 
PDF
On the Secrecy of RSA Private Keys
Dharmalingam Ganesan
 
PDF
Computing the Square Roots of Unity to break RSA using Quantum Algorithms
Dharmalingam Ganesan
 
PDF
Analysis of Short RSA Secret Exponent d
Dharmalingam Ganesan
 
PDF
Dependency Analysis of RSA Private Variables
Dharmalingam Ganesan
 
PDF
Analysis of Shared RSA Modulus
Dharmalingam Ganesan
 
PDF
RSA Game using an Oracle
Dharmalingam Ganesan
 
PDF
RSA Two Person Game
Dharmalingam Ganesan
 
PDF
RSA without Integrity Checks
Dharmalingam Ganesan
 
PPTX
RSA without Padding
Dharmalingam Ganesan
 
PDF
Solutions to online rsa factoring challenges
Dharmalingam Ganesan
 
.NET Deserialization Attacks
Dharmalingam Ganesan
 
Reverse Architecting using Relation Algebra.pdf
Dharmalingam Ganesan
 
How to exploit rand()?
Dharmalingam Ganesan
 
Cyclic Attacks on the RSA Trapdoor Function
Dharmalingam Ganesan
 
An Analysis of RSA Public Exponent e
Dharmalingam Ganesan
 
An Analysis of Secure Remote Password (SRP)
Dharmalingam Ganesan
 
Thank-a-Gram
Dharmalingam Ganesan
 
Active Attacks on DH Key Exchange
Dharmalingam Ganesan
 
Can I write to a read only file ?
Dharmalingam Ganesan
 
How do computers exchange secrets using Math?
Dharmalingam Ganesan
 
On the Secrecy of RSA Private Keys
Dharmalingam Ganesan
 
Computing the Square Roots of Unity to break RSA using Quantum Algorithms
Dharmalingam Ganesan
 
Analysis of Short RSA Secret Exponent d
Dharmalingam Ganesan
 
Dependency Analysis of RSA Private Variables
Dharmalingam Ganesan
 
Analysis of Shared RSA Modulus
Dharmalingam Ganesan
 
RSA Game using an Oracle
Dharmalingam Ganesan
 
RSA Two Person Game
Dharmalingam Ganesan
 
RSA without Integrity Checks
Dharmalingam Ganesan
 
RSA without Padding
Dharmalingam Ganesan
 
Solutions to online rsa factoring challenges
Dharmalingam Ganesan
 
Ad

Recently uploaded (20)

PPTX
prodad heroglyph crack 2.0.214.2 Full Free Download
cracked shares
 
PDF
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
PDF
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
PDF
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
PDF
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
PPTX
Smart Doctor Appointment Booking option in odoo.pptx
AxisTechnolabs
 
PPTX
From spreadsheets and delays to real-time control
SatishKumar2651
 
PDF
Everything you need to know about pricing & licensing Microsoft 365 Copilot f...
Q-Advise
 
PDF
Ready Layer One: Intro to the Model Context Protocol
mmckenna1
 
PDF
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
PDF
Best Web development company in india 2025
Greenusys
 
PDF
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
PPTX
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
PDF
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
PDF
Latest Capcut Pro 5.9.0 Crack Version For PC {Fully 2025
utfefguu
 
PPTX
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PPTX
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
PDF
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
prodad heroglyph crack 2.0.214.2 Full Free Download
cracked shares
 
TheFutureIsDynamic-BoxLang witch Luis Majano.pdf
Ortus Solutions, Corp
 
AI + DevOps = Smart Automation with devseccops.ai.pdf
Devseccops.ai
 
UITP Summit Meep Pitch may 2025 MaaS Rebooted
campoamor1
 
Empower Your Tech Vision- Why Businesses Prefer to Hire Remote Developers fro...
logixshapers59
 
Smart Doctor Appointment Booking option in odoo.pptx
AxisTechnolabs
 
From spreadsheets and delays to real-time control
SatishKumar2651
 
Everything you need to know about pricing & licensing Microsoft 365 Copilot f...
Q-Advise
 
Ready Layer One: Intro to the Model Context Protocol
mmckenna1
 
4K Video Downloader Plus Pro Crack for MacOS New Download 2025
bashirkhan333g
 
Best Web development company in india 2025
Greenusys
 
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
Function & Procedure: Function Vs Procedure in PL/SQL
Shani Tiwari
 
Download Canva Pro 2025 PC Crack Full Latest Version
bashirkhan333g
 
Latest Capcut Pro 5.9.0 Crack Version For PC {Fully 2025
utfefguu
 
AEM User Group: India Chapter Kickoff Meeting
jennaf3
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
Get Started with Maestro: Agent, Robot, and Human in Action – Session 5 of 5
klpathrudu
 
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 

Remote file path traversal attacks for fun and profit

  • 1. Remote File Path Traversal Attacks for Fun and Profit Dr. Dharma Ganesan
  • 2. Disclaimer ● The opinions expressed here are my own but not the views of my employer ● The source code fragments shown here can be reused but ○ without any warranty nor accept any responsibility for failures ● Do not apply the exploit discussed here on other systems ○ without obtaining authorization from owners 2
  • 3. Goal ● Demonstrate how attackers can steal information from servers ● Present an anti-pattern that enables file path traversal attacks ● Discuss how to prevent file path traversal attacks (in C) ● Present some metrics to compare # of lines before and after patching 3
  • 4. Intended Audience ● Anyone interested in foundations of secure programming (in C) ● Exploits discussed here are well-known to the security community ● But I hope it is still informative for newcomers to software security 4
  • 5. Context: Client-Server Architectural Style ● Clients send request for a file to the server ○ Clients can be web-browsers, telnet clients, web clients, etc. ● Server sends the requested file to the clients ○ Server can be any program (e.g., web server) that responds to requests ● Of course, the server should not disclose not-public files to the clients 5
  • 6. Context: Client-Server Architectural Style... Request (public) File Response File Threat: Attackers could steal files from the private folder of the server. Caution: This threat is not only applicable to web but also to any distributed systems Server 6
  • 7. System under attack and Tools ● System under attack: The web server used here is part of a Hacking book by Jon Erickson ● The web server vulnerability presented here is not discussed in the book ○ Slides complement the book by exploiting a different vulnerability ● Tools ○ Telnet web client - part of standard Linux installations ■ Telnet has its own security problems but it is fine for this demo ○ But you may use any other proxy/plugins to browsers (e.g., developer tools, etc.) 7
  • 8. A simple web application - Proof of Concept What are the attack surfaces to steal private files? - No forms to submit - No javascripts Take sometime before seeing the rest of the slides 8
  • 9. Right click to view page source of index.html <html> <img src="image.jpg"> </html> So image.jpg must be the smiley face 9
  • 10. Under-the-hood steps to just smile 1. Request: “GET / HTTP/1.1” from the browser to the server 2. Response: index.html from the server to the browser 3. Request: “GET /image.jpg” from the browser to the server 4. Response: image.jpg contents from the server to the browser 5. Request: Browser sends an implicit request for the url icon 1.GET / HTTP/1.1 2. Contents of index.html 3. GET /image.jpg HTTP/1.1 4. Contents of image.jpg Server 10
  • 11. Confirm these steps - take a peek at server’s log ● Just to be sure let’s see our web server’s log fragment: ● Got request from 127.0.0.1:50628 "GET / HTTP/1.1" Opening ./webroot/index.html't 200 OK Got request from 127.0.0.1:50630 "GET /image.jpg HTTP/1.1" Opening ./webroot/image.jpg't 200 OK ● The log shows that the browser actually sent two requests as expected ● The files are delivered relative to the (public) webroot directory 11
  • 12. Attack surface to reach the server’s private files ● Of course, what if we target one of the GET request’s path? ● For example, the second GET request and mess with the file name ● Core Idea: Instead of image.jpg what if we request any other file 12
  • 13. Let’s steal arbitrary file contents (telnet client) ● Instead of a web browser, let’s use a telnet client ○ Telnet has its own security problems but for our purpose it is fine. ● # telnet localhost 80 (i.e., connect to our web server listening at port 80) ... GET / HTTP/1.1 (I typed this command) HTTP/1.0 200 OK Server: Tiny webserver <html> <img src="image.jpg"> </html> 13
  • 14. Let’s steal server’s /etc/passwd using a telnet client # telnet localhost 80 ... Connected to localhost. Escape character is '^]'. GET /../../../etc/passwd HTTP/1.1 HTTP/1.0 200 OK Server: Tiny webserver root:x:0:0:root:/root:/bin/bash daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin ... Contents of the /etc/passwd exposed 14
  • 15. The web server log fragment and analysis # ./tinyweb Accepting web requests on port 80 Got request from 127.0.0.1:50750 "GET /../../../etc/passwd HTTP/1.1" Opening ./webroot/../../../etc/passwd't 200 OK ● The attacker was able to get out of the public root directory! ● The tiny web server is clearly vulnerable to remote file content disclosure ● It appears that the server strcat the web root with the incoming file name ○ We will confirm by looking into the source code of the web server 15
  • 16. Analysis of the web server code fragment if(strncmp(request, "GET ", 4) == 0) // GET request ptr = request+4; // ptr is the URL. ... strcpy(resource, WEBROOT); // Begin resource with web root path strcat(resource, ptr); // ptr points to the input url string in GET … } ● The anti-pattern is that the server concatenate the public web root with the input filename (can be evil) ● The resource variable contains the filename as a string but it was not evaluated/canonicalized before opening the file 16
  • 17. How to fix this vulnerability? (high-level idea) ● It is not that difficult to find this anti-pattern but fixing is important ○ For large systems, grep for strcpy followed by strcat with variable names (e.g. filename) ○ grep -A 1 “strcpy” -r * | grep “strcat” … ● Canonicalize after combining the public webroot with the input filename ● Evaluate whether the canonicalized file is within the webroot ● If yes, we are safe and can disclose the content ● Otherwise, raise a generic error that the file is not found 17
  • 18. Canonicalize and then validate filenames - core idea ● canonicalize_file_name is a library function ● For example, if the input string is ./webroot/../../../etc/passwd ● Output of canonicalize is : /etc/passwd (in my case) ○ Do not forgot to free the memory returned by canonicalize ● Check whether the prefix of the canonicalized file name is within the public dir ○ See starts_with_substr in the appendix ○ I don’t think the C language has built-in functions to check the prefix ❖ If you find bugs in my patch (see the appendix), please contact me 18
  • 19. The patched web server stopped the exploit # telnet localhost 80 Trying ::1... Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET /../../../etc/passwd HTTP/1.1 HTTP/1.0 404 NOT FOUND Server: Tiny webserver <html><head><title>404 NOT Found</title></head><body><h1>URL not found</h1></body></html> Connection closed by foreign host 19
  • 20. The patched web server log fragment and analysis # ./tinyweb_secure Accepting web requests on port 80 Got request from 127.0.0.1:50816 "GET /../../../etc/passwd HTTP/1.1" input file name = /etc/passwd Unsafe file ./webroot/../../../etc/passwd't 404 Not Found ● The server knows that the attacker is reaching into non-public directories ● The server successfully stopped the attacker 20
  • 21. Number of lines before and after patching # wc -l tinyweb_secure.c tinyweb.c 224 tinyweb_secure.c (after patching) 122 tinyweb.c (before patching) ● To my surprise, the patched version (tinyweb_secure.c) has nearly two times more code than the original version (tinyweb.c) ○ Comments are inlined using “//” - so they do not contribute much to the metrics ● This shows to me that secure coding (in C) will take at least two times more coding effort than “traditional” coding ○ If I add my code review and testing effort, it is at least three times more expensive! ● More study is needed on other systems to confirm my claims on effort ○ May be it is the C language and its small library contribute to more application code ○ Or, it is me who did not patch it in a compact way - but I doubt 21
  • 22. Space of inputs for traditional vs secure programs 22 Evil input values Valid input values ● Traditional programs handle only valid input values well ● Secure coding requires the programs to handle evil input values, too ● The problem is that the threats (evil inputs) have to identified up-front and ○ software has to be designed to resist and recover
  • 23. Conclusion and broader applicability ● Using string concat to construct file names can be dangerous ○ This anti-pattern should be avoided ● The server should canonicalize file names and check the resulting filenames ○ Otherwise attackers will get into private directories and steal files ● File Path exploitation is independent of web-applications ● Any client-server architecture must close this attack surface ● Usage of TLS between clients and server will not stop the attack ○ In this case, TLS will just help attackers to securely download private files :) ● Firewalls do not usually stop file path exploitation payload 23
  • 24. References 1. HTTP: https://developer.mozilla.org/en-US/docs/Web/HTTP 2. OWASP: https://www.owasp.org/index.php/Path_Traversal 3. Jon Erickson, Hacking - The Art of Exploitation a. The web server of this book is exploited for buffer overflows in the book b. My slides show a different vulnerability not discussed in the book AFAIK 4. Robert Seacord, Secure coding in C and C++. a. There is a nice chapter dedicated to file system exploits but my slides show a detailed demo b. Usage of strcat to construct and test filenames is also well-explained 24
  • 26. Appendix - Implementation to fix the vulnerability 26
  • 27. How to fix this vulnerability? strcpy(resource, WEBROOT); // Begin resource with web root path strcat(resource, ptr); // and join it with resource path pointed by ptr. if(is_safe_file(resource)) { // Is it inside the web root? fd = open(resource, O_RDONLY, 0); // Try to open the file. printf("tOpening %s't", resource); } else { printf("tUnsafe file %s't", resource); // Hacker is attacking us. fd = -1; } 27
  • 28. My implementation of is_safe_file /* Returns -1 if the absolute filename is NULL. * Returns 1 if the absolute filename is present inside the web root. * Returns 0 otherwise. */ int is_safe_file(char *filename) { char *realFname; char *fullwebRootPath; int status = -1; if(NULL == filename) return status; fullwebRootPath = getWebrootFullPath(); if(fullwebRootPath != NULL) { realFname = canonicalize_file_name(filename); status = starts_with_substr(realFname, fullwebRootPath); printf("input file name = %sn", realFname); free(realFname); free(fullwebRootPath); } return status; } cont... 28
  • 29. My implementation of starts_with_substr /* Returns 1 if the given str starts with the given prefix. * Returns -1 if the arguments are invalid. Otherwise, returns 0. */ int starts_with_substr(char *str, char *prefix) { if(NULL == str || NULL == prefix) return -1; while(*prefix) { if(*str++ != *prefix++) return 0; } return 1; } 29
  • 30. My implementation of the get web root path /* Returns the full web root file path. The caller must free the returned memory. */ char* getWebrootFullPath() { long size; char *cwd; /* current working dir */ char *ptr = NULL; char *webrootPath = NULL; char *webrootCanoPath = NULL; size = pathconf(".", _PC_PATH_MAX); if ((cwd = (char *)malloc((size_t)size)) == NULL) return NULL; 30
  • 31. get web root path ... if(getcwd(cwd, (size_t)size)) { size = strlen(cwd) + strlen(WEBROOT) + 2;// +2 is for '/' and '0' if ((webrootPath = (char*)malloc((size_t)size)) == NULL) { free(cwd); return NULL; } strcpy(webrootPath, cwd); strcat(webrootPath, "/"); strcat(webrootPath, WEBROOT); } webrootCanoPath = canonicalize_file_name(webrootPath); free(cwd); free(webrootPath); return webrootCanoPath; } 31