0% found this document useful (0 votes)
11 views

Documents-Metadata9o0

Uploaded by

nou20200619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Documents-Metadata9o0

Uploaded by

nou20200619
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Digital Forensic Investigation

MSc #11761
2018

Dr. Ali Hadi


[email protected]
Documents and File Metadata

"You see but you do not observe"


– Sherlock Holmes
What’s Covered?

• Overview..
• File Identification..
• Understanding Metadata..
• Types of Metadata..
• Useful Data!..
• Addition Embedded Metadata..
• Temporary Files..
• Ghiro – Image Forensics..

3
Overview

The great challenges for a digital investigator comes when:


• The evidence is in plain sight but can’t be found
• Or isn’t recognized for what it is
A file isn’t what it says it is.
• i.e. a JPEG file might be renamed with an AVI extension

More complex techniques employed by the bad guys include:


• Embedding files within files (alternate data streams)
• Or burying small files in the Windows Registry
File Identification..
o File search is the process of identifying what kind of file it is.
o Windows file systems use the file extension as a file identifier
o IMAGE.JPG is an image file using file compression.
o File systems do not enforce extension rules.

o Then, the default actions (double-click) will apply to the file.


File Identification.. Cont.

• Changing the extension does nothing to


1 alter the structure of the file.

• Changing the extension did not affect


2 the ability of the correct host
application to open the file.

• There is something in the file structure


3 that determines whether or not an
application can open a file.
File Structure..

1
• Any file stored by any file system must have similar
structure
• i.e. MSWord document can be opened with Word for
Macintosh, Word for Windows, or OpenOffice.

2
• Extensions acts as a superficial identifier of file type.

3
• Several internal identifiers that files use to introduce
themselves (file metadata and file structure).
Metadata..
o Files can contain two types of metadata that applications use
to recognize and open the file:
o Internal metadata is contained within the file and can
consist of a binary string or a text string
o Metadata containers (the MFT attributes, the file header,
magic number).
File Header..

1 First string of data read by Applications


when first asked to load the file.
If the header does not correspond to the
2 file type identified, the application may
have difficulty loading the file.

3 the header provides the starting point for


carving files.

4 A humanly readable file has humanly


readable headers and EOF markers.

5
A binary file has binary metadata.
Digital Archaeology: The Art and Science Digital Forensics
Magic Numbers..

o Magic numbers are another method of structuring a header.


o Many programs use the magic number as the first step in
identifying a file type.
o Digital forensic examiners can use a disk editor to view a file
and examine the magic number to identify the file type.
What’s Metadata?..
1
• Defined as data that describes data and can exist in
multiple forms.
2
• OSes maintains information about files in various
repositories.
3
• NTFS file system makes use of a series of metadata files.
4
• Individual files can contain information stored within the
file that defines the file.
5
• Many document and image management software
solutions maintain large amounts of information
Types of Metadata..
System metadata
• Generated by the file system or document management
system
Substantive metadata

• Defines modifications to a document

Embedded metadata

• Embedded by the application that creates or edits the


file
External metadata
• Another form of metadata that exists that is important
to the investigator. Ideas?
System Metadata..

o All file systems maintain vast amounts of information about


the files and directories stored on the volumes they control.
o How and where the system metadata is stored is essential for
the digital investigator
System Metadata.. Values.
o The ability to prove the existence of a deleted document and to research
the timeline of a document.
o OS metadata does not help identify contents of files.
o A critical piece of information found here is the modify/access/create
(MAC) data
o Disks formatted with NTFS offer the additional attribute of entry modified
(EM).
o Notes the last time the MFT entry was modified.
o MAC information is valuable for creating a timeline of events.
o The tools used by a forensic investigator are tested and verified to not alter
MAC data.
o All files stored on any file system are stamped the MAC time.
System Metadata.. Create.

• Is generated the first time that the file is saved to the file
system
The Create • It is not necessarily the date that the file was originally
Attribute saved
• create attribute serves only as supplemental evidence to
support other findings
System Metadata.. Create.
Two things affect If a user copies a file The two files are identical, each
will have a different create date
the create date from one location to
another The source file show the date it
was initially saved

The new copy shows the time


and date that it was first saved
to the target drive.

Through a file system


utility that allows a
user to intentionally
modify the attribute
System Metadata.. Access.
• Is the most volatile attribute of a file.
Access • Any user views, opens, copies, or backs
Attribute up a file at any time, this attribute is
modified by the file system

• Each time an executable is run


• The activity of antivirus scanning
software
Access time • Right-clicking on a file in Explorer and
selecting Properties
• For example, using the proper utilities,
it is possible to identify the previous ten
times that a document was accessed
System Metadata.. Modify.

• Arguably most valuable of the time/date


attributes contained within a file.
• Tells when the contents of the file were
last altered.
Modify time • Actions that change the access and create
stamp attributes do not impact modify times.
• Act of moving or copying a file has no
impact.
• will impact the attributes of the folder
containing the files.
System Metadata.. Modify.. Cont.
o SOURCE FILE ENTITY — C:\Documents\NOVEL.DOC
Create time remains the same, access time is reset, modify
time remains the same.
o SOURCE FILE CONTAINER — C:\Documents
Create time remains the same, access time is reset, modify
time remains the same.
o DESTINATION FILE ENTITY — C:\User\Documents\NOVEL.DOC
Create time is reset, access time is reset, modify time
remains the same.
o DESTINATION FILE CONTAINER — C:\User\Documents
Create time remains the same, access time is reset, modify
time is reset.
System Metadata.. Entry Modified.
• Is modified each time any of the
other three attributes is changed for
any reason.
Entry modified • It says that something in the
attribute (NTFS) metadata that comprises the MFT
entry for the file has changed.
• No indication of which attribute
changed.

o MAC time stamps can all be easily viewed in Windows


Explorer or in one of the Linux File browsers.
o The entry modified attribute is not so easily viewed.
System Metadata.. Using MAC.
o First things an investigator does when approaching a new
inquiry is to ask “Who did what, and when did they do it?”.
o Once a specific time has been identified
o It might be possible to identify the users who had access to the data
o Or to begin the search for who might have gained access from beyond
the network
o A typical investigation will involve many different file types
and events.
o Commercial forensic software features a timeline functionality
that allows multiple queries to be processed simultaneously.
The Sleuth Kit Tool...
o The Sleuth Kit can generate timelines on virtually any file
system.
o File system utility collects the temporal and collects it into a
single file, called the body file.
o A pipe-delimited ASCII text file that contains one line for each
file, listing MAC data.
o The mactime program can be used to build a timeline based
on the body file parameters.
o The Sleuth Kit includes the ability to build the timelines as do
all commercial forensic suites.
Timeline Sources..
o Timelines are produced from more than one source (file
access and creation times).
o Data could be data extracted from:
 MAC data
 System logs
 Event logs
 E-mails
 Internet history
 File metadata
Embedded Sensitive Metadata..
o Majority of applications that create or edit user files generate
important metadata that is contained within the file
structure.
o Some viewable, while some of it can only be examined.
o Some of the embedded metadata is generated by the
application, while some of it can be added by the user.
o Metadata cannot always be taken at face value, but requires
intelligent analysis by the investigator.
MS PowerPoint Metadata..
MS PowerPoint
Metadata
Useful Data!..
User name
User initials
Organization name (if configured on system)
Computer name
Sample of useful data Document storage location
that can be extracted Names of previous authors
from a documents Revision log (Word, Excel)
metadata:
Version log (Word)
Template file name (Word, PowerPoint)
Hidden text (Word, Excel)
Globally Unique Identifiers (GUIDs)
Temporary Files..
o Modern operating systems rely on the temporary files.
o Once a temporary file is no longer needed, the OS deletes it,
but not all temporary files are successfully deleted.
o By knowing the extension and default locations of temporary
files created by specific applications, the investigator can go
looking for specific types of files.
Temporary Files.. Cont.

Circumstances that result in temporary files:

• Files created by desktop applications to facilitate


editing (undo files, scratch files, and so forth)
2

• Backwardly compatible applications that require swap


files in order to run on the current system
3

• Spooler files created when a print job is sent to the


printer
Temporary Files.. Cont.
o Internet Explorer and all other Internet browsers keeps a
cache of files from recently visited Web sites.
o page will load more quickly if the user returns to that site
o Many applications create “auto save” files
o if the system crashes while the user is working on a document, all
unsaved work will not necessarily be lost.

o Note: the default locations listed can be changed by the user


Temporary Files
File Extension Description
crdownload Google Chrome incomplete download file
part Partial download file
tmp Temporary file
cvr Microsoft Office/Outlook crash report file
cache Generally cache
lck Lock file
fsf Microsoft Office cache file
sqlite-journal Mozilla Firefox file
dat Microsoft Internet Explorer cache
as$ Microsoft Word temporary file
fuse-hidden Samba file
bak Microsoft Word
Temporary Files.. Cont.
o Investigators should be aware of:
o Applications running on the target system
o Research the types of files created by the application.
o If the temporary files are deleted by the application or
the OS.

o These files are treated as any other deleted file and, if not
overwritten by later files, can still be recovered by most
forensic software.
Tools..
• Metadata Analyzer
• http://regex.info/exif.cgi
• Ghiro, http://www.imageforensic.org/
• https://metashieldanalyzer.elevenpaths.com/
• FOCA, https://www.elevenpaths.com/labstools/foca/
• Python Tools: check…..
• http://www.file-
extensions.org/filetype/extension/name/temporary-files
Ghiro – Image Forensics..
Ghiro: Analysis Results
References..
• Incident Response & Computer Forensics 3E, McGraw Hill, 2014
• Handbook of Digital Forensics and Investigations, Eoghan Casey,
• SIFT, http://computer-forensics.sans.org/community/downloads,
• Digital Forensics Research Workshop. http://www.dfrws.org/
• Forensics Wiki, http://www.forensicswiki.org/wiki/Main_Page,
Carving …

“Data can survive what’s carved in rocks!"


– Me
What’s Covered?

• Overview

• What Is File Carving?

• Why Need File Carving?

• Types of File Carving

• How

• Techniques and Tools

42
Overview
• Files continue to exist on most media until overwritten!
• Files cannot be opened without reforming them into their
original structure
• Traditional data recovery methods rely on file system
structures like file tables to recover data that has been
deleted
• But, what if the file system structures are corrupted?
What Is File Carving?

File Carving
• Forensic technique that recovers files based merely on file
structure and content from raw data and without any
matching file system metadata
‒ E.g., recover deleted file from unallocated disk space
Why Need File Carving?
• Data still exists but cannot be correctly interpreted due to
absent or damaged metadata

Examples:
• File system corruption
• Formatted device
• Unknown file formats
• Files deleted (whether intentionally or not)
Caution #1
• File carving is not just about storage media and file systems!
‒ E.g. carving from network traffic, memory dumps, etc)
Caution #2
• Carving is not just about files too!

• Examples
‒ Strings from memory
‒ Code from malware
‒ Single packets from the network
Types of File Carving

File Header Based

Header – Footer Carving Based

File Structure Carving Based

Metadata File Carving Based


Existing Techniques… Based on
Thumbnail

A Method for
• Recover a fragmented JPEG file depending on the
Recovering JPEG
process of mapping between the structure of original
Files Based on
image and thumbnail
Thumbnail

Carving • Employing a fixed and unique hex pattern (UHP) in (JFIF, ExIF)
format to detect the thumbnails and embedded JPEG file as a
Thumbnail/s and pre-processing data to simplify the restoring of JPEG files.
Embedded JPEG
Files Using Image • PattrecCarve can distinguish and recognize the structure of the
Pattern Matching thumbnails, embedded JPEG, or original JPEG file
Existing Techniques… Absence of File
System

Identification of • This technique present a way to carving


Fragmented four type of JPEG file format based on a
JPEG Files in the
Absence of File bit pattern matching and statistical
Systems byte frequency.
Carving Concepts: Basic

• Beginning of file is not overwritten


• File is not fragmented
• File is not compressed
Carving Concepts: Advanced
• Dealing with fragmented files
• Fragments not sequential
• Fragments out of order
• Fragments missing
Carving Techniques

• Header-footer or header-“maximum file size” carving


• File structure based carving
• Content based carving
Root Directory File Allocation Table
Name Size Cluster # Cluster # Content
Image1.png 501 bytes 4 3 FREE
Exam.doc 4008 bytes 7 4 EOF
Syllabus.doc 1655 bytes 5 5 90
Notes.txt 1533 bytes 6 6 91
7 8
8 93
FAT File System … …
- Before deletion - 90 EOF
91 EOF
92 FREE
93 94
Exam.doc 94 EOF
95 FREE
96 FREE
3 4 5 6 7 8 … 90 91 92 93 94 95 96
Root Directory File Allocation Table
Name Size Cluster # Cluster # Content
Image1.png 501 bytes 4 3 FREE
_xam.doc 4008 bytes 7 4 EOF
Syllabus.doc 1655 bytes 5 5 90
Notes.txt 1533 bytes 6 6 91
7 FREE
8 FREE
FAT File System … …
- After deletion - 90 EOF
91 EOF
92 FREE
93 FREE
Exam.doc 94 FREE
95 FREE
96 FREE
3 4 5 6 7 8 … 90 91 92 93 94 95 96
Tools
o Foremost
o Forensic Toolkit (FTK)
o PhotoRec
o Recover My Files
o Scalpel
o Encase
o Bulk extractor
o Revit
Foremost
Foremost Written for the US Government Center
for Information Systems Security Studies
Utility and Research

Linux CLI data carving can extract files from their


entirety from unallocated space,
which means that the file must be
continuously allocated

header/footer based but does examine block


content to verify file type

Can work on a live system


or a forensic image file
Scalpel

Complete rewrite of foremost


1

Reducing the amount of time to search for header


Used in file carving in order
2 to carve files in an frugal
and footer.
Reducing the number of bytes written in the
manure carved files.
Reducing the copying from memory to memory

First, load a configuration file stating specs for files to be carved One
3 full pass records locations for file headers and footers in a DB For
each chunk, a work queue ensures same data is not read twice Files
are written as read, resulting in a small memory foot print

This results in at most two full passes over the disk


4
Bulk extractor

A high-performance carving and feature extraction tool


that uses bulk data analysis to allow the triage and rapid
exploitation of digital media.

The bulk extractor is a program that extracts email


addresses, credit card numbers, URLs, and other types of
information from any kind of digital evidence.

A stream-based forensic tool, that means it scans the entire


media from beginning to end without seeking the disk
head, and is fully parallelized, allowing it to work at the
maximum I/O capabilities of the underlying hardware.
Nothing’s Perfect!
o Time consuming
o Many unreadable invalid and partial results
o More data out than in
o No offset/sector reference to input data
o Quality of the tooling is unclear
EnCase VS PhotoRec
Crum's?
• Nothing’s perfect, but few is better than nothing

It’s some desert,


right? !!!
Assignment(s)
o Check the given assignments in our shared directory.
References
• The evolution of file carving – the benefits and problems of forensics recovery, Anandabrata
Pal and Nasir Memon. IEEE Signal Processing Magazine, 26(2):59–71, March 2009
• https://www.dfrws.org/2006/challenge/index.shtml
• http://www.dfrws.org/2007/challenge/index.shtml
• https://shankaraman.wordpress.com/tag/carving-files-from-wireshark-packets/
• Measuring and Improving the Quality of File Carving Methods, S.J.J. Kloet, 2007
• Data Carving Concepts, Antonio Merola, SANS Institute 2008
• Word Search Pic, https://en.wikipedia.org/wiki/Word_search
• http://www.slideshare.net/rzirnste/advances-in-file-carving
• Hybrid File Carving Technique: JPEG Files, Esra’a Al-Shammari, 2016
• Digital media triage with bulk data analysis and bulk_extractor, Simson L. Garfinkel, 2013
• Digital Archaeology: the Art and Science of Digital Forensic, Michael W. Graves 2013
• Bulk extractor pic, http://www.integritie.com/images/icons/bulkextract_100x100.jpg.
• Foremost Pic, http://i.cmpnet.com/ddj/samag/images/sam0309a/sam0309a.gif.

You might also like