Design and Implementation of A Targeted Data Extraction System For Mobile Devices" Fifteenth IFIP WG 11.9 International Conference On Digital Forensics, Orlando, USA
Design and Implementation of A Targeted Data Extraction System For Mobile Devices" Fifteenth IFIP WG 11.9 International Conference On Digital Forensics, Orlando, USA
net/publication/331158106
CITATIONS READS
0 295
11 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Umit Karabiyik on 25 July 2019.
Abstract
The amount of data stored on smart phones and other mobile devices has increased phenomenally
over the last decade. Moreover smart phones are becoming an essential personal item worldwide. As
a result there has been a spike in the use of these devices for documenting different scenarios that are
encountered by the users as they go about their daily lives. In many situations, for example, in cases
of accidents, malicious intimidation through text messages or multiple shootings, a smart phone can
be used to document the incident. This data, which may be in the form of a picture or a video or a
clip of audio, can be of significant forensic interest to an investigator. In many situations, the owner
of the phone may be willing to provide the investigator access to this data (through a documented
consent agreement). Such consent is usually contingent upon the fact that not all the data available on
the phone may be extracted for analysis, either due to privacy concerns or due to personal reasons.
Courts have also opined in several cases that investigators must limit data extracted, so as to focus on
only “relevant information” for the investigation at hand. Thus, only selective (or filtered) data should
be extracted as per the consent available from the witness/victim (user). This paper describes the design
and implementation of such a targeted data extraction system for mobile devices. It assumes consent of
the user and implements state of the art filtering using machine learning techniques. This system can
2
be used to identify and extract selected data from smart phones, in real time at the scene of the crime.
We describe the complete targeted data extraction system (TDES) and report results of experiments
conducted with both iOS and Android based systems.
I. I NTRODUCTION
With the rapid growth of smartphones and tablets, it is becoming essential for law enforcement
to effectively conduct forensic analysis on such devices. These mobile devices now have so much
data on them that they have in essence become personal data repositories and the privacy of this
data is a serious concern. A recent ruling of the US Supreme Court (Riley v California (573 U.S.
[2014]) and subsequent rulings arising from this landmark case indicate that in order to search a
smartphone, it may not be enough to have a warrant for the search but it may also be required to
restrict the search to the specific items on the device that relate to the crime being investigated.
In this paper we describe the design and implementation of a novel forensically sound system
that can do this targeted (selective) data extraction. Commercial tools such as Cellebrite UFED
Physical Analyzer, Cellebrite [12], have great utility but they target a different use case and
their search features do not currently support many of the capabilities, including particularly the
AI capabilities, that we have integrated into our system. An additional objective of our design
and implementation is to ensure that the cost to law enforcement agencies in minimal as we use
only free and/or open-source software tools as much as possible. To the best of our knowledge,
there is no such system that can match our capabilities.
Our design is motivated by the following observations. In many incidents that happen, for
example an accident scenario, someone maliciously texting a victim, a multiple shooting video-
taped by many bystanders, etc., a mobile device may have captured significant data that is
of forensic interest to an investigator and the owner of the device may wish to provide the
investigator the data (through a consent form). However, the owner may be reluctant to have all
the data on the phone extracted for analysis due to privacy concerns and personal reasons. As
discussed previously, the courts have held that the investigator must limit the amount of data that
is collected to only what is “relevant.” Thus only selective (or filtered) data should be extracted.
For example, at a crime scene or subsequent to a crime, victims and witnesses consent to
assist law enforcement by allowing analysis of their smartphones. Law enforcement personnel
assigned to the crime, using a consent agreement, now wish to scan only relevant files on the
smartphones so as to find sufficient evidence to support arrest and conviction of the perpetrators
3
of the crime. The smartphone data that is extracted must be relevant to the crime as defined
in the consent form. The scenario described is important in understanding our assumptions. We
assume that the mobile devices are voluntarily provided and thus it is not necessary to break
into the phone (rooting or jailbreaking) nor crack a pin code, etc.
In this paper we describe the design and implementation of a prototype software system that
can do targeted data extraction from mobile devices (iOS or Android based) in a forensically
sound manner, driven by input provided by either a first responder or a forensic examiner. The
system would run on a laptop and would connect to a mobile device. The input provided could
be based either on consent or on a warrant. Our software system would reduce the number of
files collected through both analysis of the file metadata as well as analysis of the content of the
files. The forensic soundness of the system processes is developed using the eDiscovery reference
model EDRM [24], as well as using dynamic / live analysis forensic techniques that have been
proposed for network and cloud forensics [22, 38]. Essentially, we validate the processes done
by our system rather than rely on traditional dead forensic approaches that are not appropriate
for such mobile device analysis.
According to Jansen et al. [20], extraction methods are typically classified as manual, physical
or logical. In manual extraction, a person looks through the data on the phone in order to
determine what is relevant. In physical extraction the goal is to get all possible data on the
device by making a bit-by-bit copy. Logical extraction makes a bit-by-bit copy of the logical
storage objects (for example files and directories) by using the system calls of the underlying
operating system.
For our system we use techniques that support logical extraction or file system extraction
rather than physical extraction. We also assume that the relevant data has not been stored in
hidden or deleted files. Instead, we focus on the core technology of targeted data extraction
including how to define what is to be extracted, having a convenient and clear user interface to
describe this data and ensuring that the data extracted is done in a forensically sound manner.
An additional goal for our system is that the targeted data extraction process be extremely fast.
The vast majority of the US mobile devices market (about 97%) uses some version of Apple’s
iOS or some version of Google’s Android OS. Thus in this paper we focus on smartphones
running recent versions of Apple iOS and Android OS. In the rest of this paper we describe the
design and implementation of our targeted data extraction system (TDES). It supports a digital
forensics investigator in collecting relevant data using: (1) metadata filtering rules such as the
4
specific date&time, location and type of data to be extracted; and (2) content-based filtering
using artificial intelligence (AI) and machine learning techniques such as exclusion or inclusion
of pictures having pornographic content or messages that are abusive in nature. Additionally,
the TDES enforces a proper chain-of-custody approach with appropriate guarantees of evidence
preservation that have probative value in a court of law.
To the best of our knowledge there are no tools that are capable of doing online targeted
data extraction as we do. All the tools currently available commercially or otherwise create a
complete backup of the device and allow the investigator to query the backup offline. Moreover
none of these tools have the capability of filtering data based on machine learning based content
filters.
There are a number of tools available for full data acquisition from Android and iOS based
devices. Some of the most important commercial tools for smartphone forensics are: Cellebrite
UFED Physical Analyzer [12], Paraben Electronic Evidence Examiner [35], Oxygen Forensic
[34], AccessData Mobile Phone Examiner Plus, [2], Microsystems XRY [31], Magnet Ac-
quire [26], and Blackbag Mobilyze [10]. These tools aim to acquire as much data as possible
for further analysis and provide both physical and logical acquisitions. Nevertheless, on-line or
off-line selective data acquisition methods have not yet been integrated into such tools.
There has been considerable research on forensic data extraction and analysis in the last
decade. Some of this work is targeted towards extraction of information about specific types of
artifacts, for example acquisition of data from cloud drives or social networking applications:
Roussev et al. [38], Al Mutawa et al. [3], Anglano [5], Mahajan et al. [28], Ruan et al.
[39] and Maus et al. [29]. Other work has been directed towards general forensic extraction
issues for mobile devices: Husain et al. [19], Quick and Alzaabi [36]. In the last few years the
idea of “real time triage” has become important and there has been some work on frameworks
for building such systems: Roussev et al. [37], Cantrell et al. [11] and Walls et al. [47].
Interested readers can consult Scrivens and Lin [40] for an in-depth discussion on Android
forensic analysis and Morrissey and Campbell [30] for forensic analysis of iOS based devices.
Another aspect of forensics that have garnered the attention of researchers is that of privacy both
in the contest of digital forensics in general and mobile forensics in particular Aminnezhad et al.
[4] and Stirparo and Kounelis [43].
5
Machine Learning (see Murphy [32]) and its applications have gained a lot of attention
lately. Deep Learning refers to the development of models based on training neural networks
(see LeCun et al. [25]), has been successfully applied for building systems that have been
able to better the performance of humans in areas as diverse as image recognition [23] to
natural language [13] translation. More recently open source frameworks like CAFFE [21],
Theano [9] and TensorFlow [1] have been developed which can be used for implementing
deep representational learning using neural networks easily. Furthermore with the advent of
smartphones that are equipped with state of the art processors it is now possible to run trained
models for deep representational learning on these phones for several different tasks such as
face detection, image analysis and and classification, using frameworks such as Inception [45],
NSFW [27], MobileNet [18], and CoreML [6].
Our system consists of three different subsystems: the data identification system, the data
acquisition system and the data validation system. The data identification system is responsible
for identifying the most relevant files based on metadata and content. The input to this system
is driven broadly by the contents of a consent form and fine tuned by the investigator using a
user interface that we have designed for this purpose.
Data on the smartphones consists of several different types, such as photos (images), videos,
messages, contact lists, etc. We consider these as the basic categories of data. Each category is
associated with metadata that describes aspects of the data such as time (when was that image
put on the device), location (where was the image taken), sender and receiver (for text and
multimedia messages) etc. Notice that by metadata we simply mean information about the data
category stored by the OS and which can be used to filter data in that category. It must be
noted that the metadata is different from the content. As a concrete example, we are able to
extract photos based on date ranges such as “photos taken within the last week.” This date range
uses metadata about the photos. However, if we want to find only photos containing cats, then
we need to do additional content based filtering. The identification system uses state of the art
algorithms in machine learning, natural language processing and data mining for content based
filtering.
The data acquisition system interacts with the identification system to retrieve targeted files in
one or more phases from a target smartphone, in a forensically sound manner. Note that in our
6
case, acquisition includes what is often termed data collection, so that after the acquisition step,
the acquired data is the desired evidence. It must be noted that the data identification system and
the data acquisition system work with each other (what data to extract and actually extracting
the data, respectively) and are in fact inextricably linked.
Our data acquisition system consists of two parts: a system-on-chip called the “TDES manager”
which resides on a portable bootable drive (we have used a USB memory stick for implementa-
tion) and an app called the “TDES app” to be deployed on the phone. First, the manager boots
up in Windows 10 OS when connected to any laptop or computer. Second, the target smartphone
is connected to the same laptop or computer and the TDES app is downloaded onto the target
smartphone. The user interface of the app allows the investigator to provide the input to the data
identification system from the target phone itself. Finally, the filtered data from the target phone
is transferred back to the TDES manager.
The data validation system ensures that (1) data is transferred in a forensically sound manner
from the TDES app to the TDES manager and it includes appropriate hashing to insure the
integrity of the data; (2) a log timeline is generated that documents the steps taken by the TDES
system during the “live analysis;” and (3) a report is generated that documents the queries by
the investigator, the data analysis performed in response to the query (including the time) and
the data selected. The data validation system is integrated into the identification and acquisition
systems through the TDES app and the TDES manager.
Going forward we describe our system in detail. For ease of exposition and because of the
fact that the data identification and acquisition abstractions are highly coupled, we describe them
together as the targeted data extraction system.
In order to develop a model for targeted data extraction for incidents where only limited data
must be extracted, we considered a list of potential scenarios, suggested by investigators (personal
communication), wherein consent might be natural and filtering of data would be very useful.
Some of the scenarios we considered are as follows:
• A car accident where a bystander has taken photos or a video of what occurred.
• An overdosing incident where the victim has contact information on the phone about the
dealers and/or the drugs taken.
• A suicide where the victim’s phone may contain relevant text/photos.
7
• A situation wherein a couple breaks up and one of the parties involved harasses the other
using a smartphone.
• An incident of domestic violence where the pictures and videos of the bruises inflicted on
the victim by the perpetrator can be used as evidence.
• An incident of public mass violence where many individuals have captured videos or photos
of the crowd and the goal is to analyze these to locate the possible perpetrators and the
devices used in the attack.
• Isolated shooting incidents involving multiple parties where photos or videos were taken.
For example, in several cases of shootings, bystanders or companions recorded the interaction on
their smartphones Steinbuch and Tacopino [42]. In the case of the Boston marathon bombings,
the amount of digital evidence was overwhelming Stroud [44] and automated selective data
extraction may have been very useful. Based on the aforementioned scenarios, the different
types of data that might be of use to the forensic investigator are as follows:
• User created data: Contacts and address book, SMS, MMS, Calender, Voice Memos, Notes,
Photographs, Video/ Audio, Maps and location information, Voice Mails and Stored files,
etc.
• Internet related data: Browsing history, emails, social networking data, etc.
• Third party application data: Messaging data (text, voice, video, pictures) from What-
sApp, Facebook, Skype, etc.
As mentioned before, we concentrate on two types of devices: iPhones and Android based
smartphones. Recall that the TDES consists of an on-device “TDES app” and a “TDES manager.”
The “app” is deployed on the target device and is responsible for filtering and transferring the
data to the “manager”. We call this “On-device Acquisition.” The reason for using an app is
that only information that the app filters out is actually transferred from the phone. Thus, no
other data on the phone is ever pushed to the manager. However, it must be noted that for some
data types on the iPhone, it is not possible to selectively extract the relevant data without either
“jailbreaking” the phone or using the iTunes backup system. As our use case entails consent
from the user, jailbreaking is not an option because it voids the manufacturer warranty. Thus
our only choice for such a situation is to use the iTunes backup system for selective extraction;
we call this “backup acquisition.” In this latter case, we must initially move data to the TDES
manager before selectively extracting the data that has been defined as needed.
There are several issues that arise when using the TDES model: (1) how does the app actually
8
extract the data; (2) how do we decide what is to be extracted; and (3) how do we get the app
on the target phone and the extracted data back to the manager. We discuss each of these in the
following sections.
As previously discussed, data on the phones is of several categories with each category having
several possible metadata. Table I shows what the TDES app can extract, using metadata filtering,
when deployed on the target device. The first part of the table shows what can be extracted from
both iPhones and Android phones. The second part of Table I shows what can be extracted by
Android and not by iPhone. For example, photos captured by third party apps such as Facebook
or WhatsApp can be extracted from Android phones but not from iPhones. The third part of
Table I shows examples of data that our app currently cannot extract from either iOS or Android
phones.
1) Filtering for iOS: System interfaces for iOS devices are delivered in the form of packages.
These packages are referred to as frameworks. A framework comprises a dynamic shared library
and supporting files for that library. Frameworks are added to an iOS application project using
XCode. Cocoa Touch is a user interface framework used for building programs to run on iOS
platforms. Figure 1 shows the iOS frameworks used in the TDES application. A brief description
of each of the Apple frameworks we have used in the TDES mobile application is as follows:
• The Photos framework provides direct access to the photo and video assets managed by the
iPhone Photos app.
9
• The AVKit framework provides a high-level interface for playing video content.
• The CoreLocation framework provides location and heading information to apps.
• The EventKit framework provides an interface for accessing calendar events on a user’s
device.
• The Contacts framework provides access to the user’s contacts and functionality for orga-
nizing contact information.
Using the above frameworks, the TDES iOS application accesses data stored in the user data
partition such as photos, videos, contacts and calendar events. Note that the frameworks require
the iPhone user to give explicit permission to the TDES app for such data access. It is also to
be noted that Apple has focused on enforcing greater privacy capability for iPhone users; there
are no frameworks currently provided that give third-party applications (such as our app) the
ability to access data such as Messages, Call Logs and Web History.
2) Filtering for Android: The Android operating system is a stack of software components
which is roughly divided into five sections namely Application Layer, Application Framework,
Libraries, Android Runtime and Linux Kernel (see Figure 2). The TDES app is deployed
in the Application Layer along with other native applications. It uses services provided by
the Application Framework which includes the Content Provider, Activity Manager, Resource
Manager and the View system Google [17]. The Content provider class offers granular control
over the permissions for accessing data from other applications and allows our application to
access a range of data such as messages, photos, videos, contacts, calendar events and call
logs. The Activity Manager, Resource Manager and the View system are used for design and
implementation of the TDES app.
We start this section by briefly discussing how to deploy machine learning models on mobile
phones. Trained models are developed through any supervised ML technique, including learning
using deep neural nets. Models can be used that have been developed by others and are open
source, improved for accuracy by retraining or developed from scratch using large amounts
of training data. A trained model can be viewed as a data structure containing estimates of
the relevant parameters (both for classification as well as regression). A trained model can be
incorporated into an app (iOS or Android) by using an appropriate framework. Initially we
have used the trained models from Inception-v3 [14], MobileNet [18] and OpenNSFW [27]
11
incorporating them into our TDES app for various classification problems using photos and
videos. Our current prototype allows us to identify photos containing weapons, people, vehicles,
drugs, websites, skin exposure and gadgets.
Filtering for iOS: For on-device content based filtering we use the CoreML framework Apple
[6] from Apple which is used to integrate machine learning models into an app (See Fig-
ure 3). CoreML includes support for various machine learning frameworks like Vision and
GameplayKit Apple [6].
Filtering for Android: In order to do content filtering for Android based phones, we use
the TensorFlow Lite framework [46]. Integration of trained models into the application requires
a different kind of setup as compared to iOS. The trained model and the labels are used in
conjunction with a shared object file (libtensorflow inference.so) that is written in C++. Further-
more, in order to interface with the Android platform we need to use a JAVA API (distributed
as a JAR file named libandroid tensorflow inference java.jar) Shekhar [41], Abadi et al. [1].
Filtering for iOS: For iPhones (iOS) we use iTunes Backup to cover the categories in the
second and third sections of Table I for which on-device acquisition cannot be done. In general,
iOS security mechanisms don’t allow applications (such as our TDES app) running on-device
to extract certain types of content from the device. Therefore, we were required to make use of
external tools and techniques which would allow us to extract more information. The technique
we used is “iTunes backup acquisition.” We spent time determining the appropriate tool to use
and decided on idevicebackup2 which is contained in the libimobiledevice suite. This suite is open
source and free. We were able to figure out how to get Messages and Call Logs. In order to do
this we obtain call logs, messages and contacts through the off-device system. We next describe
in some detail the structure of the information and how we can get what we need to illustrate the
complexity of getting the data because the iTunes Backup file organization is quite complicated.
In the backup, resources are indexed by their hashes (SHA-1). In the file named “Manifest” we
can see which hash represents a resource. For example consider the Call History database below:
The complexity is because this is sort of a data mining problem as knowing where to look and
finding the relevant fields is not obvious from the data structures and is not well documented.
Similar complexity is involved in getting the Messages and Contacts database (which are needed
to get caller id information for messages and call logs). An additional complexity is that the
databases described here may in fact change the fileID hash so we decided to lookup the id
within the manifest file.
Filtering for Android: For Android based phones, as far as we know, everything that can
be filtered off-device can be done on-device and hence we have only implemented on-device
filtering techniques. Note that as depicted in Table I there are third party applications whose
data cannot be retrieved using on-device acquisition for Android phones when the device is not
rooted. In our experiments on both rooted and unrooted phones, we have not found any Android
backup tool (including ADB) that is analogous to iTunes backup (for iOS) that would allow us
to extract the data from third party apps. See Table II third section.
D. TDES Communication
One of the most important aspects of the TDES system is the communication between the
TDES Manager and the target smartphone. We decided to implement a similar paradigm for
both iOS and Android based phones. This is shown in Figure 4. In our model the investigator
is provided with a portable TDES Boot Drive (currently we use a USB stick) that is preloaded
with a number of items including: the boot loader for Windows 10 OS, the TDES manager and
other tools necessary to install the TDES App to the target smartphone. All extracted data will
be sent back to the Boot Drive by the TDES app and reports on this data will also reside on the
Drive. The investigator can use any available computer to boot into the TDES manager which
runs off an isolated environment on the portable TDES Boot Drive. After booting up, the TDES
manager needs to have Internet access if the target device is an iPhone. Succinctly the steps for
targeted data extraction are as follows:
• The boot drive containing the TDES manager is inserted into a laptop and Windows 10 OS
boots up and the “Manager” program starts running.
• A wired connection using a USB cable is connected to the smartphone. App installation
happens in this step, except that for iPhones a hotspot is needed to: (a) code sign the App
by connecting to Apple servers and (b) trust the developer, by connecting again to Apple.
14
After the App gets downloaded to the smartphone, the phone can be disconnected from the
laptop.
• A wireless or wired two-way communications channel is setup between the Manager and
the TDES app for data transfer.
• The targeted data extracted by the TDES App is exported to the TDES Manager and reports
are generated for the extracted data.
The exported data is stored in a specific folder that is created on the file system of the
boot drive (described later). In our implementation we ensure that no copies of the data to be
exported are stored on the user’s phone in any intermediate form. We next describe the protocols
for TDES App installation on the target device and then we consider the data transfer protocol
for exchanging data between the TDES App and the TDES Manager.
1) IOS TDES App Installation: The iOS kernel has control over user processes and applica-
tions which can be run on the iOS device. Applications from sources approved by Apple are
those that can be run on non-jailbroken iOS devices. In order to make sure that the source is a
known and approved one and that the app has not been corrupted, iOS requires that all executable
code must be signed with a certificate issued by Apple. Applications that come pre-installed in
the iOS devices have already been signed by Apple. Third-party applications also need to be
signed with a certificate issued by Apple to prevent loading of any tampered or self-modifying
code [7].
To be able to code sign the app, it is required that the developer’s certificate and provisioning
profile must be set in XCode. In most cases, we can rely on XCode’s automatic code signing,
16
which requires that a developer must specify a code signing identity in the build settings of the
project. But when we try to install an app on an iPhone (without using XCode) for in-the-field
evidence extraction, we must go through a process called “Side-loading.” Side-loading is the
installation of an application on a mobile device without using the device’s official application-
distribution method which is Apple’s App Store.
Our method of choice for side-loading is by using “Impactor,” a tool from Cydia [16]. We first
generate an “.ipa” file of the TDES App using the XCode Archive utility. IPA stands for iOS
App Store Package and is an application archive file which stores an iPhone app. Each .ipa file
is compressed with a binary for the ARM architecture and can only be installed on an iPhone,
iPod Touch, or iPad. In order to code-sign, Impactor logs into the Apple Developer Center and
downloads the developer’s provisioning profile and the iOS development certificate. Note that
logging into the Apple Developer Center requires an Internet connection. Then Impactor signs
the .ipa contents in a depth first manner starting with the deepest folder level, making its way
up to the top level folder. Once this signing is completed, Impactor installs TDES App onto the
specified device. Note that all these tasks discussed have been automated using an AutoHotKey
script [8] that runs after the TDES Manager boots, thus requiring no actions by the investigator.
2) Android TDES App Installation: The Android OS requires that every application being
installed to a device must be signed. This signature process hashes the files within the application
in turn generating a versioned application. This process ensures that if an application was
maliciously tampered with and then attempted to be redistributed as an update, it would fail
because there is no feasible way to replicate the key used in the signature process. Therefore, as
long as the application developed is signed and does not attempt to update another application,
it can be self-signed. The key used to sign the application is a jks or Java Keystore, which can
be created using the Java Keytool. Once this has been generated, a versioned application can be
created through Android Studio.
The output of the completed compilation is an apk file, which is the standard Android OS
application extension and is used for installing any application on the Android device. In order to
install the TDES App, its apk file must be on the target device. This can be done either using a
wired or a wireless connection. Note that for Android no other authentication is necessary. Once
the apk file is on the target device, the TDES App can be installed. For simplicity and ease of
use we use ADB (Android Debug Bridge) a command line utility provided within Android SDK
that allows for communication between the host computer and a target device, for installing the
17
TDES App automatically. The only restriction to using ADB is that the target device must first
be in USB Debugging Mode and once the installation is complete, this mode can be turned off.
3) TDES Data Transfer Protocol: An important consideration for us while developing and
using the App - Manager communication channel was to ensure that the selected data that is
extracted using the app is sent with forensic integrity so that any modification of the data can
be detected. Furthermore, if during the chain of custody of the data any changes were made,
inadvertently or purposefully, the owner of the phone would be able to verify this. We do this
through hashing various portions of the data extracted as well as sending a “final hash” value
to the user’s email address by the TDES manager.
For iOS: For iPhones we are using an analogous data transfer socket based protocol to that for
Android phones (described below). Since we needed a hotspot in any case, we implemented the
App - Manager communication channel to transfer the data either wirelessly or using a wired
connection.
For Android: We use ADB for communication between the host computer and the target device.
Simple file transfers can be done once communication is established but ADB also provides more
capable commands as well. In particular, ADB allows for something called port forwarding.
Simply put, this redirects data passing through the specified port on the host computer to the
specified port on the target device. With this setup complete, the target device can now create
client sockets as needed to transfer data as many times as necessary. Android applications are
natively written in Java so standard networking packages can be imported on both the target
device app and host computer application. Specifically, we import java.net.* which allows for
SeverSockets and Sockets to be created. The ServerSocket waits for a client Socket to connect
from the target device and we use input and output streams to gather the data.
E. User Interface
Our user interface runs as part of the App on the target device. The interface lets an investigator
define the selected data to be extracted from the phone. Although we will possibly converge
on one type of user interface it is currently somewhat different on the iPhone and the Android
Phone. In our current implementation, we assume that a consent form has been manually signed
by both the investigator and the user which allows extraction of the selected data. Note that
the investigator would actually decide on the specifics using the app interface discussed below.
18
However, in future work we could also automate and print out consent form(s) during the
extraction process.
Before turning to the user interfaces, we note an interesting and useful feature called book-
marking that we have implemented in the TDES App. Suppose a dataset has been extracted
using a set of filters. The investigator setting up these filters can display the results of the
filtering and to do a quick data review on the phone itself before deciding what data to export
to the TDES manager. For example, if a set of images of weapons in a certain time range has
been selected, the investigator can do a review of the images to decide which subset of these
are relevant to the investigation by bookmarking the relevant set. Although bookmarking seems
like an excellent idea, the problem arises in that the data extracted is not identical to what was
specified, and moreover, the investigator has now seen data that in fact may not even be exported,
thus violating the privacy of the user to some extent. After discussions with a former prosecutor
and current defense attorney (personal communication) it became clear that bookmarking was a
useful feature as it provides a mechanism for incorporating the experience of the investigator in
the selection process. At the same time the attorney also agreed that in some cases bookmarking
may have the potential to introduce bias into the evidence collection process to the extent of not
including exculpatory evidence. Thus in our implementation we provide the ability to turn off
bookmarking and in cases where bookmarking is done we create two versions of the data: one
with the bookmarking and one without and export both of these sets back to the TDES manager.
1) iPhone App Interface: Figure 5 shows the iPhone TDES app interface. It starts with the
choice of “when” which defines date-ranged options consisting of “today,” “last week,” or “last
month.” An arbitrary date range can also be provided. In the next screen, the user is given
the option “where” and a choice of locations are provided such as current location, location
determined within a certain number of miles and location determined by city, state, or zip code.
The next screen defines the “what” or data type that is to be extracted. Choices are images,
videos, calendar, call logs, messages and contacts. For each of the data types, the next screen
defines the “filtering option” (primarily machine learning based content filtering) which provides
the ability to further qualify the extraction for the selected data type. For example, if the selected
data type is image or video, the content filtering options that we support are either the inclusion
or the exclusion of: weapons, places, vehicles, drugs, websites, gadgets, skin exposure, porn and
favorites. Thus, as an example, if the “exclude skin-exposure” option is selected then our app
would filter out such images for display or export. The final screen shows what is to be done
19
with the evidence: display on the device, export, or both. The choice “Export” is essentially
getting the extracted information back to the TDES manager.
2) Android App Interface: In contrast to the TDES iPhone App, the TDES Android App starts
with the screen for specifying the data categories to be extracted. The Android App supports the
same categories of data as the iPhone App as shown in the second screen in Figure 6. Selecting
20
any of these data types leads to a new screen with another set of choices providing further
filtering options based on metadata as well as content specific to the data type selected. For
example:
• Call logs can be further filtered by name and number as well as by date and time.
• Contacts can be further filtered by name and number.
• Messages can be further filtered by name and number as well as by date and time.
• Videos and images can be further filtered by location, date, time and finally by content.
The options for content based filtering are identical to the options available for the iPhone
as discussed in Section IV-E1.
The Android interface also has provisions (as does the iPhone interface) for displaying or
exporting the extracted data back to the TDES manager.
For the TDES data transfer we defined a common interface using the JSON object format [15]
to be used for data transfer for both types of phones. The JSON structure allows us to describe
the extracted data as well as additional information such as hashes when used and any other
21
reporting information that we may wish to collect using the TDES app. For example, as part of
the report, we indicate the time that the TDES app started to run, times when the extractions
were completed, etc. Although the data transfer is primarily from the App to the Manager, we do
get a few pieces of information from the Manager to the App such as Investigator name, device
owner’s name and case number. The Android TDES App can extract additional information such
as IMEI, phone number, and phone email address. For iOS this information must be entered in
the Manager. Figure 7 in Appendix A shows a sample report generated for an iPhone.
1) TDES Directory Structure on the Boot Drive: The directory structure that we create for
storing the evidence on the boot drive, insuring integrity of the data and for reporting purposes is
shown in Figure 8-(A) in Appendix A. In this Figure, Case directories are created for each case
that the investigator is handling. Inside this directory, the actual full report is the file Report.html.
We describe the file Final.json in the next section when we describe the .json files. The extracted
information is stored as one or more iterations of filtering requests made by the investigator. For
example after the investigator has exported a set of selected data, he/she may wish to do another
filtering, either through omission or maybe for some other combinations of metadata/content
filtering on the target phone. For each iteration, we see that each category of data has a directory
associated with it and a .json file associated with this directory.
2) JSON Format for Data Transfer: JSON format is used to describe the structure of the data
that is exported so that the Report Manager can create one or more appropriate reports easily.
We create reports in the HTML format.
An example of a JSON format that we have defined is illustrated in Figure 8-(B) in Ap-
pendix A. Suppose we have retrieved a series of photos using our metadata and content filters.
Information about each photo and auxiliary information will be transferred along with the actual
file of the image. Note that both the TDES iPhone app and the TDES Android app would
create the information in the same format. Once this information is transferred to the TDES
Manager on the boot drive, the Report Manger will use the full information to create the actual
report. Various hashes are also transferred as part of the JSON files. As can be seen in the file
It1 Photos.json, the file is structured into arrays of arrays containing (Key, Value) pairs. For
example, creation date is a key and its value is the string 01-01-2017. It should be clear that
there is a lot of data that is exported in the JSON file. Also note two other important points. The
key filename has a value string associated with it that is the name of the actual photo image.
The actual image however is not stored as part of the JSON file but is stored as a separate file
22
as defined by the key exportpath. Note also that the hash value of the actual photo file is stored
within this JSON structure and is defined by the key f hash.
3) Hashing and Data Integrity: We use SHA-1 hashes to ensure the integrity of the data
transferred to the boot drive and during subsequent chain of custody. This is typically done by
law enforcement for digital forensics data. We can easily change to SHA-2 or other hash types
if preferable for investigators. The hashes that we create are as follows. For each filename that
is defined in a JSON file we create a hash associated with it called the f hash. Consider for
example the It1 Photos.json file in Figure 8 in Appendix A and the key filename with value
It1 Photo Camera 1.jpeg. There is an f hash associated with it (shown in the figure) since the
actual file is stored in a separate location. Thus any file in our directory that is not a JSON file has
a hash value stored in some JSON file. Next, for every JSON file there is a JSON hash (j hash)
associated with that file. For example the hash value computed on the file It1 Photos.json is
stored as the key It1 Photos j hash in the file It1 Hashes.json. For each iteration n, the hash
of Itn Hashes.json is stored in the Final.json file, also shown in Figure 8 in Appendix A. The
hash of Final.json is called the Final hash. Note that the Final hash ensures that no file in a
Case directory can be modified without detection. The Final hash computed by the TDES App
is sent to the TDES Manager and stored in Report.html. Note that the TDES Manager could
independently compute the Final hash to check if there were any changes during the transfer.
We compute hashes at intermediate points for several reasons including ease in granular transfer
of data and checking if the transfer is correct. Also, for example, checking extracted files against
“known” files [33] is also simplified. The TDES Manager also emails a copy of the Final hash
to the owner of the smartphone as well as the investigator.
V. E XPERIMENTS
We conducted a series of experiments to test our prototype system (Tables IV through VI for
iOS; Tables VII and VIII for Android) for initial accuracy and speed with respect to metadata
and content based filtering. Note that for metadata filtering we would expect 100% accuracy
since we use the native frameworks provided by Apple and Android. However we also did
check manually that the metadata filtering was correct.
We also compared the performance of our system against two commercial tools (Paraben
EEE [35] and Magnet Axiom [26]; Table IX) currently used by law enforcement. As pointed
out in Section II neither of these tools are capable of doing selective data extraction in the
23
manner we have described. Essentially these tools do a physical acquisition of the data on the
phone and then allow the user to analyze the data extracted off-line. This paradigm is used
irrespective of the category of the data being analyzed.
In our experiments we used three Android based phones and three iPhones. The detailed
configurations of the hardware used for the experiments is summarized in Table III. Apple
Devices Dev-I and Dev-III belong to authors of this paper and contain real user data. Dev-
II contains synthetic data, non-copyrighted and available for reuse drawn from the Internet.
Similarly, Android Devices Dev-V and Dev-VI contain real user data (and belong to the authors)
and Dev-IV contains synthetic data. The table shows the total number of artifacts on the devices
for each category of data. Our TDES boot drive was a SanDisk Extreme 128 GB stick and the
laptop used was a ThinkPad X1 Carbon running Windows 7 (which was never used for the
experiments).
Model/Version Storage CPU NIC Photos Videos Messages Call Logs Contacts Calendar
Dev-I: iPhone-8 256GB Apple A10 Lightning 10,307 178 208 482 1102 148
(iOS v11.2.1) Fusion port
Dev-II: iPhone-7 128GB Apple A10 Lightning 2,621 109 5 155 6 46
(iOS v11.2.5) Fusion port
Dev-III: iPhone-6 Plus 16GB Apple A8 Lightning 2,566 102 15,978 714 384 265
(iOS v11.2.2) port
Dev-IV: Samsung Galaxy 32GB Octa-core Micro- 100 6 37 7 20 17
S7 (v7.0, Nougat) Exynos USB 2.0
Dev-V: Moto G3 16GB Quad-core 1.4 GHz Micro- 191 7 25,420 429 1889 780
(v6.0, Marshmallow) Cortex-A53 USB 2.0
Dev-VI: Samsung Galaxy 32GB Quad-core Micro- 249 22 13,362 500 240 337
S7 Edge (v7.0, Nougat) Snapdragon 820 USB 2.0
SanDisk Extreme 128GB USB 3.0
PRO CZ88
ThinkPad X1 512GB Intel Core
Carbon (4th gen) i7-6600U USB 3.0
A. iOS
Table IV shows the results for a series of experiments (we used only Dev-I and Dev-II for
this series) for on-device metadata based filtering for iPhones. Each experiment (shown in the
first column numbered from 1 to 14) defines a category of data to be extracted and the filters
used for the selection. For example, for experiments #12 and #13 both Photos and Videos are
extracted for the date ranges, as indicated in the metadata filtering column. For each device we
show the total number of artifacts selected based on the filter as a fraction of the total number of
24
artifacts on the device. For example on Dev-I for experiment #12 there were 91 photos retrieved
from 10,307 photos on the device. The metadata filtering was 100% accurate based on checking
the devices manually and using features of the phones such as Photos Album count. The time
to display the data on the target device and export the data to the TDES Manager is indicated
in the columns “Display Time” and “Export Time” respectively for each device. Note that the
observed times show that the system can be used for practical in-field targeted data extraction.
The size of the exported data is also shown in the Table IV. Note that for these experiments we
used the wired connection for data transfer.
Category Metadata Dev-I Dev-II Display Display Export Export Size-I Size-II
Filter Artifacts Artifacts Time-I Time-II Time-I Time-II
1-Photos Date: 12/24/17 - 12/27/17 2/10,307 90/2,621 0.7sec 0.38sec 3.58sec 33.63sec 2.33MB 12.4MB
2-Photos Date: 12/26/17 1/10,307 10/2,621 0.45sec 0.38sec 0.41sec 5.58sec 123KB 1.51MB
3-Photos Location: Within 10 miles* 418/10,307 86/2,621 1.21sec 0.65sec 42m64sec 5m37sec 822MB 183MB
4-Videos Date: 09/1/17 - 01/31/18 34/178 61/109 1.20sec 3.02sec 51m11sec 13m6sec 1.38GB 446MB
5-Videos Date: 08/31/17 1/178 1/109 0.48sec 0.44sec 72.55sec 6.93sec 42.9MB 6.47MB
6-Videos Location: Within 10 miles* - 9/109 - 13.35sec - 4m28sec - 135MB
7-Videos Location: Current Location* 4/178 - 0.2sec - 17m2sec - 405MB -
8-Contacts Name: “Puppy” 3/1102 - 2.57sec - 0.6ms - - -
9-Contacts Name: “Robert” - 1/6 - 2.06sec - 0.8ms - -
10-Contacts Number: +*(***)***-*** 1/1102 1/6 0.12sec 0.02sec 0.8ms 0.8ms - -
11-Calendar Date: 01/01/18 - 01/15/18 19/148 1/46 0.14sec 0.05sec 0.6ms 0.7ms - -
12-Photos, Date: 08/30/17 - 09/15/17, 91/10,307 (P), 149/2,621 (P), 0.7sec 0.56sec 4m1sec 2m23sec 236MB 81.9MB
Videos Location: Any 1/178 (V) 5/109 (V)
13-Photos, Date: 08/31/17, 9/10,307(P), 0/2,621 (P), 0.73sec 0.35sec 1m01sec 12.61sec 51MB 6.47MB
Videos Location: Within 50 miles 1/178 (V) 1/109 (V)
14-Videos Date: Last Week, 3/178 (V) 4/5 (V) 0.4sec 0.41sec 1m2sec 1m19sec 47MB 44.6MB
Location: Within 10 miles
Table V shows results of a series of experiments (using only Dev-III) for backup based
metadata filtering for messages and call logs. For these categories of data for iPhones, as
previously discussed, we must use extraction via iTunes on the TDES Manager and hence there
is no export time. Note however that the filtering is still specified by the investigator on the
TDES App. The accuracy of the metadata filtering is again 100%, as expected, based on manual
analysis using iTunes.
Table VI shows results for a series of experiments (using only Dev-II) for various combinations
of metadata and content based filtering for Photos. There were 2621 photos on the device with
109 photos in the date ranges 12/25/17 to 12/29/17 and 72 of these taken on 12/25/17. We used
the Inception-v3 model for content filtering of items shown in the third column. In rows 1-3 of the
table we focused on weapons. Since none of the weapon photos were taken within the specified
25
TABLE VI: On-device Metadata and Content Based Filtering on iPhones Using Inception-v3
Model
location (metadata filter within 10 miles) of the phone, content filtering in the 2nd row was not
applied. In rows 4-9 we focused on a variety of items that could be relevant for law enforcement.
The confusion matrix (columns TP, FN FP, TN) for each row shows retrieval results for this
series of experiments. The “Accuracy Measure” (TP+TN)/(TP+FN+FP+TN) column summarizes
how well the Inception-v3 models does for the content filtering. Note that one of our future goals
is to create and train our own neural network models for law enforcement use. We also show
the time to display and export the extracted photos in the relevant columns.
B. Android
Table VII shows the results for a series of experiments (we used Dev-IV and Dev-V) for
on-device metadata based filtering for Android phones. We illustrate the results for a variety
26
of combinations of data categories and meta filters. As for the experiments for iPhones, each
experiment (numbered from 1 to 18) shows the categories of data and the meta filters used in
columns one and two. Note that the display and export times are quite reasonable. For example,
in experiment 18 for Dev-V, device artifacts totaling 236 MB took 18.2 seconds to export.
We next show a series of experiments for extracting photos using both metadata filtering
and content based filtering, shown in Table VIII. We used Dev-IV and the MobileNet model
from TensorFlowLite. The confusion matrices and accuracy measures are shown as well as the
display and export times. Again, the results are good but it is clear that it will be useful to
explore developing specific ML models for law enforcement targets.
Category Meta- Dev-IV Dev-V Display Display Export Export Size-IV Size-V
Filter Artifacts Artifacts Time-IV Time-V Time-IV Time-V
(sec) (sec) (sec) (sec)
1-Photos Date: 04/07/17- 02/05/18 100/100 101/191 2.06 0.83 12.03 13.13 10.3 MB 12.3 MB
2-Photos Date: 02/03/18 - 02/05/18 22/100 2/191 0.86 0.31 4.65 2.32 9.50 MB 6.56 MB
3-Photos Location: Current Location 4/100 1/191 0.56 0.63 2.5 10.58 4.50 MB 46.1 MB
4-Videos Date: 12/19/17 -02/03/18 1/6 3/7 0.45 0.89 1.08 3.91 9.45 MB 16.6 MB
5-Videos Location: Current Location 6/6 7/7 1.69 0.90 10.76 13.09 144 MB 190 MB
6-Calender Date: 05/29/17 - 05/30/17 1/17 85/780 0.62 1.03 1.02 2.40 1 KB 13 KB
7-Calender Date: 06/01/17 - 06/27/17 5/17 1/780 0.83 1.85 1.25 1.03 4 KB 2 KB
8-Messages Date: 08/01/17 - 09/20/17 37/37 987/25,420 0.69 1.23 3.46 13.34 8 KB 16 KB
9-Messages Name: aaabb 14/37 32/25,420 1.02 1.23 3.56 14.23 5 KB 7 KB
11-Messages Number: +*(***)***-*** 1/37 5/25,420 0.42 0.92 1.02 1.25 2 KB 4 KB
12-Call Logs Date: 08/07/17 - 08/08/17 4/7 8/429 0.22 0.23 1.28 3.20 4 KB 5 KB
13-Call Logs Name: aaabb 1/7 9/429 0.22 0.49 1.02 6.59 2 KB 5 KB
14-Call Logs Number: +*(***)***-*** 2/7 11/429 0.47 0.89 1.99 11.2 2 KB 6 KB
15-Messages Number: +*(***)***-*** 4/37 100/25,420 1.02 1.25 11.02 14.08 154.8 MB 199.1 MB
Photos Date: 01/28/18 - 02/05/18 23/100 6/191
Videos Location: Current Location 6/6 7/7
16-Photos Date:12/20/17 - 01/16/18 2/100 5/191 0.56 0.98 5.02 9.68 58.89 MB 94.02 MB
Videos Date: 12/20/17 - 01/16/18 1/6 2/7
17-Messages Date: 12/12/17 - 02/05/18 1/37 1000/25,420 0.44 1.02 0.98 3.89 3 KB 258 KB
Call Logs Number: +*(***)***-*** 1/7 8/429
18-Messages Date: 09/12/17 - 09/29/17 15/37 300/25,420 1.89 1.65 10.2 18.2 148.5 MB 236.1 MB
Calender Date: 09/12/17 - 09/29/17 2/17 5/780
Photos Location: Current Location 4/100 1/191
Videos Location: Current Location 6/6 7/7
As mentioned before for comparison purposes we compared our system to two commercial
tools, Paraben [35] and Magnet Axiom [26]. We used Android Dev-VI and iPhone Dev-II for
the experiments. The results are shown in Table IX. Note that in the table, App Installation Time
(AIT) denotes the time it takes from the instant the device (iOS or Android) is connected to the
27
TABLE VIII: On-device Metadata and Content Based Filtering on Android Phones Using
MobileNet Model
Laptop running the extraction tool to the instant a choice can be made for data selection. Also
note that for the TDES application, the exported data was recorded on a flash drive running the
TDES manager whereas for both Paraben and Magnet Axiom, the exported data was stored on
the hard drive of the laptop, which would also account for a difference in the times.
For Android TDES App the installation time was 14 seconds. Note that for Android based
phones Magnet Axiom uses backup based acquisition and hence for extraction of any artifact
the user needs to spend this time for creating the backup. Thus for example, for extracting
the Call Logs, first the backup needs to be created which takes 29 minutes and then the the
28
Call Logs can be obtained from the backup in 1 minute 17 seconds. For the TDES App the
corresponding time is 14 seconds for the TDES App installation and 1 second for the export. For
Paraben the corresponding time is 5 seconds for the initialization and 40 seconds for the export.
For both Paraben and Magnet Axiom the only choices available for acquisition are the broad
categories of data as shown in Table IX. Paraben does not have the option of extracting Photos
and Videos separately but rather has the option for extracting all media artifacts. However in our
experiments we observed that selecting this option resulted in extraction of only the metadata
for the media artifacts and not the artifacts themselves. For Magnet Axiom there is a separate
option of exporting Video Artifacts which is a preview of the video as a PNG file.
For iOS TDES App the installation time was 52 seconds. For iOS extraction both Paraben and
Magnet Axiom first create a backup before any data export can be done. However for Paraben
the backup creation happens with the application initialization (10 minutes) whereas for Magnet
Axiom there is a separate backup creation step that takes 38 minutes 54 seconds after 9 minutes
of initialization. Thus, for example when extracting Contacts and Calender the TDES App takes
52 seconds for initialization and 3.8 milliseconds for the data export for both categories of data.
However the corresponding time for Paraben is 10 minutes for initialization and 4 seconds for
export and for Magnet Axiom is 47 minutes 54 seconds for initialization and 0.6 seconds for
data export. For the TDES App backup acquisition is only required when extracting Call Logs
or Messages.
In this paper, we have described the design and development of a system that can do targeted
data extraction from smartphones based on both metadata filtering and content based filtering.
Our novelty is not only what we proposed to do but also how we were able to make it work
effectively for both iOS and Android based devices.
We have noted that our current work is based on the assumption that the phone is voluntarily
provided to law enforcement. However, much of our work clearly does not require this voluntary
aspect. For example, a court order might require an individual to allow law enforcement to get
selected types of data from the phone and require that the passcode be provided. The reason that
our work would be incomplete from this perspective is that the user might have deleted data
before providing it to law enforcement and we would need to jailbreak /root the phone so as to
extract deleted data if possible. Again, another application of our work could be when a phone
29
is seized from a non-cooperative party without the passcode being known and when a memory
dump can be done. If the file system is intact, our system could be adapted to obtaining selected
information. We plan to explore some of these scenarios in future work.
We have shown the value and ability to use ML models for selective data extraction. At the
current time, we have been using open-source models that we have adapted to our needs. In
future work, we intend to explore retraining existing models and developing specific models
for the needs of law enforcement trained on either newly created or available data sets. We
would also propose to extend ML techniques to using natural language processing for selective
extraction of data from chat applications.
At the current time we have fully functioning prototypes for iPhones and Android phones.
We plan to provide these to law enforcement for testing and feedback before adding additional
features and capabilities to our TDES system.
30
A PPENDIX
R EFERENCES
[1] Martı́n Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean,
Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: A
system for large-scale machine learning. In OSDI, volume 16, pages 265–283, 2016.
[2] AccessData. Accessdata. https://accessdata.com/products-services/mobile-solutions#, 2018.
Accessed: 2018-02-06.
[3] Noora Al Mutawa, Ibrahim Baggili, and Andrew Marrington. Forensic analysis of social
networking applications on mobile devices. Digital Investigation, 9:S24–S33, 2012.
[4] Asou Aminnezhad, Ali Dehghantanha, and Mohd Taufik Abdullah. A survey on privacy
issues in digital forensics. International Journal of Cyber-Security and Digital Forensics
(IJCSDF), 1(4):311–323, 2012.
[5] Cosimo Anglano. Forensic analysis of whatsapp messenger on android smartphones. Digital
Investigation, 11(3):201–213, 2014.
[6] Apple. Coreml: Apple developer docs. https://developer.apple.com/documentation/coreml,
2017.
[7] Apple. Whitepaper on ios security. Technical report, 2018.
[8] AutoHotkey. Autohotkey. https://autohotkey.com/, 2018. Accessed: 2018-02-07.
[9] James Bergstra, Frédéric Bastien, Olivier Breuleux, Pascal Lamblin, Razvan Pascanu,
Olivier Delalleau, Guillaume Desjardins, David Warde-Farley, Ian Goodfellow, Arnaud
Bergeron, et al. Theano: Deep learning on gpus with python. In NIPS 2011, BigLearning
Workshop, Granada, Spain, volume 3. Citeseer, 2011.
[10] BlackBag. Blackbag mobilyze. https://www.blackbagtech.com/software-products/mobilyze.
html, 2018. Accessed: 2018-02-06.
[11] Gary Cantrell, David Dampier, Yoginder S Dandass, Nan Niu, and Chris Bogen. Research
toward a partially-automated, and crime specific digital triage process model. Computer
and Information Science, 5(2):29, 2012.
[12] Cellebrite. Cellebrite. http://www.cellebrite.com/Mobile-Forensics, 2018. Accessed: 2016-
04-10.
[13] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi
Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using
rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078,
33
2014.
[14] François Chollet et al. Keras. https://github.com/fchollet/keras, 2015.
[15] Douglas Crockford. The application/json media type for javascript object notation (json).
2006.
[16] Cydia. Cydia for ios. https://www.cydiaios7.com/, 2018.
[17] Google. Android developer manual. https://developer.android.com/about/versions/oreo/
index.html, 2017.
[18] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias
Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural
networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
[19] Mohammad Iftekhar Husain, Ibrahim Baggili, and Ramalingam Sridhar. A simple cost-
effective framework for iphone forensic analysis. In Digital Forensics and Cyber Crime,
pages 27–37. Springer, 2010.
[20] Rick Ayers Sam Brothers Wayne Jansen, Rick Ayers, and S Brothers. Guidelines on mobile
device forensics. NIST Special Publication, pages 800–101, 2014.
[21] Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross
Girshick, Sergio Guadarrama, and Trevor Darrell. Caffe: Convolutional architecture for
fast feature embedding. In Proceedings of the 22nd ACM international conference on
Multimedia, pages 675–678. ACM, 2014.
[22] Suleman Khan, Abdullah Gani, Ainuddin Wahid Abdul Wahab, Muhammad Shiraz, and
Iftikhar Ahmad. Network forensics: Review, taxonomy, and open challenges. Journal of
Network and Computer Applications, 2016.
[23] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep
convolutional neural networks. In Advances in neural information processing systems, pages
1097–1105, 2012.
[24] D Lawton, R Stacey, and G Dodd. E-discovery in digital forensic investigations. Technical
report, Technical Report CAST publication, 2014.
[25] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436,
2015.
[26] Magnet. Magnet acquire. https://www.magnetforensics.com/magnet-acquire/, 2018. Ac-
cessed: 2018-02-06.
[27] J Mahadeokar et al. Opennsfw. https://github.com/yahoo/open nsfw, 2017.
34
[28] Aditya Mahajan, MS Dahiya, and HP Sanghvi. Forensic analysis of instant messenger
applications on android devices. arXiv preprint arXiv:1304.4915, 2013.
[29] Stefan Maus, Hans Höfken, and Marko Schuba. Forensic analysis of geodata in android
smartphones. In International Conference on Cybercrime, Security and Digital Forensics,
http://www. schuba. fh-aachen. de/papers/11-cyberforensics. pdf, 2011.
[30] Sean Morrissey and Tony Campbell. iOS Forensic Analysis: for iPhone, iPad, and iPod
touch. Apress, 2011.
[31] MSAB. Msab. https://www.msab.com/, 2018. Accessed: 2018-02-06.
[32] Kevin P Murphy. Machine learning: a probabilistic perspective. MIT press, 2012.
[33] NIST. Nsrl, 2017. URL https://www.nist.gov/software-quality-group/national-software-
reference-library-nsrl. Accessed: 2018-03-28.
[34] Oxygen. Oxygen. https://www.oxygen-forensic.com/en/, 2018. Accessed: 2018-02-06.
[35] Paraben. Paraben. https://shop.paraben.com/index.php?id product=125&controller=
product, 2018. Accessed: 2018-02-06.
[36] Darren Quick and Mohammed Alzaabi. Forensic analysis of the android file system yaffs2.
2011.
[37] Vassil Roussev, Candice Quates, and Robert Martell. Real-time digital forensics and triage.
Digital Investigation, 10(2):158–167, 2013.
[38] Vassil Roussev, Andres Barreto, and Irfan Ahmed. Forensic acquisition of cloud drives.
arXiv preprint arXiv:1603.06542, 2016.
[39] Keyun Ruan, Joe Carthy, Tahar Kechadi, and Mark Crosbie. Cloud forensics. In IFIP
International Conference on Digital Forensics, pages 35–46. Springer, 2011.
[40] Nathan Scrivens and Xiaodong Lin. Android digital forensics: data, extraction and analysis.
In Proceedings of the ACM Turing 50th Celebration Conference-China, page 26. ACM,
2017.
[41] Amit Shekhar. Android tensorflow machine learning example. https://blog.mindorks.com/,
2017.
[42] Yaron Steinbuch and Joe Tacopino. Woman records horrific scene after boyfriend is fatally
shot by police, 2016. URL http://nypost.com/2016/07/07/woman-live-streams-bloody-
aftermath-of-police-involved-shooting/. Accessed: 2017-09-25.
[43] Pasquale Stirparo and Ioannis Kounelis. The mobileak project: Forensics methodology for
mobile application privacy assessment. In Internet Technology And Secured Transactions,
35