Netdroid: Summarizing Network Behavior of Android Apps For Network Code Maintenance
Netdroid: Summarizing Network Behavior of Android Apps For Network Code Maintenance
Abstract—Network access is one of the most common features process. First of all, due to various app features related
of Android applications. Statistics show that almost 80% of to the network, a substantial portion of code in Android
Android apps ask for network permission and thus may have apps is often about sending network requests and processing
some network-related features. Android apps may access multiple
servers to retrieve or post various types of data, and the code network responses. Furthermore, as client-side code, Android
to handle such network features often needs to change as a application and server-side code on the remote servers form
result of server API evolution or the content change of data a whole system. However, the two portions of code often do
transferred. Since various network code is used by multiple not evolve simultaneously. It is common that Android app code
features, maintenance of network-related code is often difficult and server-side code are maintained by different development
because the code may scatter in different places in the code
base, and it may not be easy to predict the impact of a code groups, especially when the server-side code is also responding
change to the network behavior of an Android app. In this to web user interfaces. In many other cases, an Android app
paper, we present an approach to statically summarize network my use various third-party web services (e.g., Google and
behavior from the byte code of Android apps. Our approach Facebook services for related features, Admob services for
is based on string taint analysis, and generates a summary of advertisement), and the evolution of third-party services is
network requests by statically estimating the possible values of
network API arguments. To evaluate our technique, we applied usually out of the control of Android app developers. Client
our technique to top 500 android apps from the official Google developers often use mocking techniques [20] to keep some
Play market, and the result shows that our approach is able to control on these dependencies during development phase, but
summarize network behavior for most apps efficiently (averagely they finally need to adapt their code to accommodate changes
less than 50 second for an app). Furthermore, we performed an in third party services.
empirical evaluation on 8 real-world maintenance tasks extracted
from bug reports of open-source Android projects on Github. The maintenance of network related code in Android apps
The empirical evaluation shows that our technique is effective in is often tedious and error-prone due to two major reasons.
locating relevant network code. First of all, since lots of software features in Android apps
require interaction with network, network-related code often
I. I NTRODUCTION scatters in different components, and thus it is difficult to
In this network era, most software access remote servers locate the code to be revised. Second, for better flexibility
for various reasons, and mobile apps, such as Android apps, and re-usability of code, developers often have to dynamically
often extensively use network due to the limited computation concatenate constant strings and user inputs to generate a
power of mobile devices and requirements to access real-time network request. The generation process often involves com-
data. For example, travel apps such as Orbitz1 and Expedia2 plicated string operations and invocations to network-related
need to request availability information of hotel rooms and APIs. In such a scenario, developers may not have an intuitive
air tickets; messaging apps such as SnapChat3 and FaceBook understanding of the network requests generated by the code,
Messenger4 need to exchange text / multi-media messages as well as how their changes may affect network behaviors of
through network servers; even simple flashlight apps such as the app.
Tiny Flashlight5 collect user data and send them to remote To sum up, maintenance of network code is an important
servers for usage-pattern study and advertisement. Actually, task in the evolution of Android apps, and developers can
statistics on Android permissions [35] show that, the network benefit a lot from techniques that can summarize network
permission is the most popular permission among Android behaviors of android apps as more intuitive models, and
apps, and about 80% Android apps request the network techniques that can provide traceability from intuitive models
permission. to the code base. In this paper, we propose a novel fully-
When Android apps evolve, the maintenance of network- automatic approach to generate such models (we referred to
related code is often an important task in the maintenance them as traceable network summaries) for an Android app.
In our approach, the traceable network summary of an app
1 https://play.google.com/store/apps/details?id=com.orbitz
2 https://play.google.com/store/apps/details?id=com.expedia.bookings
describes all possible network requests with a sequence of
3 https://play.google.com/store/apps/details?id=com.snapchat.android string constants extracted from the source code.
4 https://play.google.com/store/apps/details?id=com.facebook.orca Examples of such summaries are presented at the end of
5 https://play.google.com/store/apps/details?id=com.devuni.flashlight Section II. All the string constants in the signatures have their
6 It should be noted that although the detailed design and implementation 7 http://java.decompiler.free.fr/. We do not decompile the Jar file but directly
of our approach is for android apps, the basic idea of our approach may be use the Java byte code in our approach, we decompile the code here only for
applicable to other mobile apps or even PC applications the ease of understanding.
166
10 localObject7 = ((StringBuilder)localObject7) constant sequences. As mentioned in introduction, the 5 major
.toString();
11 localStringBuilder.<init>((String)localObject7); components of our approach are Dex2Jar, network invocation
12 HttpGet localHttpGet = handling, string taint analysis, grammar combination, and
new org/apache/http/client/methods/HttpGet;
13 localObject7 = localStringBuilder.toString(); summary generation. In this section, we will first introduce
14 localObject1 = localHttpGet; the representation of tracable network summaries, and then
15 localObject2 = localObject7;
16 ((HttpGet)localObject1).<init>((String)localObject2); describe in detail the design of last 4 components.
The second code sample is from Y.class8 of the app A. Representation of Network Summaries
Youtube. The function of the following code is to generate In our paper, a tracable network summary is represented as a
a URI, which is later packaged and sent to the Internet. set of constant string sequences. Each constant string sequence
... has two reserved constants “START”, and “END” to be the
1 Object localObject1 = new android/net/Uri$Builder;
2 ((Uri.Builder)localObject1).<init>(); beginning and ending constants, and is in the form as below.
3 String str1 = "http";
4 localObject1 = ((Uri.Builder)localObject1) ST ART → C1 → ... → Ci → ... → Cn → EN D (1)
.scheme(str1);
5 str1 = "gdata.youtube.com";
6 localObject1 = ((Uri.Builder)localObject1) It should be noted that, in our approach, we use such a
.authority(str1); sequence to present all network requests that contains C1 ,
7 str1 = "feeds";
9 localObject1 = ((Uri.Builder)localObject1) ..., Ci , ... Cn in sequence. In other words, if the alphabet
.appendPath(str1); of characters in all network requests is Σ, the sequence in
10 str1 = "api";
11 localObject1 = ((Uri.Builder)localObject1) Formula 1 represents all network requests in the form as below.
.appendPath(str1);
12 localObject1 = ((Uri.Builder)localObject1).build();
13 b = (Uri)localObject1; Σ∗ C1 Σ∗ ...Σ∗ Ci Σ∗ ...Σ∗ Cn Σ∗ (2)
...
Compared with arbitrary automata used in existing string
From the two code samples above, we have the following (taint) analyses [5] for value summarization, the restricted
observations. First of all, developers do manipulate strings form in Formula 1 has better readability for developers,
in the code to generate the content of network requests. because the former often consists of hundreds of states and
Second, the second code sample shows that some of the transitions (as illustrated in TransVis [32]), while the latter is
packet generation APIs (e.g. Uri.Builder) can be more complex restricted by the number of constants defined in the program.
than taking a method that takes a single string argument Furthermore, the summary of an app is the union of all
(e.g., ((HttpGet)g).<init>("...");). In particular, when sequences. Therefore, it is possible to map one or several
using such API methods to generate the content of network sequences to a certain restful API at the server side. For
requests, an object will take multiple string arguments in multiple sequences with common prefixes, it is straightforward
multiple method invocations, and generate the content by to combine them as a tree.
concatenating the arguments. Based on the above observations,
we choose to leverage string analysis as the basic technique B. Handling Network-API-Method Invocations
in our approach. Furthermore, we propose techniques to han- In this subsection, we introduce how we handle network-
dle various complex request-generation APIs. For the above API-method invocations. Our work includes two parts. The
example, our approach is able to generate two constant-string first part is building a list of network API specifications that
sequences (as a part of the summaries of the two apps) as are able to model the semantics of these APIs (i.e., modeling
below: how they generate the network-packet contents by concate-
Sequence 1: nating their arguments). We use API grammar templates to
Host:ggtrack.org
START -> http://ggtrack.org/SM1c?device_id= specify the semantics of each API. An API grammar template
-> &adv_sub -> END is a context-free grammar with parameters. The parameters
Sequence 2: represent the parameters of the API, the start variable of the
Host:gdata.youtube.com grammar template represents the generated content of network
START -> http:// -> gdata.youtube.com
-> /feeds -> /api -> END request, and the productions in the template help to model the
semantics of the API method. For example, the API grammar
III. A PPROACH template for the API method: ”java.net.url(String protocol,
The overview of our approach is presented in Figure 1. String host, String path)” is shown as below.
From the figure, we can see that the input of our approach S0 -> S1 S2
S1 -> <protocol> "://"
is an apk file and a prepared list of network API methods S2 -> <host> <path>
with their grammar templates. The output of our approach
is a network summary, which is in the form a set of string The grammar shows that, in the generated URL, the value
of host will appear before path, and the constant string “://”
8 The class name is renamed due to obfuscation in the standard Android will be added to format the request. When NetDroid locates
build process. a invocation of the API, it will replace “protocol”, “host” and
167
"
$% &
)
# " !
&
" !
&
)
"
'(
"
Fig. 1. The overview of our approach
TABLE I
E XAMPLES OF N ETWORK API METHODS IN A NDROID SYSTEM
API Type
android.net.Uri$Builder: android.net.Uri$Builder scheme combining
android.net.Uri$Builder: android.net.Uri$Builder authority combining
android.net.Uri$Builder: android.net.Uri$Builder appendPath combining
java.io.OutputStream: void write conditional
java.io.ObjectOutputStream: void writeObject conditional
java.io.ObjectOutputStream: void writeChars conditional
java.io.ObjectOutputStream: void writeUTF conditional
org.apache.http.client.methods.HttpGet: void <init> simple
org.apache.http.client.methods.HttpPost: void <init> simple
org.apache.http.client.methods.HttpPut: void <init> simple
org.apache.http.client.methods.HttpDelete: void ¡init¿ simple
org.apache.http.client.methods.BasicHttpRequest: void ¡init¿ simple
... ...
“path” in the grammar with their corresponding real arguments API, such as URL u = new URL("http", str1, str2),
in the API invocation. we will generate an API grammar segment for the invocation
To generate the list of API grammar templates, we need as below. In the API grammar segment, the parameters are
to manually study the possible network libraries in android replaced with arguments, so that we can later calculate the
system. Since we focus on HTTP requests in this paper, we context-free grammar g1, and g2 for the arguments str1 and
consider only the network API methods that may generate str2 with string taint analysis, and combine the API grammar
HTTP requests. In the android system, there are three sources segment with g1 and g2. In the grammar segment, we use
of network API methods: Java network libraries, Apache <:str1> to represent an argument str1.
network libraries, and android network libraries. It should be S0 -> S1 S2
noted that, for each android app, the android system requires S1 -> "http" "://"
S2 -> <:str1> <:str2>
all its required third-party libraries to be packaged within
the app. This design decision is made to help separate apps The API grammar segment can handle simple network API
at runtime into sandboxes for better system security. It also methods that take all arguments at one time. However, as
means that it is sufficient for us to collect the network API we shown in Section II, there are some more complicated
methods in the android system. Because the byte code of all network API methods. We divide these API methods into
third-party network libraries must be packaged within the apk, two categories: combining network APIs, and conditional
and can be directly analyzed. network APIs. Combining network APIs are those network
Given the list of network PI methods and their grammar APIs that may generate a network packet by concatenating
templates, our network-API-method handling component lo- multiple arguments from multiple invocation sites. For exam-
cates network API invocations, and binds arguments to param- ple, in the our second code sample in Section II, the class
eters in the grammar templates. For each network API method, Uri.builder can generate a URL by sequentially invoke
we refer to the grammar template with binded arguments as three methods: scheme(String str), authority(String
API grammar segments. When we locate an invocation of the str), and appendPath(String str). Then a URI is gen-
168
erated based on the arguments of all the method invoca- Algorithm 1 Concatenation of Grammar Segments for Com-
tions. Conditional network APIs are usually general IO API bining Network API methods
methods, and whether their arguments will be sent as the Require:
content of a network request depends on the code context. G is the inter-procedure control flow graph
For example, the method write(byte[] bytes) defined in M is a map from network-API-methods to grammar
the class java.io.OutputStream can be used to send data templates
to either network or the file system. Whether the method start is an initialization statement of a network-request-
sends data to the network or to the file system depends on generating instance
whether the OutputStream belongs to a socket or a file. In Ensure:
our work, we manually examined the API documentation9 of S is the concatenated grammar segment for start
Android SDK, and identified the network API methods based 1: worklist ← ∅
on whether at least one of their parameters are sent to the 2: worklist.enqueue(start)
network. Table I lists some popular API-methods of each type, 3: while worklist = ∅ do
and we introduce as follows how we handle these two types 4: current = worklist.pop()
of complicated network API methods in Section III-B1, and 5: S ← S ∪ M.get(current)
Section III-B2, respectively. 6: if S ! = S then
1) Combining Network API Methods: Combining network 7: suc ← G.successors(current)
API methods are typically a group of methods defined in a 8: worklist.enqueueAll(suc)
request-generating class (e.g. Uri.builder). An instance (object) 9: end if
of the class, after initiated, will call methods in this group 10: end while
for one or more times to acquire all the data to be put into
a network request. Therefore, to handle combining network
API methods, we need to trace the life cycle of request
invocation with the current grammar segment of obj. As an
generation instances, and build an API grammar segment for
example, for the code sample 2 in Section II, we will have the
the instance as a whole. Therefore, the API grammar templates
following API grammar segment.
for combining network API methods need to consider the
current state of the request-generating object. We present the S0 -> ""
S1 -> S0 <:str1-3> "://"
API grammar templates of the methods in Uri.builder as S2 -> S1 <:str1-5>
below. In the template, the parameter <head> denotes the start S3 -> S2 "/" <:str1-7>
S4 -> S3 "/" <:str1-10>
variable of the current grammar, and represents the current
state of the packet-generating object. In the grammar, S[i] When there are branches, there can be multiple start
denotes a new nonterminal that is different from all existing variables for obj because we will add start variables (i.e.,
nonterminals. S[i]) along all paths. In this case, we will add a final start
<init>: variable into the grammar segment of obj that deduces all the
S0 -> "" current start variables of the grammars. Furthermore, we add
scheme(String scheme):
S1 -> <head> <scheme> "://" a new non-terminal for each method invocation at a different
authority(String authority): location, so a same non-terminal will be added when a method
S2 -> <head> <authority>
appendPath(String path): invocation is analyzed for the second time, and the analysis
S[i] -> <head> "/" <path> will converge with the existence of loops in the data flow.
Algorithm 1 shows the worklist-based algorithm we use to As an example, when our analysis goes through the following
concatenate grammar segments of combining Network API code sample, it will add S0->"" to the grammar segment at
methods. Line 1, and add S1->S0 <:str1> to the grammar segment
During the analysis, we first locate all initializations the first time it goes through Line 3. The second time it goes
of the request-generating objects (e.g., the invocation of through Line 3, S1->S1 <:str1> will be added, because S1
Uri.builder.<init> at Line 2 of code sample 2 in Sec-
is the current <head>, and the new non-terminal will still be
S1. Apparently, the analysis will reach a fixed point the third
tion II). Then we use the initialization statement as the starting
node for our algorithm. We first generate an API grammar time Line 3 is processed.
segment for it according to the API grammar template. For 1 ((Uri.Builder)localObject1).<init>();
2 while(...)
example, we generate S0->"" for Uri.builder.<init>. 3 localObject1 = ((Uri.Builder)localObject1)
Then, we leverage standard context-sensitive inter-procedure .appendPath(str1);
4
data flow analysis [24] to trace the generated object obj.
Along the data flow, when obj invokes a combining network 2) Conditional Network API Methods: To handle condi-
API method (e.g., schema(String scheme)), we will merge tional network API methods, we need to determine whether
the API grammar segment of the combining network API the object that invokes a conditional network API method
9 https://developer.android.com/reference/packages.html
is under a network-related context (e.g., determine whether
an output stream belongs to a socket). The API grammar
169
templates for such APIs are not special, but we must correctly E. Extracting Tracable Network Summaries
differentiate the network-API-method invocations that are un- Finally, we need to extract the network summaries from
der a network-related context and the invocations that are not. a set of combined grammars generated from the grammar
NetDroid leverages data dependence analysis on the objects combination component. To generate signatures of constant
that invokes conditional network APIs. Specifically, NetDroid string sequences, we enumerate all the limited deduction trees
checks whether the data flow of the object reaches any point of the grammar (i.e., we deduce only once for recursive
in the byte code, where the object is related to another nonterminals). Therefore, for each deduction tree, we generate
indication API (which indicates a network-related context). a sequence of constant strings by ignoring the terminals which
For example, when an OutputStream object ob has data de- are not constant strings (i.e., “<???>”). It should be noted
pendency with an invocation of the API java.net.Socket: that, according to our definition in Section III-A, ignoring
getOutputStream(), NetDroid will determine that all the non-constant strings and deducing only once for recursive
conditional network API invocations of ob are real network nonterminals actually generates a conservative approximation
API invocations. In the network API handling step, NetDroid of the original grammar.
will consider only the located real network API invocations, Using the approach above, we can generate a set of constant-
while ignore all other conditional network API invocations. string sequences from each combined grammar. Then, we
C. Apply String Taint Analysis merge all these sets, and compare all these constant-string se-
The third component of NetDroid uses string taint analysis quences to remove all the duplicate constant-string sequences.
to estimate the possible values of all the network arguments If we can find common prefixes (e.g., host names), we merge
in the located network API invocations. String taint analy- all the constant-string sequences with a common prefix as a
sis [32] [30] is able to estimate the possible values of a given tree for better presentation. Figure 2 shows part of a prefix
string variable in the code, and trace values back their origins tree summary generated from Flister app. This part of the
in the code. By analyzing the data flow of string variables and summary shows that the app is accessing 3 domains, and for
string concatenations, for a given string variable v, string taint each domain, the summary provides the paths and parameter
analysis is able to generate a context-free grammar, whose templates used. Thus if the parameter names or paths are
language represents the possible values of v, and whose code- changed, it would be easy for Flister developers to find what
location attributes on the terminals record the origin of values. needs to be changed.
After string taint analysis is applied, NetDroid is able F. Trace from Summaries to Code
to generate a context-free grammar for each argument of
each network-API-method invocation. As an example, for the After a summary is generated, since all the string constants
argument localObject2 in Line 16 of code sample 1, after involved in the summary have their code location recorded,
this step, we can generate a grammar for it as below. The it is straightforward to trace from the constant strings in the
language of this grammar actually represents the possible summary to code locations. When the network request format
values of localObject2. In the grammar, “<???>” denotes needs to be changed due to server-side code changes or API
any string, because the phone number is read from the phone changes, developers can simply trace from the string constants
storage, and string taint analysis is not able to estimate it. in the affected string-constant sequences.
S0 -> "http://ggtrack.org/SM1c?device_id=" IV. E VALUATION
S1 -> S0 <???>
S2 -> S1 "&adv_sub" To evaluate our approach, we implemented our approach
S3 -> S2 <???>
as a prototype called NetDroid10 base on the Soot frame-
D. Grammar Combination work [28], and carried out an experiment on the real-world
The component of grammar combination, for each network apps from android market.
API invocation inv, combines the API grammar summary A. Research Questions
of inv with the grammar of each argument of inv. This
process is straightforward. We just replace the arguments in To evaluate the effectiveness of our approach, we first need
the API grammar summary of inv with the start variables to study whether our approach is applicable and efficient on
of the grammar of each argument. For example, in the code real-world apps. Then we need to evaluate the quality of
sample 2, we can combine the API grammar segment shown in the summaries generated by our approach. Specifically, we
Section III-B1 with four simple grammars generated by string should evaluate the quality of the generated summaries in real-
taint analysis, and get the combined grammar as below. world maintenance tasks of network code. Therefore, we try
to answer the following three research questions.
S0 -> ""
S1 -> S0 S1-3 "://" • RQ1: Is our approach robust and efficient enough to
S2 -> S1 S1-5
S3 -> S2 "/" S1-7 handle most Android projects?
S4 -> S3 "/" S1-10 • RQ2: Is our approach able to generate meaningful net-
S1-3 -> "http"
S1-5 -> "gdata.youtube.com" work summaries?
S1-7 -> "feeds"
S1-10 -> "api" 10 Available at http://xywang.100871.net/netdroid
170
TABLE II
BASIC INFORMATION OF MAINTENANCE TASKS
• RQ3: How effective our generated network summaries byte code of these classes and found that some Java byte code
are in helping developers doing real-world network-code fails to pass the type check of Soot. Such failures are due to
related maintenance tasks? the imprecision in the process of translating Dalvik byte code
B. Applicability to Java byte code. Actually, since our tool is designed for
developers, such failures may not happen in reality, because
To answer the first research question, we applied our pro-
developers can choose to compile their project to Java byte
totype NetDroid on 500 android apps from the Google Play
code and directly apply NetDroid on it without translation.
Market. The 500 apps are top ranked in Google market and
We integrate the translation from apk files to Jar files in our
request network access permission. We refer to this set of
tool because developers do not have to change their building
android apps as Top-500-Set below. The size of apk files in
configuration, and we are able to evaluate our tool on top
Top-500-Set ranges from 30KB to 120.6MB, and the total size
android apps which we do not have source code for.
of the 500 apk files is 5.7GB. We also report the size of the Jar
file translated from the apk file, because the apk file usually Among the 455 apps that NetDroid can successfully process,
contains not only code, but also supporting files such as figures NetDroid generates an invalid summary for 33 of the apps. A
or even video snippets. Therefore, the sizes of apk files may summary is invalid if it does not have any constant string
not be precise indications of the scalability of our approach. sequence, or all of its constant string sequences are empty.
In contrast, the generated Jar File is more precise because it The reason for invalid network summaries is that, the value of
contains only Java byte code. The size of the translated jar the network arguments in the located network API invocations
files ranges from 7kB to 8.9MB, and the total size is 346MB. come from android system library so that they are estimated
NetDroid is able to successfully process 455 android apps, as any string by string taint analysis, and therefore an invalid
which takes a proportion of 91.0% of all the 500 android apps. summary will be generated. It may be helpful to build an
For all the 45 apps, the reason why NetDroid can not process android system model and further trace to the constant strings
them is failure in the translation from apk files to jar files in the android system. However, if the values of network argu-
with Dex2Jar, or the loading phase of Soot (due to errors in ments eventually come from user input, it maybe impossible to
the translated Java byte code). We further investigate the Java generate precise network summaries for those parts statically.
171
To sum up, NetDroid is able to successfully generate valid It should be noted that, when tracing from constant strings
network summaries for 422 apps from the Top-500-Set, which in our summaries back to the code, developers may actually
makes a proportion of 84.4%. Furthermore, the failing reasons not revise the string constant itself, but perform the change
of the 88 apps show that our approach has the potential to at a specific point on the data flow from the string constant
perform better in reality when source code is available. to the network packet. Therefore, NetDroid provides all code
locations along the data flow path from the code location of the
C. Supporting Maintenance Tasks string constant to the network request generation APIs. Such
a strategy will cause some false positives, but it will largely
Although we can use the top apps downloaded from the reduce the number of false negatives. Furthermore, since the
official Google Play Market to evaluate the efficiency and reported code locations are still few and along the same data
robustness of NetDroid, we can evaluate the helpfulness of the flow, they should not cause big burden on the developers to
generated summaries only with open source Android projects, identify the correct location to perform the change.
because the latter have public available version history for us We further studied the reason of false positives and false
to collect real-world maintenance tasks. negatives generated by our approach and describe them as
To perform our study, we first collected 8 real-world main- follows.
tenance tasks related to network code. It should be noted False Positives. Our approach generates 47 false positives
that, although maintenance of network code is common, it in total. Among these false positives, 40 are along the data
is difficult to identify such tasks for two reasons. First of flow path from the string constant to the network API. As we
all, a lot of network code maintenance tasks are not bug mentioned above, these false positives are not very harmful
fixes, because when the developers know about the changes on because they help developers to understand how the network
server-side code or third-party web services, they may direct request is generated and decide where to change the code.
change the code so that the error is not released to public The rest 7 false positives are due the imprecision of our anal-
available versions. However, Github does not allow search on ysis. Since our analysis uses approximation when analyzing
code commit messages, so it is not possible for us to search string operations, it may mistakenly involve irrelevant string
for network-code-related code commits directly. By contrast, constants to the network summary.
we have to search in the bug reports (Github allows searching False Negatives. Our approach generates only 3 false
of bug reports) to find network-code-related bug fixes. Second, negatives. 1 of the false negatives in task 8 is due to the
network-related code is often used in various features so the usage of reflection to perform network method calls, which
bug reports related to network code may be of various forms, we cannot handle for now. The rest 2 false negatives are due
and it is difficult to find them with specific keywords. During to the concatenation of user input. Our analysis reports only
our searching, we found that the API change of third-party code locations between the code location of the affected string
web services is one common reason of network-code related constants and the network API. Therefore, user input is traced
maintenance, so we search Github bug reports with names of only after they are concatenated with string constants. In the
popular third-party web services such as Twitter, Facebook, two false negatives, the two revised code locations are simply
and Google. on the code processing just user input, so our approach was
The collected tasks are presented in Table II. In the table, not able to locate them.
columns 1-7 present the task ID, the project where the bug
report is from, the bug report number, a description of the D. Threats to Validity
change on the RESTful API, the size of the source code base, The main threats to the construct validity is that, in our em-
the relevant RESTful API, and the code change locations in pirical evaluation on software maintenance tasks, our assump-
the code version history. We use the code revision locations in tion of developers’ knowledge may be different the developers’
the version history as the ground truth of our empirical study. actual knowledge. To reduce this threat, we assume developers
The result of our study is presented in Table III. Columns know only the name of changed parameters / methods in the
1-9 present the task id, number of code locations reported Restful API, and we then search the network summaries and
by the summary, number of code locations actually revised, trace back to the code based on only that name. Therefore,
the number of true positives, false negatives, false positives, the usage scenario of NetDroid in our empirical should not
precision, recall, and F score. From the results, we have the be easier than the actual scenario. The main threats to the
following observations. internal validity is that, the reported evaluation results may
First of all, our approach is able to generate summaries for be only applicable to the apps used in our evaluation. We
all the 8 apps involved in the maintenance tasks. Second, by use keywords of popular web services such as Facebook
tracing from the summaries to code, our technique is able to and Twitter to more effectively find network-code-related bug
cover 11 of 14 revised code locations, with 47 false positive fixes. We note that these keywords may cause our results
code locations in total. This indicates that, when a server biased to maintenance tasks involving third-party web services,
restful API is changed, the developers need to examine about but we believe that such maintenance tasks are common and
5 code locations to find the actual code location to be changed, they are not significantly different from the network-related
which is very reasonable in a debugging process. code maintenance tasks that do not involve third-party web
172
TABLE III VI. R ELATED W ORK
C ODE L OCATION FOR M AINTENANCE TASKS
In this section, we discuss the related works of our pa-
Task Located Changed TP FN FP P(%) R(%) F(%) pers. These research efforts mainly fall into three categories:
1 13 2 2 0 11 15.4 100 26.7 network behavior summarization, android code analysis, and
2 5 1 1 0 4 20 100 33.3
3 7 1 1 0 6 14.2 100 24.9 string analysis.
4 2 1 1 0 1 50 100 66.7
5 1 1 1 0 0 100 100 100 A. Network Behavior Summarization
6 21 2 1 1 20 4.8 50 8.8 In the area of network security, an important problem
7 3 1 1 0 2 33.3 100 50.0
8 6 5 3 2 3 33.3 40 36.3 is to identify which application is generating the traffic on
the network. Therefore, various techniques are developed to
summarize network behaviors of applications. Though the
services. To reduce this threat, we use a large set of top apps techniques are for a totally different purpose, they are relevant
to generate network summaries. Furthermore, we choose apps to the research in this paper.
from different domains in the study on software maintenance Statistical-information based approaches [19] [12] [3]
tasks, so that it is more likely that our results are also mainly use the statistical information or the contents of
applicable to other apps. the network traffic (e.g., packet size, data transferring rate,
packet intervals) to perform a protocol/domain classification
of network traffic. These approaches are able to identify
V. D ISCUSSION network traffic belonging to applications of certain domains,
A. Limitations such as database applications, video players, etc. However,
similar to port-based approaches, these approaches are also
Our current approach to generate network summaries has coarse-grain and cannot support application-level network-
the following major limitations. traffic classification.
First of all, our approach is based on the network-related Content-based approaches are able to support application-
API methods presented in Table I. Therefore, our approach level network-traffic classification by matching the payload
cannot handle the cases where a different set of network APIs of network packets with pre-generated signatures of specific
are used. This limitation can be overcome by extending our applications. One necessary and challenging step in these
API set. approaches is to generate signatures for large number of
Second, since our approach is based on static analysis applications. Sen et al. [26] proposed to use content-based
of Java code, it cannot handle the Android apps that send signatures to identify the P2P network traffic of different
out network request with native code. Additionally, it cannot P2P applications. These signatures are constructed manually
handle dynamic code features such as runtime code loading through careful reverse engineering the P2P applications. The
and reflection, as shown in our evaluation. other group of approaches try to extract content based network
Third, our approach focuses on the generation of network signatures of an application from a large amount of network
requests. However, there are another portion of code that traffic of the application. There have been many efforts in
parses network responses and process the data. Such code this part focusing on generating the network signatures of
are also network related and may evolve frequently due to worms from their collected network traffic. These efforts (e.g.,
evolution of server-side code or third-party web services. Our Autograph [17], EarlyBird [27], PolyGraph [14]) basically
current approach is not able to support maintenance of such extract common byte flows in worms’ network traffic and
code. generate a content-based signature (in the form of a string
or a regular expression) for a certain worm or a group of
B. Dynamic Approaches to Network Behavior Summarization worms. More recently, Park et al. [22] proposed to use the
Longest Common Subsequence (LCS) alogrithm to generate
From our evaluation, we can see that, our static approach a fingerprint of an application from the packets’ content in
is able to automatically generate network summaries for most the application’s network traffic. Recently, Perdisci et al. [23]
of the apps, and it is able to cover the whole code base of proposed a clustering-based approach to generate a signature
the app, so it is able to find some network behaviors that for a group of malware sharing similar network behavior.
are very difficult to be revealed dynamically. However, the This approach generates signatures for various HTTP-based
static approach may be not precise enough in some apps, may software (not limited to worms, but also include other software
generate lots of false positives, some invalid summaries, and applications such as adware, spyware). Dai et al. [6] further
cannot handle dynamically loaded code. Therefore, dynamic extends this approach to Android apps. Although the above
approach to network behavior summarization may well com- network-traces based signature-generation approaches are fully
plement our approach. With proper testing of the Android automatic during the signature-extraction phase. All of these
apps, network traffic collection, as well as tainting of data efforts require a large amount of annotated representative
sent to the network, dynamically generated traceable network network traffic for the application under study. Therefore,
summaries may resolve some limitations of our approach. they all need manual generation of network traffic or the
173
accumulation of network traffic from a monitored network, String analysis is an improvement over data-flow analysis [15].
both of which require a relatively long time and much cost. Christensen et al. [5] first suggested string analysis, which is
Compared with the above approaches, our approach leverages an approach for obtaining possible values of a string variable.
and adapts string analysis techniques to statically generate the Then, string analysis is widely used in various areas, especially
content-based signatures of android apps without requiring for detecting and sanitizing SQL Injection vulnerabilities and
any annotated network traffic. This advantage is especially Cross-Site-Scripting vulnerabilities. Halfond and Orso [13]
important for the signature generation of android apps because used string analysis to detect and neutralize SQL injection
of the huge number of existing android apps and the rapid attacks. Minamide [18] first applied string analysis on web
development of new android apps. applications. He also first suggested to simulate string oper-
Network behavior summarization of applications based on ations in the extended CFG with FSTs, and implemented a
system level behavior (e.g., system calls) is another well string analyzer on PHP code to predict the possible values
studied area. Most of the approaches in this area execute the of dynamically generated web pages. Later, Wassermann and
application under monitored environment and collect system Su first developed string-taint analysis [32] to more precisely
event sequences as the behavior signature of the applica- detect the above two kinds of vulnerabilities [33]. After that,
tion [4] [25]. Therefore, usually, network accesses are recorded Wassermann and Su [34] further extended their work, and
as simple system calls without considering the content sent developed an approach to generating test cases for security
to or received from the network. Recently, Bayer et al. [2] vulnerabilities. Our previous work [29] [31] extends string
proposed an approach to cluster malware based on system taint analysis with conditional and dynamic features. Kieyzun
behavior. Their approach take into account more detailed et.al. [16] further improved their approach by considering
network traffic information but the considered information strings that flow through the database. Compared to these
still only limited high-level information such as the names approaches, we apply string analysis on statically generating
of downloaded files. network signatures, which is a totally different problem. We
further proposed techniques to handle obfuscation and various
B. Analysis of Mobile Applications
network APIs.
Our work is also related to the security analysis of mobile
applications. This area is an emerging field in academic VII. C ONCLUSIONS
research, and some of the recent representative research efforts In this paper, we propose a novel approach to statically
are presented as below. PiOS [7] is static analysis framework generate traceable network summaries for android apps. Our
for iOS, which is able to check the leaking of sensitive infor- approach is based on grammar templates of network API
mation by combining data flow analysis and slicing techniques. methods and string taint analysis, and we further propose new
Stowaway [11] is a automatic tool that is able to determine techniques to handle complex network API invocations, and to
whether an Android application requests more permissions generate signatures from string-operation grammars. We eval-
than it actually requires. The tool is based on a pre-generated uate our approach on top 500 android apps and 8 maintenance
mapping from Android system APIs to Android permissions. tasks from open-source android projects. The results show that
Enck et al. [10] analyzed the permission system and the our approach is able to efficiently generate signatures with
permission combinations of Android System to collect a list of high quality for most of the apps. The empirical study on
dangerous permission patterns and developed Kirin, a service maintenance tasks shows that the signatures generated by our
which identifies Android application requesting dangerous per- approach are able to help developers precisely and quickly
mission, so that the users can be warned when installing them. locate the code locations in network-related code maintenance
Later, Enck et al. [9] further proposed ded, a de-compiler for tasks. There are several directions to further improve our work,
Android application, which is able to convert Dalvik Virtual which are listed as below.
Machine code to JVM code, and then decompile the JVM code First of all, we plan to apply our approach on a larger set of
using existing Java de-compilers. As for dynamic techniques, apps and real-world maintenance tasks. We also plan to adapt
TaintDroid [8] dynamically monitors the information flow in our approach to GUI-based Java software projects and evaluate
Android applications by tracking the propagation of taints our adapted approach on open-source Java applications.
throughout the android system. Apex [21] and TISSA [36] are Second, NetDroid currently generates some invalid signa-
two recent advancements over the current Android permission tures because it cannot trace into the system library. It is
system to provide more fine-grained permission control and inefficient to analyze the whole android system, and we plan
dynamic permission adjustability. These works mainly focus to build an android system model for the network signature
on information leaking or permissions instead of network generation problem, so that we can reduce the invalid signa-
analysis, and none of these efforts are able to generate network tures.
signatures for android apps. Third, our tool NetDroid is not able to process some apps
due to the failure in loading some classes in the Java byte code.
C. String Analysis
We may extend our tool to tolerate such failures or avoid such
In the field of static program analysis, the research efforts failures by enhancing the converting tool from Dalvik byte
that are the most related to our work is string analysis. code to Java byte code.
174
R EFERENCES [20] S. Mostafa and X. Wang. An empirical study on the usage of mocking
frameworks in software testing. In Quality Software (QSIC), 2014 14th
[1] Dex2jar, http://developer.android.com/design/patterns/app-structure. International Conference on, pages 127–132. IEEE, 2014.
html. [21] M. Nauman, S. Khan, and X. Zhang. Apex: extending android permis-
[2] U. Bayer, P. M. Comparetti, C. Hlauschek, C. Krügel, and E. Kirda. sion model and enforcement with user-defined runtime constraints. In
Scalable, behavior-based malware clustering. In NDSS, 2009. Proceedings of the 5th ACM Symposium on Information, Computer and
[3] L. Bernaille, R. Teixeira, I. Akodkenou, A. Soule, and K. Salamatian. Communications Security, ASIACCS ’10, pages 328–332, New York,
Traffic classification on the fly. SIGCOMM Comput. Commun. Rev., NY, USA, 2010. ACM.
36:23–26, April 2006. [22] B.-C. Park, Y. J. Won, M.-S. Kim, and J. W. Hong. Towards automated
[4] A. Bose, X. Hu, K. G. Shin, and T. Park. Behavioral detection of application signature generation for traffic identification. In NOMS,
malware on mobile handsets. In Proceedings of the 6th international pages 160–167, 2008.
conference on Mobile systems, applications, and services, MobiSys ’08, [23] R. Perdisci, W. Lee, and N. Feamster. Behavioral clustering of http-based
pages 225–238, New York, NY, USA, 2008. ACM. malware and signature generation using malicious network traces. In
[5] A. Christensen, A. Møller, and M. Schwartzbach. Precise analysis of Proceedings of the 7th USENIX conference on Networked systems design
string expressions. In Proc. SAS, pages 1–18, 2003. and implementation, NSDI’10, pages 26–26, Berkeley, CA, USA, 2010.
[6] S. Dai, A. Tongaonkar, X. Wang, A. Nucci, and D. Song. Networkpro- USENIX Association.
filer: Towards automatic fingerprinting of android apps. In INFOCOM, [24] T. Reps, S. Horwitz, and M. Sagiv. Precise interprocedural dataflow anal-
2013 Proceedings IEEE, pages 809–817, April 2013. ysis via graph reachability. In Proceedings of the 22Nd ACM SIGPLAN-
[7] M. Egele, C. Kruegel, E. Kirda, and G. Vigna. Pios: Detecting privacy SIGACT Symposium on Principles of Programming Languages, pages
leaks in ios applications. In NDSS, 2011. 49–61, 1995.
[8] W. Enck, P. Gilbert, B. gon Chun, L. P. Cox, J. Jung, P. McDaniel, and [25] K. Rieck, T. Holz, C. Willems, P. Düssel, and P. Laskov. Learning and
A. Sheth. Taintdroid: An information-flow tracking system for realtime classification of malware behavior. In DIMVA, pages 108–125, 2008.
privacy monitoring on smartphones. In OSDI, pages 393–407, 2010. [26] S. Sen, O. Spatscheck, and D. Wang. Accurate, Scalable In-
[9] W. Enck, D. Octeau, P. McDaniel, and S. Chaudhuri. A study of android Network Identification of P2P Traffic Using Application Signatures. In
application security. In Proceedings of the 20th USENIX Security WWW2004, May 2004.
Symposium, 2011. [27] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm
[10] W. Enck, M. Ongtang, and P. D. McDaniel. On lightweight mobile fingerprinting. In Proceedings of the 6th conference on Symposium on
phone application certification. In ACM Conference on Computer and Opearting Systems Design & Implementation - Volume 6, pages 4–4,
Communications Security, pages 235–245, 2009. Berkeley, CA, USA, 2004. USENIX Association.
[11] A. P. Felt, E. Chin, S. Hanna, D. Song, and D. Wagner. Android [28] R. Vallée-Rai, P. Co, E. Gagnon, L. Hendren, P. Lam, and V. Sundaresan.
permissions demystified. In Proceedings of the 18th ACM conference Soot - a java bytecode optimization framework. In Proceedings of the
on Computer and communications security, CCS ’11, pages 627–638, 1999 Conference of the Centre for Advanced Studies on Collaborative
New York, NY, USA, 2011. ACM. Research, pages 13–, 1999.
[12] P. Haffner, S. Sen, O. Spatscheck, and D. Wang. Acas: automated [29] X. Wang, L. Zhang, T. Xie, H. Mei, and J. Sun. Locating need-to-
construction of application signatures. In Proceedings of the 2005 ACM translate constant strings in web applications. In Proceedings of the
SIGCOMM workshop on Mining network data, MineNet ’05, pages 197– eighteenth ACM SIGSOFT international symposium on Foundations of
202, New York, NY, USA, 2005. ACM. software engineering, pages 87–96. ACM, 2010.
[13] W. G. J. Halfond and A. Orso. Amnesia: Analysis and monitoring for [30] X. Wang, L. Zhang, T. Xie, H. Mei, and J. Sun. Locating need-to-
neutralizing SQL-injection attacks. In Proc. ASE, pages 174–183, 2005. externalize constant strings for software internationalization with gener-
[14] N. James, B. Karp, and D. Song. Polygraph: Automatically generating alized string-taint analysis. IEEE Transactions on Software Engineering,
signatures for polymorphic worms. In Proceedings of the 2005 IEEE 39(4):516–536, 2013.
Symposium on Security and Privacy, pages 226–241, Washington, DC, [31] X. Wang, L. Zhang, T. Xie, Y. Xiong, and H. Mei. Automating
USA, 2005. IEEE Computer Society. presentation changes in dynamic web applications via collaborative
[15] J. Kam and J. Ullman. Global data flow analysis and iterative algorithms. hybrid analysis. In Proc. FSE, 2012.
Journal of the ACM (JACM), 23(1):158–171, January 1976. [32] G. Wassermann and Z. Su. Sound and precise analysis of web
[16] A. Kieyzun, P. J. Guo, K. Jayaraman, and M. D. Ernst. Automatic applications for injection vulnerabilities. In Proc. PLDI, pages 32–41,
creation of SQL injection and cross-site scripting attacks. In Proc. ICSE, 2007.
pages 199–209, 2009. [33] G. Wassermann and Z. Su. Static detection of cross-site scripting
[17] H.-A. Kim and B. Karp. Autograph: toward automated, distributed worm vulnerabilities. In Proc. ICSE, pages 171–180, 2008.
signature detection. In Proceedings of the 13th conference on USENIX [34] G. Wassermann, D. Yu, A. Chander, D. Dhurjati, H. Inamura, and Z. Su.
Security Symposium - Volume 13, SSYM’04, pages 19–19, Berkeley, Dynamic test input generation for web applications. In Proc. ISSTA,
CA, USA, 2004. USENIX Association. pages 249–260, 2008.
[18] Y. Minamide. Static approximation of dynamically generated web pages. [35] K.-K. Yap, T.-Y. Huang, M. Kobayashi, Y. Yiakoumis, N. McKeown,
In Proc. WWW, pages 432–441, 2005. S. Katti, and G. Parulkar. Making use of all the networks around us:
[19] A. W. Moore and D. Zuev. Internet traffic classification using bayesian A case study in android. In Proceedings of the 2012 ACM SIGCOMM
analysis techniques. In Proceedings of the 2005 ACM SIGMETRICS Workshop on Cellular Networks: Operations, Challenges, and Future
international conference on Measurement and modeling of computer Design, pages 19–24, 2012.
systems, SIGMETRICS ’05, pages 50–60, New York, NY, USA, 2005. [36] Y. Zhou, X. Zhang, X. Jiang, and V. W. Freeh. Taming information-
ACM. stealing smartphone applications (on android). In Proceedings of
the 4th international conference on Trust and trustworthy computing,
TRUST’11, pages 93–107, Berlin, Heidelberg, 2011. Springer-Verlag.
175