0% found this document useful (0 votes)
8 views

DNS Tunneling Detection by Cache-Property-Aware Features

Uploaded by

Sarthak Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

DNS Tunneling Detection by Cache-Property-Aware Features

Uploaded by

Sarthak Gupta
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO.

2, JUNE 2021 1203

DNS Tunneling Detection by Cache-Property-Aware


Features
Naotake Ishikura, Daishi Kondo , Member, IEEE, Vassilis Vassiliades ,
Iordan Iordanov, and Hideki Tode, Member, IEEE

Abstract—Many enterprises are under threat of targeted malware as seemingly harmless attachments to the employees
attacks aiming at data exfiltration. To launch such attacks, in of targeted enterprises. By opening these malicious emails, the
recent years, attackers with their malware have exploited a covert employees unfortunately infect their computers with malware.
channel that abuses the domain name system (DNS) named DNS
tunneling. Although several research efforts have been made to This mistake establishes communication channels between the
detect DNS tunneling, the existing methods rely on features that attackers and their malware. Then, the attackers can remotely
advanced tunneling techniques can easily obfuscate by mimick- control the malware and steal confidential information from
ing legitimate DNS clients. Such obfuscation would result in data the infected enterprises. This data leakage puts enterprises at
leakage. To tackle this problem, we focused on a “trace” left by a great disadvantage and affects profitability drastically.
DNS tunneling that cannot be easily hidden. In the context of data
exfiltration by DNS tunneling, the malware connects directly to In recent years, attackers with malware have launched this
the DNS cache server and the generated DNS tunneling queries form of attack by exploiting a covert channel that abuses the
produce cache misses with absolute certainty. In this study, we domain name system (DNS) known as DNS tunneling [4].
propose a DNS tunneling detection method based on the cache- DNS tunneling is a security threat used to tunnel data and com-
property-aware features. Our experiments show that one of the mands by exploiting a domain name in DNS queries and the
proposed features can efficiently characterize the DNS tunnel-
ing traffic. Furthermore, we introduce a rule-based filter and a corresponding DNS responses. It is one of the top DNS-based
long short-term memory (LSTM)-based filter using this proposed attacks [5]. Between April and September 2014, the attack-
feature. The rule-based filter achieves a higher rate of DNS tun- ers stole 56 million debit and credit card numbers from the
neling attack detection than the LSTM one, which instead detects American retailer, Home Depot [6], and several attacks were
the attack more quickly, while both maintain a low misdetection launched against a Middle Eastern government organization in
rate.
August 2018 [7]. In general, enterprises enforce access control
Index Terms—Cache-property-aware features, data exfiltra- of ports and protocols that are not usually utilized (e.g., peer-
tion, DNS tunneling, targeted attacks. to-peer (P2P) file sharing such as BitTorrent) for employees. In
addition, in a quarantine network that installs trusted middle-
I. I NTRODUCTION boxes, end-to-end encrypted communications can be decrypted
and inspected by middleboxes [8] that can identify malicious
ARIOUS protocols exist on the Internet, and by exploit-
V ing their vulnerabilities, attackers using their malware
launch targeted attacks that cause data exfiltration [2], [3].
activities. However, because DNS is an indispensable protocol
for implementing many services, such as content distribution,
its use is not restricted and is poorly managed. Therefore,
One of the ways attackers can exfiltrate data from an enter- the DNS operation unfortunately provides attackers with mal-
prise network commences with sending emails that include the ware an opportunity to realize targeted attacks through DNS
Manuscript received October 30, 2020; revised March 18, 2021 and April tunneling.
28, 2021; accepted May 4, 2021. Date of publication May 10, 2021; date of To detect DNS tunneling, several countermeasures have
current version June 10, 2021. This project has received funding from JSPS been proposed [9]–[24]. Indeed, these methods are effective
KAKENHI, Grant Number JP19K24351, the European Union’s Horizon 2020
Research and Innovation Programme under Grant Agreement No. 739578 and for detecting tunneling traffic from malware, such as Morto
the Government of the Republic of Cyprus through the Deputy Ministry of worm [25], or DNS tunneling tools such as dnscat2 [26].
Research, Innovation and Digital Policy. This article was presented in part However, these countermeasures are built using features that
at the ICIN [1]. The associate editor coordinating the review of this arti-
cle and approving it for publication was C. Fung. (Corresponding author: can be easily obfuscated by advanced DNS tunneling tech-
Daishi Kondo.) niques. For instance, steganography can hide leaked data in the
Naotake Ishikura and Daishi Kondo are with Department of Computer fully qualified domain name (FQDN) of the tunneling query,
Science and Intelligent Systems, Osaka Prefecture University, Sakai 599-8531,
Japan (e-mail: [email protected]; [email protected] which makes the FQDN look legitimate and invalidates filters
u.ac.jp). relying on its features. Thus, this obfuscation would result in
Vassilis Vassiliades is with the CYENS Centre of Excellence, 1500 Nicosia, data leakage.
Cyprus (e-mail: [email protected]).
Iordan Iordanov is with Corpy & Co., Tokyo 113-0033, Japan (e-mail: To address this problem, we focus on the nature of DNS tun-
[email protected]). neling. To successfully exfiltrate data attached to the domain
Hideki Tode is with the Department of Computer Science and Intelligent name of a DNS query, the DNS cache server to which the
Systems, Osaka Prefecture University, Sakai 599-8531, Japan (e-mail:
[email protected]). malware connects directly must avoid producing a cache hit
Digital Object Identifier 10.1109/TNSM.2021.3078428 in the server; otherwise, the data cannot be leaked outside of
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
1204 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

Fig. 1. An overview of DNS tunneling.

the enterprises. In other words, leaking data through DNS tun- of the features of DNS tunneling, a rule-based filter and an
neling would trigger a cache miss on the DNS cache server. LSTM filter are proposed in Section V. The performance of
However, cache servers exist to exploit the natural tendency the filters is evaluated in Section VI. We discuss our findings
of humans to request the same information multiple times. in Section VII and conclude the paper in Section VIII.
We hypothesize that the number of queries satisfies Zipf’s
law [27]. Based on this hypothesis, DNS tunneling violates
normal human behavior because it requires the cache to be II. BACKGROUND
bypassed, a clear indication that cache misses are the actual A. DNS Tunneling Basics
footprint of a DNS tunneling attack. Therefore, we believe that DNS tunneling bypasses firewalls to send and receive data
this cache property is more tolerant than the features used in by exploiting the domain names included in the DNS query
conventional methods to counter feature obfuscation. and the corresponding DNS response. The data and commands
Considering the above facts, we propose three features are tunneled between the malware and the attacker in the
derived from the cache property: cache hit ratio, access hit context of targeted attacks causing data exfiltration (Fig. 1).
ratio, and access miss count. Through extensive experiments, Assuming a domain name attacker.com is shared to cre-
we demonstrate that the access miss count addresses some ate a covert channel between the attacker and the malware
shortcomings of the hit ratios of both the cache and access that has infiltrated the enterprise network, to obtain a com-
and clearly characterizes DNS tunneling traffic. Therefore, it mand from the attacker to search confidential information
is useful for designing and implementing a solid DNS firewall in the enterprise network, the malware generates an FQDN
against DNS tunneling. Based on this knowledge, we introduce (get_command).attacker.com and sends it as a DNS
a rule-based filter and a long short-term memory (LSTM) [28]- query to the DNS cache server in the enterprise network
based filter using the proposed feature. The rule-based filter (Step 1). Following the usual process of resolving an FQDN,
achieves a higher detection rate of DNS tunneling attack than the DNS cache server iteratively queries the root (Steps 2
the LSTM filter, which instead detects the attack faster, while and 3), the com (Steps 4 and 5), and the attacker.com
both maintain a low misdetection rate. DNS server (Step 6). Then, the attacker.com DNS server
To the best of our knowledge, our previous work [1] was the obtains the request (get_command) and replies with a suitable
first to analyze cache-property-aware features, and this paper DNS response containing the command to the malware via
is an extended version. We extend the previous work with the the DNS cache server (Steps 7 and 8). After repeating the
following contributions. process of obtaining a new command and sending an answer
• performing a comprehensive survey of DNS tunneling to the command, the malware eventually leaks the confiden-
research in terms of attack and detection methods, tial information collected by the attacker in the same manner
• introducing a new cache-property-aware feature, access (i.e., by including the information to be leaked in the domain
miss count, and comparing this feature with the cache name).
hit ratio and access hit ratio, When a DNS client resolves the domain name by sending
• proposing a rule-based filter and an LSTM filter based the DNS query, the query first reaches the DNS cache server.
on the access miss count against DNS tunneling, and If the corresponding DNS response is cached in the server, it
• evaluating the performance of the filters created by is a cache hit, that is, the response is directly returned from the
a large legitimate training dataset composed of more server; otherwise, it is a cache miss, that is, the DNS query
than 350,000 DNS queries on the test dataset including is forwarded to the upstream DNS servers (Fig. 2). For the
legitimate queries and DNS tunneling ones. malware to send malicious DNS queries to the attacker effec-
The remainder of this paper is organized as follows. tively, the queries must not cause a cache hit on the DNS cache
Section II is the summary of the basics of DNS tunneling and server; this is a fundamental characteristic of DNS tunneling.
several existing studies on the attack and detection methods In this study, we assumed that the exfiltrated data are simi-
of DNS tunneling. Section III proposes cache-property-aware lar to credit card information (such an attack scenario is also
features for DNS tunneling detection while Section IV per- considered in [18], [21], [23]), and all the FQDNs generated
forms monitoring and analysis of these features. Using one to leak such data are unique.
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1205

TABLE II
C LASSIFIERS

Fig. 2. Structure of DNS cache.

TABLE I
DNS T UNNELING T OOLS AND M ALWARE DATASET

in an FQDN (e.g., www.attacker.com transfers 0 and


mail.attacker.com transfers 1 at an interval of one
query per day). This method can obfuscate the features derived
from the time interval and FQDN for detecting the attack, as
described in Section II-C. Then, data exfiltration can be real-
ized by adding the sequence number replaced with a general
character string to the above FQDN. It is difficult to distinguish
these DNS tunneling attacks using the existing methods.

C. Conventional Detection Methods


DNS tunneling is detected using classifiers and features. A
summary of the classifiers used in the related works [9]–[24]
Some researchers [29]–[31] have evaluated the performance is presented in Table II. The most common method is rule-
(e.g., throughput) of DNS tunneling tools such as based detection based on a defined threshold for a certain
iodine [32] and dns2tcp [33]. Raman et al. [34] feature. However, rule-based models can neither create com-
proposed a DNS tunneling method and measured the plicated rules nor deal with tunneling tools designed to bypass
maximal throughput. the defined rules. In conventional research, machine learn-
ing techniques are often adopted as classifiers. In the case
B. Conventional Attack Methods of supervised learning, the detection method is effective only
The existing DNS tunneling tools and malware datasets for specific malware and DNS tunneling tools, which limits
examined in related works [9]–[24] are summarized in Table I. the versatility of the models. In recent years, unsupervised
In the attacks detected and evaluated in these related works, learning has been adopted to detect DNS tunneling, as it is sig-
the time interval between consecutive tunneling queries was nificantly more suitable for anomaly detection than supervised
sometimes short; furthermore, in some of the works, anoma- learning because the training process requires only legiti-
lous/malware datasets were presented without description. For mate data and it is applicable to a wider range of problems.
instance, Ellens et al. [10] customized iodine and performed However, DNS tunneling cannot be detected without using a
a simulation to generate 36,389 flows of Command & Control feature that can distinguish anomalous traffic from legitimate
(C&C) communication in 1.55h, or more than 6.5 C&C flows ones because unsupervised learning only detects the outliers
per second. In addition, some of the attacks examined produce for legitimate traffic.
encrypted FQDNs that were longer than average. These char- The features (the inputs for the classifiers) proposed in the
acteristics make it easy to detect the existing DNS tunneling. related works [9]–[24] are summarized in Table III. In general,
However, it is questionable whether conventional detection there are two types of analyses for DNS tunneling detection:
methods are effective against stealthy DNS tunneling attacks payload analysis and traffic analysis. The payload analysis is
with low throughput and legitimate-looking FQDNs. an evaluation of a DNS query and/or the corresponding DNS
Meanwhile, several researchers have discussed DNS tunnel- response and traffic analysis is an evaluation of DNS traffic
ing produces queries slowly and does not exhibit characteris- over a monitoring period (e.g., in terms of time and number
tics such as encryption. Xu et al. [13] converted commands for of samples). These features are used as thresholds or fed to
C&C communication in anomalous FQDNs to seemingly legit- machine learning models to build countermeasures for DNS
imate labels used in general services (e.g., www, mail, and tunneling detection. However, the above-mentioned detection
ftp that correspond to the commands). Such an attack method methods rely on features that attackers and their malware can
can achieve C&C communication by obfuscating the features easily obfuscate by mimicking benign entities. For instance, an
related to the FQDN used to detect attacks, as described in analysis of character frequencies and entropy can be bypassed
Section II-C. Paxson et al. [11] embedded one-bit information using steganography. Such an obfuscation would result in data
1206 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

TABLE III
F EATURES

TABLE IV
C RITERIA FOR M AKING F EATURE V ECTORS (I NPUTS ) FOR C LASSIFIERS the cache hit ratio is expected to decrease because the mali-
cious DNS queries generated to exfiltrate data cause cache
misses, as discussed in Section II-A. To the best of our
knowledge, this characteristic of DNS tunneling has not been
investigated in related works, as demonstrated in the review in
Section II-C. In this section, we propose three features based
on the cache property to identify DNS tunneling traffic.

A. Cache Hit Ratio


leakage. Therefore, it is necessary to propose a resilient feature
to detect DNS tunneling by focusing on a “trace” of DNS The first feature we propose is the cache hit ratio CHR n
tunneling that cannot be easily concealed. on the DNS cache server, which is defined as follows:
The output of the classifiers in the related works [9]–[24] is 1
CHR n = n
· NCH .
the label (i.e., legitimate or malicious/anomalous). The clas- n
sifiers are evaluated based on performance metrics, such as Here, n is the number of queries under observation (i.e., a
the true positive rate and accuracy, which are computed on window size), and NCH n is the number of successful cache
the basis of the confusion matrix. The criteria for making the hits within the n observed queries. Note that in this paper,
feature vectors (inputs) for classifiers in the related works are we define the cache hit as a state in which the response to
summarized in Table IV. The focus in most existing studies a query from the DNS client is discovered in the connected
is on a query and/or response and their legitimateness. DNS cache server without sending any queries to authoritative
DNS servers. Experimental results using time-series data are
reported in Section IV-B, where the plots of CHR derived from
III. C ACHE -P ROPERTY-AWARE F EATURES FOR DNS all generated queries in a sliding window manner are shown.
T UNNELING D ETECTION CHR is a naive feature derived from the cache property to
Fujiwara et al. [47] reported that the cache hit ratio on identify DNS tunneling traffic; it has two shortcomings. First,
the DNS cache server1 inside the University of Tsukuba in CHR is not improved by the caches of resource records that
November 2011 was 75.1%. During a DNS tunneling attack, a client query rarely looks up. According to [47, Table 2],
90.7% of queries from the DNS clients were for the A and
1 The authors defined the cache hit ratio as (Total # of client queries that do AAAA records, and therefore, caching these resource records
not cause any queries to authoritative DNS servers)/(Total # of client queries). can increase CHR. However, resource records, such as NS
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1207

TABLE V
C OMPARISON B ETWEEN A C ACHE E NTRY AND AN ACCESS E NTRY

records, which are not often looked up by DNS clients, are


also cached to mitigate the load of authoritative DNS servers.
Such records play no role in DNS tunneling detection (i.e.,
unnecessary caches for DNS tunneling detection); therefore,
they might induce a decrease in CHR.
Second, in addition to the use of caching algorithms, such as
least recently used (LRU), caches are evicted based on their
time to live (TTL). Fujiwara et al. [47] show that defining
a low TTL value (≤300 sec), where DNS-based wide-area
load balancing is achieved [48], [49], decreases the CHR.
Furthermore, the TTL itself essentially causes a cache miss
and does not aid in characterizing the DNS tunneling traf- Fig. 3. Calculation example of CHR, AHR, and AMC (n = 100, t = 600).
fic. It is therefore difficult to determine whether a decrease
in the CHR is attributable to DNS tunneling or the inherent
shortcomings of the CHR. Experimental results from the time-series data are reported
in Section IV-B; the plots for the AMC derived from every
B. Access Hit Ratio generated query are shown in a sliding window manner.
Fig. 3 is the summary of a calculation example of CHR,
To compensate for the shortcomings of the CHR described
AHR, and AMC. These experimental results are shown in
in Section III-A, we first propose an access entry that
Section IV-B. As described in Section III-A, not only the tun-
inspects client queries and stores minimal information, only
neling queries but also the legitimate ones cause cache misses,
the FQDNs. The entry eviction policy is the LRU. When an
and it is difficult to distinguish whether a decrease in the
FQDN in the client query is found in a given list of access
CHR is caused by a malicious event such as DNS tunneling.
entries, this can be considered as an access hit. Table V sum-
By proposing an access entry and AHR, we compensate for
marizes a comparison between the cache entry and the access
the shortcomings of the CHR. However, malware can inten-
entry.
tionally replay legitimate DNS queries to increase the AHR.
We propose a second feature, which we call the access hit
In addition, both ratio-based features normalize the number
ratio AHR n , defined by the following formula:
of cache hits and access hits, respectively, which means that
1
AHR n = n
· NAH . the number of access misses itself cannot be evaluated based
n on these features. Finally, AMC, a count-based feature, solves
Here, n is the number of queries under observation (i.e., a these shortcomings.
window size), and NAH n is the number of successful access
hits within n queries. Experimental results obtained from the
time-series data are reported in Section IV-B, showing the plots IV. M ONITORING AND A NALYSIS OF
for the AHR derived from every generated query in a sliding C ACHE -P ROPERTY-AWARE F EATURES
window manner. A. Experimental Setup
However, malware can intentionally increase the AHR by
For our DNS traffic monitoring experiments, we installed
sending a large volume of legitimate DNS queries whose
a DNS cache server on the local network of our laboratory
FQDNs are normally expected to be stored in the access entry.
at Osaka Prefecture University and captured the DNS traffic
In this case, the malware can send tunneling queries while
generated on the cache server by laboratory members. To pro-
hiding its activity, which is a shortcoming of AHR.
duce DNS tunneling traffic in our laboratory, we set up an
authoritative DNS server, a DNS tunneling client, and a DNS
C. Access Miss Count tunneling server. We assumed that the tunneling client was
To compensate for the shortcoming of the AHR described in legitimate. However, unfortunately, it was infected by mal-
Section III-B, we focus only on the access misses and propose ware (i.e., installed a DNS tunneling client). Therefore, the
a third feature, which we call the access miss count AMC t , client produced both legitimate and tunneling traffic. In our
defined by the following formula: experiments, while generating the tunneling traffic, the
client produced legitimate DNS traffic by browsing the
AMC t = NAM
t
.
Web and launching some background applications such
t
Here, t is a time interval whose unit is seconds, and NAM as Slack. The authoritative DNS server delegated the domain
is the number of access miss queries in that time interval. name for DNS tunneling to the DNS tunneling server, which
1208 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

Fig. 4. Time-series data of CHR 100 of all the clients in Scenario 1 and AHR 100 and AMC 600 of all the clients in Scenarios 1, 2, and 3.

Fig. 5. Time-series data of CHR 100 of the tunneling client in Scenario 1 and AHR 100 and AMC 600 of the tunneling client in Scenarios 1, 2, and 3.

TABLE VI
made the DNS cache server forward malicious queries gener- PARAMETERS FOR T UNNELING E XPERIMENTS
ated by the tunneling client to the tunneling server. We used
dnscat2 [26] as a DNS tunneling tool for both the tunnel-
ing client and server. Note that because of the nature of
DNS tunneling (i.e., data is exfiltrated via cache misses),
we would have obtained the same results (CHR, AHR,
and AMC) using different tools when the parameters of
the tunneling query transmission interval and the tunnel-
ing traffic generation period were set to the same values.
Before performing the experiments on tunneling, we created a
list of cache entries and access entries by capturing DNS traffic
from 21 clients in our laboratory for 31 days. The parameters
for the experiments are presented in Table VI. We prepared collected from all the clients in Scenarios 1, 2, and 3, for
three data exfiltration scenarios in terms of the tunneling query n = 100 and t = 600 are shown in Fig. 4. Fig. 5 shows
transmission interval: Scenarios 1, 2, and 3, with transmis- the traffic of the tunneling client extracted from Fig. 4. To
sion intervals of 1, 10, and 100 s, respectively. We used a list compute the CHR and AHR of all the clients, a memory to
of cache entries only for Scenario 1, which demonstrated the store the latest n queries was prepared for each client, and the
effectiveness of AHR and AMC against the shortcomings of first CHR and AHR are calculated after the arrival of n queries.
the CHR. We omitted the experiments for Scenarios 2 and 3 To compute the AMC of all the clients, a memory to store the
because the results for Scenario 1 (the “easiest” case for DNS latest queries within t was prepared for each client, and the
tunneling detection) already indicated the shortcomings of the first AMC was calculated after t s. The red curve in Figs. 4
CHR. and 5 indicate that the CHR, AHR, and AMC were affected
by the DNS tunneling traffic generated by the tunneling client.
B. Results These figures illustrate that both the CHR and AHR decreased,
The scatter plot of the time-series data for the CHR collected whereas the AMC increased when DNS tunneling traffic was
from all the clients in Scenario 1, and the AHR and AMC produced.
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1209

Fig. 6. Cumulative distribution functions (CDF) of TTL (s) of unique A and Fig. 7. CDFs of query transmission interval (s) of the tunneling client in
AAAA records in our 31-day dataset. Scenarios 1, 2, and 3, which exclude the tunneling query transmission interval.

From Figs. 4(a) and 5(a), during the monitoring period,


CHR cannot clearly establish whether the decrease in the CHR
was caused by the tunneling because of the two drawbacks of
the cache entries discussed in Section III-A. In our 31-day
dataset, 99.7% of the queries from all the clients were for A
and AAAA records, whereas 51.2% of the cache entries were
for A and AAAA records. Fig. 6 shows the CDF of the TTL
of unique A and AAAA records in our 31-day dataset, and the
CDF indicates that the TTL of 43.7% of the A and AAAA
records were less than 300 s. These statistics are related to
the two drawbacks discussed in Section III-A.
By contrast, Figs. 4(b), 5(b), 4(c), and 5(c) clearly identify
the decrease in the AHR caused by tunneling. These figures
Fig. 8. Minimum AHR n for legitimate and tunneling traffic in
indicate that the AHR effectively addressed the drawbacks of Scenarios 1, 2, and 3.
the CHR and can successfully identify tunneling traffic. As
shown in Figs. 4(d) and 5(d), we observed that during the
tunneling traffic generation period, the AHR did not decrease
drastically; therefore, the AHR could not identify the tunnel-
ing traffic when the tunneling query transmission interval was
large, which is a vulnerability of AHR.
As described in Section III-B, one shortcoming of the
AHR is that it can be increased by malware that sends a
large volume of legitimate DNS queries with FQDNs that
are normally expected to be stored in the access entry, thus
concealing the tunneling queries. To overcome this vulnerabil-
ity, we focused only on the access misses and computed the
AMC. Figs. 4(e), 5(e), 4(f), 5(f), 4(g), and 5(g) clearly show
the increase in the AMC owing to the tunneling. Compared
to the AHR, the AMC can characterize tunneling traffic more
Fig. 9. Average AHR n for legitimate and tunneling traffic in Scenarios 1, 2,
effectively because it ignores legitimate traffic that increases and 3.
AHR during the tunneling traffic generation period. These
figures indicate an increase in the AMC, even when only legit-
imate traffic was generated. This was caused by the fact that observed that 62.4%, 58.3%, and 48.3% of the intervals were
new Web content that were not captured among the DNS less than 1 s in Scenarios 1, 2, and 3, respectively. Considering
traffic for the 31 days were accessed. Accessing new con- a higher threshold, 73.7%, 69.9%, and 61.8% of the intervals
tent sends a certain number of queries that are not included were less than 10 s for each scenario, and 96.3%, 94.5%,
among the access entries in a short period of time, which and 93.3% of the intervals were less than 100 s for each
drastically increases the AMC. In addition, we confirmed that scenario. Therefore, our parameter settings generated tunnel-
some websites install crawlers to resolve domain names. For ing queries at a reasonable time interval, compared to general
instance, the website, http://www.guide2research. legitimate query traffic.
com/ resolves many FQDNs of conference Web sites and Figs. 8 and 9 show the minimum and average AHR n for
might retrieve some information about the conferences. tunneling and legitimate traffic in Scenarios 1, 2, and 3. Here,
Fig. 7 illustrates the CDFs of query transmission interval the minimum and average AHR n for the tunneling traffic were
of the tunneling client in Scenarios 1, 2, and 3, excluding calculated based on (a) the traffic produced by the tunneling
the tunneling query transmission interval. From the CDFs, we client for 20 min during the tunneling traffic generation period,
1210 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

Fig. 10. Maximum AMC t for legitimate and tunneling traffic in


Scenarios 1, 2, and 3.

Fig. 12. FQDN ranking vs. the number of DNS queries.

the legitimate traffic in Scenarios 1, 2, and 3, which indicated


that the traffic could be easily classified. When t increased
(t ≥ 500 s), the maximum AMC t for tunneling traffic in
Scenario 2 was higher than that for the legitimate traffic in
Scenarios 1, 2, and 3. By contrast, in Scenario 3 (large tunnel-
ing query transmission interval), it was impossible to classify
the traffic, as the corresponding maximum AMC values of the
legitimate traffic was higher than those of the tunneling traffic.
The average AMC t for the tunneling traffic in Scenarios 1, 2,
and 3 was higher than that for legitimate traffic in Scenarios 1,
Fig. 11. Average AMC t for legitimate and tunneling traffic in 2, and 3, which indicates that the AMC t increased because
Scenarios 1, 2, and 3.
of the tunneling traffic. From the average AMC t for the tun-
neling traffic in Scenarios 1, 2, and 3, the average number of
access misses for 20 min was approximately 2.
and (b) the first n queries after generating the last tunneling
Fig. 12 shows the queried FQDN ranking versus the number
query. The minimum and average AHR n for legitimate traf-
of DNS queries from all the clients, which can be obtained by
fic in Scenarios 1, 2, and 3 were computed based on traffic
analyzing our 31-day dataset. Each FQDN was ranked based
from all the clients, except the above (a) and (b) traffic. The
on the number of DNS queries that contained it. Fig. 12
minimum AHR n for tunneling traffic in Scenarios 1 and 2
indicates that popular FQDNs are repeatedly requested by
was lower than that for legitimate traffic, which means that
the clients, roughly adhering to the Zipf’s law [27]. This
the traffic can be easily classified. By contrast, in Scenario 3
fact verifies the hypothesis introduced in Section I and sup-
(large tunneling query transmission interval), it was impossi-
ports the results for the AHR and AMC shown in this section
ble to classify the traffic, as the corresponding minimum AHR
(i.e., access misses do not often happen based on the typi-
values of legitimate traffic were less than those of the tun-
cal human behavior, and once tunneling queries are generated,
neling traffic. The average AHR n for the tunneling traffic in
the corresponding access misses occur; consequently, the AHR
Scenarios 1, 2, and 3 was lower than that for legitimate traf-
decreases and AMC increases, which indicates DNS tunneling
fic, which indicates that the AHR n decreased because of the
traffic).
tunneling traffic.
Figs. 10 and 11 show the maximum and average AMC t for
the tunneling and legitimate traffic in Scenarios 1, 2, and 3. V. F ILTERS BASED ON C ACHE -P ROPERTY-AWARE
Here, the maximum and average AMC t for the tunneling traf- F EATURES AGAINST DNS T UNNELING
fic are calculated based on (a) the traffic produced by the In this section, based on the results presented in
tunneling client for 20 min during the tunneling traffic gen- Section IV-B, we implement a rule-based filter and an LSTM
eration period and (b) the queries produced within t seconds filter using the AMC as a cache-property-aware feature. The
after the tunneling traffic generation period. The maximum proposed monitoring and filtering system (Fig. 13) should be
and average AMC t for the legitimate traffic in Scenarios 1, deployed on the DNS cache server in the enterprise network.
2, and 3 were computed based on traffic from all the clients, We assume that the DNS clients are expected to connect to
except the above (a) and (b) traffic. The maximum AMC t for the DNS cache server installed in the enterprise to monitor
the tunneling traffic in Scenario 1 was higher than that for and manage their activities in terms of risk hedge.
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1211

above which the input is deemed an “anomaly” is computed


by a user-defined percentile value. For example, the threshold
computed for the 99th percentile corresponds to the minimum
prediction error from the top 1%. During deployment, the filter
monitors the DNS traffic, computes the AMC for each DNS
client, and predicts the next step AMC using the trained LSTM.
When the prediction error exceeds the threshold computed for
the binary classifier, the filter produces an alert because such
queries can be anomalous.
To summarize, the difference between the rule-based and
LSTM-based filters is that the former uses a threshold com-
puted directly from the AMC, whereas the latter uses a thresh-
old computed from the prediction error of an AMC-forecasting
LSTM model.
Fig. 13. Overview of the proposed monitoring and filtering system.

C. Guides to Deploy Filters


A. Rule-Based Filter
Note that multiple thresholds can be computed to achieve
Our rule-based filter is based on an anomaly detection model
flexibility and easily customize the sensitivity of the filter dur-
that defines a threshold value for AMC. Specifically, the filter
ing deployment. In general, there is a tradeoff when choosing
classifies the DNS query as anomalous when the AMC exceeds
a threshold value (increasing both the true and false positive
a preconfigured threshold value. To extract the threshold val-
rates); therefore, there is no optimal threshold value. It all
ues, we collect only legitimate DNS traffic, create access
comes down to the false positive rate deemed acceptable by
lists, and compute the AMC for the DNS query generated
the enterprise network operators.
by each DNS client. The AMC values are collected (from all
The proposed monitoring and filtering system should be
the clients) into a training dataset. The classifiers’ thresholds
implemented on the DNS cache server, and the number of
are determined by taking user-defined percentile values from
trained models equals the number of DNS cache servers in
this dataset. For example, the threshold computed for the 99th
the enterprise network. For instance, if an enterprise network
percentile corresponds to the minimum AMC from the top 1%.
installs several DNS cache servers for each department, cor-
responding models should be created for each of them. This
B. Long Short-Term Memory Filter model creation policy stems from the fact that the locality of
The filter is modeled using LSTM networks [28], a type the DNS traffic for each DNS cache server was different [51],
of recurrent neural network. LSTMs differ from feedfor- which makes it necessary to tune the model for each of them.
ward neural networks (which are universal, nonlinear function Because our proposed system monitors multiple DNS clients
approximators [50]), as they also have feedback connections in parallel, the system has to create a number of instances
that serve as a type of “memory” for what the network for the trained model equal to the number of clients. The
has already seen. Such networks are suitable for use when required storage for an LSTM-based filter depends on the num-
the input consists of data sequences such as network traffic. ber of units, weights and clients (see Appendix A for a detailed
Compared to standard recurrent neural networks, LSTMs can explanation).
handle very long sequences, which makes them ideal for this
study. They are widely deployed in the industry.
The proposed LSTM filter is based on an anomaly detection
VI. E VALUATION OF F ILTERS
pipeline for temporal data. The rationale behind the design of
this pipeline is that a predictive model of normal behavior A. Experimental Setup
has a low prediction error when fed with normal input and In this section, we describe the procedure for creating and
a higher prediction error for abnormal input. We implement evaluating the rule-based and LSTM filters. To create the fil-
the pipeline as follows. In the same manner as described in ters, we used the 31-day DNS traffic dataset introduced in
Section V-A, we first collect only legitimate DNS traffic, cre- Section IV-A. The initial access entries were created using the
ate access lists, and compute the AMC for the DNS query data of the first 24 days. The access entries were then used
generated by each DNS client. Then, we collect the AMC val- to calculate AMC t (t = 100, 200, . . . , 1200) for each DNS
ues into a training dataset. We use the AMC dataset to train query generated by each DNS client for the remaining seven
an LSTM model and obtain its next-step predictions. We then days. The training set consisted of a large dataset of legiti-
compute the prediction error based on the model’s predictions mate queries, which were more than 350,000. AMC t was not
and the actual AMC values. Because legitimate DNS traffic computed for the first t s because the data were not sufficient.
itself contains outliers, the prediction error tends to be higher The computed AMC t vectors, which were time-series data,
at these outliers and can be used to create a filter. Therefore, comprised the training dataset for creating the filters.
we create a filter by constructing a binary classifier that takes The additional procedures for building the LSTM filters can
the prediction error as the input. The classifier’s threshold be explained as follows. We preprocessed the dataset using the
1212 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

standard scaler2 to the dataset. The training dataset obtained


from each client was divided according to the batch size, and
data that were not consistent with the batch size were excluded.
The training dataset was fed into a stateful LSTM, which can
remember past trends better than the stateless LSTM. As for
the parameters of the LSTM, the timestep was set to 100,
the batch size was set to 32, the input dimension was set to 1
because only AMC t was used, and the number of LSTM cells
was set to 1, 2, 4, 8, and 16. We used the Adam Optimizer [52].
The loss function was the mean squared error, and when the
difference between the current error and the previous error
was less than 10−3 or the number of epochs reached 30, we
terminated training. The LSTM was trained on a client-by-
client basis, and the state of the model was reset every time
the client was switched.
The evaluation method of the filters is described below. For
the rule-based filters, as mentioned in Section V-A, we deter-
mined multiple thresholds for the training dataset to classify
a query as an attack or not on the basis of the AMC t . In our Fig. 14. ROC curves for the test dataset of Scenarios 1, 2, and 3 (AMC 600 ).
experiments, we considered 10,000 thresholds (from 0.01% to
100% with 0.01% increments). For the LSTM filters, we used because we prepared one rule-based model and five LSTM
the trained LSTM model to predict the training set and calcu- models for 12 values of t) to create the first alarm (as the
late the prediction error, that is, the squared error between the filter initially predicted the input as legitimate) in Scenarios 1,
actual and predicted AMC t . As mentioned in Section V-B, 2, and 3, respectively, and x min and x max were the minimum
multiple thresholds can be used to classify a prediction error and maximum of x {filter ,t} , respectively. When Speed {filter ,t}
as an attack or not. In our experiments, we considered 10,000 is 1.0, the model {filter, t} is the one that raises the earliest
thresholds (from 0.01% to 100% with 0.01% increments). detection alarm. Note that in this paper, positives indicate that
We then extracted the AMC t for each DNS query from the the queries are (a) the ones produced by the tunneling client
DNS traffic data generated in Scenarios 1, 2 and 3, which for 20 min during the tunneling traffic generation period or
were described in Section IV-A. The initial access list used (b) the ones produced within t seconds after the tunneling
to obtain AMC t was created from the 31-day DNS traffic traffic generation period (the red curve illustrates the queries
dataset described in Section IV-A. Similar to the training, the in Figs. 4 and 5). The other queries were labeled negatives.
AMC t was not computed for the first t s because of insuf-
ficient data. The test dataset for evaluating the data was the
computed AMC t , which was time-series data. The rule-based B. Results
filters directly classified the activity of the test dataset using Fig. 14 shows the ROC curves for the test dataset of
their computed thresholds. On the other hand, the LSTM-based Scenarios 1, 2, and 3, when t was 600. From the figure, when
filters classified the activity by first predicting it, then comput- the false positive rate was over 0.025, the true positive rate
ing the prediction error and finally comparing the error with was over 0.91, which indicated that our filters could clas-
their corresponding thresholds. For both filters, we evaluated sify legitimate and malicious DNS queries correctly with high
the receiver operating characteristic (ROC) curve (this curve probability. By contrast, in Fig. 15, which shows the ROC
was plotted using the true positive rate3 and false positive curves for the test dataset of only Scenario 3, when t was
rate4 ), area under the curve (AUC) score, and accuracy.5 The 600, the classification performance deteriorated, compared to
speed of the attack detection was defined as follows: Fig. 14. This is because, as mentioned in Section IV-B, the
{filter ,t},1 {filter ,t},2 {filter ,t},3 larger the tunneling transmission interval, the harder it was to
x {filter ,t} = NFN + NFN + NFN and detect the DNS tunneling attack. From these figures, the rule-
x {filter ,t} − x min based filter outperformed the LSTM-based one. In checking
Speed {filter ,t} = 1 − .
x max − x min the positives classified as negatives (i.e., legitimate queries) by
the filters, the LSTM filter sometimes identified several posi-
{filter ,t},1 {filter ,t},2
Here, x {filter ,t} is the summation of NFN , NFN , tives as legitimate queries because it evaluated the queries on
{filter ,t},3 the basis of a threshold computed from the prediction error,
and NFN ,
which are the number of false negatives
required for the model {filter, t} (there were 72 models which means the prediction error could become low at some
point, even during the tunneling attack. Meanwhile, the rule-
2 We also used the power transformer to preprocess the dataset; however, based filter evaluated the queries with a threshold computed
the models created using the preprocessed dataset exhibited worse prediction directly from the AMC. When the prediction error obtained
performance (see Appendix B).
3 (# of true positives)/(# of true positives + # of false negatives). by the LSTM filter fell below the predetermined threshold
4 (# of false positives)/(# of false positives + # of true negatives). and the AMC exceeded the threshold for the rule-based filter,
5 (# of true positives + # of true negatives)/(# of queries). the rule-based filter classified the queries as anomalous.
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1213

TABLE VII
AUC S CORES FOR THE T EST DATASET OF S CENARIOS 1, 2, AND 3

TABLE VIII
AUC S CORES FOR THE T EST DATASET OF S CENARIO 3

ROC curves for the test dataset of Scenario 3 (AMC 600 ).


Fig. 16. Pareto front (false positive rate = 0.025). There are 12 models per
Fig. 15.
filter (each with a different value of t (100, 200, . . . , 1200)) and there exist
9 non-dominated solutions. Although the rule-based filters tend to achieve
higher true positive rates than LSTM filters, the LSTM filters tend to detect
As shown by the results reported in Table VII, almost all attacks more quickly.
of the AUC scores were over 0.90, with a few exceptions
(marked in bold). However, from the results in Table VIII,
we observed a considerably different situation. Scenario 3 question is whether there is a correlation between the number
was the most challenging in the context of DNS tunneling of units in the LSTM model and the exfiltration period.
detection. The results indicate that the efficiency of the fil- We fixed the false positive rate at 0.025 and discussed the
ter not only depended on the number of units in the LSTM accuracy and speed of attack detection. The accuracy for the
model but was also strongly related to the time interval t, entire test dataset for Scenarios 1, 2, and 3 are shown in
which was also the size of the sliding window. It appears Table IX. Table X shows the accuracy for the test dataset for
that the value of t significantly influenced the quality of the only Scenario 3. These tables indicate that both filters have a
detections. More interestingly, for the same value of t, we high accuracy (ranging from 0.9697 to 0.9763).
observed different results, depending on the number of units Fig. 16 illustrates the Pareto front based on the true pos-
in the LSTM model. This clearly implies that for more chal- itive rate and the speed of the attack detection. As shown
lenging settings where exfiltration was performed over large in Figs. 14 and 15, rule-based filters tended to achieve higher
periods of time, the model must be able to identify a larger true-positive rates than LSTM filters. Instead, the LSTM filters
number of features on varying scales. Therefore, one possible tend to detect attacks more quickly than rule-based filters. This
future research direction is to investigate the effectiveness of benefit comes from the fact that the LSTM filters predicted
multi-scale ensemble LSTM models for detecting DNS tun- the time-series data and could thus detect anomalies faster.
neling attacks over large periods of time. Another interesting The metric of speed is important to prevent data exfiltration;
1214 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

TABLE IX
ACCURACY FOR THE T EST DATASET IN S CENARIOS 1, 2, AND 3 (FALSE P OSITIVE R ATE = 0.025)

TABLE X
ACCURACY FOR THE T EST DATASET IN S CENARIO 3 (FALSE P OSITIVE R ATE = 0.025)

TABLE XI
VALUES OF t AND T HRESHOLDS OF THE 9 N ON -D OMINATED S OLUTIONS the operators can determine whether the client is infected by
S ORTED BY T RUE P OSITIVE R ATE (N OTE T HAT THE T HRESHOLD OF THE malware. This strategy to combine these filters is an enterprise
RULE -BASED F ILTER I S AMC W HILE T HAT OF THE LSTM F ILTER I S THE network operation guarding against data exfiltration through
P REDICTION E RROR OF THE C ORRESPONDING AMC-F ORECASTING
LSTM M ODEL ) DNS tunneling.
In this paper, we proposed a method for detecting DNS
tunneling that focuses on DNS clients rather than domain
names. This is because our goal was to prevent information
leakage by detecting DNS clients infected with malware in tar-
geted attacks. As discussed in Section II-C, some researchers
have focused on domain names and proposed filters whose
design guideline was to identify the domain names used to
perform DNS tunneling [11], [14], [18], [20], [21]. The short-
coming of such methods is that they fail when the attackers
and their malware change tactics and utilize several domain
names for the attacks. Our proposed filter does not focus on
once the first alarm goes off, the firewall operator can care- the domain name; rather, it detects attacks based on whether
fully examine the anomalous client and determine whether the access misses have occurred, thereby effectively addressing
client is to be isolated. Finally, the total amount of leaked data the above drawback.
can be reduced as much as possible. Table XI summarizes the The proposed filters can be easily integrated with the con-
values of t and the thresholds of the 9 non-dominated solutions ventional detection methods introduced in Section II-C. For
sorted by true positive rate. example, when malware attempts to leak a file through DNS
tunneling, to detect the attack, a countermeasure adopts length-
based features such as the FQDN length and the longest label
VII. D ISCUSSION length, which are used in the payload analysis. RFC 1035 [53]
Based on the evaluation in the foregoing Section VI, we defines the maximum total length of a domain name (dots
conclude that the rule-based filter achieves a higher rate of included) and a label as 255 characters and 63 characters,
the DNS tunneling attack detection than the LSTM, which, respectively. In the dataset used in [23], more than 99% of
however, detects the attack faster, while both maintain a low the queries included less than 80 characters. Considering the
misdetection rate. From this perspective, enterprise network malware, to improve the information leakage throughput by
operators can deploy our proposed monitoring and filtering adding more data, the FQDN becomes longer. However, this
system with some strategies. For example, first, based on type of malicious FQDN should be filtered out by a counter-
the LSTM filter, the operators identify a suspicious client. measure based on the statistics of the FQDN length. Finally,
After the first alarm raised by the LSTM filter, the operators to circumvent the filter, the malware is forced to generate
still allow the client to produce the queries. However, these more malicious queries, which causes more access misses,
queries, including the tunneling ones, should be resolved by thus including the likelihood of the proposed filter detect-
only the connected DNS cache server without the iterative ing the attack more easily. By combining our proposal with
query process, such that the unresolvable ones do not get for- conventional filters, we can create a more resilient firewall.
warded to the outside (i.e., data exfiltration is eliminated at Our experiments were carried out by utilizing a DNS traf-
this point). Then, using a rule-based filter, at a defined point, fic dataset from 21 clients in our laboratory. We expect that,
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1215

develop the rule-based and LSTM filters to counter DNS


tunneling. The DNS tunneling attack detection rate of the rule-
based filter was higher than that of the LSTM, which instead
detected the attack more quickly, while both maintained a low
misdetection rate.
In future work, we will tackle low-throughput information
leakage DNS tunneling attacks. We believe that the proposed
cache-property-aware features can be applied for detection
based on long-term monitoring; therefore, we will further
investigate the “trace” of DNS tunneling against such an
Fig. 17. Comparison of actual AMC 100 , predicted AMC 100 with standard advanced attack.
scaler, and predicted AMC 100 with power transformer for one client in one
day in the training set. A PPENDIX A
F ILTER S TORAGE R EQUIREMENTS
even if the number of clients increases, the robustness of the
The required storage for an LSTM-based filter depends on
cache-property-aware features will be assured. The reason is
the number of units, weights and clients. One LSTM unit typ-
because the robustness is related to the queried FQDN rank-
ically has two stored values for the states: the “cell state”
ing versus the number of DNS queries from all the clients
and the “hidden state”. The number of weights for 1, 2, 4, 8,
(see Section IV-B). Popular FQDNs are repeatedly requested
and 16-unit LSTMs is 18, 43, 117, 361, and 1233, respectively.
by the clients, roughly adhering to Zipf’s law, and hence,
Note that the LSTM weights and the threshold can be shared
access misses do not often happen based on the typical human
(they do not change for each client), therefore, can be stored
behavior. Hasegawa et al. [51] observed Zipf’s law even in a
only once, whereas the states need to be maintained indepen-
large DNS traffic dataset captured at a DNS cache server in
dently for each client. Thus, assuming that the state, weight
a campus network. This indicates that our solution could be
and threshold are expressed by 8 bytes, the required storage for
applied even for the case that the number of clients increases.
a rule-based system is just 8 bytes, while the required storage
Evaluating our proposed system by increasing the number of
for the LSTM-based filters is 8 ∗ (2 ∗ N ∗ m + w + 1) bytes,
clients is one of our future works.
where N is the number of clients, m is the number of LSTM
However, the proposed method is limited in one regard; it
units and w is the number of weights. More specifically, 1, 2,
cannot effectively identify low-throughput attacks, as shown
4, 8, and 16-unit LSTMs, require (16N + 152), (32N + 352),
in Sections IV and VI. We believe that it is still difficult to
(64N + 944), (128N + 2896), and (256N + 9872) bytes,
detect such attacks, even using the existing methods introduced
respectively. If, for example, there are 1000 clients, the most
in Section II-C. To detect low-throughput attacks, one possi-
memory-intensive model (16 units) requires 265872 bytes
ble approach is to employ the proposed cache-property-aware
(∼260KB).
features. For instance, when the information on the access
misses from a DNS client is recorded over a long period of A PPENDIX B
time and the total number of access misses for one client is DATA P REPROCESSING
higher than that of other DNS clients at a certain point, the Similar to the procedure described in Section VI, we
client could be suspected of being infected with malware. In performed experiments to verify the influence of two data pre-
other words, it is essential to propose a method for detecting processing methods: standard scaler and power transformer.
low-throughput attacks by long-term monitoring rather than Fig. 17 shows the comparison of the actual AMC 100 , pre-
short-term monitoring. dicted AMC 100 with standard scaler and power transformer
for one client for one day in the training set. From the figure,
VIII. C ONCLUSION we can observe that the standard scaling can enable accurate
Various countermeasures against DNS tunneling have been prediction AMC (in the figure, blue and pink lines overlap;
proposed; however, they are based on DNS tunneling features thus, the overlapped part can be seen as purple), whereas
that can be easily obfuscated by malicious entities mimick- that of the power transformer failed to do this. This fact was
ing legitimate ones. Therefore, conventional approaches are most apparent in the case where t was set to a smaller value.
not robust against feature obfuscation. To solve the issue, Thus, in this study, we adopted a standard scaler as the data
we focused on the nature of DNS tunneling. When a tun- preprocessing method.
neling client sends a malicious query to the tunneling server, ACKNOWLEDGMENT
the query definitely causes a cache miss on the DNS cache
The authors thank the anonymous reviewers for their con-
server to which the client connects. Based on this obser-
structive comments.
vation, we proposed cache-property-aware features for DNS
tunneling detection. Our extensive experiments revealed that
R EFERENCES
the access miss count can clearly reveal DNS tunneling
traffic that generates tunneling queries within a reasonable [1] N. Ishikura, D. Kondo, I. Iordanov, V. Vassiliades, and H. Tode, “Cache-
property-aware features for DNS tunneling detection,” in Proc. 23rd
time interval, compared to general legitimate query traffic. Conf. Innovat. Clouds Internet Netw. Workshops (ICIN), Paris, France,
Moreover, we exploit the cache-property-aware features to 2020, pp. 216–220.
1216 IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, VOL. 18, NO. 2, JUNE 2021

[2] IT Security Risks Survey 2014: A Business Approach to Managing [25] (2011). Morto Worm Sets a (DNS) Record. [Online]. Available:
Data Security Threats. Accessed: Mar. 18, 2021. [Online]. Available: https://community.broadcom.com/symantecenterprise/communities/
https://media.kaspersky.com/en/IT_Security_Risks_Survey_2014_ community-home/librarydocuments/viewdocument?DocumentKey=
Global_report.pdf 268f079a-2bb8-4775-9ef9-1b02e32ca55d&CommunityKey=1ecf5f55-
[3] (2015). Understanding Targeted Attacks: The Impact of Targeted 9545-44d6-b0f4-4e4a7f5f5e68&tab=librarydocuments
Attacks. [Online]. Available: https://www.trendmicro.com/vinfo/us/ [26] dnscat2. Accessed: Mar. 18, 2021. [Online]. Available: https://github.
security/news/cyber-attacks/the-impact-of-targeted-attacks com/iagox86/dnscat2
[4] M. Al-Kasassbeh and T. Khairallah, “Winning tactics with DNS tun- [27] G. K. Zipf, Human Behavior and the Principle of Least Effort.
nelling,” Netw. Security, vol. 2019, no. 12, pp. 12–19, 2019. Cambridge, MA, USA: Addison-Wesley, 1949.
[5] IDC 2020 Global DNS Threat Report. Accessed: Mar. 18, 2021. [28] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
[Online]. Available: https://www.efficientip.com/resources/idc-dns- Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
threat-report-2020/ [29] T. van Leijenhorst, K.-W. Chin, and D. Lowe, “On the viability and
[6] (2014). New FrameworkPOS Variant Exfiltrates Data via DNS performance of DNS tunneling,” in Proc. 5th Int. Conf. Inf. Technol.
Requests. [Online]. Available: https://www.gdatasoftware.com/blog/ Appl. (ICITA), 2008, pp. 560–566.
2014/10/23942-new-frameworkpos-variant-exfiltrates-data-via-dns- [30] L. Nussbaum, P. Neyron, and O. Richard, “On robust covert channels
requests inside DNS,” in Proc. IFIP Int. Inf. Security Conf. (IFIP SEC), 2009,
pp. 51–62.
[7] (2019). DNS Tunneling in the Wild: Overview of OilRig’s DNS
[31] M. Aiello, A. Merlo, and G. Papaleo, “Performance assessment and anal-
Tunneling. [Online]. Available: https://unit42.paloaltonetworks.com/dns-
ysis of DNS tunneling tools,” Logic J. IGPL, vol. 21, no. 4, pp. 592–602,
tunneling-in-the-wild-overview-of-oilrigs-dns-tunneling/
Aug. 2013.
[8] D. Naylor et al., “Multi-context TLS (McTLS): Enabling secure in- [32] iodine. Accessed: Mar. 18, 2021. [Online]. Available: https://github.com/
network functionality in TLS,” in Proc. ACM Conf. Spec. Interest Group yarrick/iodine
Data Commun. (SIGCOMM), 2015, pp. 199–212. [33] dns2tcp. Accessed: Mar. 18, 2021. [Online]. Available: https://github.
[9] K. Born and D. Gustafson, “Detecting DNS tunnels using charac- com/alex-sector/dns2tcp
ter frequency analysis,” 2010. [Online]. Available: https://arxiv.org/abs/ [34] D. Raman et al., “DNS tunneling for network penetration,” in Proc. Int.
1004.4358. Conf. Inf. Security Cryptol. (ICISC), 2012, pp. 65–77.
[10] W. Ellens, P. Żuraniewski, A. Sperotto, H. Schotanus, M. Mandjes, [35] Backdoor.Win32.Denis. Accessed: Mar. 18, 2021. [Online]. Available:
and E. Meeuwissen, “Flow-based detection of DNS tunnels,” in Proc. https://otx.alienvault.com/pulse/590314fb6575a03746de87a8
IFIP Int. Conf. Auton. Infrastruct. Manage. Security (AIMS), 2013, [36] BernhardPOS. Accessed: Mar. 18, 2021. [Online]. Available: https://otx.
pp. 124–135. alienvault.com/pulse/55a5b4eeb45ff55fb194e69e
[11] V. Paxson et al., “Practical comprehensive bounds on surreptitious [37] Cobalt Strike. Accessed: Mar. 18, 2021. [Online]. Available:
communication over DNS,” in Proc. 22nd USENIX Conf. Security, https://www.cobaltstrike.com/
Aug. 2013, pp. 17–32. [38] DET. Accessed: Mar. 18, 2021. [Online]. Available: https://github.com/
[12] C. Qi, X. Chen, C. Xu, J. Shi, and P. Liu, “A bigram based real time DNS sensepost/DET
tunnel detection approach,” Procedia Comput. Sci., vol. 17, pp. 852–860, [39] DNScat. Accessed: Mar. 18, 2021. [Online]. Available: http://tadek.
2013. pietraszek.org/projects/DNScat/
[13] K. Xu, P. Butler, S. Saha, and D. Yao, “DNS for massive-scale com- [40] DNSExfiltrator. Accessed: Mar. 18, 2021. [Online]. Available:
mand and control,” IEEE Trans. Depend. Secure Comput., vol. 10, no. 3, https://github.com/Arno0x/DNSExfiltrator
pp. 143–153, May/Jun. 2013. [41] (2017). Covert Channels and Poor Decisions: The Tale of
[14] A. M. Kara, H. Binsalleeh, M. Mannan, A. Youssef, and M. Debbabi, DNSMessenger. [Online]. Available: https://blogs.cisco.com/security/
“Detection of malicious payload distribution channels in DNS,” in talos/covert-channels-and-poor-decisions-the-tale-of-dnsmessenger
Proc. IEEE Int. Conf. Commun. (ICC), Sydney, NSW, Australia, 2014, [42] (2018). DNSpionage Campaign Targets Middle East. [Online].
pp. 853–858. Available: https://blog.talosintelligence.com/2018/11/dnspionage-
[15] M. Aiello, M. Mongelli, and G. Papaleo, “DNS tunneling detec- campaign-targets-middle-east.html
tion through statistical fingerprints of protocol messages and machine [43] (2016). Three Month FrameworkPOS Malware Campaign Nabs
learning,” Int. J. Commun. Syst., vol. 28, no. 14, pp. 1987–2002, 2015. ~43,000 Credit Cards From Point of Sale Systems. [Online]. Available:
https://www.anomali.com/blog/three-month-frameworkpos-malware-
[16] M. Aiello, M. Mongelli, E. Cambiaso, and G. Papaleo, “Profiling DNS
campaign-nabs-43000-credits-cards-from-poi
tunneling attacks with PCA and mutual information,” Logic J. IGPL,
vol. 24, no. 6, pp. 957–970, Dec. 2016. [44] (2009). OzymanDNS—Tunneling SSH Over DNS. [Online]. Available:
https://malicious.link/post/2009/2009310ozymandns-tunneling-ssh-over-
[17] A. L. Buczak, P. A. Hanke, G. J. Cancro, M. K. Toma, L. A. Watkins, and dns-html/
J. S. Chavis, “Detection of tunnels in PCAP data by random forests,” in
[45] Reverse DNS Shell. Accessed: Mar. 18, 2021. [Online]. Available:
Proc. 11th Annu. Cyber Inf. Security Res. Conf. (CISRC), 2016, pp. 1–4.
https://github.com/ahhh/Reverse_DNS_Shell
[18] A. Das, M.-Y. Shen, M. Shashanka, and J. Wang, “Detection of exfil- [46] TCP-over-DNS. Accessed: Mar. 18, 2021. [Online]. Available:
tration and tunneling over DNS,” in Proc. 16th IEEE Int. Conf. Mach. https://analogbit.com/software/tcp-over-dns/
Learn. Appl. (ICMLA), Cancun, Mexico, 2017, pp. 737–742. [47] K. Fujiwara, A. Sato, and K. Yoshida, “DNS traffic analysis—CDN and
[19] C.-M. Lai, B.-C. Huang, S.-Y. Huang, C.-H. Mao, and H.-M. Lee, the world IPv6 launch,” J. Inf. Process., vol. 21, no. 3, pp. 517–526,
“Detection of DNS tunneling by feature-free mechanism,” in Proc. 2013.
IEEE Conf. Depend. Secure Comput. (DSC), Kaohsiung, Taiwan, 2018, [48] A. Shaikh, R. Tewari, and M. Agrawal, “On the effectiveness of
pp. 1–2. DNS-based server selection,” in Proc. IEEE INFOCOM Conf. Comput.
[20] J. Steadman and S. Scott-Hayward, “DNSxD: Detecting data exfiltra- Commun. 20th Annu. Joint Conf. IEEE Comput. Commun. Soc., vol. 3.
tion over DNS,” in Proc. IEEE Conf. Netw. Funct. Virtualization Softw. Anchorage, AK, USA, 2001, pp. 1801–1810.
Defined Netw. (NFV-SDN), Verona, Italy, 2018, pp. 1–6. [49] G. C. M. Moura, J. Heidemann, R. D. O. Schmidt, and W. Hardaker,
[21] A. Nadler, A. Aminov, and A. Shabtai, “Detection of malicious and low “Cache me if you can: Effects of DNS time-to-live,” in Proc. ACM
throughput data exfiltration over the DNS protocol,” Comput. Security, Internet Meas. Conf. (IMC), 2019, pp. 101–115.
vol. 80, pp. 36–53, Jan. 2019. [50] K. Hornik, “Approximation capabilities of multilayer feedforward
[22] C. Liu, L. Dai, W. Cui, and T. Lin, “A byte-level CNN method to detect networks,” Neural Netw., vol. 4, no. 2, pp. 251–257, 1991.
DNS tunnels,” in Proc. IEEE 38th Int. Perform. Comput. Commun. Conf. [51] K. Hasegawa, D. Kondo, and H. Tode, “FQDN-based whitelist filter on
(IPCCC), London, U.K., 2019, pp. 1–8. a DNS cache server against the DNS water torture attack,” in Proc.
[23] J. Ahmed, H. H. Gharakheili, Q. Raza, C. Russell, and V. Sivaraman, IFIP/IEEE Int. Symp. Integr. Netw. Manage. (IM), May 2021. [Online].
“Monitoring enterprise DNS queries for detecting data exfiltration from Available: https://im2021.ieee-im.org/program/poster-sessions
internal hosts,” IEEE Trans. Netw. Service Manag., vol. 17, no. 1, [52] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
pp. 265–279, Mar. 2020. 2014. [Online]. Available: https://arxiv.org/abs/1412.6980.
[24] S. Chen, B. Lang, H. Liu, D. Li, and C. Gao, “DNS covert channel [53] “Domain names—Implementation and specification,” Internet Eng. Task
detection method using the LSTM model,” Comput. Security, vol. 104, Force, RFC 1035, 1987. [Online]. Available: https://tools.ietf.org/html/
May 2021, Art. no. 102095. rfc1035
ISHIKURA et al.: DNS TUNNELING DETECTION BY CACHE-PROPERTY-AWARE FEATURES 1217

Naotake Ishikura received the B.S. and M.S. Iordan Iordanov received the B.S. degree in
degrees in engineering from Osaka Prefecture applied mathematics and the M.S. degree in applied
University, Osaka, Japan, in 2019 and 2021, respec- and computational mathematics from the University
tively. His current research interest includes network of Crete, Greece, in 2013 and 2015, respec-
security. tively, and the Ph.D. degree in computer sci-
ence from the University of Lorraine, LORIA
(CNRS UMR 7503), Inria Nancy-Grand Est, Nancy,
France, in 2019. He is currently a Chief Scientist
with Corpy&Co., Inc., Tokyo, Japan. His research
Daishi Kondo (Member, IEEE) received the B.S. interests include applied and computational math-
degree in engineering from Osaka University, Osaka, ematics, computational geometry, and explainable
Japan, in 2013, the M.A.S. degree in interdis- artificial intelligence.
ciplinary information studies from the University
of Tokyo, Tokyo, Japan, in 2015, and the Ph.D.
degree in computer science from the University
of Lorraine, LORIA (CNRS UMR 7503), Inria
Nancy-Grand Est, Nancy, France, in 2018. He
is currently an Assistant Professor with Osaka
Prefecture University. His research interests include
information-centric networking, network security,
privacy, and peer-to-peer networking.

Vassilis Vassiliades received the B.Sc. degree in Hideki Tode (Member, IEEE) received the B.E.,
computer science from the University of Cyprus M.E., and Ph.D. degrees in communications engi-
(UCY) in 2007, the M.Sc. degree in intelli- neering from Osaka University in 1988, 1990,
gent systems engineering from the University of and 1997, respectively. From 1991 to 2008, he
Birmingham, U.K., in 2008, and the Ph.D. degree was an Assistant Professor and an Associate
in computer science from UCY in 2015. He is Professor with Osaka University. He has been
currently a Team Leader with the CYENS Centre a Professor with the Department of Computer
of Excellence (formerly known as RISE), Cyprus, Science and Intelligent Systems, Graduate School
and an Associate Research Fellow with UCY. He of Engineering, Osaka Prefecture University since
was a Postdoctoral Fellow and a Research Engineer 2008. His current research interests include archi-
with Inria Nancy, France, from 2015 to 2018 and a tectures and controls for optical networks, wireless
Research Associate with UCY from 2015 to 2019 and RISE in 2019. His multihop networks, future Internet, and content distribution networks. He
research interests lie in the areas of artificial intelligence and robotics, with is a Fellow of the Institute of Electronics Information and Communication
emphasis on machine learning and evolutionary computation. Engineers, Japan.

You might also like