A Survey On Network Methodologies For Real-Time Analytics of Massive Iot Data and Open Research Issues
A Survey On Network Methodologies For Real-Time Analytics of Massive Iot Data and Open Research Issues
Abstract—With the widespread adoption of the Internet of new business models, streamline operational processes and
Things (IoT), the number of connected devices is growing at create more innovative products and services across various
an exponential rate, which is contributing to ever-increasing, industries [9]–[15]. Throughout this paper, we refer to this
massive data volumes. Real-time analytics on the massive IoT
data, referred to as the “real-time IoT analytics” in this paper, is real-time analytics of the massive IoT data as the “real-time
becoming the mainstream with an aim to provide an immediate IoT analytics”. The formal definition of the real-time IoT ana-
or non-immediate actionable insights and business intelligence. lytics can be presented as the process to provide optimized
However, the analytics network of the existing IoT systems does IoT services, which include control instructions, performance
not adequately consider the requirements of the real-time IoT enhancement of IoT applications or innovative IoT business
analytics. In fact, most researchers overlooked an appropriate
design of the IoT analytics network while focusing much on the services by analyzing the massively collected IoT data using
sensing and delivery networks of the IoT system. Since much related analytics resources like network, computation, and stor-
of the IoT analytics network has often been taken as granted, age, as soon as the IoT data enters the system and within
the survey, in this paper, we aim to review the state-of-the- fixed time.
art of the analytics network methodologies, which are suitable Despite its importance, the real-time analytics of the mas-
for real-time IoT analytics. In this vein, we first describe the
basics of the real-time IoT analytics, use cases, and software sive IoT data is, indeed, in its infancy. In particular, an
platforms, and then explain the shortcomings of the network adequate network infrastructure to support the massive IoT
methodologies to support them. To address those shortcomings, data processing and analysis is required according to the work
we then discuss the relevant network methodologies which may conducted by Ge et al. [16]. Indeed, a detailed review of the
support the real-time IoT analytics. Also, we present a number network architecture of the IoT, from the real-time analytics
of prospective research problems and future research directions
focusing on the network methodologies for the real-time IoT standpoint, is yet to appear in the literature. As depicted in
analytics. Fig. 1, the entire IoT network may be divided into three main
parts, namely the sensing, delivery, and analytics networks.
Index Terms—The Internet-of-Things (IoT), real-time ana-
lytics, data center network, hyper-convergence, edge analytics As shown in the figure, the sensing network is used to col-
network. lect the information from the physically connected things. As
there exist numerous surveys regarding the IoT system archi-
tectures and their enabling technologies [17]–[20], much focus
has been given in the literature toward the sensing and delivery
I. I NTRODUCTION
networks of the IoT. On the other hand, researchers have often
HE PROLIFERATION of the Internet of Things (IoT),
T which connects billions of machines, devices, and things,
is rapidly transforming the way businesses extract value
considered the networks required to support the real-time IoT
analytics as granted. However, for the real-time IoT analytics
which has a critical time constraint for task completion, choos-
from data [1]–[4]. If the massive IoT data generated from ing the appropriate analytics architecture is crucial. Therefore,
these huge number of devices [5]–[8] can be captured in in this paper, we focus on the network methodologies required
real-time and acted upon, preventive maintenance can be for the real-time IoT analytics.
developed to pre-empt the performance problems on equip- The contributions of our work in this paper are as follows.
ment and appliances, and can also help organizations adopt 1) Through a rigorous study of the existing surveys on the
Manuscript received June 28, 2016; revised November 18, 2016, January IoT network systems, we identify the research gap in the
24, 2017, and April 7, 2017; accepted April 9, 2017. Date of publication existing communication and networking technologies of
April 14, 2017; date of current version August 21, 2017. This work was the IoT as summarized in Table I which emphasizes
supported by Strategic International Research Cooperative Program, Japan
Science and Technology Agency. (Corresponding author: Shikhar Verma.) the need to survey the state-of-the-art of the network
The authors are with the Graduate School of Information methodologies for real-time IoT analytics.
Sciences, Tohoku University, Sendai 980-8579, Japan (e-mail: 2) Next, we present a comprehensive taxonomy of the IoT
[email protected]; [email protected]; [email protected].
tohoku.ac.jp; [email protected]; [email protected]). analytics and stress on the importance of real-time IoT
Digital Object Identifier 10.1109/COMST.2017.2694469 analytics. In addition, we describe the various use cases
1553-877X c 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
1458 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
Fig. 1. The IoT network architecture comprising sensing, delivery, and analytics network. Note that existing research works are mostly on the sensing and
delivery networks while the analytics networks (i.e., our focus in this paper) have often been overlooked.
and software platforms of real-time IoT analytics, and another name. On the other hand, to the enterprise organiza-
highlight their network requirements. tions, the IoT, albeit an intricate mix of technical standards
3) Also, we provide an extensive survey on the network and conflicting opinions, is a huge potential to benefit their
architectures that may cater to the needs of the real- existing users and attract even more customers. To the ven-
time IoT analytics. In this vein, we review and rethink dors, IoT stands out as one of the latest marketing phenomena.
data center networks (DCN) and discuss which ones From whichever perspective one looks at the IoT, as this
may be relevant to the real-time IoT analytics. Among paradigm transforms a concept to reality, a key challenge is
the best suited network architectures that we identify how the data created by billions of devices will flow through
for the real-time analytics of the massive IoT data, the system, where the data will end up, and what role ana-
hyper-convergence, in particular, appears to be useful lytics will play on the massive IoT data. These challenges
due to its unique virtualization architecture and signif- must be addressed in the design phase which is essential
icantly higher fault tolerance in contrast with the other to ensure that the right technologies and tools are utilized
networking architectures. from the start. Particularly, the network practitioners need
4) Next, we discuss the required network support for par- to understand the unique challenges in various “phases” of
allel mining of the massive IoT data. In addition, we the IoT, from the massive data sensing/collection, and stor-
introduce the edge analytics networks to demonstrate age, to analytics, and formulate the best possible networking
how some of the real-time IoT analytics tasks may be solution in each phase. The work in [21] discussed the
shifted from the data center to the network edge to application, sensing, and network layer technologies involved
decrease the service delay. with the massive IoT data gathering and transferring. In the
5) As the final contribution, we identify a number of open survey conducted by Meddeb [19], standardization organiza-
research issues, and discuss the future directions that the tions such as the Internet Engineering Task Force (IETF),
researchers may adopt to effectively address the network the International Telecommunication Union (ITU-T), the
requirements for the real-time IoT analytics. International Organization for Standardization/International
The remainder of this paper is organized as follows. Electrotechnical Commission (ISO/IEC), the Institute of
Section II provides the relevant research works. Section III Electrical and Electronics Engineers (IEEE), the European
discusses the IoT analytics taxonomy, and discusses the Telecoms Standards Institute (ETSI), the 3rd Generation
network requirements of the various real-time IoT analytics Partnership Project (3GPP), and so forth are all highlighting
use cases and software platforms. Section IV provides an the importance on sensing and delivery networks. However,
extensive survey on the network methodologies for real-time none of the standards considered in that survey considered
IoT analytics. Then, Section V discusses a number of open about the importance of thoroughly considering a robust
research issues from the viewpoint of the network support for computing-layer network for the IoT. The survey conducted by
the real-time IoT analytics. Finally, the paper is concluded Al-Fuqaha et al. [17] provided five layers of the IoT, namely
in Section VI. the object, object-abstraction, service management, applica-
tion, and business layers. In their work, they defined the object
layer as the sensing network layer, which collects the large-
II. R ELATED W ORK scale sensor data from physical things. In order to transfer the
The IoT conveys different meanings to different people. For sensed-data to the object-abstraction layer, delivery networks
the developers’ perspective, the IoT is a huge opportunity to based on technologies such as Radio-frequency identifica-
put together the correct combination of standards, tools, and tion (RFID), 4G, WiFi, Bluetooth, and so forth are employed.
software technologies that they were probably doing under The survey in [22] also discussed different IoT standards
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1459
TABLE I
A S UMMARY OF F OCUSED N ETWORK T OPICS OF THE I OT S YSTEM C ONDUCTED BY THE E XISTING S URVEYS
protocols involved in these layers. For instance, it consid- discussed. However, no discussion on the computing network
ered communication protocols as infrastructure protocols in protocols was presented. Thus, it appears that the surveys con-
which only sensing and delivery network protocols were ducted on the IoT until now have not given much research
1460 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
attention to the analytics network layer. Instead, the network are formally presented, and several use cases are described to
attributes required for the massive IoT data analytics have demonstrate the viability of real-time analytics for the massive
been, rather, taken for granted. For instance, in the cloud com- IoT data.
puting survey, Mineraud et al. [23] described only the cloud
software platforms such as “ThingWorx”. To illustrate how
existing works have often overlooked the computing network III. OVERVIEW AND N ETWORK R EQUIREMENTS
design and deployment issues, readers may also refer to the OF R EAL -T IME I OT A NALYTICS
famous survey on IoT-oriented healthcare technologies con- Recent research works have indicated that gathering the
ducted by Islam et al. [24]. For instance, the survey considered massive IoT data alone is not sufficient [41]–[43], and it is
6LowPAN to be the base of the IoT-based healthcare network, critical to analyze every segment of data generated by the IoT
and accordingly discussed thoroughly the different issues of devices. With this aim, Ge et al. [16] introduced the concept
the sensing network elements of the considered IoT system. of IoT analytics. On a surface level, the term IoT Analytics
On the other hand, it did not, at all, describe any com- may appear to be a simple and straightforward notion. From
puting network related issue to deal with the massive IoT a more technical viewpoint, the IoT analytics may be defined
data generated by the mentioned sensing networks. A sim- as the analysis of every segment of the massive IoT data at
ilar observation may be made regarding Razzaque et al.’s the right moments in order to obtain the business value and
survey [20] on the middleware for the IoT systems that dis- drive intelligent decisions. It is worth noting that this is not
cussed various aspects of the IoT delivery networks, and did necessarily the same as the big data analytics since the IoT
not provide any detail on the analytics network architecture. analytics does not deal with all the characteristics of big data
Furthermore, a survey of the Medium Access Control (MAC) processing. In other words, the massive IoT data processing
layer issues for Machine-to-Machine (M2M) communications and analysis are challenged by its sheer volume without tak-
was presented in the research work conducted by Rajandekar ing into account the other unique characteristics of big data. A
and Sikdar [25]. However, this work also took into consider- robust understanding of the advanced IoT analytics depends on
ation how different MAC layer protocols are suited for M2M a systematic approach to ingesting the massive IoT data [44],
communications in an IoT system, particularly from the sens- placing it in the context of the situation at the right moment
ing and delivery networks perspective. On the other hand, and history relating to the use case, and intelligently acting
Mehaseb et al. [26] surveyed the uplink scheduling tech- upon it [45]. Extracting the business value from IoT analytics
niques of LTE-A for M2M-based communications in the IoT can, indeed, be a tricky proposition [46]. In order to effectively
by arguing that the LTE-A is the most applicable technology leverage the right data at the right moment to create proactive
to support a massive number of devices. Indeed, the LTE- and predictive business models, IoT analytics is receiving a
A is a cellular communication technology, which is adopted great deal of research attention in a number of different ways.
to deliver the massive IoT data to the data centers for per- Therefore, a comprehensive categorization of IoT analytics is
forming analytics. Therefore, this work also considered the essential, which is depicted in Fig. 2. As shown in the taxon-
delivery network part of the IoT system while not devoting omy, IoT analytics can be mainly categorized into historical
much attention toward the network support required for the and proactive analytics. Historical analytics is the traditional
IoT analytics. Additionally, the research work conducted by (i.e., the simplest and the most common form of analytics)
Luong et al. [27] surveyed a wide range of economic and pric- that aims to obtain visual insights from the mining of histori-
ing models for data collection and communication in an IoT cal data. It can be further categorized into descriptive [47], [48]
system. However, their consideration was also limited to only and diagnostic analytics [49], [50], which provide visualization
the sensing network. Pricing models in computation networks or reports-based statistics of the IoT system performance and
for analysis of the collected data are necessary from the overall malfunction notification or alerts of the sensors and equipment,
IoT network deployment point of view. However, such com- respectively. On the other hand, proactive analytics is emerg-
putation network-centric issues were overlooked in that work, ing as a new trend [51], [52] to facilitate actionable insights
as in other contemporary surveys [28], [29]. using the right data-frames of the massive IoT data. As shown
To view the summary of the existing surveys on sensing and in the figure, the proactive analytics on the massive IoT data
delivery networks of the IoT system, please refer to Table I. can be categorized into stream IoT analytics and real-time IoT
Thus, on the one hand, it is evident that the existing works in analytics [53]. While the former deals with batches or streams
the literature have not effectively surveyed the network support of IoT data with none or moderately high time constraints,
requirement for the computing and analytics space of the IoT the latter needs to provide the analytical response or output
system. On the other hand, the massive IoT data requiring real- within a strict time-bound as the IoT data arrives in micro-
time analytics may not be adequately supported by the existing batches [54]. The real-time analytics is swiftly becoming the
networking architectures. To the best of our knowledge, no mainstream, and may be further categorized into predictive
earlier work has been able to pinpoint the network attributes and prescriptive types. A typical example of a typical predic-
required by real-time analytics of the massive IoT data. In fact, tive IoT analytics [42], [64], [65] is the real-time visualization
the contemporary works have also not been able to identify of energy usage in a smart building to predict the electricity
and/or clarify the relationship between the IoT-generated mas- bill of the users and so forth. As for the example of the pre-
sive data and the real-time IoT analytics. To address this issue, scriptive analytics, consider the industrial IoT system which
in the following section, the phases of real-time IoT analytics continually monitors the current status of the equipment to
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1461
TABLE II
U SES C ASES OF R EAL -T IME I OT A NALYTICS AND T HEIR N ETWORK R EQUIREMENTS
TABLE III
S OFTWARE P LATFORMS A PPLICABLE TO R EAL -T IME I OT A NALYTICS AND T HEIR N ETWORK R EQUIREMENTS
recommend whether/when components need to be repaired, may not be adequate to support the real-time IoT data. The
upgraded, and so forth [66]. The applications of the real-time reason behind this is the existing network infrastructures for
IoT analytics are becoming more important in various domains analytics are mainly designed for Web analytics, which is
such as smart home/city, industrial IoT, disaster management, significantly different compared to the real-time IoT analyt-
smart grid, healthcare, and transportation. The network scales ics [55], [56]. While the Web analytics aims to analyze data
of these applications and the characteristics of the network originating from a single application (e.g., a specific website),
requirements vary for these different applications. Table II the IoT data emanates from different sources and applica-
summarizes the network requirements of the real-time IoT tions. However, compared to its Web analytics counterpart,
analytics applications in the various domains. the IoT analytics is still in its infancy, and researchers and
As discussed above, the massive IoT data poses a challenge practitioners of the IoT systems are working toward design-
to the real-time analytics of IoT within a tight time-bound. ing software platforms for the real-time IoT analytics [57].
Traditionally, the real-time IoT analytics may be antici- Table III lists the unique features and advantages of the state-
pated to happen in the cloud [58], data centers [59], and of-the-art software platforms, which are useful for real-time
other server-centric environments [60]. This is because the IoT analytics. The table also describes why adequate network
cloud computing emerged as one of the prominent areas for support is required by these software platforms. In particular,
data computation and analytics due to its ability to provide several of the existing platforms supposedly have the capability
computing-based services by means of sharing networks, stor- to provide real-time analytics. However, for the massive IoT
age, servers, applications, and so forth [61], [62]. However, data, these platforms, in their current implementations, appear
according to Ge et al. [16], the existing networking tech- to be inadequate due to a number of performance trade-offs.
nologies, such as cloud infrastructures, at their current status, For instance, distributed built-in memory and other resources
1462 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
were identified to be a key requirement for carrying out the forth. However, designing the appropriate infrastructure for a
IoT analytics that, in turn, lead to significantly large laten- data center capable of dealing with the real-time IoT analyt-
cies. Therefore, it is evident that designing the IoT analytics ics has not been taken into account in the existing literature.
only from the software and hardware points of view is not Therefore, in this section, we aim to review the data center
adequate. In other words, the distributed hardware resources network architectures from the perspective of the real-time IoT
require to be interconnected by a robust network so that the analytics. We broadly categorize the state-of-the-art data cen-
software platforms can leverage the available resources to the ter networking architectures into wired and wireless networks
maximum extent so as to support real-time IoT analytics. In as shown in Table IV, and describe the need to rethink the data
other words, there is a need to minimize the performance trade- centers networks in order to cater to the needs of the real-time
offs of the existing analytics platforms by designing a robust IoT analytics.
network, which is able to efficiently control the flow of the 1) Wired Topologies: According to the work in [112], most
massive IoT data. Therefore, it is important to discuss whether of the state-of-the-art data centers employ Ethernet switches
the existing network methodologies can facilitate the real-time to connect the servers [112]. The work also reveals that the
IoT analytics. In the next section, we describe the state-of- internal networks of the data centers are predominantly con-
the-art of network methodologies to support the real-time IoT structed with high-speed Ethernet with capacities ranging from
analytics. 1 to 10Gbps. However, there exists a multitude of techniques
to connect the servers on a data center network that are com-
IV. N ETWORK M ETHODOLOGIES FOR THE R EAL -T IME monly referred to as the “network topologies”. The network
A NALYTICS OF THE M ASSIVE I OT DATA topology of a data center may be represented as a graph if
the servers and switches are regarded as the vertices while
As described in the earlier section, there is a serious
the wires/cables are considered as the edges [113]. The objec-
need to investigate the existing analytics network designs,
tive of a specific network topology adopted in a data center
and determine which designs are robust enough to efficiently
is to improve the performance of the data center in terms of
connect the distributed hardware resources and control the
fast access and routing, increased fault tolerance, and so forth.
network flows so as to facilitate real-time IoT analytics. In
However, because real-time IoT analytics requires continuous
this vein, in this section, we extensively survey the key
querying, even if the data center network condition reaches
network support architectures, which are suitable for real-
its maximum capacity, the system does not have the luxury
time IoT analytics. Our survey includes the rethinking of data
to halt incoming requests. Therefore, the real-time analytics
centers, hyper-converged networks, massively parallel mining
for the massive IoT data requires a dynamic and adaptive
network architectures, and edge analytics networks, in order
network topology. In this vein, we first classify the wired data
to reduce the transmission time and speed of data flow which
center topology into two types, namely static and dynamic
are essential for supporting real-time IoT analytics.
network topology. While the static network topology cannot be
modified after deployment, the latter can be modified and/or
A. Rethinking Data Centers Networks reconfigured according to the varying network traffic condi-
Recently, data centers have been receiving a great deal of tions. In Table IV, we present the suitability and unsuitability
attention from both academia and industry since they offer a of various wired and wireless data center network topologies
cost-effective infrastructure for storing large volumes of data in the context of real-time IoT analytics that we discuss in the
and hosting large-scale service applications. Leading business following.
organizations such as Google, Facebook, Amazon, Microsoft, The state-of-the-art static architectures of the data center
and so forth have heavily invested in data centers for storage, network comprise either two- or three-level hierarchies of
Web search, information retrieval, and large-scale computa- switches and/or routers. Thus, they exhibit similar traits of a
tion tasks [103]. Despite much investment, the architectures typical tree based network architecture. The variations in these
of most of the existing data centers are suitable for the tra- tree-based architectures are widely deployed in the data center
ditional Web analytics, and still far from being ideal in the networks of today [114]–[116]. Once deployed, the tree-based
context of the real-time IoT analytics. The surge of data from data center network architectures cannot be further modified.
the IoT, along with other large-scale data sources, is con- The notable examples of the static architecture consist in the
tributing to a data growth which is four times faster than that basic tree, fat tree, Clos network, Dcell, and BCube. From
anticipated by the Moore’s law [68]. As a consequence, the Table IV, it can be observed that none of the static topolo-
existing data centers are experiencing a tremendous pressure gies are suitable for real-time IoT analytics mainly because of
since they need to provide more space, fast computation, and their low fault tolerance ability, inefficient resource utilization,
swift response to deal with this “data tsunami”, and fulfill the oversubscription problem, and so forth.
requirements of the real-time IoT analytics. While cloud com- Due to the shortcomings of the static architectures for data
puting is integrated with the IoT [109]–[111] to address these center networks, many researchers have considered employing
issues, from the analytics network viewpoint, the cloud is still the optical switching technology to construct dynamic data
not able to cope with the real-time constraints. Indeed, there center networks [106], [117], [118]. While they have been
have been numerous works on the data center network design effective for traditional analytics tasks, we need to discuss
for cloud computing from various perspectives, e.g., in terms whether they may be also effectively applied to real-time IoT
of management, scalability, processing infrastructure, and so analytics. The three flexible dynamic architectures to construct
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1463
TABLE IV
S UITABILITY OF DATA C ENTER N ETWORK T OPOLOGIES FOR R EAL -T IME A NALYTICS OF M ASSIVE I OT DATA
the data center is shown in Table IV that consist in c-Through, It is also worth reminding that the dynamic topologies of the
Helios, and the Optical Switching Architecture (OSA). These data center networks are flexible and reconfigurable to cater to
dynamic network topologies may be partly suitable to real- the needs of real-time IoT analytics. Their good performance
time IoT analytics owing to their dynamic and flexible network may be attributed to the redundant optical networks considered
traffic control by reconfiguring the optical network based on in the discussed dynamic topologies. However, the real-time
the traffic conditions on the server racks. In addition, these IoT analytics requires an adaptive network environment which
architectures have high throughput and low latency because can adapt to the varying network traffic flows so that storage
of optical medium of communications which are desirable and computing resources may be efficiently utilized so as to
features for real-time IoT analytics. meet the real-time analytics requirements in terms of desirable
Leason Learned: The increase of data centers in the next throughput, adaptability to congestion, network scalability,
few years will be driven due to the emergence of differ- and resiliency. Furthermore, a number of researchers focused
ent social networking services, sensor-based services, and a on improving the latency-aware task completion in dynamic
plethora of IoT applications. These applications will not only data center network topologies [106], [119]–[122]. Hence, the
generate a massive volume of data for analytics but also are dynamic network topologies with such considerations may be
anticipated to require a response in real-time. In particular, considered for the real-time IoT analytics.
the various application domains of the IoT needs real-time 2) Wireless Topologies: The discussion in the earlier sec-
analytics to obtain real-time insights so as to generate value tion revealed that the tree-based, wired network topologies of
from those data. However, existing data centers are only being the current data centers are prone to over-subscription [113],
leveraged for Web analytics without real-time constraints. As which further worsens when few “hot” servers with signifi-
discussed earlier, the static topologies are unable to change cantly much higher traffic than the other servers become the
their topologies after deployment, and therefore, they can- bottleneck. Also, the capacity provided by the wired infras-
not control the network flows. However, most of the data tructure may be inadequate due to varying traffic demands.
centers, today, employ these fixed topologies. Hence, in the According to the study on data center network traffic distribu-
current data center deployment settings, the aforementioned tions that was conducted by Kandula et al. [123], the traffic
static architectures are not able to provide the network require- matrix is generally sparse because only a few servers expe-
ment for real-time IoT analytics. Also, it is not cost-effective rience a substantially high volume of traffic. In addition, the
to deploy new data centers and design new topologies only data center network traffic distributions are highly dynamic
for real-time IoT analytics. Therefore, there is a critical need in nature. As a consequence, the available link capacities
to improve and rethink these topologies from real-time IoT are often under-utilized. As a remedy to this problem, the
analytics viewpoint. recent advances in wireless technologies have set the stage
1464 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
Fig. 3. Wireless data center network topologies. a) A purely wireless data center network architecture. b) A hybrid wireless data center network architecture.
for high-data-rate communications with reliability assurance. notice that, compared to the purely wireless topology, the bet-
For instance, the Extremely High Frequency (EHF), ranging ter tolerance, scalability, high bandwidth, and throughput with
from 30 to 300 GHz, can be considered as a viable candi- low latency make the hybrid wireless architecture suitable for
date for the high-speed wireless solution in the data center real-time IoT analytics.
networks. The 60 GHz spectrum, in particular, provides a Leason Learned: Due to the recent advances in the massive
7 GHz (57–64 GHz) waveband that is capable of support- IoT data, real-time analytics, big data mining, and cloud com-
ing a data rate exceeding 1Gbps. Also, the relatively small puting, the data center networks need to significantly scale up,
wavelengths of radio signals support highly directional com- and therefore, are experiencing a tremendous pressure. As a
munications that can increase frequency reuses. Even though consequence, the number of servers and wires/cables in the tra-
the transmission range is typically up to 10 meters, i.e., rather ditional data centers with wired network topologies is expected
small, it is adequate for the short distance indoor wireless to dramatically increase raising scalability and maintenance
communications for the server racks in a data center network. challenges. The concept of the wireless data center networks
Despite the above-mentioned advantages, a “purely wire- (i.e., the hybrid and pure wireless topologies) is, thus, gain-
less” network [124], as shown in Fig. 3(a), alone may experi- ing much attention since it is a viable alternative. However,
ence difficulty in meeting all the demands (i.e., high capacity, the wireless data center networks need to deliver, at least,
scalability, and fault tolerance) of a data center dealing with the same performance in terms of bandwidth, throughput, and
large data coming from IoT and other sources [156]. For exam- delay as compared with the wired data center architectures. As
ple, the capacity of wireless links is usually limited due to a solution, the cylindrical wireless data center architecture was
the interference and high transmission overhead [107]. As a proposed in [124]. Still, from the real-time IoT analytics view-
remedy, the hybrid wireless network topology has emerged to point, the wireless data center networks have additional design
leverage the advantage of the optical and wireless technolo- requirements. One one hand, both wired and wireless data cen-
gies. The main idea of the hybrid paradigm is to introduce ter networks are under constant pressure to quickly support
additional wireless links to the existing wired topology to new workloads such as cloud, real-time IoT analytics, and so
construct a hybrid Ethernet/wireless architecture envisioned on with the constraint of bandwidth and storage. On the other
by Cui et al. [107]. In this vein, the servers are equipped hand, network, server, and storage vendors are under pressure
with radios. However, equipping all the servers with radios to design more efficient yet cost-effective solutions. This has
may lead to increased cost and under-utilization since all motivated researchers to adopt software-defined environments
radios are not allowed to simultaneously transmit. Hence, as to include networking, storage, and servers in hyper-converged
shown in Fig. 3(b), it is reasonable to assign the same set nodes [125]. Next, we describe the hyper-converged networks,
of radios to groups of servers, referred to as the Wireless which is likely to emerge as a potentially viable candidate to
Transmission Units (WTUs). Also, it is worth noting that the support real-time IoT analytics.
server racks do not obstruct line-of-sight transmissions since
the radios are deployed on the top of the racks. Since the exist-
ing interconnection architectures organize servers in groups
(e.g., pods in fat-trees, BCube0 s in BCube, and so forth), it B. Hyper-Converged Networks
is reasonable to consider such groups as the WTU. In other The hyper-converged network is an emerging paradigm
words, a hybrid wireless data center network is constructed which is gaining popularity in both industry as well as
by adding the WTUs based wireless network to the existing academia [125], [126]. Before virtualization and Storage
Ethernet-based wired topology in such a manner that the cost Area Network (SANs) were introduced, many organiza-
of rearranging the servers is minimal. From Table IV, we can tions deployed physical servers with directly attached storage
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1465
While the hyper-convergence network architecture appears Extensible LAN (VXLAN), Network Virtualization using
to be robust enough to support real-time IoT analytics, they Generic Encapsulation (NVGRE) [136], stateless transport
may be also coupled with other network methodologies. For tunneling (STT) [137], and Network Virtualization Overlays
instance, the parallel mining architectures can be used in both 3 (NVO3) [138]. The parallel data mining on overlay-based
conventional and hyper-converged data center networks to network architecture improves the service availability by
facilitate real-time IoT analytics. Furthermore, edge analyt- involving all the nodes to execute both processing and man-
ics networks can be easily constructed with hyper-converged agement functions. When a data processing request is injected
nodes to locally carry out real-time IoT analytics tasks. In to the overlay network, the node which receives the request
the remainder of the section, we discuss the parallel mining initiates a “reception function”, which carries out “mapping”
network architectures and edge analytics networks, respec- by flooding messages to find randomly selected “mappers”
tively, to support real-time IoT analytics. (i.e., processing nodes). Then, a “mapper”, which initially
completed the mapping process executing both management
and processing functions, becomes a “reducer”. At this point,
C. Parallel Mining Architectures of the Massive IoT Data the “reducer” requests the other “mappers” to transmit the
The massive scale data mining is an essential component for processed data to itself. After receiving the processed data
the actualization of the smart city, a prominent IoT initiative, from the “mappers”, the reducer executes the reduction pro-
according to the work in [128]. Furthermore, the work con- cess and outputs the analyzed result. In this architecture,
ducted by Gattikar et al. [129] discussed that the large scale the connectivity of the overlay-based network architecture
data mining should provide outputs/results in an expeditious may dramatically impact the service availability of the large-
manner in response to the real-time demands. However, the scale IoT data mining. As a solution, a number of research
traditional data mining methodologies exploiting parallel min- works address this connectivity issue in the overlay-based
ing architectures such as MapReduce [130] and Hadoop do network construction from various perspectives, i.e., context-
not fulfill the needs of real-time IoT analytics. This is because aware, graph-theory-based, complex network theory-based,
the traditional architectures usually employ a centralized man- and so forth [139]–[141]. These works can make the overlay-
agement overseen by a master node that degrades the overall based network architectures more tolerant to small-scale server
system performance when the number of processing nodes breakdowns. However, for large-scale failures (i.e., physical
increases [131]. Furthermore, such architectures are prone to network disruption in the entire overlay), such works may not
single point of failures. In order to alleviate the scalability still be effective. Therefore, in the following, a more robust
and service availability issues, which are critical for parallel parallel data mining architecture is described, that may be able
mining of large scale data, an overlay-based network archi- to withstand physical network disruption while carrying out
tecture was proposed in [131]. Because all the nodes in the parallel mining of the massive IoT data.
overlay network are responsible for executing both process- 2) Parallel Mining in Optical-Wireless Hybrid Network:
ing and management functions, this architecture can balance Earlier in the section, it was explained that only optical and
the management load in an effective manner. In addition, by wireless data centers may not be able to fulfill the real-time
involving all the nodes in the management tasks, the overlay- analytics requirement of the massive data generated by the
based network architecture does not rely upon a single master IoT and other large-scale data sources. Therefore, it was indi-
node, and thus, achieves higher fault tolerance and better ser- cated that the future of the data center lies in the hybrid
vice availability in contrast with the traditional parallel, large data warehouses, which jointly consider the optical and wire-
data mining architectures. In the following, the overlay-based less network technologies as well as hyper-converged nodes
network architecture for parallel mining of the massive IoT to exploit their advantages. Furthermore, the existing paral-
data is briefly described. lel mining methodologies (e.g., MapReduce and so forth) for
1) Overlay-Based Network Architecture for Parallel Mining large data mining are designed with the optical networks of
of the Massive IoT Data: According to the research works data centers [129]. Hence, the parallel mining architectures
in [132]–[134], an overlay network is typically constructed need to be redesigned to fit the hybrid data warehouses. In
on top of another network and is supported by its infrastruc- this vein, the servers in each rack may be categorized into two
ture. The overlay network aims to detach network services groups. The first group consists of servers having a wireless
from the underlying infrastructure. In this vein, it encapsulates interface alone. The other group comprises servers, which have
its packets inside other packets, and forwards the encapsu- both wired and wireless interfaces. In other words, instead of
lated packets to the endpoint. Upon arriving at the end-point, employing purely wireless networks for the entire data cen-
the packets are decapsulated. The most prevalent example of ter, the parallel mining-centric optical-wireless hybrid network
an overlay network is the one running on top of the public employs wireless interfaces only for intra-rack communica-
Internet or the Public Switched Telephone Network (PSTN). tions. This is because the adopted wireless networks have a
Recently, other overlay networks have emerged including limited communication range, which is further aggravated by
the Virtual Private Networks (VPNs), Peer-to-Peer (P2P) reduced signal strength due to walls and other obstructions.
networks, Content Delivery Networks (CDNs), Voice over Therefore, the wireless communication within the rack does
Internet Protocol (VoIP) services such as Skype, non-native not suffer much from these performance issues. Also, the intra-
software-defined networks, and so forth [135]. The network rack servers employ multicast communication during the data
protocols of a typical overlay network consist of Virtual replication to efficiently utilize the available radio resource.
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1467
On the other hand, the inter-rack communication (whereby to the edge of the network while the Fog computing brings
the servers in different racks transmit data to one another) is cloud characteristics and functionalities at the network edge.
carried out in a unicast manner over reserved optical links. Also, the Fog computing is more scalable in contrast with
Because the inter-rack communication demands the reserva- the edge analytics [149]. This is because the Fog comput-
tion of multiple optical paths, its efficiency is lower in contrast ing allows a server on the edge to collect data from multiple
with that of the intra-rack communication. As a consequence, sources while edge analytics pushes the processing capa-
there should be efficient task allocation schemes, preferably bilities directly onto the devices employing Programmable
with deadline-awareness, so as to avoid further performance Automation Controllers (PACs) [150]. Therefore, the edge
degradation. analytics and Fog computing paradigms, although having the
Leason Learned: The two parallel mining architectures dis- same objective, exhibit different data gathering, processing,
cussed, up to now, are designed with the aim to be fault and communication architectures. Furthermore, other organi-
tolerant and to reduce the task completion time. However, it is zations are also developing similar technologies which push
a common practice with these architectures to consider con- analytics at the network edge. The overall functionalities of
ventional batch processing architectures like MapReduce [142] all these technologies need to aim to reduce the transmission
or Hadoop that do not fit well with the real-time IoT analyt- time and analyze the mission-critical data locally (i.e., at the
ics (as discussed in Section III). To date, researchers have network edge), and provide the results within milliseconds.
not considered reviewing/rethinking these architectures from From the IoT perspective, the edge analytics may provide
the real-time IoT analytics point of view. Normal batch pro- a new network architecture which can be a mix between
cessing like MapReduce exploits the stored data for analytics centralized and distributed processing toward real-time IoT
while the real-time analytics requires not only continuous analytics. Furthermore, it may partly solve the bandwidth
analysis but also parallel processing on the streaming data problem of the data centers by not sending all data to the
in addition to the mining of historical data. Therefore, there data center for processing and temporarily storing or caching
is a need of a new architecture of data centers, which may the data. In this sense, these devices/nodes at the edge may
consider the analytics of the massive IoT data in real-time be referred to as “micro-data centers”. Furthermore, a proof-
by managing the network flows within the required task of-concept face recognition application was constructed by
completion time. Yi et al. [151] in which the response time was reduced
The network methodologies discussed up to now are typi- from 900ms to 169ms by migrating analytics operations from
cally distant from the source of the IoT data. Due to network the centralized data center to the edge. On the other hand,
congestion and other issues, they may suffer from delay issues Ha et al. [152] considered a wearable cognitive assistance
when dealing with real-time IoT analytics tasks. Edge analyt- application in which cloudlets were used to offload analyt-
ics networks, which are much closer to the IoT data sources, ics tasks from the data centers that improved the response
may be a viable candidate to minimize the delay involved in time by 80ms to 200ms. Hence, these variants of edge ana-
real-time IoT analytics. In the remainder of the section, we lytics networks appear to be promising for the real-time
discuss the edge analytics networks to support real-time IoT IoT analytics.
analytics. Recently, the mobile edge computing is also gaining
research attention from analytics purposes [147]. The network
architecture of the mobile edge computing can be constructed
D. Edge Analytics Network for Real-Time IoT Analytics by a number of nodes located between the edge device and the
Real-time IoT analytics may not be limited only inside the cloud in the data centers General Purpose Computing on Edge
data centers. Indeed, in their efforts to design more efficient Nodes that include base stations of macro/small cells, gate-
ways to store and process the massive IoT data, researchers ways, traffic aggregation points, and so forth. While the base
have introduced the edge analytics networks, which brings stations leverage customized Digital Signal Processors (DSPs)
the computing nodes closer to the data source (i.e., at the to handle the workloads, the inherent design of the DSPs
edge of the network) [143]. The work in [143] considered the makes them rather unsuitable for dealing with analytical work-
analytics on critical data at the edge network to reduce the loads. In addition, according to the work in [153], it is
overhead of processing massive data at the centralized data difficult to estimate whether these nodes are, indeed, capa-
centers. By employing the edge analytics network paradigm, ble of performing computations (i.e., required for real-time
the things that generate the data may be able to process the IoT analytics) in addition to their existing workloads. In this
data themselves “locally” instead of transmitting the data to vein, Liu [154] exploited a small cell “Base Station-on-a-chip”
the data center [144]. This may decrease the data process based on the OCTEON Fusion Family by CAVIUM that com-
delay, which is important for the real-time IoT analytics [145]. prises 6 to 14 cores and supports analytics tasks for 32 to 300
Moreover, edge analytics research is evolving in various forms users. However, such base stations are exploited for analytics
such as Fog computing, cloudlets [146], mobile edge com- during the off-peak hours to exploit the computational capabil-
puting [147], and so forth. The Fog computing, coined by ities of the multiple computing cores which may be available.
Cisco [148], includes cloud computing capabilities through Furthermore, Intel’s Smart Cell Platform (SCP) [155] offers
virtualization, virtual machines, and so forth. In other words, another virtualization across the cellular base stations that
the edge analytics refers to the technology which permits anal- support additional workloads. However, the research in [153]
ysis on data away from centralized nodes in the data center indicates that a significantly high investment is required to
1468 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
replace specialized DSPs on the cellular base stations with decision making at real-time. However, due to deficiencies in
these comparable general purpose CPUs. existing network methodologies as reviewed in Section IV,
Leason Learned: Real-time IoT analytics at the network most existing data center facilities are unable to guarantee
edge can provide a drastic reduction in time to obtain the top-notch real-time analytics. Therefore, before sending mas-
analytics results or the so-called “actionable insights”. This sive data to the DCNs, their underlying network support must
shrinks the gap between the actual delivery and analytics be thoroughly revisited to identify the key challenges and
networks of the IoT system. If the edge analytics gains design robust network architectures that can sustain the mas-
momentum in the coming days, IoT users may initially be sive IoT data with real-time processing requirements. In the
tempted to move away from the centralized analytics networks remainder of this section, we discuss the network redesign
(e.g., data center networks). However, the edge analytics challenges like network scalability, network agility, network
networks to support real-time IoT analytics is still in their fault tolerance, spectral efficiency and challenges imposed by
infancy. A comprehensive framework to facilitate this may network delay. Several open issues for each challenge are
still take years. Meanwhile, much research focus should be also presented.
given toward deployment strategies, e.g., where exactly in the
edge to place the workload. Furthermore, to incorporate a
high level of analytics capability and reliability at the mobile A. Network Scalability Challenge
edge servers requires both substantial investment (due to their The typical data center network architecture is based on a
storage, power, cooling, and other operational costs) and a two-tier or three-tier hierarchical topology. In the two-tier data
significant shift in corporate thinking. Therefore, it may be center network architecture, servers are arranged into racks
expected that the real-time analytics at the edge will exist in forming the tier-one network, while the tier-two network is
addition to, but is not likely to entirely replace, the analyt- composed of switches providing server-to-server connectiv-
ics networks such as data centers. Hence, the edge analytics ity. Modern, larger data center networks commonly deploy a
should be regarded as a technology, which can complement three-tier architecture, which consists of the core, aggregation,
the centralized architectures for real-time IoT analytics. For and access layers. The growth in size and complexity of the
instance, using the real-time edge analytics, companies may data center networks can lead to scalability challenges [157].
be able to better understand their data and analytics needs Furthermore, as the IoT data continues to flow into the data
to “thin” the massive IoT data, and locally process and ana- warehouses in a dramatically increasing rate, the data cen-
lyze only the necessary data segments. On the other hand, the ter networks are likely to experience scalability issues. This
remaining data can be dispatched to the data centers for real- is because the existing data centers were designed for the
time analytics needs of other parties. However, the mobile edge unpredicted amount of data from Web analytics which can
computing platform confronts a peculiar challenge of extract- be at peak or low sometimes [158]. However, the massive
ing the maximum performance while minimizing the impact data from multitudes of things will always exhibit a mas-
of virtualization. Future research works need to effectively sive size. Note that while the connected things individually
address this challenge also. generate more frequent, smaller data chunks, they collectively
In addition to the above research challenges, a number of lead to massive data volumes. In addition, large data gener-
other challenges need to be considered for effectively design- ated by many other sources (e.g., social networks, online video
ing the network systems to facilitate real-time IoT analytics. streaming, and so forth) may flow along with this huge amount
In the following section, we discuss the open research issues of IoT data. Hence, a key requirement of real-time IoT ana-
and provide future directions. lytics is the data center network scalability. The scalability
challenge can be solved by exploiting modular data centers.
Modular data centers are movable, and can be attached to
V. A NALYTICS N ETWORK D ESIGN C HALLENGE an existing data center efficiently for scaling its infrastruc-
OF R EAL -T IME A NALYTICS OF M ASSIVE DATA : ture [159]. They can be made available in shipping containers
O PEN R ESEARCH I SSUES or even directly attached to the existing switches. Although
Along with other large scale data sources (such as online several modular data center approaches have emerged that are
video streaming and so forth), the IoT has emerged as one of scalable enough to support real-time IoT analytics, they are
the most influencing systems affecting the data flows through not without shortcomings. For instance, the Dcell and BCube
existing delivery and analytics networks. In particular, the IoT topology based modular data centers, which can be directly
systems are still evolving toward unpredictable directions. It attached to switches and made available in shipping contain-
is estimated that in the near future, billions of wireless sen- ers, respectively, have drawbacks as listed in Table IV. The
sors enabled devices in the IoT systems will exchange billions basic metric to check the scalability efficiency of network
of bits of information. This means that the networking infras- topology should be a number of supporting servers with
tructure such as the data centers, upon which the IoT system fewer links, small diameter for efficient routing and large
hinges, will surely experience a tremendous and rapidly grow- value of bisection width for better network capacity [160].
ing burden. As discussed in the earlier sections, many infor- Conventional hierarchical data center topologies have poor
mation technologies and business organizations have heavily bisection bandwidth and are also susceptible to major disrup-
invested into the IoT already. Therefore, they expect hefty tions due to device failures at the highest levels. Rather than
returns from the IoT, particularly in terms of the intelligent scale up individual network devices with more capacity and
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1469
features, the data center can be scale out with the small mod- while retaining the same IP address. Moreover, when the ser-
ule of servers which will be easy to wire and will have large vices are reassigned to the servers, the operators may typically
bisection bandwidth. Also can be deployed anywhere easily resort to a manual process to perform the fragmentation of the
like Bcube. address space, and therefore, experience a tremendous con-
As far as real-time IoT analytics is concerned, addressing figuration burden. This type of addressing is likely to have
the scalability issue of the analytics networks does not consist a significant impact on real-time IoT analytics. Therefore,
in the only challenge. After scaling up an analytics network, the above-mentioned challenges need to be taken into con-
the real-time control and reconfiguration of the newly added sideration while designing the analytics networks. The above
infrastructure including smooth and agile management, rout- mentioned issues may serve as the motivation for designing
ing, addressing, and access control are essential. Therefore, agile networks for real-time analytics on massive IoT data.
next, we discuss the network agility challenge with the focus Even though there are some existing solutions for the above
of real-time IoT analytics. mentioned issues, but those are not efficient for real-time
analytics of massive IoT data. For instance, setting up more
network links to increase the capacity of the hot servers can
be a solution for the oversubscription problem. However, the
B. Network Agility Challenge non-deterministic nature of traffic distributions in DCNs make
As discussed in Section V-A, addressing the scalability issue it extremely difficult to simply add extra links to a certain
may aggravate the agility of the IoT analytics network. Indeed, group of servers. Also, it is worth noting that adding links
the analytics network, which continues to receive the massive to all servers is impractical due to high cost and wiring dif-
IoT data on a continuous basis, must have the ability to assign ficulties [107]. In order to address these challenges of the
any service to any server efficiently including the newly added traditional wired topologies of the DCNs, it is, therefore, nec-
servers [161]. In addition, it should also be able to control and essary to design novel approaches that may flexibly provide
reconfigure the topology in real-time [161]. With agility, the additional capacity for hot servers and can easily be real-
analytics network can meet the demands of massive IoT data ized with current hardware technologies. In this vein, wireless
of different scattered sensors around the globe from a large networks can be considered as a viable candidate due to their
shared server pool, resulting in real-time services and higher unique advantages over their wired counterpart.
resource utilization in a cost effective manner. However, the The first notable advantage of the wireless DCN is the
agility aspect is not adequately addressed by today’s DCN convenience, which they offer, in terms of deployment and
designs due to a number of issues. First, existing topolo- maintenance. For instance, in a large-scale data center, a great
gies and architectures do not deliver enough capacity between deal of manual effort may be required to wire a large pool of
the interconnected servers. Conventional architectures depend servers that is inherently difficult and error-prone. Also, this
on the tree-like static network topologies which are usually problem is particularly more severe for the aforementioned
built from expensive hardware. The capacity between differ- extended wired network topologies [124] for the data cen-
ent paths of the trees is typically oversubscribed by factors of ters due to the fact that they introduce much more wiring
1:5 [162] or even more. This limits communication between than the conventional wired topologies. By exploiting the
servers to the place where it fragments the server pool. As a wireless technology for the DCN, these difficulties can be
consequence, congestion and computation hot-spots are dom- significantly reduced [107]. Another advantage of the wire-
inant even when spare capacity is available elsewhere [163]. less technology consists in the flexibility it provides for the
For instance, according to the work in [164], approximately DCN. Because wireless links can be dynamically set up, it
60% of the core and edge links are active at any given time is possible to carry out adaptive topology adjustments. In
in a DCN. Also, the work indicated that the utilization rate other words, the DCN can be reconfigured to fulfill the real-
is much lower for the aggregation links, i.e., below 10% for time traffic demands of the hot servers. Furthermore, because
95% percent utilization. According to these statistics, it may the wireless connections no longer rely on switches, they
be concluded that while the average loads of the DCN are do not suffer from the problems caused by these centralized
not that high, they suffer from the over-subscription mainly devices, e.g., single-point failures and limited bisection band-
due to the high traffic surge at few “hot” servers. Second, width. However, despite their advantages, there are several
the network does little to avoid the traffic flood in a ser- challenges in introducing the wireless technology to DCN in
vice that may affect other services around it. This is because terms of speed and stability issues. Furthermore, for address-
when one server experiences a traffic flood, it is usual for the ing the problem and dividing servers across any VLAN can
other sharing services of the same network to experience the be improved by data center management, which will easily
same. This situation is not acceptable for real-time IoT ana- assign any server to any service irrespective of the IP address.
lytics due to its stringent resource and deadline requirements. Virtual machines should be able to migrate to any server while
Third, to achieve scalability, the routing design in traditional retaining the same IP address, and the network configuration
networks is implemented by assigning the servers topologi- of each server should be identical to what it would be if
cally significant IP addresses and dividing them among Virtual connected via a LAN. However, these management processes
Local Area Networks (VLANs). This type of fragmentation should be automated. Therefore, operators or service provider
of the address space limits the usefulness of virtual machines, may use Software Defined Networks for the DCNs in the
which may prevent their migration from their original VLAN future.
1470 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
C. Network Fault Tolerance Challenges was explained in Section IV. While hyper-converged nodes
Fault tolerance may be a common aspect that every new may be effective for recovering from server failure scenarios,
system is likely to confront. The IoT, which is still evolving they may not be suitable for a router or switch failure. For
toward a standardized system, is no exception. For the proper instance, hyper-converged node prior to failure replicates and
functioning of the IoT system, it needs to operate continuously store its data in the virtual volume. During the failure of any
in spite of the failure of some of its components. In partic- node, the virtual machine is built on the top of another node
ular, the real-time analytics of the massive IoT data require and the data of the failed node is replicated to this virtual
machine. Through this, the hyper-converged node can provide
a fault tolerant system in order to report the decisions and
higher fault tolerance to the failed server, and recover faster
insights within the required time constraints. The failure of
than other data center architectures. However, the basic met-
the analytics network, therefore, will heavily impact the pro-
ric to estimate the fault tolerance of any system is Time to
cessing nodes since they can, no longer, deliver the processed
Repair (TTR) in case of failure. A lower value of TTR implies
results over the malfunctioning network. As a consequence,
high fault tolerance capacity by the system. Hyper-converged
the real-time analytics tasks may not be able to meet their
systems, owing to their low time to repair emerges as a suit-
time requirements. In spite of designing scalable and agile able candidate for leveraging high fault tolerance among other
networks, the data center may not be able to provide the aforementioned alternatives. However, hyper-converge systems
QoS requirements if the physical network suffers from failure. are not fully fault tolerant because they are unable to handle
Although fault tolerance is incorporated in the design of almost switch failures like traditional methods. Therefore, it opens a
all DCN architectures, yet no architecture can be considered new issue, i.e., how to incorporate 100% fault tolerance to the
to be cent percent fault free. It can be also concluded from the hyper-converged system, particularly in case of switch fail-
Table IV that the some data centers’ architectures are less fault ures. A possible direction to this problem is to replace the
tolerant than others. For instance, the tree topology is likely to 10 Gbps optical links with wireless communication that may
be much more fault prone because of the single point of fail- provide high data rate within a small distance and provide easy
ure [165]. Furthermore, even though fault tolerance is a well maintenance and management in the event of a switch fail-
known challenge for every system, methodologies to design ure. For instance, if one switch fails, then the hyper-converge
fault tolerant systems vary. Therefore, the issues of different node attached to that switch can connect to other available,
network methodologies related to designing fault tolerant ana- neighboring switches through wireless communication, and
lytics networks to support real-time analytics of IoT data are can relay their services. While the hyper-converged network
addressed here. We broadly classify the approaches for design- can be useful to replace the failed node within minutes and
ing fault tolerant networks into hardware and software based relocate the virtual machines of those nodes to another node,
approaches, which are described below. how to limit the migration time within a pre-defined threshold
The hardware-based fault tolerance method includes the so as to comply with the real-time analytics is another key
standby hardware, which replaces the failed physical network problem that researchers need to address.
components in real-time. For instance, the hardware-based While trustworthy and physically reliable, the hardware-
fault tolerance approach is implemented in the Dcell network based fault tolerance approaches incur additional cost to the
topologies efficiently as each server in the Dcell node is con- already expensive analytics networks. As an alternative to
nected to one server of another Dcell node which keeps the the hardware-based fault tolerant network design, software-
communication alive through connected servers in case of fail- based approaches for making fault tolerant networks should
ure of any server. However, Dcell topology can have high also be investigated. For instance, the overlay based parallel
latency in case of a large number of servers and degrade data mining network methodology [132]–[134], discussed in
the packet delivery rate. Despite its design from a fault tol- Section IV, improves service availability during server break-
erant viewpoint, it is not suitable for real-time analytics of downs. The overlay network is constructed by all the servers.
IoT data. Another example of hardware-based fault tolerance The architecture keeps providing the service even if some
may be given through hybrid networks adopted in the ana- server nodes are not available or removed from the overlay
lytics networks that consider integrated wired and wireless network. This type of architecture can be promising for real-
network technologies. Such a hybrid data center network typ- time IoT analytics as it can achieve higher service availability
ically exhibits high fault tolerance as the servers within a against small-scale failures [166]. However, in the case of
rack can be connected with wireless links while the rack a large number of server failures, the overlay network may
can be connected with the wired link. Each server within not be capable of providing services which may severely dis-
the rack can also be connected with the neighboring ToR rupt real-time IoT analytics. Therefore, in future, the tradeoff
switches that improves the flexibility as well as fault toler- between hardware and software-based network fault tolerance
ance. For example, when a ToR switch fails, then the servers approaches needs to be carefully investigated.
within that ToR switch can connect to the neighboring switch,
which can relay their services. This type of optimized hybrid
communication in the analytics networks can provide a new D. Spectral Efficiency Issues
direction by considering both wired and wireless parame- Addressing the issues related to scalability, agility, and fault
ters. The hyper-converged system can be another example of tolerant can enable the analytics network infrastructure for
the state-of-the-art hardware-based fault tolerant networks that real-time analytics of massive IoT data. However, massive data
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1471
can create another challenge related to efficient data delivery the application deadlines in packet-switched networks, like fair
among computing resources for providing real-time analyt- share, the Earliest Deadline First (EDF), rate reservation, and
ics. In that case, analytics network should use the available so forth. While the fair share allocation method aims to evenly
bandwidth in an efficient manner. Spectral efficiency may be assign bandwidth to all the current traffic flows, it does not
a major issue that has not been adequately addressed from offer a deadline-aware solution. EDF was, therefore, designed
the viewpoint of massive IoT data. In particular, the wire- to allocate the required bandwidth to the packets having the
less networks need to be capable of efficiently handling the earliest deadlines. While EDF works on switching between
network flows and controlling the deadlines of real-time ana- packets deadlines, the analytics networks need to take into
lytics tasks. Because the servers in the wireless data centers account network flow deadlines. The rate reservation method
may break down or experience connectivity issues with the reserves the rate required to meet the latter depending on the
neighboring servers, the performance of the overall network arrival of the network flows. Readers may refer to the results
flows may dramatically degrade. The purely wireless DCNs in [167] that demonstrate the ratio of successful packets deliv-
are not designed with fault tolerance in mind to combat such ery within a specified deadline for a growing number of flows
failures. In addition, numerous servers deployed in a relatively in the Web analytics network in a wired data center. Note that
narrow space may raise spectral efficiency issues, i.e., the opti- this ratio is proportional to the relevant application throughput.
mized use of spectrum or bandwidth so that the maximum From the results, it is evident that even though EDF and other
amount of data can be transmitted with the minimum num- rate reservation methods initially improve application through-
ber of transmission errors [108]. However, the purely wireless puts, their performance degrades after a certain increase in the
data centers may not be able to entirely overcome the spectral number of flows. Therefore, even though the deadline-aware
efficiency issues. To make the wireless DCN fault tolerant and methodologies appear to support the applications/tasks of Web
spectral-efficient, a novel spherical rack architecture based on a analytics to complete within their respective deadlines, they
bimodal degree distribution proposed by Suto et al. [108] may may not be able to handle massive IoT data. Therefore, it
be considered. It aims to improve the data transmission time is reasonable to infer that the average throughputs for real-
with low path loss. In such a rack architecture, servers on both time IoT analytics are likely to degrade even more. Hence, an
intra- and inter-racks are located in a circular arrangement so appropriate deadline-aware network flow control methodology
as to keep a short transmission distance while enabling the hub is required to control the traffic flows without contributing
servers to obtain the required degree of connectivity. Moreover, to significant overhead on the switches and without being
such an arrangement was demonstrated to achieve reasonable affected by the topological changes in the analytics network.
data transmission time needed for MapReduce. However, due Also, parallel mining methodologies may alleviate network
to its batch processing architecture, MapReduce is not suitable delays if they are carefuly designed. Most research works
for real-time IoT analytics. Hence, even though the wireless on conventional parallel mining architectures have dealt with
DCN designed in [108] can be fault tolerant and improve load-aware task allocation and scheduling [168]–[170], and
spectral efficiency, whether the architecture is applicable for network-aware task allocation methods [171]–[173]. While the
real-time IoT analytics is worth investigating. server load-based task allocation methodology decreases the
data processing time, the network load-based task allocation
scheme contributes to significantly lower transmission delay.
E. Network Delay Issues Therefore, by taking into account both the aspects, a more
Network delay issues arise from a number of scenarios in effective task allocation method should be designed for paral-
the analytics network that may be particularly detrimental to lel mining of large-scale data for minimizing the completion
real-time IoT analytics. Therefore, the reason behind network time of real-time IoT analytics. In addition, the conventional
delay issues should be carefully identified during designing network load-based task allocation method distributes the tasks
the analytics networks. The network delay can occur because by considering the available network bandwidth. This is par-
of data flows between servers and switches in the analytics ticularly difficult to implement in a hybrid data center network
network. It can also occur during data access from the database due to the fact that the wireless and optical links have not only
for the analytics. Therefore, issues in network methodolo- different capacities but also employ inherently different access
gies which introduce network flow delay are discussed in the control and data transmission methods [174]. Therefore, it is
remainder of this section. important to estimate the number of tasks, which are to be allo-
The foremost issue of network delay is the tier architecture cated to nodes in each rack based on the expected transmission
of data centers. The general three-tier architecture is con- delay by considering the characteristics of both wireless and
structed with an aggregators, ToR, and servers. The data flow optical networks. Additionally, the nodes having less processing
between each tier introduces delays owing to reasons like over- loads should be allocated the tasks first. Through these steps,
subscription, massive data flow, spectral efficiency problem, the parallel processing of the large data mining can shorten the
unreliable components and so on. Therefore, each tier should completion time of real-time IoT analytics. In this vein, it is
have deadlines, in order to provide real-time services with worth mentioning the work conducted by Suto et al. [175] that
the constraint of massive IoT data flow. However, relevant pioneered the context-aware task allocation scheme, with an
research works, in this regard, are scattered in the literature aim to minimize the completion time of tasks, by considering
that need to be thoroughly surveyed. In addition, it is worth both the wireless and optical network characteristics. Since
pointing out that there exists different methodologies to meet it supports hybrid data center networks and reduces the task
1472 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
completion time, this type of task allocation scheme appears large servers of the data center networks(DCN), the mobile
to be particularly suited to real-time IoT analytics. edge nodes may not be able to support heavyweight soft-
Additionally, context-aware task allocation can be carried ware due to hardware constraints. For example, a small cell
out by intelligent methods (e.g., machine learning and deep Evolved Node B (eNB) with Intel’s T3K concurrent dual-
learning based cognitive techniques) to reduce the network mode System-on-Chip (SoC) comprising a 4-core Advanced
flow delay and meet the deadlines of real-time IoT ana- RISC Machines (ARM)-based CPU [187] and limited memory
lytics tasks. For instance, the work in [176] presented the may not be sufficient for executing the analytics tasks of
role cognitive radio technology for the IoT, specifically spec- Apache Spark, which requires at least 8 CPU cores and 8GB
trum sensing which can provide interference free-channels, memory for effective performance. As a consequence, the
efficient bandwidth utilization, and many more functional- mobile edge analytics requires lightweight algorithms, which
ities. Similarly, we can find the benefits of integration of can support reasonable analytics tasks [188], [189]. On the
radio technologies with IoT in these papers [35], [177]–[180]. other hand, while the lightweight library of Apache Quarks
Moreover, cognitive IoT can also provide intelligent appli- can be exploited with the edge nodes for real-time IoT ana-
cations as discussed in the works conducted in [181]–[184] lytics, the basic data processing and filtering operations of the
by embedding perception-action, intelligent decision making, Quarks are not suitable for the advanced analytical tasks such
knowledge discovery, massive data analytics, and on-demand as context-aware recommendations at real-time. Furthermore,
service provisioning. However, most of the above-discussed machine learning libraries such as TensorFlow [190], which
work considered the role of the cognitive network in the deliv- require less memory and storage, while supporting hetero-
ery network of IoT only. There is no existing work which geneous distributed systems, may be effective for real-time
discusses the potential of the cognitive network at the analyt- analytics tasks exploiting the mobile edge nodes. Additionally,
ics network of IoT. For issues like spectral efficiency, cognitive the work in [191] demonstrated that mobile containers con-
radio technologies can be promising for efficiently using spec- sisting of multiplex device hardware across multiple virtual
trum and improving the performance of real-time analytics of devices can provide similar performance to native hardware.
massive IoT data. Thus, exploiting cognitive radio technolo- Container technologies such as hyper-converged nodes are
gies at the analytics network can be a new research direction evolving and drawing significant attention because they can
for real-time IoT analytics. Moreover, exploiting cognitive be quickly deployed over heterogeneous platforms. Therefore,
techniques to control the analytics network conditions and more research works to effectively adopt hyper-converged
fine-tune network parameters can minimize delay. Also, the nodes on the edge analytics network for carrying out real-time
cognition of network conditions can play an important role IoT analytics is required.
in real-time analytics of network by reducing the load on the
network. In summary, the cognitive methods have the poten-
tial to optimize the analytics network of IoT by studying the VI. C ONCLUSION
behavior of network conditions in advance. Real-time IoT analytics is gaining momentum as the main-
However, parallel mining methodologies can reduce the stream research area to provide immediate (or near-immediate)
transmission delay within an analytics network. The data has actionable insights and business intelligence. However, the
to be transmitted to the end server from a device which opens analytics network of the existing IoT systems does not take
the transmission delay between devices and servers. Therefore, into account the unique requirements of real-time IoT ana-
edge computing can be a promising research area to reduce lytics. In this paper, we pointed out that most contempo-
transmission delay between the devices and servers. However, rary researchers overlooked a suitable IoT analytics network
the edge analytics networks may still experience processing design, because they predominantly focused on the sensing and
delay due to the increase of devices in their locality. Another delivery networking technologies of the IoT system. On the
challenge of using the edge analytics networks for real-time other hand, much of the IoT analytics network structures are
IoT analytics consists of the placement of these nodes in taken for granted. The state-of-the-art DCN and parallel data
the strategically correct places so as to reduce the trans- mining architectures were typically designed for Web analyt-
mission time between these nodes and minimize the service ics, and therefore, are not suitable for real-time IoT analytics.
latency (i.e., both communication and computational delays). We identified this research gap, and described real-time IoT
But due to the inherent requirement of real-time IoT analyt- analytics use cases and software platforms along with their
ics tasks, there can be a huge amount of IoT data ingestion unique network requirements. We then conducted an exten-
at a given edge network node which may increase both pro- sive survey on the state-of-the-art wired, wireless, and hybrid
cessing and transmission delays because of the increase in data centers to assess whether they can fulfill the network
the queueing time. As a remedy, the research work con- requirements of real-time IoT analytics. Our survey indicated
ducted in [185] and [186], for reducing the service delay that hyper-converged networks can be particularly suitable for
and improving the transmission time in the edge analytics real-time IoT analytics due to their unique fault tolerance and
network through the virtual machine migration and consid- scalability. Additionally, network supports for parallel mining
ering the transmission power control may be useful. Similar methods of the massive IoT data were thoroughly discussed.
approaches to tackle the issue of service delay at the edge Furthermore, the relevance of edge analytics networks to expe-
analytics network may be required for the IoT systems in dite real-time IoT analytics toward the network edge (i.e.,
future. In addition, it is worth noting that in contrast with the away from the data centers) was elucidated. Then, a detailed
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1473
TABLE V
L IST OF ACRONYMS [5] J. A. Galache et al., “ClouT: Leveraging cloud computing techniques
for improving management of massive IoT data,” in Proc. IEEE 7th
Int. Conf. Service-Oriented Comput. Appl., Matsue, Japan, Nov. 2014,
pp. 324–327.
[6] Y. Ma et al., “An efficient index for massive IOT data in cloud envi-
ronment,” in Proc. 21st ACM Int. Conf. Inf. Knowl. Manag. (CIKM),
Maui, HI, USA, Nov. 2012, pp. 2129–2133.
[7] T. Li, Y. Liu, Y. Tian, S. Shen, and W. Mao, “A storage solu-
tion for massive IoT data based on NoSQL,” in Proc. IEEE Int.
Conf. Green Comput. Commun. (GREENCOM), Besanç̧on, France,
Nov. 2012, pp. 50–57.
[8] Z. M. Ding, J. J. Xu, and Q. Yang, “SeaCloudDM: A database cluster
framework for managing and querying massive heterogeneous sen-
sor sampling data,” J. Supercomput., vol. 66, no. 3, pp. 1260–1284,
Dec. 2013.
[9] V. P. Kafle, Y. Fukushima, and H. Harai, “Internet of Things stan-
dardization in ITU and prospective networking technologies,” IEEE
Commun. Mag., vol. 54, no. 9, pp. 43–49, Sep. 2016.
[10] S. Haller, S. Karnouskos, and C. Schroth, “The Internet of Things in
an enterprise context,” in Proc. Future Internet Symp., Vienna, Austria,
Sep. 2008, pp. 14–28.
[11] R. Murray, “Driverless cars,” IET Comput. Control Autom., vol. 18,
no. 3, pp. 14–17, Jun./Jul. 2007.
[12] Q. W. Oung et al., “Wearable multimodal sensors for evaluation of
patients with Parkinson disease,” in Proc. IEEE Int. Conf. Control Syst.
Comput. Eng. (ICCSCE), Penang, Malaysia, Jun. 2016, pp. 269–274.
[13] D. Georgakopoulos, P. P. Jayaraman, M. Fazia, M. Villari, and
R. Ranjan, “Internet of Things and edge cloud computing roadmap
for manufacturing,” IEEE Cloud Comput., vol. 3, no. 4, pp. 66–73,
Jul./Aug. 2016.
[14] R. Lin, Z. Wang, and Y. Sun, “Wireless sensor networks solutions
for real time monitoring of nuclear power plant,” in Proc. 5th World
Congr. Intell. Control Autom. (IEEE Cat. No. 04EX788), Hangzhou,
China, Jun. 2004, pp. 3663–3667.
[15] A. Biem, H. Feng, A. V. Riabov, and D. S. Turaga, “Real-time analysis
and management of big time-series data,” IBM J. Res. Develop., vol. 57,
nos. 3–4, pp. 1–12, May/Jul. 2013.
[16] Y. Ge, X. Liang, Y. C. Zhou, Z. Pan, G. T. Zhao, and Y. L. Zheng,
“Adaptive analytic service for real-time Internet of Things applica-
tions,” in Proc. IEEE Int. Conf. Web Services (ICWS), San Francisco,
CA, USA, Jun. 2016, pp. 484–491.
[17] A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and
M. Ayyash, “Internet of Things: A survey on enabling technologies,
protocols, and applications,” IEEE Commun. Surveys Tuts., vol. 17,
no. 4, pp. 2347–2376, 4th Quart., 2015.
[18] Y. Kawamoto, H. Nishiyama, N. Kato, N. Yoshimura, and
S. Yamamoto, “Internet of Things (IoT): Present state and future
prospects,” IEICE Trans. Inf. Syst., vol. E97-D, no. 10, pp. 2568–2575,
Oct. 2014.
[19] A. Meddeb, “Internet of Things standards: Who stands out from the
crowd?” IEEE Commun. Mag., vol. 54, no. 7, pp. 40–47, Jul. 2016.
[20] M. A. Razzaque, M. Milojevic-Jevric, A. Palade, and S. Clarke,
discussion on a number of open research issues and future “Middleware for Internet of Things: A survey,” IEEE Internet Things
J., vol. 3, no. 1, pp. 70–95, Feb. 2016.
directions on real-time IoT analytics was provided. Among the [21] O. Vermesan et al., “Internet of Things strategic research roadmap,” in
open research issues, the challenges of analytics network scal- Internet of Things—Global Technological and Societal Trends, vol. 1.
ability, agility, fault tolerance, spectral efficiency, and network Aalborg, Denmark: River, Jun. 2011, pp. 9–52.
[22] V. Gazis, “A survey of standards for machine-to-machine and the
delay were thoroughly discussed with a focus on real-time IoT Internet of Things,” IEEE Commun. Surveys Tuts., vol. 19, no. 1,
analytics. pp. 482–511, 1st Quart., 2017.
[23] J. Mineraud, O. Mazhelis, X. Su, and S. Tarkoma, “A gap analy-
sis of Internet-of-Things platforms,” Comput. Commun., vols. 89–90,
A PPENDIX pp. 5–16, Sep. 2016.
[24] S. M. R. Islam, D. Kwak, M. H. Kabir, M. Hossain, and K. S. Kwak,
A list of acronyms is provided in Table V. “The Internet of Things for health care: A comprehensive survey,” IEEE
Access, vol. 3, pp. 678–708, 2015.
R EFERENCES [25] A. Rajandekar and B. Sikdar, “A survey of MAC layer issues and pro-
tocols for machine-to-machine communications,” IEEE Internet Things
[1] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of J., vol. 2, no. 2, pp. 175–186, Apr. 2015.
Things (IoT): A vision, architectural elements, and future directions,” [26] M. A. Mehaseb, Y. Gadallah, A. Elhamy, and H. Elhennawy,
Future Gener. Comput. Syst., vol. 29, no. 7, pp. 1645–1660, Sep. 2013. “Classification of LTE uplink scheduling techniques: An M2M per-
[2] S. Chen, H. Xu, D. Liu, B. Hu, and H. Wang, “A vision of IoT: spective,” IEEE Commun. Surveys Tuts., vol. 18, no. 2, pp. 1310–1335,
Applications, challenges, and opportunities with China perspective,” 2nd Quart., 2016.
IEEE Internet Things J., vol. 1, no. 4, pp. 349–359, Aug. 2014. [27] N. C. Luong, D. T. Hoang, P. Wang, D. Niyato, D. I. Kim, and
[3] J. Zheng, D. Simplot-Ryl, C. Bisdikian, and H. T. Mouftah, “The Z. Han, “Data collection and wireless communication in Internet of
Internet of Things [guest editorial],” IEEE Commun. Mag., vol. 49, Things (IoT) using economic analysis and pricing models: A sur-
no. 11, pp. 30–31, Nov. 2011. vey,” IEEE Commun. Surveys Tuts., vol. 18, no. 4, pp. 2546–2590,
[4] D. Singh, G. Tripathi, and A. J. Jara, “A survey of Internet-of-Things: 4th Quart., 2016.
Future vision, architecture, challenges and services,” in Proc. IEEE [28] J. Liu et al., “New perspectives on future smart FiWi networks:
World Forum Internet Things (WF-IoT), Seoul, South Korea, Apr. 2014, Scalability, reliability, and energy efficiency,” IEEE Commun. Surveys
pp. 287–292. Tuts., vol. 18, no. 2, pp. 1045–1072, 2nd Quart., 2016.
1474 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
[29] E. Soltanmohammadi, K. Ghavami, and M. Naraghi-Pour, “A survey of [55] A. B. Sharma, F. Ivančić, A. Niculescu-Mizil, H. Chen, and G. Jiang,
traffic issues in machine-to-machine communications over LTE,” IEEE “Modeling and analytics for cyber-physical systems in the age of
Internet Things J., vol. 3, no. 6, pp. 865–884, Dec. 2016. big data,” ACM SIGMETRICS Perform. Eval. Rev., vol. 41, no. 4,
[30] O. Hahm, E. Baccelli, H. Petersen, and N. Tsiftes, “Operating systems pp. 74–77, Mar. 2014.
for low-end devices in the Internet of Things: A survey,” IEEE Internet [56] R. Tönjes et al., “Real-time IoT stream processing and large-scale
Things J., vol. 3, no. 5, pp. 720–734, Oct. 2016. data analytics for smart city applications,” in Proc. Eur. Conf. Netw.
[31] C. Perera, R. Ranjan, and L. Wang, “End-to-end privacy for open Commun., Bologna, Italy, Jun. 2014.
big data markets,” IEEE Cloud Comput., vol. 2, no. 4, pp. 44–53, [57] M. Arlitt et al., “IoTAbench: An Internet of Things analytics bench-
Jul./Aug. 2015. mark,” in Proc. 6th ACM/SPEC Int. Conf. Perform. Eng., Austin, TX,
[32] J. Granjal, E. Monteiro, and J. Sá Silva, “Security for the Internet USA, Jan. 2015, pp. 133–144.
of Things: A survey of existing protocols and open research [58] B. P. Rao, P. Saluia, N. Sharma, A. Mittal, and S. V. Sharma, “Cloud
issues,” IEEE Commun. Surveys Tuts., vol. 17, no. 3, pp. 1294–1312, computing for Internet of Things & sensing based applications,” in
3rd Quart., 2015. Proc. 6th Int. Conf. Sens. Technol. (ICST), Kolkata, India, pp. 374–380,
[33] K. Zhang, X. Liang, R. Lu, and X. Shen, “Sybil attacks and their Oct. 2012.
defenses in the Internet of Things,” IEEE Internet Things J., vol. 1, [59] L. Wang and R. Ranjan, “Processing distributed Internet of Things
no. 5, pp. 372–383, Oct. 2014. data in clouds,” IEEE Cloud Comput., vol. 2, no. 1, pp. 76–80,
[34] A. M. Nia and N. K. Jha, “A comprehensive study of security Jan./Feb. 2015.
of Internet-of-Things,” IEEE Trans. Emerg. Topics Comput., to be [60] V. Sarathy, P. Narayan, and R. Mikkilineni, “Next generation cloud
published, doi: 10.1109/TETC.2016.2606384. computing architecture: Enabling real-time dynamism for shared dis-
[35] Q. Wu et al., “Cognitive Internet of Things: A new paradigm beyond tributed physical infrastructure,” in Proc. 19th IEEE Int. Workshops
connection,” IEEE Internet Things J., vol. 1, no. 2, pp. 129–143, Enabling Technol. Infrastruct. Collaborative Enterprises, Larissa,
Apr. 2014. Greece, Aug. 2010, pp. 48–53.
[36] L. D. Xu, W. He, and S. Li, “Internet of Things in industries: A survey,” [61] N. M. M. K. Chowdhury and R. Boutaba, “Network virtualization:
IEEE Trans. Ind. Informat., vol. 10, no. 4, pp. 2233–2243, Nov. 2014. State of the art and research challenges,” IEEE Commun. Mag., vol. 47,
[37] A. Whitmore, A. Agarwal, and L. D. Xu, “The Internet of Things— no. 7, pp. 20–26, Jul. 2009.
A survey of topics and trends,” Inf. Syst. Front., vol. 17, no. 2, [62] R. Jain and S. Paul, “Network virtualization and software defined
pp. 261–274, Apr. 2015. networking for cloud computing: A survey,” IEEE Commun. Mag.,
[38] L. Atzori, A. Iera, and G. Morabito, “The Internet of Things: A survey,” vol. 51, no. 11, pp. 24–31, Nov. 2013.
Comput. Netw., vol. 54, no. 15, pp. 2787–2805, Oct. 2010. [63] P. Mell and T. Grance, “The NIST definition of cloud computing,” Nat.
[39] C. Perera, C. H. Liu, and S. Jayawardena, “The emerging Internet of Inst. Standards Technol., vol. 53, no. 6, p. 50, Oct. 2009.
Things marketplace from an industrial perspective: A survey,” IEEE [64] S. Nechifor, A. Petrescu, D. Damian, D. Puiu, and B. Trnauc,
Trans. Emerg. Topics Comput., vol. 3, no. 4, pp. 585–598, Dec. 2015. “Predictive analytics based on CEP for logistic of sensitive goods,”
[40] A. Zanella, N. Bui, A. Castellani, L. Vangelista, and M. Zorzi, “Internet in Proc. Int. Conf. Optim. Elect. Electron. Equipment (OPTIM), Bran,
of Things for smart cities,” IEEE Internet Things J., vol. 1, no. 1, Romania, May 2014, pp. 817–822.
pp. 22–32, Feb. 2014. [65] X. Zhu, F. Kui, and Y. Wang, “Predictive analytics by using Bayesian
[41] B. Marr, Big Data: Using SMART Big Data, Analytics and Metrics model averaging for large-scale Internet of Things,” Int. J. Distrib.
to Make Better Decisions and Improve Performance. New York, NY, Sensor Netw., vol. 9, no. 12, pp. 1–10, Dec. 2013.
USA: Wiley, Jan. 2015. [66] A. Osman, M. El-Refaey, and A. Elnaggar, “Towards real-time analytics
[42] S. Bin, L. Yuan, and W. Xiaoyi, “Research on data mining models for in the cloud,” in Proc. IEEE 9th World Congr. Services, Santa Clara,
the Internet of Things,” in Proc. Int. Conf. Image Anal. Signal Process., CA, USA, Nov. 2013, pp. 428–435.
Zhejiang, China, Apr. 2010, pp. 127–132. [67] R. Winter. (Apr. 2008). Why Are Data Warehouses Growing So Fast?
[43] A. R. Biswas and R. Giaffreda, “IoT and cloud convergence: [Online]. Available: www.b-eyenetwork.com/print/7188
Opportunities and challenges,” in Proc. IEEE World Forum Internet [68] R. R. Schaller, “Moore’s law: Past, present and future,” IEEE Spectr.,
Things (WF-IoT), Seoul, South Korea, Apr. 2014, pp. 375–376. vol. 34, no. 6, pp. 52–59, Jun. 1997.
[44] L. Lengyel, P. Ekler, T. Ujj, T. Balogh, and H. Charaf, “SensorHUB: [69] A. Vineela and L. S. Rani, “Internet of Things—Overview,” Int. J. Res.
An IoT driver framework for supporting sensor networks and data Sci. Technol., vol. 2, no. 4, pp. 8–12, Apr. 2015.
analysis,” Int. J. Distrib. Sensor Netw., vol. 11, no. 7, pp. 379–454, [70] K. Sato, Y. Kawamoto, H. Nishiyama, N. Kato, and Y. Shimizu,
Jan. 2015. “A modeling technique utilizing feedback control theory for
[45] C. C. Aggarwal, N. Ashish, and A. Sheth, “The Internet of Things: performance evaluation of IoT system in real-time,” in Proc. Int. Conf.
A survey from the data-centric perspective,” in Managing and Mining Wireless Commun. Signal Process. (WCSP), Nanjing, China, Oct. 2015,
Sensor Data. New York, NY, USA: Springer, Dec. 2012, pp. 383–428. pp. 1–5.
[46] M. Zwolenski and L. Weatherill, “The digital universe: Rich data and [71] K. Ueda, M. Tamai, and K. Yasumoto, “A method for recogniz-
the increasing value of the Internet of Things,” Aust. J. Telecommun. ing living activities in homes using positioning sensor and power
Digit. Econ., vol. 2, no. 3, pp. 1–9, Apr. 2014. meters,” in Proc. IEEE Int. Conf. Pervasive Comput. Commun.
[47] Y. Huang and G. Li, “Descriptive models for Internet of Things,” in Workshops (PerCom Workshops), St. Louis, MO, USA, Mar. 2015,
Proc. Int. Conf. Intell. Control Inf. Process., Dalian, China, Aug. 2010, pp. 354–359.
pp. 483–486. [72] H. Huo, Y. Xu, H. Yan, S. Mubeen, and H. Zhang, “An elderly health
[48] M. M. Rathore, A. Ahmad, A. Paul, and S. Rho, “Urban planning and care system using wireless sensor networks at home,” in Proc. 3rd
building smart cities based on the Internet of Things using big data Int. Conf. Sensor Technol. Appl. (SENSORCOMM), Athens, Greece,
analytics,” Comput. Netw., vol. 101, pp. 63–80, Jun. 2016. Jun. 2009, pp. 158–163.
[49] A. Merentitis et al., “WSN trends: Sensor infrastructure virtualization [73] J. Lee, H.-A. Kao, and S. Yang, “Service innovation and smart analytics
as a driver towards the evolution of the Internet of Things,” in Proc. 7th for industry 4.0 and big data environment,” in Proc. 6th CIRP Conf.
Int. Conf. UBICOMM, Porto, Portugal, Sep. 2013, pp. 113–118. Ind. Product Service Syst., vol. 16. Windsor, ON, Canada, Jan. 2014,
[50] M. Hassanalieragh et al., “Health monitoring and management pp. 3–8.
using Internet-of-Things (IoT) sensing with cloud-based processing: [74] D. Sonntag, S. Zillner, P. van der Smagt, and A. Lörincz, “Overview
Opportunities and challenges,” in Proc. IEEE Int. Conf. Services of the CPS for smart factories project: Deep learning, knowledge
Comput., New York, NY, USA, Jun./Jul. 2015, pp. 285–292. acquisition, anomaly detection and intelligent user interfaces,” in
[51] H. Hromic et al., “Real time analysis of sensor data for the Internet of Industrial Internet of Things. Cham, Switzerland: Springer, Oct. 2016,
Things by means of clustering and event processing,” in Proc. IEEE pp. 487–504.
Int. Conf. Commun. (ICC), London, U.K., Jun. 2015, pp. 685–691. [75] C. Oberwinkler and M. Stundner, “From real time data to produc-
[52] R. Barga, “Processing big data in motion,” in Proc. IEEE Int. Conf. tion optimization,” in Proc. SPE Asia Pac. Conf. Integr. Model. Asset
Cloud Eng. (IC2E), Berlin, Germany, Jun. 2016, p. 171. Manag., Kuala Lumpur, Malaysia, Jan. 2004, pp. 1–14.
[53] A. Akbar, F. Carrez, K. Moessner, and A. Zoha, “Predicting complex [76] J. Lee, B. Bagheri, and H.-A. Kao, “A cyber-physical systems archi-
events for pro-active IoT applications,” in Proc. IEEE 2nd World Forum tecture for industry 4.0-based manufacturing systems,” Manuf. Lett.,
Internet Things (WF-IoT), Milan, Italy, Jan. 2016, pp. 327–332. vol. 3, pp. 18–23, Jan. 2015.
[54] N. Mohamed and J. Al-Jaroodi, “Real-time big data analytics: [77] B. S. Sahay and J. Ranjan, “Real time business intelligence in sup-
Applications and challenges,” in Proc. Int. Conf. High Perform. ply chain analytics,” Inf. Manag. Comput. Security, vol. 16, no. 1,
Comput. Simulat. (HPCS), Bologna, Italy, Sep. 2014, pp. 305–310. pp. 28–48, Dec. 2008.
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1475
[78] C. Cauzzi et al., “Earthquake early warning and operational earth- [104] K. Bilal et al., “Quantitative comparisons of the state-of-the-art data
quake forecasting as real-time hazard information to mitigate seismic center architectures,” Concurrency Comput. Pract. Exp., vol. 25, no. 12,
risk at nuclear facilities,” Bull. Earthquake Eng., vol. 14, no. 9, pp. 1771–1783, Dec. 2013.
pp. 2495–2512, Sep. 2016. [105] C. Guo et al., “BCube: A high performance, server-centric network
[79] M. Y. S. Uddin et al., “The scale2 multi-network architecture for architecture for modular data centers,” ACM SIGCOMM Comput.
IoT-based resilient communities,” in Proc. IEEE Int. Conf. Smart Commun. Rev., vol. 39, no. 4, pp. 63–74, Oct. 2009.
Comput. (SMARTCOMP), St. Louis, MO, USA, May 2016, pp. 1–8. [106] G. Wang et al., “c-Through: Part-time optics in data centers,” ACM
[80] A. M. Ibrahim, I. Venkat, K. G. Subramanian, A. T. Khader, and SIGCOMM Comput. Commun. Rev., vol. 40, no. 4, pp. 327–338,
P. De Wilde, “Intelligent evacuation management systems: A review,” Oct. 2010.
ACM Trans. Intell. Syst. Technol., vol. 7, no. 3, Feb. 2016, Art. no. 36. [107] Y. Cui, H. Wang, X. Cheng, and B. Chen, “Wireless data center
[81] M. Erdelj, E. Natalizio, K. R. Chowdhury, and I. F. Akyildiz, “Help networking,” IEEE Wireless Commun., vol. 18, no. 6, pp. 46–53,
from the sky: Leveraging UAVs for disaster management,” IEEE Dec. 2011.
Pervasive Comput., vol. 16, no. 1, pp. 24–32, Jan./Mar. 2017. [108] K. Suto et al., “A failure-tolerant and spectrum-efficient wireless data
[82] N. Y. Soltani, S.-J. Kim, and G. B. Giannakis, “Real-time load elasticity center network design for improving performance of big data mining,”
tracking and pricing for electric vehicle charging,” IEEE Trans. Smart in Proc. IEEE 81st Veh. Technol. Conf. (VTC Spring), Glasgow, U.K.,
Grid, vol. 6, no. 3, pp. 1303–1313, May 2015. May 2015, pp. 1–5.
[83] J. H. Yoon, R. Baldick, and A. Novoselac, “Dynamic demand response [109] A. Botta, W. de Donato, V. Persico, and A. Pescapé, “On the integration
controller based on real-time retail price for residential buildings,” of cloud computing and Internet of Things,” in Proc. Int. Conf. Future
IEEE Trans. Smart Grid, vol. 5, no. 1, pp. 121–129, Jan. 2014. Internet Things Cloud, Barcelona, Spain, Aug. 2014, pp. 23–30.
[110] J. Zhou et al., “CloudThings: A common architecture for integrating
[84] P.-Y. Chen, S. Yang, and J. A. McCann, “Distributed real-time anomaly
the Internet of Things with cloud computing,” in Proc. IEEE 17th Int.
detection in networked industrial sensing systems,” IEEE Trans. Ind.
Conf. Comput. Supported Cooperat. Work Design (CSCWD), Whistler,
Electron., vol. 62, no. 6, pp. 3832–3842, Jun. 2015.
BC, Canada, Jun. 2013, pp. 651–657.
[85] S. Kazemi, R. J. Millar, and M. Lehtonen, “Criticality analysis of fail- [111] K.-D. Chang, C.-Y. Chen, J.-L. Chen, and H.-C. Chao, Internet
ure to communicate in automated fault-management schemes,” IEEE of Things and Cloud Computing for Future Internet. Heidelberg,
Trans. Power Del., vol. 29, no. 3, pp. 1083–1091, Jun. 2014. Germany: Springer, Sep. 2011, pp. 1–10.
[86] H. Liu, “Remote intelligent medical monitoring system based [112] K. Kant, “Data center evolution: A tutorial on state of the art, issues,
on Internet of Things,” in Proc. Int. Conf. Smart Grid Elect. and challenges,” Comput. Netw., vol. 53, no. 17, pp. 2939–2965,
Autom. (ICSGEA), Zhangjiajie, China, Aug. 2016, pp. 42–45. Dec. 2009.
[87] S. Sakr and A. Elgammal, “Towards a comprehensive data analyt- [113] A. Greenberg et al., “VL2: A scalable and flexible data center network,”
ics framework for smart healthcare services,” Big Data Res., vol. 4, ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 51–62,
pp. 44–58, Jun. 2016. Oct. 2009.
[88] L. Catarinucci et al., “An IoT-aware architecture for smart health- [114] M. Al-Fares, A. Loukissas, and A. Vahdat, “A scalable, commodity
care systems,” IEEE Internet Things J., vol. 2, no. 6, pp. 515–526, data center network architecture,” in Proc. ACM SIGCOMM Conf. Data
Dec. 2015. Commun. (SIGCOMM), Seattle, WA, USA, Aug. 2008, pp. 63–74.
[89] T. T. Thakur, A. Naik, S. Vatari, and M. Gogate, “Real time traffic man- [115] R. N. Mysore et al., “Portland: A scalable fault-tolerant layer 2
agement using Internet of Things,” in Proc. Int. Conf. Commun. Signal data center network fabric,” in Proc. ACM SIGCOMM Conf. Data
Process. (ICCSP), Melmaruvathur, India, Apr. 2016, pp. 1950–1953. Commun. (SIGCOMM), Barcelona, Spain, Aug. 2009, pp. 39–50.
[90] W. Knight, “Driverless cars,” Technol. Rev., vol. 116, no. 6, pp. 44–49, [116] M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat,
Nov. 2013. “Hedera: Dynamic flow scheduling for data center networks,” in
[91] J. G. Shanahan and L. Dai, “Large scale distributed data sci- Proc. 7th USENIX Conf. Netw. Syst. Design Implement. (NSDI),
ence using apache spark,” in Proc. 21th ACM SIGKDD Int. Conf. San Jose, CA, USA, Apr. 2010, p. 19.
Knowl. Disc. Data Min. (KDD), Sydney, NSW, Australia, Aug. 2015, [117] N. Farrington et al., “Helios: A hybrid electrical/optical switch archi-
pp. 2323–2324. tecture for modular data centers,” ACM SIGCOMM Comput. Commun.
[92] H. Karau, A. Konwinski, P. Wendell, and M. Zaharia, Learning Spark: Rev., vol. 40, no. 4, pp. 339–350, Oct. 2010.
Lightning-Fast Big Data Analytics, 1st ed. Beijing, China: O’Reilly [118] K. Chen et al., “OSA: An optical switching architecture for data cen-
Media, 2015. ter networks with unprecedented flexibility,” IEEE/ACM Trans. Netw.,
[93] P. Carbone et al., “Apache flink: Stream and batch processing in a vol. 22, no. 2, pp. 498–511, Apr. 2014.
single engine,” Bulletin IEEE Comput. Soc. Tech. Committee Data Eng., [119] C. Kachris and I. Tomkos, “A survey on optical interconnects for data
vol. 38, no. 4, pp. 28–38, 2015. centers,” IEEE Commun. Surveys Tuts., vol. 14, no. 4, pp. 1021–1036,
[94] N. Spangenberg, M. Roth, and B. Franczyk, “Evaluating new 4th Quart., 2012.
approaches of big data analytics frameworks,” in Proc. Int. Conf. Bus. [120] M. J. O’Mahony, D. Simeonidou, D. K. Hunter, and A. Tzanakaki,
Inf. Syst., Poznań, Poland, Jun. 2015, pp. 28–37. “The application of optical packet switching in future communica-
[95] V. Markl, “Breaking the chains: On declarative data analysis and data tion networks,” IEEE Commun. Mag., vol. 39, no. 3, pp. 128–135,
independence in the big data era,” in Proc. VLDB Endowment, vol. 7, Mar. 2001.
no. 13, Aug. 2014, pp. 1730–1733. [121] O. Gerstel, M. Jinno, A. Lord, and S. J. B. Yoo, “Elastic optical
[96] R. Ranjan, “Streaming big data processing in datacenter clouds,” IEEE networking: A new dawn for the optical layer?” IEEE Commun. Mag.,
Cloud Comput., vol. 1, no. 1, pp. 78–83, May 2014. vol. 50, no. 2, pp. s12–s20, Feb. 2012.
[122] S. Sarkar, H.-H. Yen, S. Dixit, and B. Mukherjee, “A novel delay-aware
[97] A. Bahga and V. Madisetti, Internet of Things: A Hands-On Approach.
routing algorithm (DARA) for a hybrid wireless-optical broadband
Bothell, WA, USA: VPT, Aug. 2014.
access network (WOBAN),” IEEE Netw., vol. 22, no. 3, pp. 20–28,
[98] M. Rychly, P. Koda, and P. Mr, “Scheduling decisions in stream pro- May/Jun. 2008.
cessing on heterogeneous clusters,” in Proc. 8th Int. Conf. Complex [123] S. Kandula, J. Padhye, and P. Bahl, “Flyways to de-congest data
Intell. Softw. Intensive Syst. (CISIS), Birmingham, U.K., Jan. 2014, center networks,” Microsoft Res., Microsoft Corp., Redmond, WA,
pp. 614–619. USA, Tech. Rep. MSR-TR-2009-109, Aug. 2009. [Online]. Available:
[99] M. Yang and R. T. B. Ma, “Smooth task migration in apache storm,” https://www.microsoft.com/en-us/research/publication/flyways-to-de-
in Proc. ACM SIGMOD Int. Conf. Manag. Data, Melbourne, VIC, congest-data-center-networks/
Australia, May 2015, pp. 2067–2068. [124] J.-Y. Shin, E. G. Sirer, H. Weatherspoon, and D. Kirovski, “On the fea-
[100] T. Akidau et al., “The dataflow model: A practical approach to bal- sibility of completely wireless datacenters,” in Proc. 8th ACM/IEEE
ancing correctness, latency, and cost in massive-scale, unbounded, Symp. Architect. Netw. Commun. Syst. (ANCS), Austin, TX, USA,
out-of-order data processing,” in Proc. ACM Proc. 41st Int. Oct. 2012, pp. 3–14.
Conf. Very Large Data Bases, Kohala, HI, USA, Aug. 2015, [125] A. C. Azagury et al., “GPFS-based implementation of a hyper-
pp. 1792–1803. converged system for software defined infrastructure,” IBM J. Res.
[101] F. Yang et al., “Druid: A real-time analytical data store,” in Proc. ACM Develop., vol. 58, nos. 2–3, pp. 1–12, Mar./May 2014.
SIGMOD Int. Conf. Manag. Data, Snowbird, UT, USA, Jun. 2014, [126] A. J. Younge et al., “Analysis of virtualization technologies for high
pp. 157–168. performance computing environments,” in Proc. IEEE 4th Int. Conf.
[102] M. Barlow, Real-Time Big Data Analytics: Emerging Architecture. Cloud Comput., Washington, DC, USA, Sep. 2011, pp. 9–16.
Sebastopol, CA, USA: O’Reilly Media, Jun. 2013. [127] S. Park, B. Cha, and J. Kim, “Preparing and inter-connecting hyper-
[103] A. Greenberg, J. Hamilton, D. A. Maltz, and P. Patel, “The cost of a converged SmartX Boxes for IoT-cloud testbed,” in Proc. IEEE 29th
cloud: Research problems in data center networks,” ACM SIGCOMM Int. Conf. Adv. Inf. Netw. Appl., Gwangiu, South Korea, Apr. 2015,
Comput. Commun. Rev., vol. 39, no. 1, pp. 68–73, Jan. 2009. pp. 695–697.
1476 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 19, NO. 3, THIRD QUARTER 2017
[128] N. Elmqvist and P. Irani, “Ubiquitous analytics: Interacting with [156] D. Halperin, S. Kandula, J. Padhye, P. Bahl, and D. Wetherall,
big data anywhere, anytime,” Computer, vol. 46, no. 4, pp. 86–89, “Augmenting data center networks with multi-gigabit wireless links,”
Apr. 2013. ACM SIGCOMM Comput. Commun. Rev., vol. 41, no. 4, pp. 38–49,
[129] A. Gattiker, F. H. Gebara, H. P. Hofstee, J. D. Hayes, and A. Hylick, Aug. 2011.
“Big data text-oriented benchmark creation for Hadoop,” IBM J. Res. [157] X. Gao et al., “Traffic load balancing schemes for devolved controllers
Develop., vol. 57, nos. 3–4, pp. 1–6, May/Jul. 2013. in mega data centers,” IEEE Trans. Parallel Distrib. Syst., vol. 28, no. 2,
[130] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on pp. 572–585, Feb. 2017.
large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. 2008. [158] X. Yi, F. Liu, J. Liu, and H. Jin, “Building a network highway for
[131] M. K. McKusick and S. Quinlan, “GFS: Evolution on fast-forward,” big data: Architecture and challenges,” IEEE Netw., vol. 28, no. 4,
ACM Mag. Queue File Syst., vol. 7, no. 7, p. 10, Aug. 2009. pp. 5–13, Jul. 2014.
[132] J. Zhang, G. Wu, X. Hu, and X. Wu, “A distributed cache for Hadoop [159] W. Slessman, S. Durand, R. Gilbert, and A. Zoll, “Modular data center,”
distributed file system in real-time cloud services,” in Proc. ACM/IEEE U.S. Patent 14/042 087, Sep. 2013.
13th Int. Conf. Grid Comput., Beijing, China, Sep. 2012, pp. 12–21. [160] C. Guo et al., “Dcell: A scalable and fault-tolerant network structure
[133] H.-C. Hsiao, H.-Y. Chung, H. Shen, and Y.-C. Chao, “Load rebalancing for data centers,” in Proc. SIGCOMM, Seattle, WA, USA, Aug. 2008,
for distributed file systems in clouds,” IEEE Trans. Parallel Distrib. pp. 75–86.
Syst., vol. 24, no. 5, pp. 951–962, May 2013. [161] N. Bitar, S. Gringeri, and T. J. Xia, “Technologies and protocols for
[134] F. Azzedin, “Towards a scalable HDFS architecture,” in Proc. Int. data center and cloud networking,” IEEE Commun. Mag., vol. 51, no. 9,
Conf. Collaboration Technol. Syst., San Diego, CA, USA, May 2013, pp. 24–31, Sep. 2013.
pp. 155–161. [162] A. Greenberg et al., “VL2: A scalable and flexible data center network,”
[135] A. Crespo and H. Garcia-Molina, Semantic Overlay Networks for P2P in Proc. ACM SIGCOMM Conf. Data Commun. (SIGCOMM),
Systems. Heidelberg, Germany: Springer, Jul. 2005, pp. 1–13. Barcelona, Spain, Aug. 2009, pp. 51–62.
[136] M. Sridharan et al., “NVGRE: Network virtualization using generic [163] D. A. Maltz, A. G. Greenberg, P. K. Patel, S. Sengupta, and P. Lahiri,
routing encapsulation,” IETF, Fremont, CA, USA, IETF RFC 7637, “Data center without structural bottlenecks,” U.S. Patent 8 996 683,
Sep. 2015. Mar. 2015.
[137] B. Davie and J. Gross, “A stateless transport tunneling protocol for [164] T. Benson, A. Anand, A. Akella, and M. Zhang, “Understanding data
network virtualization (STT),” IETF, vol. 5, pp. 1–19, Mar. 2012. center traffic characteristics,” ACM SIGCOMM Comput. Commun. Rev.,
[138] L. Yong, “Overlay virtual gateway for overlay networks,” U.S. Patent vol. 40, no. 1, pp. 92–99, Jan. 2010.
14/037 056, Sep. 2013. [165] C. Kachris, K. Kanonakis, and I. Tomkos, “Optical interconnection
[139] Z. Yao, X. Wang, D. Leonard, and D. Loguinov, “Node isolation networks in data centers: Recent trends and future challenges,” IEEE
model and age-based neighbor selection in unstructured P2P networks,” Commun. Mag., vol. 51, no. 9, pp. 39–45, Sep. 2013.
IEEE/ACM Trans. Netw., vol. 17, no. 1, pp. 144–157, Feb. 2009. [166] K. Suto et al., “An overlay-based data mining architecture tolerant to
[140] P. Flocchini, A. Nayak, and M. Xie, “Enhancing peer-to-peer systems physical network disruptions,” IEEE Trans. Emerg. Topics Comput.,
through redundancy,” IEEE J. Sel. Areas Commun., vol. 25, no. 1, vol. 2, no. 3, pp. 292–301, Sep. 2014.
pp. 15–24, Jan. 2007. [167] C. Wilson, H. Ballani, T. Karagiannis, and A. A. Rowstron,
[141] K. Suto et al., “THUP: A P2P network robust to Churn and DoS attack “Better never than late: Meeting deadlines in datacenter
based on bimodal degree distribution,” IEEE J. Sel. Areas Commun., networks,” Microsoft Res., Microsoft Corp., Redmond, WA,
vol. 31, no. 9, pp. 247–256, Sep. 2013. USA, Tech. Rep. MSR-TR-2011-66, May 2011. [Online]. Available:
[142] C. Yan, X. Yang, Z. Yu, M. Li, and X. Li, “IncMR: Incremental data https://www.microsoft.com/en-us/research/publication/betternever-than-
processing based on MapReduce,” in Proc. IEEE 5th Int. Conf. Cloud late-meeting-deadlines-in-datacenter-networks/
Comput. (CLOUD), Honolulu, HI, USA, Jun. 2012, pp. 534–541. [168] D. Warneke and O. Kao, “Exploiting dynamic resource allocation for
[143] R. Want, B. N. Schilit, and S. Jenson, “Enabling the Internet of Things,” efficient parallel data processing in the cloud,” IEEE Trans. Parallel
IEEE Comput. Soc., vol. 48, no. 1, pp. 28–35, Jan. 2015. Distrib. Syst., vol. 22, no. 6, pp. 985–997, Jun. 2011.
[144] M. Satyanarayanan et al., “Edge analytics in the Internet of Things,” [169] C. Tian, H. Zhou, Y. He, and L. Zha, “A dynamic mapreduce scheduler
IEEE Pervasive Comput., vol. 14, no. 2, pp. 24–31, Apr. 2015. for heterogeneous workloads,” in Proc. 8th Int. Conf. Grid Cooperat.
[145] P. G. Lopez, “Edge-centric computing: Vision and challenges,” ACM Comput. (GCC), Washington, DC, USA, Aug. 2009, pp. 218–224.
SIGCOMM Comput. Commun. Rev., vol. 45, no. 5, pp. 37–42, [170] A. Verma, L. Cherkasova, and R. H. Campbell, “Orchestrating
Oct. 2015. an ensemble of mapreduce jobs for minimizing their makespan,”
[146] T. Verbelen, P. Simoens, F. De Turck, and B. Dhoedt, “Cloudlets: IEEE Trans. Depend. Secure Comput., vol. 10, no. 5, pp. 314–327,
Bringing the cloud to the mobile user,” in Proc. 3rd ACM Workshop Sep./Oct. 2013.
Mobile Cloud Comput. Services, Ambleside, U.K., Jun. 2012, [171] M. Asahara, S. Nakadai, and T. Araki, “Loadatomizer: A locality and
pp. 29–36. I/O load aware task scheduler for mapreduce,” in Proc. IEEE 4th
[147] X. Sun and N. Ansari, “EdgeIoT: Mobile edge computing for the Int. Conf. Cloud Comput. Technol. Sci. (CloudCom), Taipei, Taiwan,
Internet of Things,” IEEE Commun. Mag., vol. 54, no. 12, pp. 22–29, Dec. 2012, pp. 317–324.
Dec. 2016. [172] H. Kim, J. Jung, M. Bae, and H. Kim, “A simulation study on
[148] “Cisco fog computing solutions: Unleash the power of the Internet of map/reduce framework in wireless data center environment,” in Proc.
Things,” White Paper, Cisco, San Jose, CA, USA, 2015. Int. Conf. ICT Converg. (ICTC), Jeju-do, South Korea, Oct. 2013,
[149] T. H. Luan et al., “Fog computing: Focusing on mobile users at the pp. 440–445.
edge,” arXiv preprint arXiv:1502.01815, Jun. 2015. [173] R. Kaewpuang, D. Niyato, P. Wang, and E. Hossain, “A frame-
[150] T. Żabiński, “Implementation of programmable automation work for cooperative resource management in mobile cloud comput-
controllers—Promising perspective for intelligent manufacturing ing,” IEEE J. Sel. Areas Commun., vol. 31, no. 12, pp. 2685–2700,
systems,” Manag. Prod. Eng. Rev., vol. 1, no. 2, pp. 56–63, Jul. 2010. Dec. 2013.
[151] S. Yi, Z. Hao, Z. Qin, and Q. Li, “Fog computing: Platform [174] S. Sarkar, S. Dixit, and B. Mukherjee, “Hybrid wireless-optical
and applications,” in Proc. 3rd IEEE Workshop Hot Topics broadband-access network (WOBAN): A review of relevant
Web Syst. Technol. (HotWeb), Washington, DC, USA, Jan. 2015, challenges,” J. Lightw. Technol., vol. 25, no. 11, pp. 3329–3340,
pp. 73–78. Nov. 2007.
[152] K. Ha et al., “Towards wearable cognitive assistance,” in Proc. 12th [175] K. Suto, H. Nishiyama, and N. Kato, “Context-aware task allocation
Annu. Int. Conf. Mobile Syst. Appl. Services (MobiSys), Bretton Woods, for fast parallel big data processing in optical-wireless networks,” in
NH, USA, Jun. 2014, pp. 68–81. Proc. Int. Wireless Commun. Mobile Comput. Conf. (IWCMC), Nicosia,
[153] B. Varghese, N. Wang, S. Barbhuiya, P. Kilpatrick, and Cyprus, Aug. 2014, pp. 423–428.
D. S. Nikolopoulos, “Challenges and opportunities in edge com- [176] A. A. Khan, M. H. Rehmani, and A. Rachedi, “When cognitive radio
puting,” in Proc. IEEE Int. Conf. Smart Cloud (SmartCloud), meets the Internet of Things?” in Proc. Int. Wireless Commun. Mobile
New York, NY, USA, Nov. 2016, pp. 20–26. Comput. Conf. (IWCMC), Paphos, Cyprus, Sep. 2016, pp. 469–474.
[154] K. Y. Liu, “High performance and efficient single-chip small cell base [177] E. Z. Tragos and V. Angelakis, “Cognitive radio inspired
station SoC,” in Proc. IEEE Hot Chips 24 Symp. (HCS), Aug. 2012, M2M communications,” in Proc. 16th Int. Symp. Wireless Pers.
pp. 1–29. Multimedia Commun. (WPMC), Atlantic City, NJ, USA, Oct. 2013,
[155] C. Liang and X. Huang, “SmartCell: A power-efficient reconfig- pp. 1–5.
urable architecture for data streaming applications,” in Proc. IEEE [178] A. Aijaz and A. H. Aghvami, “Cognitive machine-to-machine commu-
Workshop Signal Process. Syst., Washington, DC, USA, Nov. 2008, nications for Internet-of-Things: A protocol stack perspective,” IEEE
pp. 257–262. Internet Things J., vol. 2, no. 2, pp. 103–112, Apr. 2015.
VERMA et al.: SURVEY ON NETWORK METHODOLOGIES FOR REAL-TIME ANALYTICS 1477
[179] J. Tervonen, K. Mikhaylov, S. Pieskä, J. Jämsä, and M. Heikkilä, Zubair Md. Fadlullah (M’11–SM’13) received
“Cognitive Internet-of-Things solutions enabled by wireless sen- the B.Sc. (Hons.) degree in computer science and
sor and actuator networks,” in Proc. 5th IEEE Conf. Cogn. information technology from the Islamic University
Infocommun. (CogInfoCom), Vietri sul Mare, Italy, Nov. 2014, of Technology, Bangladesh, in 2003, and the M.Sc.
pp. 97–102. and Ph.D. degrees in applied information science
[180] G. Sallai, “From telecommunications to cognitive infocommunica- from Tohoku University, Japan, in 2008 and 2011,
tions and Internet of Things—Phases of digital convergence,” in Proc. respectively. He is currently an Associate Professor
IEEE 17th Int. Conf. Intell. Eng. Syst. (INES), San José, Costa Rica, with the Graduate School of Information Sciences,
Jun. 2013, pp. 13–17. Tohoku University. His research interests are in the
[181] V. Miz and V. Hahanov, “Smart traffic light in terms of the cogni- areas of 5G, smart grid, network security, intru-
tive road traffic management system (CTMS) based on the Internet of sion detection, game theory, and quality of security
Things,” in Proc. IEEE East-West Design Test Symp. (EWDTS), Kiev, service provisioning mechanism. He was a recipient of the Dean’s Award
Ukraine, Sep. 2014, pp. 1–5. and the President’s Award from Tohoku University in 2011, the IEEE Asia
[182] A. Somov, C. Dupont, and R. Giaffreda, “Supporting smart-city mobil- Pacific Outstanding Researcher Award in 2015, the NEC Foundation Prize for
ity with cognitive Internet of Things,” in Proc. Future Netw. Mobile
research contributions in 2016, and several best paper awards in the Globecom,
Summit, Lisbon, Portugal, Jul. 2013, pp. 1–10.
IC-NIDC, and IWCMC conferences.
[183] Y. Saleem et al., “Exploitation of social IoT for recommendation ser-
vices,” in Proc. IEEE 3rd World Forum Internet Things (WF-IoT),
Reston, VA, USA, Dec. 2016, pp. 359–364.
[184] A. Rachedi, M. H. Rehmani, S. Cherkaoui, and J. J. P. C. Rodrigues,
“IEEE Access special section editorial: The plethora of research
in Internet of Things (IoT),” IEEE Access, vol. 4, pp. 9575–9579,
Jan. 2017. Hiroki Nishiyama (SM’13) received the M.S. and
[185] T. G. Rodrigues, K. Suto, H. Nishiyama, and N. Kato, “Hybrid method Ph.D. degrees in information science from Tohoku
for minimizing service delay in edge cloud computing through VM University, Japan, in 2007 and 2008, respectively.
migration and transmission power control,” IEEE Trans. Comput., He is an Associate Professor with the Graduate
vol. 66, no. 5, pp. 810–819, May 2017. School of Information Sciences, Tohoku University.
[186] T. G. Rodrigues et al., “Towards a low-delay edge cloud comput- He has published over 160 peer-reviewed papers
ing through a combined communication and computation approach,” including many high quality publications in presti-
in Proc. IEEE 84th Veh. Technol. Conf. (VTC Fall), Montreal, QC, gious IEEE journals and conferences. His research
Canada, Sep. 2016, pp. 1–5. interests cover a wide range of areas including
[187] P. Paolucci et al., “Dual-processor complex domain floating-point DSP satellite communications, unmanned aircraft system
system on chip,” U.S. Patent 11/554 448, Oct. 2006. networks, wireless and mobile networks, ad hoc and
[188] S. Kartakis and J. A. McCann, “Real-time edge analytics for sensor networks, green networking, and network security. He was a recipi-
cyber physical systems using compression rates,” in Proc. 11th Int. ent of the Best Paper Awards from many international conferences, including
Conf. Auton. Comput. (ICAC), Philadelphia, PA, USA, Jun. 2014, IEEE’s flagship events, such as the IEEE Global Communications Conference
pp. 153–159. in 2014 (GLOBECOM’14), GLOBECOM’13, GLOBECOM’10, the IEEE
[189] I. Santos, M. Tilly, B. Chandramouli, and J. Goldstein, “Dial: International Conference on Communications in 2016 (ICC’16), and the IEEE
Distributed streaming analytics anywhere, anytime,” Proc. VLDB Wireless Communications and Networking Conference in 2014 (WCNC’14),
Endow., vol. 6, no. 12, pp. 1386–1389, Aug. 2013. and WCNC’12, the Special Award of the 29th Advanced Technology Award
[190] M. Abadi et al., “TensorFlow: Large-scale machine learning on het- for Creativity in 2015, the IEEE Communications Society Asia–Pacific Board
erogeneous distributed systems,” in Proc. 12th USENIX Conf. Oper. Outstanding Young Researcher Award 2013, the IEICE Communications
Syst. Design Implement. (OSDI), Berkeley, CA, USA, Nov. 2016, Society Academic Encouragement Award 2011, and the 2009 FUNAI
pp. 265–283. Foundation’s Research Incentive Award for Information Technology. He cur-
[191] J. Andrus, C. Dall, A. V. Hof, O. Laadan, and J. Nieh, “Cells: A virtual
rently serves as an Associate Editor for Springer Journal of Peer-to-Peer
mobile smartphone architecture,” in Proc. 23rd ACM Symp. Oper. Syst.
Networking and Applications, and the Secretary of IEEE ComSoc Sendai
Principles (SOSP), Cascais, Portugal, Oct. 2011, pp. 173–187.
[192] M. Raho, A. Spyridakis, M. Paolino, and D. Raho, “KVM, Xen and Chapter. One of his outstanding achievements is Relay-by-Smartphone, which
Docker: A performance analysis for ARM based NFV and cloud makes it possible to share information among many people by device-to-device
computing,” in Proc. IEEE 3rd Workshop Adv. Inf. Electron. Elect. direct communication. He is a Senior Member of the Institute of Electronics,
Eng. (AIEEE), Riga, Latvia, Nov. 2015, pp. 1–8. Information and Communication Engineers.