Said 2020
Said 2020
1
Department of Computer Engineering, College of Engineering and technology Arab
Academy for Science and Technology, Alexandria, Egypt.
Abstract. Applications for the Internet of Things (IoT) have evolved in excessive numbers,
producing a vast amount of data needed for intelligent processing. The varying IoT
infrastructures such as cloud and IoT application layer protocol limitations in the
transmission/receiving of messages become the barriers in the implementation of intelligent IoT
apps. In this paper, we review the importance of Big data, cloud computing and fog computing
in IoT and the challenges of using machine learning in IoT. Finally, we discuss the general
statistics of using artificial intelligence in IoT applications.
1. Introduction:
The word "smart" fascinates us but the tools that we are using today are far from being smart like a
human being. Artificial intelligence is designed to make computers do human reasoning that why we
need machine learning [1] and data analysis [2] to be combined in one system. Machine learning creates
methods to make the network's component automatic and self-sufficient while data analysis analyzes
the data to identify the historical trends and be more effective and accurate in the future. IoT [3] is the
main object of this trend that will provide a word full of intelligent devices called “smart objects” [4]
integrated through the internet, Bluetooth or infrared. In this paper, we research and evaluate the role
of machine learning in promoting data analytics for the IoT systems.
2. Big Data in IoT environments:
Intrinsically, IoT data is a type of big data. The widespread use of sensors for data collection, the
utilization of the data collected over long periods of time and the need to evaluate them on a scale to
help decision taking means that they overlap with many dimensions of big data [5]. In the following
section we will describe the main characteristics of big data found in IoT and how these characteristics
are considered as a challenge when using traditional machine learning techniques. Figure 1 shows the
IoT data characteristics, the left part of the figure concerning big data characteristics will be discussed
in the following section.
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd 1
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
velocity is one of the main properties of big data hence IoT data. The velocity dimension refers to how
quickly the data is being produced and how fast the application needs to process it. "Fast data" is
endemic to IoT, since the generated data is in most cases a continuous and unbounded stream from a
multitude of sensors [6]. The velocity of IoT data is affected by many factors. It’s affected directly by
the sampling rate and the number of sensors in a given environment. Very high sampling rates can also
often lead to redundancy if the observations are changing slowly. Nonetheless, a certain amount of
redundancy will be optimal with some degree of trust for detecting the events with some level of
confidence [5].
3.2. Volume
volume of data is not less important than velocity. Big data implies enormous volumes of data. One
view is the volume of data generated per sensor and the number of sensors (distributed). But these
sensors are constantly observing, and the streaming data is potentially unbounded.
3.3. Variety
The variety component of big data is representative of the various potential types of data representations
and protocols. IoT data heterogeneity is common, and can be due to different data sources, data types,
or networks. IoT applications collect data from a wide range of sensors and in a wide range of formats
due to the variety of environment variables being monitored.
3.4. Veracity
veracity refers to the biases, noise and abnormality in data. The question here is whether the data that
is being stored, and mined is relevant to the problem being analyzed. Figure 2 shows the characteristics
of Big Data [31].
2
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
related to the cloud computing and IoT. Figure 3 shows different IoT components, and how analytics
apply to these components [9]. Fog computing aims to form low latency network links between devices
and the endpoints of their analyses and reduces the bandwidth required [10]. An additional advantage
is the advanced security features that users can implement in a fog network such as network traffic
segmentation and virtual extension of firewalls to provide network security.
6.2. Performance
Some machine learning algorithms like Deep Convolutional Neural Networks can achieve high
accuracies, but they have high computational and memory requirements, which make it hard to
implement and sometimes infeasible on the resource-constrained devices used in IoT systems, for
example in safety-critical applications like autonomous driving, which requires real-time image
processing [11]. The following table 1 shows frame rates of YOLO algorithm, which is a state-of-the-
art algorithm used in object detection applications, accomplished using Nvidia Jetson TX1 embedded
module [12], which is made to be used in visual computing applications [35].
O-YOLOv2 11.8
YOLOv2 5.4
Table 1. Results of testing YOLO algorithms on Nvidia Jetson TX1
As shown in the Table 1, the frame rate didn’t exceed 17.85, which is too slow for applications that
require real-time processing, which requires a frame rate of 30 [11].
6.3. Reliability
This is another core issue in machine learning algorithms. Machine learning algorithms like neural
networks are not entirely accurate, which makes it unreliable for applications with high sensitivity to
accuracy, like self-driving vehicles, and cancer diagnosis. Moreover, deep learning is being introduced
in many industrial applications including data mining & analytics, where certain industrial standards
need to be met, like IEC 61508 [13], which specifies reliability specifications to be met in industrial
applications. Therefore, deep learning-based systems reliability needs to be ensured [13].
3
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
4
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
space [19]. For example, an operation of 2 x 2 pooling on top of 12 feature maps will produce an output
size tensor [16 x 16 x 12].
5
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
6
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
that is connected to a cloud platform depending on the actual traffic situation [27]. By the time, the
smart road system can predict the time and the location of the traffic and prevent that automatically.
Figure 6. Smart Parking System Overview Figure 7. Street lightening System overview
5893 14 5
Table 3. Specifications of the Transportation Mode detection dataset
10.1.2. Rain Prediction
The dataset is collected from Australia through the observation of weather stations. The dataset includes
many attributes such as: rainfall, evaporation, sunshine, temperature, direction, wind speed, pressure,
7
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
clouds and humidity. The dataset is used to predict whether there will be rain the next day or not [31].
The Specifications of this dataset are shown in Table 4.
142000 24 2
Table 4. Specifications of the Rain detection dataset
ROC-AUC Used for discriminating the positive and negative classes in binary
score classification and demonstrates how good a model is.
Table 5. Performance Metrics
10.3. Results
10.3.1. Performance comparison of the algorithms on Transportation Mode Detection dataset:
The accuracy rate is the most reliable measure in this case because for each target class in the dataset
there are equal numbers of labels which means that the dataset is balanced [32] as shown in Table 6.
LR 63 62 62 64 63/65 0.68
8
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
The Rain Prediction dataset has 24 features, three of the features are categorical features each one of
them has 16 different values, for that kind of information the dataset required much more pre-processing
tasks. One-hot encoding techniques are applied to the dataset to convert the categorical features to
numeric ones, which lead to tripling the number of features [33] as shown in Table 7.
LR 72 79 74 85 85/85 8.2
RF 80 71 74 99 85/85 1.42
11. Conclusion:
Nowadays, the IoT applications are integrated in everything. This paper shows the importance of the
usage of artificial intelligence in Big data IoT applications using our proposed framework and shows
the power of machine learning in providing benefits to the consumers. The proposed framework is
examined on two datasets (Transportation Mode Detection dataset and Rain Prediction). When using
different machine learning techniques with the first dataset, it gives best accuracy of 85/87% with RF
algorithm but when using these same techniques with the second dataset, it gives best accuracy of
85/86% with the ANN algorithm. Our innovative architecture is opening up new opportunities for
potential machine-to- machine communications work.
References
[1] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, Machine Learning: An Artificial
Intelligence Approach. Springer Science & Business Media, 2013.
[2] I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques.
Morgan Kaufmann, 2016.
[3] Q. F. Hassan, A. R. Khan, and S. A. Madani, Internet of Things: Challenges, Advances, and
Applications. Chapman & Hall/CRC Computer and Information Science Series, CRC Press,
2017.
[4] G. Fortino and P. Trunfio, Internet of Things based on Smart Objects: Technology, Middleware
and Applications. Springer, 2014.
[5] Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Mohammadamin Barekatain,
Peyman Adibi, Payam Barnaghi, Amit P. Sheth; Machine Learning for Internet of Things
Data Analysis: A Survey. 2018
[6] Nashez Zubair;Niranjan A;Kiran Hebbar;Yogesh Simmhan: Characterizing IoT Data and its
Quality for Use. 2019
[7] Bonomi F, Milito R, Zhu J, Addepalli S (2012) Fog computing and its role in the internet of
things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud
Computing, ACM, New York, NY, USA, MCC ’12, pp 13–16
[8] Joshi J, Reddy J, Reddy P, Agarwal A, Agarwal R, Bagga A, Bhargava A (2016) Cloud
9
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
computing based smart garbage monitoring system. In: 2016 3rd international conference on
electronic design (ICED), IEEE, pp 70–75
[9] Aazam M, Zeadally S, Harras KA (2018) Offloading in fogcomputing for iot: review, enabling
technologies, and research opportunities. Future Generat Comput Syst 87:278–289
[10] Fog computing and the internet of things: extend the cloud to where the things are (2015).
https://www.cisco.com/c/dam/en_us/solutions/trends/iot/docs/computing-overview.pdf.
Accessed 18 Oct 2019
[11] Fatima Hussain, Rasheed Hussain, Syed Ali Hassan, and Ekram Hossain, Machine Learning in
IoT Security: Current Solutions and Future Challenges, IEEE 2020
[12] Shafiee, Mohammad Javad, et al. "Fast YOLO: A Fast You Only Look Once System for Real
time Embedded Object Detection in Video." arXiv preprint arXiv:1709.05943 (2017).
[13] IEC 61508 2016. Functional Safety and IEC 61508. (2016). Retrieved Oct. 2016 from
http://www.iec.ch/functionalsafety/
[14] A. Joakar, A Methodology for Solving Problems with DataScience for Internet of Things, Open
Gardens(blog)(July21,2016),http://www.opengardensblog.futuretext.com/archives/2016/07/
a-methodology-for-solvingproblems-with-datascience-for-internet-of-things.html
[15] H. Zhao and C. Huang, A data processing algorithm in epc internet of things, Cyber-enabled
Distributed Computing and Knowledge Discovery (CyberC), 2014 International Conference
on, IEEE (2014), pp. 128-131.
[16] E. Shelhamer, J. Long, and T. Darrell. Fully convolutional networks for semantic segmentation.
IEEE Trans Pattern Anal Mach Intell, 39(4):640–651, 2017.
[17] Farabet, C.; Couprie, C.; Najman, L.; LeCun, Y. Learning hierarchical features for scene labeling.
IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 1915–1929. [CrossRef] [PubMed]
[18] Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image
recognition. arXiv 2014, arXiv:1409.1556.
[19] Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional
neural networks. In Proceedings of the 25th International Conference on Neural Information
Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105.
[20] N. Cristianini, J. Shawe-Taylor, An introduction to support vector machines and other kernel-
based learning methods, Cambridge university press, 2000.
[21] McCallum, K. Nigam, et al., A comparison of event models for naïve bayes text classi_cation,
in: AAAI-98 workshop on learning for text categorization, Vol. 752, Citeseer, 1998, pp. 41-
48.
[22] Neter, M. H. Kutner, C. J. Nachtsheim, W. Wasserman, Applied linear statistical models, Vol. 4,
Irwin Chicago, 1996.
[23] T. Cover, P. Hart, Nearest neighbor pattern classi_cation, IEEE transactions on information
theory 13 (1) (1967) 21-27.
[24] Breiman, Random forests, Machine learning 45 (1) (2001) 5-32.
[25] Breiman, Bagging predictors, Machine learning 24 (2) (1996) 123-140.
[26] R. Petrolo, V. Loscri, N. Mitton, Towards a smart city based on cloud of things, a survey on the
smart city vision and paradigms, Transactions on Emerging Telecommunications
Technologies.
[27] M. Vanis and K. Urbaniec, “Employing Bayesian Networks and conditional probability functions
for determining dependences in road traffic accidents data,” 2017 Smart City Symposium
Prague (SCSP), May 2017.
[28] Prabhu Ramaswamy; IoT smart parking system for reducing green house gas emission; 2016
International Conference on Recent Trends in Information Technology (ICRTIT)
[29] N. Yoshiura, Y. Fujii, and N. Ohta. Smart street light system looking like usual street lights based
on sensor networks. In International Symposium on Communications and Information
Technologies (ISCIT), pages 633–637, Sept 2013
[30] D. Kleyko, R. Hostettler, W. Birk, and E. Osipov, "Comparison of machine learning techniques
for vehicle classification using road side sensors," in Proceedings of the IEEE 18th
International Conference on Intelligent Transportation Systems, 2015, pp. 572-577.
[31] Y. K. Ever, K. Dimililer, and B. Sekeroglu, "Comparison of Machine Learning Techniques for
10
ICATAS-MJJIC 2020 IOP Publishing
IOP Conf. Series: Materials Science and Engineering 1051 (2021) 012008 doi:10.1088/1757-899X/1051/1/012008
11