2023-10-4-2-Miletic
2023-10-4-2-Miletic
Introduction
Teaching Associate, University of Belgrade Faculty of Organizational Sciences, Serbia.
Teaching Associate, University of Belgrade Faculty of Organizational Sciences, Serbia.
Teaching Assistant, University of Belgrade Faculty of Organizational Sciences, Serbia.
Research Associate, Institute of Economic Sciences, Serbia.
Full Professor, University of Belgrade Faculty of Organizational Sciences, Serbia.
https://doi.org/10.30958/ajte.10-4-2 doi=10.30958/ajte.10-4-2
Vol. 10, No. 4 Miletic et al.: A Data Streaming Architecture for Air Quality Monitoring…
Smart city services are relying on infrastructures based on IoT, edge, and
cloud computing technologies, whose interoperability requires an efficient message
distribution system. This paper proposes a methodology that enables continuous
monitoring and collection of air quality data in real-time, from sensors deployed at
different locations in smart cities. The methodology introduces a two-layer
architecture: Edge – air quality data collection, and Cloud – receiving and
processing data in a stream.
This paper is organized as follows. Next section presents a review of relevant
literature related to smart city IoT architectures and data processing. Then comes
the methodology and design of the proposed data streaming architecture. While
insights about the implementation and discussion about results are provided
afterwards and finally the paper is concluded with a discussion of the applicability
of the proposed architecture and future development.
Literature Review
Based on the rapid growth and complex requests of modern systems, for
cloud and edge computing it is important to study different methods and
architectures which have proven to be successful. This analysis will allow us to
better understand best practices and address the challenges that bring us this
dynamic field of computing.
Edge computing is a decentralized computing infrastructure that allows
remote devices to process data closer to the edge of the network, near the source
(Mitrović et al. 2023). Several analyses have been conducted on an edge
computing platform that proves edge computing is a good solution for cooperation
with cloud, network communication, and edge equipment (Chen et al. 2018,
Martin Fernandez et al. 2018, Raza et al. 2019). This approach offers several
advantages, including reduced latency, bandwidth optimization, enhanced privacy
and security, offline operation (Hassan et al. 2019, Shi et al. 2016, Varghese et al.
2016). Also, one of the most important things in edge is data privacy, reduced
attack surface, local threats, communication security, trustworthiness of edge
devices. Some of these problems are addressed in the following papers (Ali et al.
2021, Markham and Payne 2001, Xiao et al. 2019).
216
Athens Journal of Technology & Engineering December 2023
217
Vol. 10, No. 4 Miletic et al.: A Data Streaming Architecture for Air Quality Monitoring…
1. Sending messages from edge devices, using MQTT and Kafka broker. The
problem arises in the fact that the device has to send data using two
protocols, and it has to be sure that the data either arrive on both or does
not arrive on either of them.
218
Athens Journal of Technology & Engineering December 2023
Most authors include mobile phones in their systems to gather data from
mobile device sensors. In the research of Kraft et al. (2020), a mobile collective
sensing system is proposed that enables the implementation of a noise level map
for tinnitus patients. After the requirements analysis and design phase, the system
is decomposed into bounded contexts to achieve a clear and shared definition of
consistency between team members. Five bounded contexts were identified,
including user identity, social aspects, measurement, incentives, and communication.
This architecture uses a cloud-native approach, which enables efficient and
scalable processing of concurrent noise measurement requests, using microservices
and containerization technologies such as Docker and Kubernetes. In addition, the
system uses in-stream processing and the Apache Kafka platform for real-time
data processing and enabling decoupled processing of incoming geospatial data.
To display polluted areas on the map, the authors decided to use geospatial data
partitioning techniques. Hierarchical partitioning of geospatial data, such as the
implementation of the Discrete Global Grid System (DGGS), allows data to be
divided into different levels of detail. DGGS enables the representation of data in
different partition sizes, which enables aggregation and visualization at different
scales. The map has Hexagonal Hierarchical Spatial Index (H3) system. H3
systems allow precise positioning of geospatial data in appropriate partitions based
on their coordinates. This technique facilitates analyzing and visualizing polluted
parts on the map.
The development of the mobile crowdsensing system for monitoring noise
pollution for decision-making purposes in smart cities by collecting, storing, and
visualizing data on noise pollution in real time has been described in the article
(Jezdović et al. 2021). These authors also point out the experiment results
conducted in Belgrade, Serbia, and recommendations on how this system can be
applied in other cities. The application presented in this paper is a mobile
crowdsensing system for detecting noise in smart cities. The system contains a
crowdsensing mobile application, cloud, and big data infrastructure. The mobile
application enables noise recording using a microphone on mobile devices,
recording the location of detected noise using a GPS device, performing spectral
analysis on audio data, and storing transformed data and location data in a cloud
database. The web application allows the view of polluted data on Google Maps.
The type of the map is a heatmap on which is presented red, orange, and green
areas for the high, medium, and low levels of noise respectively. In the last two
219
Vol. 10, No. 4 Miletic et al.: A Data Streaming Architecture for Air Quality Monitoring…
papers we can see that the authors have used two different approaches to represent
their data on the map.
Methodology
Figure 1. Data Streaming Architecture for Air Quality Monitoring in Smart Cities
To enable data transfer, the MQTT protocol is used and the Mosquitto MQTT
broker, which is installed on the RPi device. Mosquito was chosen because it is the
MQTT protocol which provides a lightweight method of carrying out messaging
using a publish/subscribe model. This makes it suitable for IoT messaging such as
with low power sensors or mobile devices such as phones, embedded computers
or microcontrollers (What Is Mosquitto MQTT? n.d.). These tools allow sending
data to Apache Kafka, which resides in the Cloud layer of the architecture. Apache
Kafka is used as a central mechanism for receiving and processing data in a
stream. Also, it is chosen as the tool because it is highly scalable for processing
and streaming data in real time. Kafka enables data replication and data availability
even in case of network failures or interruptions, which is important for continuous
monitoring of air quality (Korab n.d.). Also, Kafka supports simultaneous sending
and receiving of data, which enables real-time data analysis and processing. This is
220
Athens Journal of Technology & Engineering December 2023
a very important feature in smart cities because it provides quick detection and
reaction to changes in air quality. The data coming to Kafka would be processed
and saved in the database and sent in real time via the access series to end user
applications.
Figure 1. Data Streaming Architecture for Air Quality Monitoring in Smart Cities
with Data Flow
221
Vol. 10, No. 4 Miletic et al.: A Data Streaming Architecture for Air Quality Monitoring…
222
Athens Journal of Technology & Engineering December 2023
diagrams due to the sensors different output value ranges. These metrics
provided valuable information for assessing air quality trends and identifying
potential pollution patterns during different periods of the day in Belgrade.
Figure 3. Line Chart Based on Average Values for Every Hour in the Day
Figure 4. Bar Chart Based on Average Values for Specific Times of the Day
223
Vol. 10, No. 4 Miletic et al.: A Data Streaming Architecture for Air Quality Monitoring…
Conclusions
The main goal of this paper is to present the design and development of the
robust architecture for crowdsensing systems in smart cities. This architecture
provides the base for effective collection and analysis of data in real-time, opening
new possibilities for measuring and monitoring level of pollution and other
parameters in local environment. Even though the architecture shows easy
processing and analyses of data, some shortcomings have also been identified.
Setting optimal number of topics and replication factor of topics, and how many
brokers is it needed. For now the replication number is set to three, because it is
the golden rule (Ibryam n.d.), but with the further analysis and testing that can be
changed because of a number of sensors that will be sending data to the Apache
Kafka.
Implementation of the developed architecture has applicability in practice for
creating crowdsensing systems in smart cities, especially for measuring air quality
levels. This enables fast and efficient data collection from various sensors and
devices, providing valuable information for management and monitoring of
environmental quality. Future steps may include implementing the developed
architecture on the Docker platform, scaling the system via Kubernetes, and
displaying measurements via a map. This would enable better resource management
and scaling, as well as greater flexibility and fault tolerance in the system
environment. Through this work, it was observed that creating an efficient
architecture for data streaming in crowdsensing systems is essential for successful
real-time data collection and analysis. These results can serve as guidelines for
other researchers dealing with similar problems in the field of smart cities and air
quality meters. Future research will be focused on exploring the possibility of
applying this architecture in wider contexts of smart cities, as well as optimizing
the number of topics, replications, and brokers to achieve maximum efficiency and
scalability. Also, research into the integration of this architecture with other
relevant technologies and platforms opens the door for further improvement of
crowdsensing systems in smart cities.
References
Akbar A, Khan A, Carrez F, Moessner K (2017) Predictive analytics for complex IoT data
streams. IEEE Internet of Things Journal 4(5): 1571–1582.
Ali B, Gregory MA, Li S (2021) Multi-access edge computing architecture, data security
and privacy: a review. IEEE Access 9: 18706–18721.
Cao K, Liu Y, Meng G, Sun Q (n.d.) An overview on edge computing research. Available
at: https://doi.org/10.1109/ACCESS.2020.2991734.
Chen B, Wan J, Celesti A, Li D, Abbas H, Zhang Q (2018) Edge computing in IoT-based
manufacturing. IEEE Communications Magazine 56(9): 103–109.
Cheng B, Solmaz G, Cirillo F, Kovacs E, Terasawa K, Kitazawa A (2018) FogFlow: easy
programming of IoT services over cloud and edges for smart cities. IEEE Internet of
Things Journal 5(2): 696–707.
De Souza PRR, Matteussi KJ, Veith ADS, Zanchetta BF, Leithardt VRQ, Murciego AL,
224
Athens Journal of Technology & Engineering December 2023
et al. (2020) Boosting big data streaming applications in clouds with burstflow. IEEE
Access 8: 219124–219136.
Fridelin Panduman YY, Ulil Albaab MR, Anom Besari AR, Sukaridhoto S, Tjahjono A,
Nourma Budiarti RP (2019) Implementation of data abstraction layer using kafka on
SEMAR platform for air quality monitoring. International Journal on Advanced
Science, Engineering and Information Technology 9(5): 1520–1527.
Froiz-Míguez I, Fernández-Caramés TM, Fraga-Lamas P, Castedo L (2018) Design,
Implementation and Practical Evaluation of an IoT Home Automation System for
Fog Computing Applications Based on MQTT and ZigBee-WiFi Sensor Nodes.
Sensors 18(8): 2660.
Hassan N, Yau KLA, Wu C (2019) Edge computing in 5G: A review. IEEE Access 7:
127276–127289.
Heinze T, Aniello L, Querzoni L, Jerzak Z (2014) Cloud-based data stream processing. In
DEBS 2014 - Proceedings of the 8th ACM International Conference on Distributed
Event-Based Systems, 238–245.
Hugo A, Morin B, Svantorp K (2020) Bridging MQTT and Kafka to support C-ITS: a
feasibility study. In Proceedings - IEEE International Conference on Mobile Data
Management, 2020-January, 371–376.
Ibryam B (n.d.) Fine-tune Kafka performance with the Kafka optimization theorem. Red
Hat Developer. Available at: https://developers.redhat.com/ articles/2022/05/03/fine-
tune-kafka-performance-kafka-optimization-
theorem#the_kafka_optimization_theorem.
Javed A, Heljanko K, Buda A, Framling K (2018) CEFIoT: a fault-tolerant IoT
architecture for edge and cloud. In IEEE World Forum on Internet of Things, WF-
IoT 2018 - Proceedings, 2018-January, 813–818.
Jezdović I, Popović S, Radenković M, Labus A, Bogdanović Z (2021) A crowdsensing
platform for real-time monitoring and analysis of noise pollution in smart cities. In
Sustainable Computing: Informatics and Systems, 31.
Khriji S, Benbelgacem Y, Chéour R, Houssaini DE, Kanoun O (2022) Design and
implementation of a cloud-based event-driven architecture for real-time data
processing in wireless sensor networks. Journal of Supercomputing 78(3): 3374–
3401.
Korab J (n.d.) How to survive a Kafka outage. Available at: https://www.confluent.io/
blog/how-to-survive-a-kafka-outage/.
Koziolek H, Grüner S, Rückert J (n.d.) A comparison of MQTT brokers for distributed IoT
edge computing.
Kraft R, Birk F, Reichert M, Deshpande A, Schlee W, Langguth B, et al. (2020) Efficient
processing of geospatial mhealth data using a scalable crowdsensing platform.
Sensors (Switzerland) 20(12): 1–21.
Markham T, Payne C (2001) Security at the network edge: a distributed firewall
architecture. In Proceedings - DARPA Information Survivability Conference and
Exposition II, DISCEX 2001, 1, 279–286.
Martin Fernandez C, Diaz Rodriguez M, Rubio Munoz B (2018) An edge computing
architecture in the internet of things. In Proceedings - 2018 IEEE 21st International
Symposium on Real-Time Computing, ISORC 2018, 99–102.
Mitrović N, Đorđević M, Veljković S, Danković D (2023) View of IoT enabled software
platform for air quality measurements. Available at: https://www.ebt.rs/journals/in
dex.php/conf-proc/article/view/188/135.
MQTT and Kafka. How to combine two complementary… | by Techletters | Python Point |
Medium (n.d.) Available at: https://medium.com/python-point/ mqtt-and-kafka-
8e470eff606b.
225
Vol. 10, No. 4 Miletic et al.: A Data Streaming Architecture for Air Quality Monitoring…
226