Xiaobo Z. Industrial Edge Computing. Architecture, Optimization and Apps 2024 (1)
Xiaobo Z. Industrial Edge Computing. Architecture, Optimization and Apps 2024 (1)
Shuxin Ge
Jiancheng Chi
Tie Qiu
Industrial Edge
Computing
Architecture, Optimization and
Applications
Industrial Edge Computing
Xiaobo Zhou • Shuxin Ge • Jiancheng Chi • Tie Qiu
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore
Pte Ltd. 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.
This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore
v
vi Preface
vii
viii Contents
1-D One-Dimensional
2-D Two-Dimensional
3G Third Generation cellular networks
4G Fourth Generation cellular networks
5G Fifth Generation cellular networks
AC Always Cooperate
AI Artificial Intelligence
AET Average Execution Time
AoI Age of Information
AR Augmented Reality
ATOM Adaptive Offloading with Two-Stage Hybrid Matching
BS Base Station
CCU Computing Capability Utilization
CDF Cumulative Distribution Function
CEM Cross Entropy Method
CHR Cache Hit Rate
CNN Convolutional Neural Network
CPU Central Processing Unit
CSI Channel State Information
D2D Device-to-Device
DAG Direct Acyclic Graph
DAG-ED Directed Acyclic Graph with External Dependency
DDPG Deep Deterministic Policy Gradient
DL Deep Learning
DMDP Dynamic Markov Decision Process
DNN Deep Neural Network
DP Differential Privacy
DQN Deep Q-Network
DRL Deep Reinforcement Learning
DVFS Dynamic Voltage and Frequency Scaling
ED External Dependency
xi
xii Acronyms
1.1 Concepts
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 1
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_1
2 1 Introduction to Industrial Edge Computing
equipment, whether from a single location or globally. The collected data can
be stored and analyzed for production planning or for innovating and optimizing
processes. Cloud computing offers a comprehensive, easy-to-maintain, flexible, and
cost-efficient data pool.
However, as services become increasingly digitized and automated, the volume
of data grows. Real-time application requirements, bandwidth limitations, and
security concerns reveal the shortcomings of cloud computing. For example, safety-
critical decisions in automated driving (such as “Does the car have to brake
immediately to avoid an accident?”) or industrial settings (such as “Does the
machine need to stop now to prevent injury?”) require instant action. However,
relaying data to the cloud for processing and decision-making in these situations
is inefficient. Delays due to low latency or poor connections can lead to serious
consequences. This is where edge computing becomes crucial.
Edge computing is a new computing paradigm that uses computing, storage, and
network resources distributed between data sources and cloud computing centers for
data analysis and processing [21]. This model employs edge devices with significant
computing power for local data preprocessing, immediate decision-making, and
then sends the results or preprocessed data to the cloud center. The rise of edge
computing is driven by the growing need for real-time data processing, especially
where bandwidth is limited and low latency is crucial. The advancement of IIoT has
led to smarter, more interconnected machinery in factories and on production lines,
increasing the need for advanced data processing technologies. In IIoT settings, edge
computing enables quick, local data processing for industrial equipment, essential
for efficient real-time monitoring and decision-making.
In this context, industrial edge computing has become a key technology. It
combines edge computing’s rapid data processing with IIoT’s intelligent device
management and optimization. Therefore, industrial edge computing enables more
efficient, reliable, and secure data processing in industrial environments. For
example, in smart manufacturing, it can process sensor data in real time for
better production control and maintenance. In automated workshops and intelli-
gent logistics systems, it plays a crucial role in enhancing production efficiency,
reducing operational costs, and improving safety and stability. As a blend of edge
computing and IIoT, industrial edge computing is opening new opportunities for
industrial automation and intelligence. With technological progress, its importance
in various industrial applications is expected to grow increasingly significant and
transformative [22, 23].
4 1 Introduction to Industrial Edge Computing
ments. Furthermore, critical trends, like the varying computing power in traditional
edge computing layers, are often overlooked. Recognizing and addressing these
differences are vital for a more accurate and functional reference architecture for
industrial edge computing.
Wired Networks
Management Computing
Mid-Edge
Layer
Edge Storage Collabration
Layer
Wired & Wireless Networks
Networking Development
Far-Edge
Layer
Controlling Collabration
Instruction
Sensing Wired & Wireless Networks
Data
Device
Layer
Equipment Sensors Machines
The Edge Layer is the central component of the reference architecture for industrial
edge computing. Its main function is to receive, process, and transmit data from
the Device Layer. This layer provides critical services like edge security, privacy
protection, data analysis, intelligent computing, process optimization, and real-time
control. Given the significant variation in computing power among devices in the
Edge Layer, a practical approach is to divide this layer into three sub-layers: the
Near-Edge Layer, the Mid-Edge Layer, and the Far-Edge Layer. These divisions are
based on the varying data processing capabilities of the devices. Segmenting the
Edge Layer in this way allows for more customized data processing, ensuring that
each sub-layer’s specific capabilities and needs are effectively addressed within the
wider scope of industrial edge computing:
• Far-edge layer: The Far-Edge Layer, key to the Edge Layer in industrial edge
computing architecture, includes edge controllers that interface with the Device
Layer. These controllers handle initial tasks like threshold judgment or data
filtering and send control flows to the Device Layer. This can be directed from
the Edge Layer or through the Cloud Application Layer. Due to the variety of
sensors and devices in the Device Layer, the Far-Edge Layer’s edge controllers
must support various protocols for real-time data collection from IIoT’s time-
delay sensitive networks. After data collection, it undergoes initial processing for
threshold judgment or filtering. The edge controllers must integrate and update
an algorithm library specific to their environmental setup, improving strategic
effectiveness. Additionally, they send control instructions to the Device Layer
using Programmable Logic Controller (PLC) control or action control modules,
based on decisions made at the Far-Edge Layer or above. Collaboration among
edge controllers is sometimes necessary for certain tasks.
A vital feature of the Far-Edge Layer is its millisecond-level latency in judgment
and feedback, crucial in emergencies. For instance, an unmanned vehicle needs
1.2 Reference Architecture 7
different production lines and equipment, or a smart grid’s edge server aggregat-
ing and optimizing electricity consumption statistics for diverse communities.
The Cloud Application Layer plays a pivotal role in industrial edge computing
architecture, primarily focusing on extracting potential value from vast amounts of
data and optimizing resource allocation across an enterprise, a region, or on a nation-
wide scale. This layer, operating through the public network, retrieves data from the
Edge Layer and supports upper-layer applications. These applications span a wide
array of functions, including product or process design, comprehensive enterprise
management, sales, and after-sales services. Additionally, the Cloud Application
Layer feeds back models and microservices to the Edge Layer, enhancing its
operational efficiency and decision-making capabilities.
Another key function of the Cloud Application Layer is its ability to facilitate
cloud collaboration. This feature enables the sharing of data among various groups
with different attributes, such as managers, cooperative enterprises, designers, and
customers. Such collaboration not only broadens the scope of data utilization but
also deepens the mining of data value, leading to more nuanced and multifaceted
insights. Decision-making processes within this layer typically span a longer time-
frame, often measured in days. This extended timescale is reflective of the complex
and comprehensive nature of the tasks handled by the Cloud Application Layer,
where strategic decisions impact broader organizational or regional objectives.
rity solutions locally, minimizing the risk of data leakage during transmission and
reducing the volume of data stored in the cloud, thereby significantly lowering
security and privacy risks.
• Reduce Operational Costs: Transferring data directly to the cloud platform
can incur substantial operational costs due to data migration, bandwidth require-
ments, and latency issues. Industrial edge computing reduces the volume of data
that needs to be uploaded, thereby decreasing the amount of data migration,
bandwidth consumption, and latency, which in turn reduces operational costs.
In this section, we explore the challenges faced by industrial edge computing. While
it offers significant benefits in system performance, data security, and cost reduction,
industrial edge computing faces various practical challenges. These include issues
with 5G foundational communications, data offloading and load balancing, edge
artificial intelligence (AI), and data sharing security. For example, integrating 5G
with industrial edge computing presents challenges in Quality of Service (QoS),
node management, and network slicing. As device numbers increase and computing
resources disperse, designing efficient data offloading and load balancing schemes
become crucial. Edge AI, while offering new data processing opportunities, also
raises concerns about computational power and model complexity. Furthermore,
ensuring the security and privacy of data in edge computing environments is a
pressing issue in industrial edge computing. We will now examine these challenges
in more detail to better understand the current and future prospects of industrial edge
computing.
In industrial edge computing systems, data offloading and load balancing are
major challenges. These arise from the large number of devices and the distributed
nature of computing resources. To tackle these issues, specialized schemes for data
offloading and load balancing are needed, considering the specific requirements of
each case. Data offloading in edge networks typically falls into two categories: full
and partial. Full data offloading means transferring all data from one device or
edge server to another. Partial data offloading involves dividing the task data and
distributing it among different devices, sometimes offloading everything to other
devices. The main aim of load balancing methods is to distribute the load evenly,
addressing the varied storage and computing capacities of edge devices and the
differences in offloading strategies. Effective load balancing requires a customized
approach, tailored to the unique characteristics and scenarios of the edge computing
environment. Integrating new technologies into these strategies is also essential,
aiming to optimize load distribution across the network of devices in industrial edge
computing systems.
Edge AI brings both new opportunities and significant challenges for data pro-
cessing in industrial edge computing. The challenges mainly focus on two areas.
First, the limited computing power of edge devices makes it difficult to quickly
complete extensive computational tasks. Second, the complexity of models used
in edge AI requires substantial computational resources for training and inference.
Acknowledging these limitations, a promising approach in industrial edge comput-
ing is the more effective integration of AI with edge computing. This integration
aims to combine the strengths of both AI and edge computing, using the real-time
data processing capabilities at the edge and efficiently managing the computational
demands of AI models. This synergy could significantly improve the efficiency and
effectiveness of data processing in industrial edge computing, making it a key area
of ongoing research and development.
The integration of edge computing in IIoT allows real-time data processing at the
edge. However, given the limited resources and the large number of edge devices,
many tasks require collaboration between multiple devices. This necessitates secure
data sharing among edge devices. IIoT demands high levels of security, and
blockchain technology can provide some measure of security for edge data sharing.
Nonetheless, the limited computing resources of edge devices pose a challenge in
designing and optimizing edge IIoT architecture based on blockchain. Challenges
References 11
include access control and secure storage using blockchain. Therefore, more focus
is needed on developing edge IIoT solutions that incorporate blockchain technology.
References
1. Ming Yang, Yanhui Wang, Cheng Wang, Yan Liang, Shaoqiong Yang, Lidong Wang, and
Shuxin Wang. Digital twin-driven industrialization development of underwater gliders. IEEE
Trans. Ind. Informatics, 19(9):9680–9690, 2023.
2. Veronica Brizzi, Giulia Baccarin, Andreas Bordonetti, and Michele Comperini. Implementa-
tion and industrialization of a deep-learning model for flood wave prediction based on grid
weather forecast for hourly hydroelectric plant optimization: case study on three alpine basins.
In Proceedings of the Italia Intelligenza Artificiale—Thematic Workshops co-located with the
3rd CINI National Lab AIIS Conference on Artificial Intelligence (Ital IA 2023), Pisa, Italy,
May 29–30, 2023, volume 3486 of CEUR Workshop Proceedings, pages 590–594, 2023.
12 1 Introduction to Industrial Edge Computing
3. Samaneh Zolfaghari, Sumaiya Suravee, Daniele Riboni, and Kristina Yordanova. Sensor-based
locomotion data mining for supporting the diagnosis of neurodegenerative disorders: A survey.
ACM Comput. Surv., 56(1):10:1–10:36, 2024.
4. Shuhui Fan, Shaojing Fu, Yuchuan Luo, Haoran Xu, Xuyun Zhang, and Ming Xu. Smart
contract scams detection with topological data analysis on account interaction. In Proceedings
of the 31st ACM International Conference on Information & Knowledge Management, Atlanta,
GA, USA, October 17–21, 2022, pages 468–477, 2022.
5. Abhishek Hazra, Mainak Adhikari, Tarachand Amgoth, and Satish Narayana Srirama. A
comprehensive survey on interoperability for IIoT: Taxonomy, standards, and future directions.
ACM Comput. Surv., 55(2):9:1–9:35, 2023.
6. Tarik Taleb, Konstantinos Samdanis, Badr Mada, Hannu Flinck, Sunny Dutta, and Dario
Sabella. On multi-access edge computing: A survey of the emerging 5g network edge cloud
architecture and orchestration. IEEE Commun. Surv. Tutorials, 19(3):1657–1681, 2017.
7. Yushan Siriwardhana, Pawani Porambage, Madhusanka Liyanage, and Mika Ylianttila. A sur-
vey on mobile augmented reality with 5g mobile edge computing: Architectures, applications,
and technical aspects. IEEE Commun. Surv. Tutorials, 23(2):1160–1192 , 2021.
8. Chi-Wei Lien and Sudip Vhaduri. Challenges and opportunities of biometric user authentica-
tion in the age of IoT: A survey. ACM Comput. Surv., 56(1):14:1–14:37, 2024.
9. François De Keersmaeker, Yinan Cao, Gorby Kabasele Ndonda, and Ramin Sadre. A survey of
public IoT datasets for network security research. IEEE Commun. Surv. Tutorials, 25(3):1808–
1840, 2023.
10. Rodrigo Marotti Togneri, Ronaldo C. Prati, Hitoshi Nagano, and Carlos Kamienski. Data-
driven water need estimation for IoT-based smart irrigation: A survey. Expert Syst. Appl.,
225:120194, 2023.
11. A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash. Internet of things: A
survey on enabling technologies, protocols, and applications. IEEE Communications Surveys
& Tutorials, 17(4):2347–2376, 2015.
12. Emiliano Sisinni, Abusayeed Saifullah, Song Han, Ulf Jennehag, and Mikael Gidlund.
Industrial internet of things: Challenges, opportunities, and directions. IEEE Transactions on
Industrial Informatics, 14(11):4724–4734, 2018.
13. T. Qiu, B. Li, X. Zhou, H. Song, I. Lee, and J. Lloret. A novel shortcut addition algorithm with
particle swarm for multi-sink internet of things. IEEE Transactions on Industrial Informatics,
pages 1–12, 2019.
14. Prasanna Kumar Illa and Nikhil Padhi. Practical guide to smart factory transition using IoT,
big data and edge analytics. IEEE Access, 6:55162–55170, 2018.
15. A. Thakur and R. Malekian. Fog computing for detecting vehicular congestion, an internet
of vehicles based approach: A review. IEEE Intelligent Transportation Systems Magazine,
11(2):8–16, 2019.
16. H. Wang, Q. Wang, Y. Li, G. Chen, and Y. Tang. Application of fog architecture based on
multi-agent mechanism in CPPS. In 2018 2nd IEEE Conference on Energy Internet and Energy
System Integration (EI2), pages 1–6, 2018.
17. N. Yoshikane et al. First demonstration of geographically unconstrained control of an industrial
robot by jointly employing SDN-based optical transport networks and edge compute. In
2016 21st OptoElectronics and Communications Conference (OECC) held jointly with 2016
International Conference on Photonics in Switching (PS), pages 1–3, 2016.
18. I. A. Tsokalo, H. Wu, G. T. Nguyen, H. Salah, and F. H. P. Fitzek. Mobile edge cloud for robot
control services in industry automation. In 2019 16th IEEE Annual Consumer Communications
& Networking Conference (CCNC), pages 1–2, 2019.
19. T. M. Jose. A novel sensor based approach to predictive maintenance of machines by leveraging
heterogeneous computing. In 2018 IEEE SENSORS, pages 1–4, 2018.
20. L. Li, K. Ota, and M. Dong. Deep learning for smart industry: Efficient manufacture inspection
system with fog computing. IEEE Transactions on Industrial Informatics, 14(10):4665–4673,
2018.
References 13
21. W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. Edge computing: Vision and challenges. IEEE
Internet of Things Journal, 3(5):637–646, 2016.
22. Rosario Giuseppe Garroppo and Maria Grazia Scutellà. Design model of an IEEE 802.11ad
infrastructure for TSN-based industrial applications. Comput. Networks, 230:109771, 2023.
23. Abhishek Hazra, Praveen Kumar Donta, Tarachand Amgoth, and Schahram Dustdar. Cooper-
ative transmission scheduling and computation offloading with collaboration of fog and cloud
for industrial IoT applications. IEEE Internet of Things Journal, 10(5):3944–3953, 2023.
24. C. Mouradian, D. Naboulsi, S. Yangui, R. H. Glitho, M. J. Morrow, and P. A. Polakos.
A comprehensive survey on fog computing: State-of-the-art and research challenges. IEEE
Communications Surveys & Tutorials, 20(1):416–464, 2018.
25. M. Mukherjee, L. Shu, and D. Wang. Survey of fog computing: Fundamental, network applica-
tions, and research challenges. IEEE Communications Surveys & Tutorials, 20(3):1826–1857,
2018.
26. H. Xu, W. Yu, D. Griffith, and N. Golmie. A survey on industrial internet of things: A cyber-
physical systems perspective. IEEE Access, 6:78238–78259, 2018.
27. Jesus Martin Talavera and Others. Review of IoT applications in agro-industrial and environ-
mental fields. Computers and Electronics in Agriculture, 142:283–297, 2017.
28. Christian Weber, Jan Koenigsberger, Laura Kassner, and Bernhard Mitschang. M2ddm-a
maturity model for data-driven manufacturing. Manufacturing Systems 4.0, 63:173–178, 2017.
29. M. Aazam, S. Zeadally, and K. A. Harras. Deploying fog computing in industrial internet
of things and industry 4.0. IEEE Transactions on Industrial Informatics, 14(10):4674–4682,
2018.
30. Ines Sitton-Candanedo, Ricardo S. Alonso, Sara Rodriguez-Gonzalez, Jose Alberto Gar-
cia Coria, and Fernando De La Prieta. Edge computing architectures in industry 4.0: A
general survey and comparison. In 14th International Conference on Soft Computing Models
in Industrial and Environmental Applications, volume 950 of Advances in Intelligent Systems
and Computing, pages 121–131, 2020.
Chapter 2
Preliminaries
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 15
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_2
16 2 Preliminaries
It is worth noting that a naughty “baby” moves around the map, highlighting the
necessity of real-time monitoring. Based on this, these devices integrate various
communication modules (Wi-Fi, Bluetooth, RFID, NB-IoT, LoRa, 5G, etc.) to
search the help from the capable units. Like the baby takes a message by crying,
the messages that are coded by different communication protocols can be captured.
The louder the crying, i.e., the stronger the signal strength, a more rapid response,
as known as the communication latency.
Base Station (BS)/Edge Server The BS equipped with the edge server takes on
the responsibility of being an “adult.” It is usually densely distributed close to end
devices and responsible for multiple devices through wireless or wired links. The
abundant computing and storage resources of the edge server support dealing with
insoluble troubles that bother devices [3].
Using the computing resources in the edge server for devices’ requests is referred
to as offloading [4]. Offloading pays attention to making optimal decisions from a
higher perspective to coordinate the overall process, similarly to mediate conflicts
between babies. Benefiting from AI, the algorithm at the edge behaves more like a
logical adult. It takes care of each device’s request with priority determined by the
“sound of crying,” i.e., the features of the request such as latency awareness and
computation intensity. Everyone wants the cry to stop quickly, and in an industrial
edge computing system, it performs the objective of minimizing the latency, which
will be introduced in Sect. 2.1.1.
Storage resources in industrial edge computing systems are utilized to deploy
applications and store data collected from end devices. These applications, like
scheduling customized maintenance plans, form the foundation for the edge to offer
services to devices. This can be likened to an adult’s life skills, such as knowing
which medicines are needed for a baby with a fever. Here, the collected data assists
in diagnosis, similar to how an adult uses their knowledge and experience to care
for a baby.
Cloud The cloud owns more adequate resources than the edge, while longer
response latency incurs. Also, it holds a broader range of applications (skills),
making it look like a baby store that covers the shortage of adults. The service that
the edge cannot provide will be scheduled to the cloud [5]. To achieve this function,
the objective of the cloud is to enrich the commodities, i.e., providing the AI model
with high inference speed and accuracy. This will be detailed in Sect. 2.1.3.
In summary, with edge computing, the task can be processed without extra
jitter during transmission in the core network. By facilitating real-time decisions,
industrial edge computing improves the accuracy of control with minimum cost.
However, to fully enjoy the benefits of industrial edge computing systems, the
following metrics must be taken into account when designing specific solutions for
different applications.
2.1 Performance Metrics of Industrial Edge Computing 17
In the existing research on industrial edge computing, some schemes are designed
to minimize the latency [6, 7]. Generally, latency comprises computing latency,
transmission latency, and some extra latency, e.g., queuing latency. Nevertheless,
it should also be calculated in different ways in different cases relying on the
offloading scheme. In this section, we introduce a general latency estimation model.
For a task, we assume its computation requirement as O, while the size of the
request when offloaded to edge or cloud is K. Meanwhile, let .ld , le , lc represent
the computation capacity of the device, edge, and cloud. The transmission rate
between edge and device and that between edge and cloud are represented by
.Rd and .Rc . There are two offloading modes in industrial edge computing: full
offloading and partial offloading. These modes determine whether to divide the
computational requirements for parallel processing across different locations, such
as the device, edge, and cloud. Two variables, .λ and .β, both ranging from .[0, 1]
and representing offloading decisions, denote the proportion of the computational
requirement allocated to the edge and the cloud, respectively. It is important to note
that the sum of .λ and .β should not exceed 1, i.e., .λ + β ≤ 1. Full offloading is a
specific instance of partial offloading, where either .λ = 1 or .β = 1.
When all the offloaded computation requirements are finished and the result
is back to the device, the task is considered completed. Thus, we calculate the
latency for three parts of the computation requirements. For the local computation
requirement, its computing latency is
(1 − λ − β)O
.c =
LD , (2.1)
ld
while the transmission latency .LDt is zero. When offloading the task to the edge, the
request is transmitted to the edge, and the result, whose size is .wλO, is then back to
the device. In this case, the computing latency is
λO
.c =
LE , (2.2)
le
K wλO
.t =
LE + . (2.3)
Rd Rd
Similarly, for the cloud, the request is first transmitted to the cloud via the edge
and then back to the device. Its computing latency is
λO
. c =
LC , (2.4)
le
18 2 Preliminaries
K K wβO wβO
t =
LC
. + + + . (2.5)
Rd Rc Rd Rc
where .ϵ D , ϵ E , ϵ C are the extra latency in device, edge, and cloud, respectively. For
example, if the service that used to process this task is not cached at the edge,
an extra cache latency is incurred. It is further discussed in detailed solutions for
different applications.
To minimize the latency, we can further introduce more devices to execute tasks
in a parallel manner to reduce the maximum of each item in Eq. (2.6). It involves
the problem of where to offload and how to partition the task. Meanwhile, when
calculating the total latency for multiple substantial tasks, the time-varying densities
should be taken into account. Here, we provide three ways to minimize the latency
and hope that can inspire readers for latency minimization schemes:
• Device-to-Device (D2D) Offloading: From the perspective of computing, the
main idea of offloading is to utilize the computation resources of edge and cloud.
Here, offloading the task to idle devices via D2D communication is an ignored
way. In this case, whether the idle device is willing to use its resources to help
others and how to encourage them to share their resources are potential solutions.
• Offloading Partition: As mentioned before, the partial offloading cannot apply
to practical applications as the task cannot be partitioned to optional parallel
computation requirements. In fact, it is greatly determined by the task segments,
i.e., sequential sub-tasks with determined computation requirements. How to
offload these sequential sub-tasks to minimize the latency should be addressed.
• Time-Varying Arrival Density: Normally, to simulate the task arriving at each
time, the existing divides the period into several uniform time slots. In practice,
the tasks arriving in each time slot show different densities, incurring an extra
execution latency. To solve this problem, a global buffer can be built, where the
tasks are regarded as sparse arriving until the buffer is full, and thus the online
matching stage can schedule the tasks.
quency and transmission power can decrease computing and transmission latency,
but this inevitably leads to higher energy consumption. Therefore, the trade-off
between energy and latency is an important factor to consider in the design and
optimization of industrial edge computing systems.
There is also a difference between energy and latency, that is, the energy
consumption is shared by all the participant units, i.e., device, edge, and cloud [8, 9].
The units have different concerns about energy consumption, which makes no sense
in the estimation of total energy consumption toward a task. Therefore, the typical
energy consumption is evaluated from the perspective of the device and edge,
respectively. The energy consumption of the cloud is not taken into account as it has
enough power supply. Next, we extend the model before and introduce a general
estimation model for the energy consumption of the device and edge.
Energy Consumption of Device Most of simple IoT devices like soil monitoring
sensors and patrol robots have a component, i.e., battery, to support the energy
consumption to achieve the function of itself. Thus, a reasonably long lifetime must
be ensured by very low power consumption. The computing energy consumption
and the transmission energy consumption of the device can be obtained by
EcD = κld3 LC
. c, (2.7)
and
⎛ ⎞
K wβO
.EtD = P D LE
t + + , (2.8)
Rd Rd
respectively. Here, .P D is the transmission power of the device during sending and
receiving.
Energy Consumption of Edge In addition to device energy consumption, energy
usage at the edge is becoming increasingly crucial, especially with the advent of
5G technology. It has been reported in MTN Consulting and Huawei [10, 11] that
5G BS consumes at least double the energy that of a 4G BS. It is projected that
power consumption will account for 18% of operational expenses in Europe and
32% in India [12]. Furthermore, the rapid advancement in hardware is leading to a
trend where edge devices are becoming smaller, such as Unmanned Aerial Vehicles
(UAVs), making energy efficiency at the edge more critical than ever. The computing
energy consumption and the transmission energy consumption of the edge can be
obtained by
. EcE = κle3 LE
c, (2.9)
20 2 Preliminaries
and
⎛ ⎞
.EtE = P E LE
t + Lt ,
C
(2.10)
respectively. Here, .P E is the transmission power of the edge during sending and
receiving. Using energy consumption as a supplement indicator can efficiently make
a trade-off between minimizing the latency and minimizing the energy consumption.
It greatly supports the devices to own a longer run time while satisfying the latency
requirement of different tasks.
It is important to recognize that significantly reducing computing and transmis-
sion energy consumption is generally challenging. However, there is potential for
a substantial reduction in other types of energy consumption. For example, the
energy used for periodically caching data on the edge server presents a promising
opportunity for a decrease. In this context, we identify two types of extra energy
consumption that can be significantly reduced or avoided:
• Data Caching: Normally, once the data required by the device’s service is not
processed on the edge, the edge should download the data from the cloud. The
cached data is usually large and consumes a lot of energy. Note that due to the
limited storage of the edge, the downloaded data may replace the existing data. To
save the energy consumption in downloading data, an intelligent caching strategy
with a high hit ratio is desired.
• Service Migration: This phenomenon arises as a secondary effect of data
caching and acts as an adjunct to caching strategies meant to accommodate
user mobility. When mobile devices, like robots, move between different edge
locations, their connectivity configurations change. This necessitates updating
the cache at the new edge location, leading to increased energy consumption,
which can sometimes be excessively high. A possible solution is to migrate data
between edge servers or to reroute the service from the original edge to the
new location. However, this method requires careful evaluation of the trade-off
between the energy consumed in migration and routing to determine the most
energy-efficient solution.
transmission in the congested backhaul network. Hence, the Deep Learning (DL)
models that achieve various necessary functions for IoT devices, including computer
vision, natural language processing, machine translation, and so on, are deployed on
the edge. Note that, here, accelerating inference indicates the reduction of value of
O (i.e., transform the heavyweight model into lightweight model), while latency
minimization focuses on reducing the latency with given O. Also, the coexistence
of them is an interesting topic to reduce the total latency, which will be introduced
in Sect. 6.1.
Another concern is model accuracy, which refers to how well these models
can predict or classify data from sensors, devices, or systems within industrial
environments for detecting anomalies, optimizing processes, etc. [4]. Industrial
edge computing systems identify abnormal patterns and defects or deviations from
quality standards to prevent failures or security breaches. Accurate models ensure
that maintenance activities can be scheduled effectively, minimizing downtime and
maximizing operational efficiency, reducing waste, and improving product quality.
Here, we also provide three aspects to show the direction may be useful for
accelerating inference with improved model precision:
• Edge Cooperation: The inference model is allowed to be divided into multiple
parts, one executing locally, and the remaining parts offloading to multiple
cooperative edge servers. In this way, this is no doubt that the inference will
be accelerated. However, the differences between models, e.g., computational
workload, input data volume, and output data volume of each layer, limit the
spread of a cooperation algorithm as it should be designed for a specific inference
model.
• Knowledge Distillation: Fluctuations in wireless bandwidth can lead to pro-
longed communication latency for both methods, particularly when dealing
with a significant volume of raw video data or intermediate features that need
to be transmitted. A recent development in this context is the adoption of
teacher–student learning, which has shown promise as a framework for real-time
video inference on resource-constrained mobile devices within multi-access edge
computing (MEC) networks [14]. In this approach, robust teacher models are
stationed on edge servers, while lightweight student models, distilled from these
teacher models, are deployed on mobile devices. This setup aims to expedite the
inference process, enhancing the inference speed with a tolerant accuracy.
• Device Cooperation: After deploying a lightweight model in devices, there is
still a challenge that should be addressed. When inference by the data for a
single device, the inference accuracy is limited due to the short sensing range and
blind zones of its sensors. Cooperative sensing seems to be an efficient way to
infer beyond their local sensing capabilities by exchanging sensing information
between neighboring units and thus improve the inference accuracy.
22 2 Preliminaries
Existing studies in the offloading field can be categorized from the perspective of
decision-making manner (i.e., centralized or decentralized).
Decentralized Offloading There are some studies that focus on making offloading
decisions in each edge independent based on existing local information in a
decentralized manner without requiring global information.
Josilo et al. [15] addressed the issue of task offloading among devices to
neighbors or cloud services. They devised a game-theoretical model to tackle the
problem of minimizing completion time. Yu et al. [16] proposed a hybrid task
offloading method based on multi-cast for industrial edge computing, catering to
numerous mobile edge devices. The framework exploits network-assisted collabo-
ration to facilitate wireless distributed computing sharing. Within this approach, the
authors employ the Monte Carlo Tree Search (MCTS) algorithm to optimize the
task assignment problem. Mohammed et al. [17] partitioned Deep Neural Network
(DNN) into multiple segments, enabling processing either locally on end devices
or offloading to potent nodes, such as those found in fog computing. Wang et
al. [18] crafted a multiuser decentralized epoch-based framework. It facilitates
decentralized user-initiated offloading without global system information. Qian et
al. [19] paid their attention to the diverse channel realizations in dynamic industrial
edge computing systems and proposed a time-varying online algorithm based on
Deep Reinforcement Learning (DRL).
The common feature of these works is that the offloading decision is made to
maximize its individual reward, such as minimum latency, based on the local obser-
vation. In this case, the individual decision always causes conflicts in maximizing
the global reward. These methodologies demonstrate commendable scalability and
adaptability across complicated environments. It eases the resilience deployment
for new devices and improves the robustness to avoid a complete breakdown when
a single point of failure. However, it is the overall superiority of the decision-
making process that is required for industrial edge computing systems, which are
not satisfied by local optimum decisions.
2.2 Related Work 23
Once the computation capacity of the devices and edge servers cannot burden the
task processing, it is necessary to split the task into multiple sub-tasks to execute
them in a hybrid manner. As studies of single offloading mode become more mature,
authors begin to consider hybrid offloading. In [23], the authors integrated the D2D
communications with MEC to further improve the computation capacity of the
cellular networks. However, the current hybrid offloading generally considers the
cooperative utilization of MEC and D2D resources, and D2D communication is
limited to one device, so the utilization is low.
A potential researcher’s direction is that a mobile device can act as a relay and
help other devices communicate with MEC servers. The relay ensures the high data
rates required by Next-Generation Networks (NGNs) D2D communication [24].
D2D-assisted task offloading allows mobile devices to offload tasks not only to edge
servers but also to neighboring nodes through D2D links [23].
The multiple-hop transmissions between multiple users through D2D links,
which are referred to as cooperative relays, produce an extra latency [25]. The
first framework supporting cooperative relay is proposed in [26], which also jointly
combines the D2D links with cellular links for communication. Meanwhile, in [27],
the authors allow a mobile device to offload tasks through relay devices. They put
24 2 Preliminaries
forth various algorithms aimed at minimizing both latency and energy consumption
in this context.
Task offloading process at the edge servers and end devices is a critical issue due to
the diverse task dependencies of each application with the EDs, which is essential
to improving the application utility [35].
Cooperation among applications proves advantageous in enhancing overall
application performance, manifesting in improvements such as heightened detection
accuracy in autonomous driving [36]. In scenarios involving DAG-based offloading
with cooperation, the cooperation entails the operator’s decision to identify task
pairs from distinct users that should share their intermediate data. This determina-
tion takes into account factors such as the size of shared data and the dynamically
changing network conditions. Yan et al. [37] formulated offloading as an application
utility maximization problem. It used Gibbs sampling to find the best cooperation
between devices, whose DAGs are connected with each other via fixed external
2.2 Related Work 25
dependencies (EDs). Liu et al. [38] used a Quantized Soft Actor–Critic (QSAC)
algorithm to find the optimal EDs under dynamic network conditions. Meanwhile,
to explore more actions, the output of the actor module is further quantized in an
order-preserving way.
In industrial edge computing systems, data (services) is deployed on the edge servers
rather than the remote cloud. The existing works mainly focus on data caching
and service migration. The efficient management of the services decides when and
where to cache or migrate services to optimize the system performance, e.g., latency
and energy consumption [39].
Similar to virtual machine (VM) caching, data (service) caching in industrial edge
computing is designed to optimize the hit rate. This involves caching data in edge
servers with limited resources while taking into account user request statistics [40].
Numerous ongoing efforts are dedicated to addressing data caching challenges,
which mainly focus on data popularity and cooperative caching.
Data Popularity The popularity of data within the coverage area of edge servers
usually adheres to a specific distribution, e.g., Zipf distribution [41]. It is essential
to note that, within industrial edge computing systems, data popularity serves as an
indicator of the frequency with which data is requested.
Wang et al. [42] indicated that the requested data of different edge nodes is highly
variant and each edge has its specific features. Since the performance of data caching
strategies heavily depends on the accuracy of popularity prediction [43], there are
lots of research works that adopt Machine Learning (ML) technologies to predict
popularity for adapting to the time-varying popularity of user’s requests [44, 45].
In [46], the authors predicted data popularity based on historical device requests
through the Long Short-Term Memory (LSTM) algorithm. However, the frequent
mobility of devices often leads to dynamic changes in data popularity, posing
challenges for accurate predictions.
With the predicted popularity of the data, the authors in [47] adopted a statistical
model to make bitrate selection and edge caching decisions for video streaming
transmission in industrial edge computing. These strategies driven by Quality of
Experience (QoE) neglect the impact of video quality and rebuffer on different
categories of 360-degree videos, which greatly reduces the QoE. However, research
has indicated that diverse video categories place different emphasis on factors
like video quality and rebuffer [48]. There exists considerable potential for further
enhancement in users’ average QoE.
26 2 Preliminaries
Fig. 2.1 An illustration of a caching system including two small BSs (SBSs) equipped with edge
servers and four users: (a) Since the blue data is the most popular, the two SBSs cache and
recommend it to users, while the expected data of user 2/4 is others; (b) With cooperation, the
two SBSs cache red and blue data, respectively, in which case the request from user 4 for yellow
data cannot be responded; (c) With cooperation and recommendation, the request of user 4 is
changed to red one, and thus all the requests can be responded when two SBSs cache red and blue
data, respectively. Note that the user request patterns are different from the corresponding user
preference patterns with the impact of the recommendation system
of the existing data caching methods, i.e., user mobility, leading to a reduction
in the service hit rate as users move among the coverage area of different edge
servers. Although this can be alleviated by frequently updating the data caching
policy, significant edge–cloud transmission overheads would be incurred causing
deterioration in the latency.
To deal with the problem caused by mobility, service migration is pointed out to
migrate the service among BSs adaptively [62].
Many solutions based on Software Definition Networks (SDN) have been
developed to improve the mobility management performance in 5G dense networks
according to the radio signal strength. Oliva et al. [63] presented a resilient
SDN framework by building a virtual registration area to avoid updating users’
locations. In this framework, the local mobility anchor establishes a bidirectional
tunnel between mobility access gateways located in edge servers and thus achieves
seamless handoff. Tartarini et al. [64] proposed a quality of service handoff rerouting
component, which proactively generates a route path before users reach their new
locations based on trace prediction. However, due to the substantial data size
involved in transmission (e.g., application and history data), surpassing session-
based and resilient signal messages, this method introduces additional latency.
Moreover, the services may not be migrated to the other BS after the connected
BS of user changes, which further complicates the service migration problem.
Therefore, it is crucial for service migration strategies to incorporate consid-
erations of migration energy consumption and latency. In one-dimensional (1-D)
mobility scenario, i.e., users can only move forward or backward, Ksentini et al.
[65] proposed a method that predicts user trajectories using the Markov Decision
Process (MDP) model, guiding the process of service migration. Extending the
scope to a more realistic two-dimensional (2-D) mobility scenario, Wang et al.
[66] based their migration decisions on trajectory predictions derived from a 2-
D MDP model with a significantly larger state space than the 1-D model. Sun
et al. [67] devised migration decisions through a user-centric scheme based on
MAB theories to meet long-term energy budgets. These strategies, independently
crafted by different users, overlooked interference and service sharing among users,
resulting in latency reduction with constrained resources. Furthermore, sustaining
trajectory prediction accuracy proved challenging with a significant increase in the
number of users.
emergence of Industry 4.0 has ushered in a new era, where edge computing, in
conjunction with industrial clouds, offers holistic solutions for pioneering business
models. Within these models, concepts like extensive customization and service-
based production take center stage. In this section, we list some related research for
improving the processing speed and accuracy.
The accuracy of inference for a single device is constrained by the limited sensing
range and blind spots of its sensors, which requires sharing the data among devices.
Depending on the type of sensing data to be shared between devices, the level of
cooperative perception consists of the raw, feature, and object, as shown in Fig. 2.2.
Raw-Level The approach involves sharing and gathering raw data to create a
comprehensive view. Cooper [77] is a pioneering method, aiming to enhance the
sensing area and improve inference accuracy by facilitating the exchange of raw
data between two devices. EMP [36] leveraged infrastructure support to share
Fig. 2.2 An example of cooperation in the perception of connected and autonomous vehicles,
where the raw, feature, and object data can be shared with others, respectively
References 31
References
1. Jiong Jin, Kan Yu, Ning Zhang, and Zhibo Pang. Guest editorial: Special section on real-
time edge computing over new generation automation networks for industrial cyber-physical
systems. IEEE Trans. Ind. Informatics, 18(12):9268–9270, 2022.
2. Akanksha Dixit, Arjun Singh, Yogachandran Rahulamathavan, and Muttukrishnan Rajarajan.
FAST DATA: A fair, secure, and trusted decentralized IIoT data marketplace enabled by
blockchain. IEEE Internet Things J., 10(4):2934–2944, 2023.
3. Mingkai Chen, Lindong Zhao, Jianxin Chen, Xin Wei, and Mohsen Guizani. Modal-aware
resource allocation for cross-modal collaborative communication in IIoT. IEEE Internet Things
J., 10(17):14952–14964, 2023.
4. Wenhao Fan, Shenmeng Li, Jie Liu, Yi Su, Fan Wu, and Yuanan Liu. Joint task offloading
and resource allocation for accuracy-aware machine-learning-based IIoT applications. IEEE
Internet Things J., 10(4):3305–3321, 2023.
5. Hui Yin, Wei Zhang, Hua Deng, Zheng Qin, and Keqin Li. An attribute-based searchable
encryption scheme for cloud-assisted IIoT. IEEE Internet Things J., 10(12):11014–11023,
2023.
6. Yuhuai Peng, Alireza Jolfaei, Qiaozhi Hua, Wen-Long Shang, and Keping Yu. Real-time
transmission optimization for edge computing in industrial cyber-physical systems. IEEE
Trans. Ind. Informatics, 18(12):9292–9301, 2022.
7. Peiying Zhang, Yi Zhang, Neeraj Kumar, and Ching-Hsien Hsu. Deep reinforcement learning
algorithm for latency-oriented IIoT resource orchestration. IEEE Internet Things J., 10(8, April
15):7153–7163, 2023.
8. Guowen Wu, Zhiqi Xu, Hong Zhang, Shigen Shen, and Shui Yu. Multi-agent DRL for joint
completion delay and energy consumption with queuing theory in MEC-based IIoT. J. Parallel
Distributed Comput., 176:80–94, 2023.
9. M. S. Syam, Sheng Luo, Yue Ling Che, Kaishun Wu, and Victor C. M. Leung. Energy-efficient
intelligent reflecting surface aided wireless-powered IIoT networks. IEEE Syst. J., 17(2):2534–
2545, 2023.
10. Matt Walker. Operators facing power cost crunch. https://www.mtnconsulting.biz/product.
Accessed Nov 7, 2020.
32 2 Preliminaries
11. D. Chen and W. Ye. 5G power: Creating a green grid that slashes costs, emissions & energy
use. https://www.huawei.com/en/publications/communicate/89/5g-power-green-grid-slashes-
costs-emissions-energy-use. Accessed Nov 7, 2020.
12. Valentin Poirot, Mårten Ericson, Mats Nordberg, and Karl Andersson. Energy efficient multi-
connectivity algorithms for ultra-dense 5G networks. IEEE Wireless Networks, 26(3):2207–
2222, Jun. 2020.
13. Mikolaj Jankowski, Deniz Gündüz, and Krystian Mikolajczyk. Joint device-edge inference
over wireless links with pruning. In 21st IEEE International Workshop on Signal Processing
Advances in Wireless Communications, SPAWC 2020, Atlanta, GA, USA, May 26–29, 2020,
pages 1–5. IEEE, 2020.
14. Lin Wang and Kuk-Jin Yoon. Knowledge distillation and student-teacher learning for visual
intelligence: A review and new outlooks. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 44(6):3048–3068, 2022.
15. Mingchuan Zhang, Yangfan Zhou, Quanbo Ge, Ruijuan Zheng, and Qingtao Wu. Decentralized
randomized block-coordinate Frank-Wolfe algorithms for submodular maximization over
networks. IEEE Trans. Syst. Man Cybern. Syst., 52(8):5081–5091, 2022.
16. Zheng Yao, Huaiyu Wu, and Yang Chen. Multi-objective cooperative computation offloading
for MEC in UAVs hybrid networks via integrated optimization framework. Comput. Commun.,
202:124–134, 2023.
17. Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed
inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Confer-
ence on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020,
pages 854–863. IEEE, 2020.
18. Xiong Wang, Jiancheng Ye, and John C. S. Lui. Decentralized task offloading in edge
computing: A multi-user multi-armed bandit approach. In IEEE INFOCOM 2022—IEEE
Conference on Computer Communications, London, United Kingdom, May 2–5, 2022, pages
1199–1208. IEEE, 2022.
19. Liping Qian, Yuan Wu, Fuli Jiang, Ningning Yu, Weidang Lu, and Bin Lin. NOMA assisted
multi-task multi-access mobile edge computing via deep reinforcement learning for industrial
internet of things. IEEE Trans. Ind. Informatics, 17(8):5688–5698, 2021.
20. Sladana Josilo and György Dán. Computation offloading scheduling for periodic tasks in
mobile edge computing. IEEE/ACM Trans. Netw., 28(2):667–680, 2020.
21. Sarhad Arisdakessian, Omar Abdel Wahab, Azzam Mourad, Hadi Otrok, and Nadjia Kara.
FoGMatch: An intelligent multi-criteria IoT-FoG scheduling approach using game theory.
IEEE/ACM Trans. Netw., 28(4):1779–1789, 2020.
22. Gongming Zhao, Hongli Xu, Yangming Zhao, Chunming Qiao, and Liusheng Huang. Offload-
ing dependent tasks in mobile edge computing with service caching. In 39th IEEE Conference
on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020, pages
1997–2006. IEEE, 2020.
23. Yinghui He, Jinke Ren, Guanding Yu, and Yunlong Cai. D2D Communications Meet Mobile
Edge Computing for Enhanced Computation Capacity in Cellular Networks. IEEE Transac-
tions on Wireless Communications, 18(3):1750–1763, 2019.
24. Molin Li, Xiaobo Zhou, Tie Qiu, Qinglin Zhao, and Keqiu Li. Multi-relay assisted computation
offloading for multi-access edge computing systems with energy harvesting. IEEE Trans. Veh.
Technol., 70(10):10941–10956, 2021.
25. Pimmy Gandotra and Rakesh Kumar Jha. Device-to-device communication in cellular net-
works: A survey. J. Netw. Comput. Appl., 71:99–117, 2016.
26. J. Nicholas Laneman, David N. C. Tse, and Gregory W. Wornell. Cooperative diversity
in wireless networks: Efficient protocols and outage behavior. IEEE Trans. Inf. Theory,
50(12):3062–3080, 2004.
27. Yang Li, Gaochao Xu, Kun Yang, Jiaqi Ge, Peng Liu, and Zhenjun Jin. Energy efficient relay
selection and resource allocation in d2d-enabled mobile edge computing. IEEE Trans. Veh.
Technol., 69(12):15800–15814, 2020.
References 33
28. Chang Shu, Zhiwei Zhao, Yunpeng Han, Geyong Min, and Hancong Duan. Multi-user
offloading for edge computing networks: A dependency-aware and latency-optimal approach.
IEEE Internet of Things Journal, 7(3):1678–1689, 2020.
29. Jeffrey D. Ullman. NP-complete scheduling problems. Journal of Computer and System
Sciences, 10(3):384–393, 1975.
30. Yujiong Liu, Shangguang Wang, Qinglin Zhao, Shiyu Du, Ao Zhou, Xiao Ma, and Fangchun
Yang. Dependency-aware task scheduling in vehicular edge computing. IEEE Internet of
Things Journal, 7(6):4961–4971, 2020.
31. Hanlong Liao, Xinyi Li, Deke Guo, Wenjie Kang, and Jiangfan Li. Dependency-aware
application assigning and scheduling in edge computing. IEEE Internet of Things Journal,
9(6):4451–4463, 2022.
32. Zhiqing Tang, Jiong Lou, Fuming Zhang, and Weijia Jia. Dependent task offloading for
multiple jobs in edge computing. In International Conference on Computer Communications
and Networks, ICCCN 2020, Honolulu, HI, USA, August 3–6, 2020, 2020.
33. Jia Yan, Suzhi Bi, and Ying Jun Angela Zhang. Offloading and resource allocation with
general task graph in mobile edge computing: A deep reinforcement learning approach. IEEE
Transactions on Wireless Communications, 19(8):5404–5419, 2020.
34. Shumei Liu, Yao Yu, Xiao Lian, Yuze Feng, Changyang She, Phee Lep Yeoh, Lei Guo,
Branka Vucetic, and Yonghui Li. Dependent task scheduling and offloading for minimizing
deadline violation ratio in mobile edge computing networks. IEEE Journal on Selected Areas
in Communications, 41(2):538–554, 2023.
35. Xuming An, Rongfei Fan, Han Hu, Ning Zhang, Saman Atapattu, and Theodoros A. Tsiftsis.
Joint task offloading and resource allocation for IoT edge computing with sequential task
dependency. IEEE Internet of Things Journal, 9(17):16546–16561, 2022.
36. Xumiao Zhang, Anlan Zhang, Jiachen Sun, Xiao Zhu, Yihua Ethan Guo, Feng Qian, and
Z. Morley Mao. EMP: edge-assisted multi-vehicle perception. In ACM MobiCom ’21: The
27th Annual International Conference on Mobile Computing and Networking, New Orleans,
Louisiana, USA, October 25–29, 2021, 2021.
37. Jia Yan, Suzhi Bi, Ying Jun Zhang, and Meixia Tao. Optimal task offloading and resource
allocation in mobile-edge computing with inter-user task dependency. IEEE Transaction on
Wireless Communication, 19(1):235–250, 2020.
38. Pengbo Liu, Shuxin Ge, Xiaobo Zhou, Chaokun Zhang, and Keqiu Li. Soft actor-critic-
based DAG tasks offloading in multi-access edge computing with inter-user cooperation. In
Algorithms and Architectures for Parallel Processing—21st International Conference, ICA3PP
2021, Virtual Event, December 3–5, 2021, Proceedings, Part III, volume 13157, pages 313–
327, 2021.
39. Steven Davy, Jeroen Famaey, Joan Serrat, Juan Luis Gorricho, Avi Miron, Manos Dramitinos,
Pedro Miguel Neves, Steven Latré, and Ezer Gochen. Challenges to support edge-as-a-service.
IEEE Communications Magazine, 52(1):132–139, Jul. 2014.
40. X. Zhang and Q. Zhu. Hierarchical caching for statistical QoS guaranteed multimedia trans-
missions over 5G edge computing mobile wireless networks. IEEE Wireless Communications,
25(3):12–20, Jun. 2018.
41. Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. Web caching and Zipf-
like distributions: Evidence and implications. In Proceedings IEEE INFOCOM ’99, The
Conference on Computer Communications, Eighteenth Annual Joint Conference of the IEEE
Computer and Communications Societies, The Future Is Now, New York, NY, USA, March 21–
25, 1999, pages 126–134, 1999.
42. Fangxin Wang, Feng Wang, Jiangchuan Liu, Ryan Shea, and Lifeng Sun. Intelligent video
caching at network edge: A multi-agent deep reinforcement learning approach. In 39th IEEE
Conference on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9,
2020, pages 2499–2508. IEEE, 2020.
43. Liang Li, Dian Shi, Ronghui Hou, Rui Chen, Bin Lin, and Miao Pan. Energy-efficient proactive
caching for adaptive video streaming via data-driven optimization. IEEE Internet Things J.,
7(6):5549–5561, 2020.
34 2 Preliminaries
44. Hao Zhu, Yang Cao, Xiao Wei, Wei Wang, Tao Jiang, and Shi Jin. Caching transient data
for internet of things: A deep reinforcement learning approach. IEEE Internet Things J.,
6(2):2074–2083, 2019.
45. Jingjing Yao and Nirwan Ansari. Caching in dynamic IoT networks by deep reinforcement
learning. IEEE Internet Things J., 8(5):3268–3275, 2021.
46. Ruyan Wang, Zunwei Kan, Yaping Cui, Dapeng Wu, and Yan Zhen. Cooperative caching
strategy with content request prediction in internet of vehicles. IEEE Internet Things J.,
8(11):8964–8975, 2021.
47. Georgios Papaioannou and Lordanis Koutsopolulos. Tile-based caching optimization for 360◦
videos. In Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc
Networking and Computing, 2019.
48. Ivan Sliver, Mirko Suznjevic, and Skorin Kapov Lea. Game categorization for deriving
QoE-driven video encoding configuration strategies for cloud gaming. ACM Transactions on
Multimedia Computing, Communications, and Applications, 2017.
49. Wei Jiang, Gang Feng, Shuang Qin, and Ying-Chang Liang. Learning-based cooperative
content caching policy for mobile edge computing. In ICC 2019–2019 IEEE International
Conference on Communications (ICC), pages 1–6. IEEE, 2019.
50. Wei Jiang, Gang Feng, Shuang Qin, Tak Shing Peter Yum, and Guohong Cao. Multi-
agent reinforcement learning for efficient content caching in mobile d2d networks. IEEE
Transactions on Wireless Communications, 18(3):1610–1622, 2019.
51. Xianzhe Xu and Meixia Tao. Decentralized multi-agent multi-armed bandit learning with
calibration for multi-cell caching. IEEE Transactions on Communications, 2020.
52. K. Poularakis, J. Llorca, A. M. Tulino, I. Taylor, and L. Tassiulas. Joint service placement
and request routing in multi-cell mobile edge computing networks. In IEEE Conference on
Computer Communications, INFOCOM, pages 10–18, Paris, France, Apr. 2019.
53. Jie Xu, Lixing Chen, and Pan Zhou. Joint service caching and task offloading for mobile edge
computing in dense networks. In IEEE Conference on Computer Communications, INFOCOM,
pages 207–215, Honolulu, HI, USA, Apr. 2018.
54. Lingjun Pu, Jiao Lei, Chen Xu, Wang Lin, and Jingdong Xu. Online resource allocation,
content placement and request routing for cost-efficient edge caching in cloud radio access
networks. IEEE Journal on Selected Areas in Communications, 36(8):1751–1767, Dec. 2018.
55. Xianzhe Xu, Meixia Tao, and Cong Shen. Collaborative multi-agent multi-armed bandit
learning for small-cell caching. IEEE Transactions on Wireless Communications, 19(4):2570–
2585, 2020.
56. François Baccelli and Anastasios Giovanidis. A stochastic geometry framework for analyzing
pairwise-cooperative cellular networks. IEEE Transactions on Wireless Communications,
14(2):794–808, 2014.
57. S. Müller, O. Atan, M. van der Schaar, and A. Klein. Context-aware proactive content
caching with service differentiation in wireless networks. IEEE Transactions on Wireless
Communications, 16(2):1024–1036, 2017.
58. Stratis Ioannidis and Edmund Yeh. Adaptive caching networks with optimality guarantees.
IEEE/ACM Transactions on Networking, 26(2):737–750, 2018.
59. Pavlos Sermpezis, Theodoros Giannakas, Thrasyvoulos Spyropoulos, and Luigi Vigneri. Soft
cache hits: Improving performance through recommendation and delivery of related content.
IEEE Journal on Selected Areas in Communications, 36(6):1300–1313, 2018.
60. Livia Elena Chatzieleftheriou, Merkouris Karaliopoulos, and Iordanis Koutsopoulos. Jointly
optimizing content caching and recommendations in small cell networks. IEEE Transactions
on Mobile Computing, 18(1):125–138, 2018.
61. Kaiyang Guo and Chenyang Yang. Temporal-spatial recommendation for caching at base
stations via deep reinforcement learning. IEEE Access, 7:58519–58532, 2019.
62. T. Ouyang, Z. Zhou, and X. Chen. Follow me at the edge: Mobility-aware dynamic service
placement for mobile edge computing. IEEE Journal on Selected Areas in Communications,
36(10):2333–2345, Oct. 2018.
References 35
63. Antonio de la Oliva, Xi Li, Xavier Pérez Costa, Carlos Jesus Bernardos, Philippe Bertin,
Paola Iovanna, Thomas Deiß, Josep Mangues, Alain Mourad, Claudio Casetti, Jose Enrique
Gonzalez, and Arturo Azcorra. 5G-TRANSFORMER: Slicing and orchestrating transport
networks for industry verticals. IEEE Communication Magazine, 56(8):78–84, Aug. 2018.
64. Luca Tartarini, Marcelo Antonio Marotta, Eduardo Cerqueira, Juergen Rochol, Cristiano Bon-
ato Both, Mario Gerla, and Paolo Bellavista. Software-defined handover decision engine for
heterogeneous cloud radio access networks. Computing Communication, 115:21–34, Mar.
2018.
65. Adlen Ksentini, Tarik Taleb, and Min Chen. A Markov decision process-based service migra-
tion procedure for follow me cloud. In IEEE International Conference on Communications,
ICC, pages 1350–1354, Sydney, Australia„ Oct. 2014.
66. S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung. Dynamic service migration
in mobile edge computing based on Markov decision process. IEEE/ACM Transactions on
Networking, 27(3):1272–1288, Jun. 2019.
67. Y. Sun, S. Zhou, and J. Xu. EMM: Energy-aware mobility management for mobile edge
computing in ultra dense networks. IEEE Journal on Selected Areas in Communications,
35(11):2637–2646, Nov. 2017.
68. Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. Distributed deep neural networks
over the cloud, the edge and end devices. In 37th IEEE International Conference on Distributed
Computing Systems, ICDCS 2017, Atlanta, GA, USA, June 5–8, 2017, pages 328–339. IEEE
Computer Society, 2017.
69. Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. Deepthings: Distributed
adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst., 37(11):2348–2359, 2018.
70. Sai Qian Zhang, Jieyu Lin, and Qi Zhang. Adaptive distributed convolutional neural network
inference at the network edge with ADCNN. In ICPP 2020: 49th International Conference on
Parallel Processing, Edmonton, AB, Canada, August 17–20, 2020, pages 10:1–10:11. ACM,
2020.
71. Li Zhou, Mohammad Hossein Samavatian, Anys Bacha, Saikat Majumdar, and Radu Teodor-
escu. Adaptive parallel execution of deep neural networks on heterogeneous edge devices.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019, Arlington,
Virginia, USA, November 7–9, 2019, pages 195–208. ACM, 2019.
72. Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed
inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Confer-
ence on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020,
pages 854–863. IEEE, 2020.
73. Ran Xu, Rakesh Kumar, Pengcheng Wang, Peter Bai, Ganga Meghanath, Somali Chaterji,
Subrata Mitra, and Saurabh Bagchi. ApproxNet: Content and contention-aware video object
classification system for embedded clients. ACM Trans. Sens. Networks, 18(1):11:1–11:27,
2022.
74. Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning
with limited numerical precision. In Proceedings of the 32nd International Conference on
Machine Learning, ICML 2015, Lille, France, 6–11 July, volume 37, pages 1737–1746.
JMLR.org, 2015.
75. Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. NoScope:
Optimizing deep CNN-based queries over video streams at scale. Proc. VLDB Endow.,
10(11):1586–1597, 2017.
76. Mehrdad Khani Shirkoohi, Pouya Hamadanian, Arash Nasr-Esfahany, and Mohammad
Alizadeh. Real-time video inference on edge devices via adaptive model streaming. In
IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada,
October 10–17, pages 4552–4562, 2021.
77. Qi Chen, Sihai Tang, Qing Yang, and Song Fu. Cooper: Cooperative perception for connected
autonomous vehicles based on 3d point clouds. In 39th IEEE International Conference on
Distributed Computing Systems, Dallas, TX, USA, pages 514–524, 2019.
36 2 Preliminaries
78. Qi Chen, Xu Ma, Sihai Tang, Jingda Guo, Qing Yang, and Song Fu. F-Cooper: feature based
cooperative perception for autonomous vehicle edge computing system using 3d point clouds.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, Arlington, Virginia,
USA, pages 88–100, 2019.
79. Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, and
Raquel Urtasun. V2VNet: vehicle-to-vehicle communication for joint perception and predic-
tion. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK,
pages 605–621, 2020.
80. Moreno Ambrosin, Ignacio J. Alvarez, Cornelius Bürkle, Lily L. Yang, Fabian Oboril,
Manoj R. Sastry, and Kathiravetpillai Sivanesan. Object-level perception sharing among
connected vehicles. In IEEE Intelligent Transportation Systems Conference, Auckland, New
Zealand, pages 1566–1573, 2019.
81. Andreas Rauch, Felix Klanner, Ralph H. Rasshofer, and Klaus Dietmayer. Car2X-based
perception in a high-level fusion architecture for cooperative perception systems. In 2012 IEEE
Intelligent Vehicles Symposium, Alcal de Henares, Madrid, Spain, pages 270–275, 2012.
Chapter 3
Computation Offloading in Industrial
Edge Computing
3.1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 37
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_3
38 3 Computation Offloading in Industrial Edge Computing
ED8
ED6
ED1
ED9
MS1 ED5 MS3
ED2
ED7
ED4
ED3
MS2
We first give a comprehensive system overview and then detail the communication
among devices and computing of the tasks to formulate the offloading as an
optimization problem.
In this industrial edge computing system model, a network of M MEC servers
(MSs) and N edge servers is conceptualized. The edge servers, each assigned an
identifier from the set .N = 1, 2, · · · , N , are responsible for generating various
tasks. Each task .Tj from edge server j is defined by its data size .dj , required CPU
cycles .τj , and the maximum latency .tjmax it can tolerate.
The system’s architecture permits two primary execution methods for tasks:
• Local Execution: Tasks are processed directly on the edge servers. This method
is generally faster due to reduced communication needs but is limited by the
edge servers’ computational capacity. Hence, the tasks are allowed to be executed
locally once they satisfy the latency requirement.
• Offloading to MSs: When a task cannot be processed within the tolerable latency
by an edge server, it is offloaded to an MS. The MSs are strategically deployed
to ensure each edge server falls under the coverage of multiple MSs, and it has
the option to offload its task to any one of these MSs.
For every task .Tj generated by edge server j , the offloading decision to a
particular MS i is represented by a binary vector .Vj . This vector indicates whether
task .Tj will be offloaded to MS i or executed locally:
⎧
⎪
⎪ ED j offloads Tj to MS i, ∀i ∈ M ,
⎨1,
Vji =
. (3.1)
⎪
⎪
⎩0, otherwise.
In the industrial edge computing system, communication between edge servers and
MSs is facilitated using Orthogonal Frequency Division Multiple Access (OFDMA)
[17]. This setup is essential for task transmission from edge servers to MSs,
integrating specific location attributes and wireless communication features.
The model considers the horizontal coordinates of MSs, denoted as .pi , and edge
servers, denoted as .qj . These factors are crucial in calculating the physical distance
between each edge server and MS, which is given by the equation:
/
Dj,i =
. ‖ pi − qj ‖2 + h2i , (3.3)
where .hi denotes MS i’s height of the antenna, which is used to capture the three-
dimensional space.
The transmission rate between MS i and edge server j , when edge server j
transmits task .Tj , is determined as follows:
⎛ ⎞
Pj gj,i
Rj,i = B log2 1 +
. . (3.4)
σ2
Here, B denotes the channel bandwidth, .σ 2 is the noise power, and .Pj is the
transmission power of edge server j . .gj,i is the channel power gain between MS
i and edge server, which is calculated considering the distance [18], i.e.,
β0 β0
gj,i =
. = . (3.5)
2
Dj,i ‖ pi − qj ‖2 + h2i
dj
tr
tj,i
. = . (3.7)
Rj,i
After MS i fully executes task .Tj , the results are sent to edge server j . Normally,
the transmission latency for sending the results back to devices can be neglected due
to its ultrasmall size [19].
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 41
The computation model for the industrial edge computing system is based on
the computation capabilities of the edge servers and involves two scenarios: local
computing and computing.
τj
tjlc =
. . (3.8)
fj
tjle = tjlc .
. (3.9)
If .tjle is less than the tolerable latency .tjmax , it indicates that edge server j has
sufficient computation capacity to execute .Tj within the required time frame.
When the local computation latency exceeds its tolerable latency (.tjle > tjmax ), it
signifies that edge server j lacks the necessary computing power. The task .Tj is
offloaded to the queue of MEC server i to await the resource allocation. The waiting
rw for task .T is the duration it stays in the queue:
latency .tj,i j
g
rw
tj,i
. = tj,i
l
− tj,i . (3.10)
rc for task .T on MS i
After determining .τj and .Fi,j , the computation latency .tj,i j
can be calculated by
τj
rc
tj,i
. = . (3.12)
Fi,j
re includes the upload,
The total execution latency for a remotely executed task .tj,i
computation, and waiting latency:
re
tj,i
. = tj,i
tr
+ tj,i
rc
+ tj,i
rw
. (3.13)
In this MEC-enabled IIoT system, the total execution latency of tasks is calculated
based on the offloading decisions .Vji and .aj for each .i ∈ M and .j ∈ N . The total
latency, denoted as .ttotal , is formulated as follows:
⎾ ⏋
⎲ ⎲
.ttotal = Vji · tj,i
re
+ (1 − aj ) · tjle , (3.14)
j ∈N i∈M
The optimization problem, aiming to minimize this total task execution latency,
referred to as P, is outlined as
P: min ttotal
{Vji ,Fi,j }
C2 : aj ∈ {0, 1}, ∀j ∈ N ,
. (3.15)
C3 : (1 − Vji ) · tjle + Vji · tj,i
re
≤ tjmax ,
τj
C4 : tr − t rw ≤ Fi,j ≤ Fi ,
tjmax − tj,i j,i
⎲
N
C5 : Fi,j ≤ Fi .
j =1
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 43
In P, the variables to be optimized are .Vji and .Fi,j . The total execution latency
for each task, whether executed locally (.tjle ) or on server i (.tj,i
re ), is considered.
Generally, the randomness of devices prevents the operator from obtaining the
density and size of the received tasks beforehand. Hence, we designed the ATOM
framework, depicted in Fig. 3.2, whose main component is the global buffer. It
supports the offloading to be executed by two different stages, i.e., offline matching
and online matching.
virtual attributes that change over time. When tasks are generated and necessitate
assignment to the MSs, pertinent information such as data size and processing
demands is initially deposited into this overarching buffer. This storage process
allows for the monitoring of task arrival rates, as the buffer’s content, such as the
number of tasks and their cumulative data size, steadily increases with each new
task.
To effectively utilize this information on task arrival density, a threshold parame-
ter, denoted as .δ, is introduced. This parameter plays a critical role in the ATOM
framework, acting as a switch between its two operational stages: the offline
matching and the online matching stages. The value of .δ is set to dynamically
determine the point of transition between these stages, enabling the system to adjust
its strategy according to the observed arriving tasks. Such a mechanism ensures
that the system can respond aptly to varying task loads, optimizing the offloading
process in accordance with real-time demands and capacities.
The online matching stage is initiated when the global buffer’s predetermined
threshold has not been reached yet. In this phase, while the specifics about the
MSs are already known, information about the arriving edge servers is unveiled
incrementally as each task arrives [20, 21]. The primary goal here is to establish a
stable matching between MSs and edge servers.
There are some specific advantages of online matching: It significantly improves
the response speed, supporting the system to deal with heavily dynamic networks
and the continuous arrival of new tasks. The strategy also supports flexible decision-
making, relying on local information rather than requiring comprehensive global
information or extensive coordination. Additionally, it improves resource utility by
adaptively allocating tasks in response to the current availability of resources. These
aspects of online matching make it a particularly effective method for task offloading
in the fast-paced and variable context of industrial edge computing.
Once reaching the threshold of the global buffer, the ATOM framework shifts to
the offline matching stage. At this point, comprehensive information about both the
MSs and the edge servers is available prior to making task offloading decisions.
The objective in this stage is to minimize the total latency, striving for an optimal
match between tasks and servers, a problem addressed through offline matching
theory [22]. This approach involves solving a well-defined problem by allocating
edge server tasks to the most suitable MSs.
Offline matching offers distinct advantages. First, it utilizes global information
and historical data, which often results in superior task offloading solutions. This
approach is beneficial for identifying more efficient and effective task allocations.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 45
Second, by taking into account a range of factors and constraints and applying
accurate models, offline matching can achieve enhanced offloading outcomes. This
method is adept at handling complex scenarios, incorporating various parameters
into the decision-making process. Third, offline matching is particularly effective in
environments where conditions are relatively stable, and the influence of dynamic
changes on task offloading decisions is minimal. In essence, offline matching excels
in optimization, accuracy, and stability. It is especially relevant for scenarios that
demand a thorough understanding of global information, as well as long-term
strategic planning. This stage is critical in environments where consistency and
predictability are key, allowing for well-informed, data-driven decision-making.
The threshold .δ of the global buffer is influenced by the ratio of heavy tasks to light
tasks, denoted as .ε, and the task arrival rate .λt . When the proportion of light tasks is
high (.ε tends to 0), .δ is better set to a higher value, ideally approaching infinity. This
is because light tasks are more suited for online offloading, and a higher .δ ensures
more of these tasks are processed in the online matching stage. Conversely, when
the proportion of heavy tasks increases (.ε tends to 1), it is preferable to engage in
offline matching more frequently. In this case, for .ε = 1, we minimize the threshold
.δ to .δmin .
δ = ⎾(1 − alnε)δmin ⏋.
. (3.16)
Here, a and .ϵ are the weighting factors for the proportion of heavy tasks and the
departure rate, respectively. .δmin is dependent on the time-varying arrival rate .λt .
Within the simulation duration .tsim , the number of tasks that cannot be offloaded is
⎰ tsim
. θt = (λt − ϵ)dt. (3.17)
0
There are two scenarios for .λt and .ϵ: If they can be estimated by the historical
data, the exact value of the threshold can be obtained by integrating the function.
If they are too complex for such estimation, numerical integration methods are
required for an approximate threshold calculation.
To accurately determine .δ, which is crucial for the transition between online and
offline matching stages in the ATOM framework, we employ numerical integration
methods to handle the complexity of task arrival rate .λt (t) and departure rate .ϵ(t).
The function .f (t) = λt (t) − ϵ(t) represents the net rate of task accumulation over
time.
For the purpose of numerical integration, the total simulation time .tsim is divided
into n equal intervals, each represented as .[tk , tk+1 ], where k ranges from 0 to .n − 1.
The step size for each interval is .h = tsim
n . To minimize integration error, Simpson’s
rule of composite integration is applied, which is a method known for its accuracy
in approximating the value of definite integrals. The integral of .f (t) from 0 to .tsim
is approximated as follows:
⎰ ⎲ h⎾ ⏋
tsim n−1
f (t)dt ≈ f (tk ) + 4f (tk+ 1 ) + f (tk+1 )
0 6 2
k=0
. ⎾ ⏋ (3.20)
h ⎲
n−1 ⎲
n−1
= f (0) + 2 f (tk ) + 4 f (tk+ 1 ) + f (tsim ) .
6 2
k=0 k=0
The online matching stage occurs if the number of tasks in the global buffer
exceeds the threshold. MSs and edge servers are represented as disjoint sets .M =
{1, 2, · · · , i, · · · , M} and .N = {1, 2, · · · , j, · · · N}. Tasks in .N should be sent to
.M in real time, necessitating an online match.
Definition 3.1 For a matching case .μ∗ between .M and .N , let G(.M , N , μ∗ )
represent a bipartite graph with .μ∗ for matching.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 47
In G(.M , N , μ∗ ), we define .Fi as the total and .Fioc as the occupied computation
capacity of MS i. Upon task .Tj creation by j , we first identify available MSs.
Calculate the distance .Di,j between edge server j and MS i using (3.3). MS
i joins the candidate list .CLj if .Di,j < ri . After determining distances, we
complete .CLj . The selection of MS for .Tj aims to minimize execution latency.
The execution latency of a task includes upload and computation latency. The
computation capacity .Fi,j assigned by MS i to task j is variable, constrained by
C4 to keep execution latency within tolerable limits. The sum of upload and waiting
latency is
.
s
tj,i = tj,i
tr
+ tj,i
rw
, (3.21)
by
s − min t s
tj,i j,i si ∈CLj
αi,j = 1 +
.
s − min t s (αmax − 1) . (3.23)
max tj,i j,i
si ∈CLj si ∈CLj
ALG(I )
CR =
. min . (3.24)
G(M ,N ,μ) OP T (I )
⎲ ⎛ ⎞
Loss =
.
OP T
Fi,j − Fi,j
ALG
. (3.25)
ei ∈E '
48 3 Computation Offloading in Industrial Edge Computing
where
⎲ ⎛ ⎞ ⎲ ⎛ ⎞
Losssi =
.
OP T
Fi,j − Fi,j
ALG
≤ Fi − ALG
Fi,j . (3.27)
ei ∈Es' ei ∈Es'
For .Es' /= ∅ and edge server .ej ∈ Es' , when .Tj is generated, it is offloaded
to .si' satisfying .Fi,j
OP T > F ALG , implying .F oc ≥ F − F ALG . Incorporating this
i,j i i i,j
into (3.27),
⎲ ⎛ ⎞
Losssi ≤ Fioc + Fi,j
.
ALG
∗ −
ALG
Fi,j , ∀ej ∗ ∈ Es' . (3.28)
ei ∈Es'
For task .Tj of edge server j , calculate .Di,j , .ψ(ηi ) and .Fi,j for each MS i. If
.Di,j < ri , MS i joins .CLj . The task is offloaded to i maximizing the product
.ψ(ηi ) × Fi,j in .CLj , followed by task removal from the global buffer.
with
⎲
k
. ρl = N. (3.33)
l=1
ˆ
Given .lˆ ≤ l and the monotonic decrease of .ψ(ηi ), .ψ( kl ) ≤ ψ( kl ). Thus, (3.34)
becomes
⎛ ⎞ ⎛r ⎞
l
.Fi ∗ ,j ψ ≤ Fi ' ,j ψ . (3.35)
k k
⎲
k ⎛ ⎞ ⎲
k ⎛ ⎞
l l
. ψ ρl ≤ ψ σl . (3.36)
k k
l=1 l=1
⎲
k ⎛ ⎞
l 1
. ρl ≥ N 1 − . (3.37)
k e
l=1
The left side represents ALG’s occupied computation capacity, completing the
proof.
Once the global buffer is filled, the offline matching stage commences. We have
previously defined MSs and edge servers as disjoint sets .M = {1, 2, · · · , i, · · · , M}
and .N = {1, 2, · · · , j, · · · N }. Let .N ' ⊆ N represent the edge servers in the full
global buffer. The goal is to establish an optimal matching relationship .μ between
.M and .N .
'
50 3 Computation Offloading in Industrial Edge Computing
by decreasing preference.
Definition 3.4 Within .Pj (i), .i ≻j i ' indicates edge server j prefers MS i over MS
.i ' . Similarly, in .Pi (j ), .j ≻i j ' signifies MS i prefers edge server j over edge server
.j .
'
A stable matching .μ between .M and .N ' is sought [22], where each edge
server matches with at most one MS, and each MS does not exceed its maximum
computation capacity. Additionally, .μ should not have any blocking pairs.
Definition 3.5 A pair .(i, j ) of edge server j and MS i blocks .μ if .j ≻i μ(i) and
i ≻j μ(j ). Such a pair is called a blocking pair of .μ.
.
To enhance the stability, we build the preference lists for all edge servers and
MSs. Intuitively, edge server j prefers an MS with more resources and lower
execution latency, while MS i favors an edge server that utilizes more resources,
enhancing its efficiency. The preference lists, .Pi (j ) for MS i and .Pj (i) for edge
server j , are complete, transitive, and strict. The preference functions are defined as
⎧
i ≻j i ' ⇔ Pj (i) > Pj (i ' ),
. (3.38)
j ≻i j ' ⇔ Pi (j ) > Pi (j ' ).
The offline stage aims to match .M and .N ' with strong stability. Initially,
preference lists are generated (Line 1). Each unmatched edge server selects the
most preferred MS from its list (Line 3). MSs with available capacity review
their preferences, accepting or rejecting edge servers based on the aggregate
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 51
computation capacity requirement (Lines 4–7). Unmatched edge servers then update
their preferences (Lines 8–11) to enhance matching stability.
in Fig. 3.3), each with a computation capacity of 80 GHz and a .450 m coverage
range, acknowledging industrial areas’ complexity compared to residential ones.
Details are in S2 of Table 3.1.
52 3 Computation Offloading in Industrial Edge Computing
signal reception conditions in IIoT contexts [26, 27]. The channel gain per meter is
.−50 dB, accounting for the complex outdoor environments in IIoT [28, 29]. MS’s
signal antennas have a height of .30 m.
Threshold parameters are also configured to demonstrate the scheme is effec-
tiveness. Task arrival rates follow a periodic step function with an .8000 ms period
and four steps per period, ranging from .0.001/ms to .0.008/ms. The arrival rate
also increases with the number of devices. The system’s task departure rate is set
at .0.005/ms. The threshold calculation is based on (3.19). All scenario-specific and
common parameters are summarized in Table 3.1.
In both scenarios, tasks are categorized as light or heavy, based on computa-
tional intensity and resource requirements. The typical heavy task is computation-
incentive Virtual Reality (VR)/Augmented Reality (AR), while the typical light task
is natural language processing. The ratio of heavy tasks .ε is randomly sampled from
.{0.2, 0.5, 0.8} and .{0.4} for two scenarios, aligning with the distribution in IIoT
• Timeout Rate (TR): This measures the proportion of tasks not completed within
the tolerance latency, computed as the number of unfinished tasks over the total
number of tasks. It is defined as
Number of U nf inished T asks
TR =
. . (3.42)
T otal Number of T asks
• Min-Min: In this approach, tasks arriving in batches are offloaded. Each task is
assigned to the MS guaranteeing the shortest execution latency, determined by
the sequential order of task arrivals [31, 32].
• Max-Min: Similar to Min-Min, this scheme offloads tasks in batches within a
fixed time slot. Each task is assigned to the MS that has the longest execution
latency, following the order of task arrival [31, 32].
For all schemes, including FoGMatch, Min-Min, and Max-Min, we utilize
the same computing power distribution. This uniformity is crucial for a fair and
significant comparison of each scheme in offloading decision effectiveness. In
FoGMatch, Min-Min, and Max-Min, the time slot is set to .800 ms.
Fig. 3.4 Analyzing the variation in heavy tasks’ average execution latency and timeout rate as
device numbers change under distinct .ε levels
Fig. 3.5 The average execution latency and timeout rate of the light tasks with varying device
numbers under different .ε
Figure 3.5 exhibits the average execution latency and timeout ratio of light tasks
with different .ε. As shown in Fig. 3.5a, b, and c, the average execution latency
for light tasks keeps stable in Max-Min, Min-Min, FoGMatch, and ATOM with
the increasing of N. Max-Min consistently shows the highest average execution
56 3 Computation Offloading in Industrial Edge Computing
latency, while Min-Min and FoGMatch exhibit similar times. ATOM outperforms
other schemes with the lowest average execution latency for light tasks. Conversely,
OnlineMatch sees a notable increase in execution latency as N grows, attributable
to accumulating waiting latency.
Figure 3.5d, e, and f shows the timeout rate performance of light tasks in different
schemes with varying N . Across these figures, Max-Min, Min-Min, FoGMatch, and
ATOM show minimal changes in timeout rates with increasing N. Notably, ATOM
always records the lowest timeout rate, outperforming Max-Min, Min-Min, and
FoGMatch by 40.6%, 15.8%, and 14.2%, respectively. In contrast, OnlineMatch
exhibits a marked increase in timeout rate for light tasks, surpassing 95% when
.N ≥ 400, indicating its instability in high-task scenarios.
Given the results from the first scenario, where Max-Min and OnlineMatch were
less effective, the second scenario focuses on comparing ATOM with Min-Min and
FoGMatch.
In Fig. 3.6, the average execution latency of heavy and light tasks in these three
schemes is compared. ATOM achieves lower average execution latency for both task
types and shows higher stability, as evidenced by the clustering of data in the boxplot
analysis. Figure 3.7a and b depicts the timeout rates for heavy and light tasks. The
timeout rate for heavy tasks is generally lower compared to light tasks, attributable
to their longer execution latency. ATOM again demonstrates lower timeout rates
for both task types compared to Min-Min and FoGMatch. An upward trend in all
six curves is observed, likely due to the sparser server distribution in the second
scenario, which limits server options for task assignment. As device numbers and
tasks increase, some tasks face latency beyond their time limits, leading to elevated
timeout rates.
Figure 3.8 presents the variation in MSs’ computation capacity utilization with
changing device numbers across different schemes in two scenarios.
In the first scenario with 100 MSs, Fig. 3.8a shows an increase in computing
demand with more devices due to a higher task count. Notably, after .N > 300,
the OnlineMatch scheme faces congestion in processing tasks, leading to latency
task processing and a sharp rise in computing utilization, indicating a system
breakdown. In contrast, ATOM, FoGMatch, Min-Min, and Max-Min maintain an
average computation capacity utilization around 0.3 with 100 devices, increasing
to approximately 0.6 with 500 devices. ATOM notably exhibits higher utilization
due to its reduced waiting times and less allocated computation capacity per task,
resulting in a longer execution latency and higher average utilization during task
execution.
In scenario 2, depicted in Fig. 3.8a, with a reduced number of 49 MSs, the overall
computation capacity utilization is higher compared to the first scenario. Similar
to scenario 1, the utilization increases with the number of devices. ATOM again
showcases higher average utilization than FoGMatch and Min-Min, credited to its
adept management of tasks and computing resources.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 57
Fig. 3.6 The average execution latency of the tasks with varying device numbers
58 3 Computation Offloading in Industrial Edge Computing
Fig. 3.7 The timeout rate of the tasks with varying device numbers
Fig. 3.8 The average computation capacity utilization of the MECs with varying device numbers
In the system model illustrated in Fig. 3.9, we focus on a scenario where a single BS,
equipped with an edge server, is interconnected with multiple users. This connection
is facilitated through dynamic wireless channels that operate within the coverage
area of the BS. The edge server in this setup is characterized by .kb cores. Similarly,
each user’s end device is equipped with .kl cores, with each core designed to execute
one task at a time. A key feature of both the end devices and the edge server is
their ability to dynamically adjust CPU frequencies. This adjustment is crucial for
optimizing energy consumption and is made possible through the implementation
of Dynamic Voltage and Frequency Scaling (DVFS) technology [35, 36].
In our system model, time is segmented into T slots, each with a duration of .τ
seconds and indexed as .t ∈ {0, 1, · · · , T − 1}. Within each time slot, users execute
applications that are structured as DAGs. These DAGs are comprised of nodes, each
3.3 Dependent Offloading with DAG-Based Cooperation Gain 59
Fig. 3.9 In the industrial edge computing system model for DAG-based task offloading, three
primary processes are defined: “offloading”, “task precedent”, and “cooperation”. “Offloading”
refers to transferring tasks from the user’s device to the edge server for execution. “Task precedent”
indicates that when a task is executed locally on the user’s device, it requires intermediate data
produced by its preceding task, which may have been executed on the edge server [33, 34].
Finally, “cooperation” suggests that two tasks from different DAGs can collaborate by sharing
their intermediate data, enhancing the overall task processing efficiency
representing a computational task, and edges that denote the dependencies between
these tasks.
Task offloading decisions for all users are determined based on the structure of
these DAGs. These decisions involve determining whether to execute tasks on the
end devices or to offload them to the edge server. Factors such as CPU frequencies
and network conditions are taken into account in this process. Additionally,
decisions regarding cooperation among applications are made to improve overall
application performance [37]. This cooperative process is restricted to applications
within the same time slot to prevent any adverse effects on performance. It entails
selecting an External Dependency (ED) from the pool of available devices for data
transmission and establishing the ratio of data to be transmitted through this ED. The
primary objective of these strategies is to maximize the utility of the applications by
judiciously optimizing the offloading, cooperation, and computation decisions for
each user.
Let N represent the total number of tasks in an application, with task 1 and task
N serving as virtual tasks that require no computation. These tasks facilitate the
initiation and termination of the application on the end device. With EDs, we define
a DAG-ED for U users’ applications, which is an extension of the standard DAG.
60 3 Computation Offloading in Industrial Edge Computing
Fig. 3.10 An illustration of ED-DAGs for two applications. The circle node in the figure
represents a task .vu,n . In a DAG, the black edge .(vu,n , vu,n' ) is the general dependency, while
the red dash edge .(vu,n , vu' ,n' ) between two DAGs is the ED. The task and edge are described by
computation requirement .cn and the size of transmitted data .oe , respectively. Note that there are two
grey virtual nodes without computation requirements as the beginning and end of an application
execution
3.3 Dependent Offloading with DAG-Based Cooperation Gain 61
• Offloading Decision: Let .α t represent the offloading decisions for N tasks and
t
U users at time slot t. Here, .αu,n = 1 implies that user u’s task n is executed
locally, while .αu,n = 0 indicates the task is offloaded to the edge.
t
• Cooperation Decision: The vector .β t , sized .|E|, indicates the data transmission
ratio through EDs, with .βet ∈ [0, 1] for each .e ∈ E.
• Computation Decision: A U -dimensional vector, .λt , represents the local CPU
frequency ratio for users in time slot t, where .λtu ∈ [0.1, 1] and .0.1 is the
minimum ratio.
Define .Ru,n (t) as the transmission rate between user u and the BS, corresponding
to the execution location of task .vu,n . By Shannon’s theorem, we have
⎛ ⎞
pu,n (t)hu (t)
Ru,n (t) = W log 1 +
. , (3.44)
σ2
where W and .σ 2 denote the bandwidth of orthogonal channels and the channel
noise, respectively. The channel gain .hu (t) between user u and BS in time slot t is
given by
⎛ ⎞p
3 × 108
hu (t) = Ad
. , (3.45)
4πfc lut
with .Ad , .fc , .lut , and p representing the channel gain, communication frequency, the
distance between user u’s end device and the edge server, and the path loss exponent,
respectively. Note that .Ru,n (t) varies due to the mobility of end devices.
Meanwhile, the transmitting power .pu,n (t) from the sender is defined as
( )
pu,n (t) = αu,n
.
t
pb + 1 − αu,n
t
pl , (3.46)
where .pb and .pl are the transmitting powers of the BS and users, respectively.
Consequently, the transmission latency from .vu' ,n' to that of task .vu,n in time slot
t can be calculated based on these parameters:
⎧
⎨ |αu,n
t −α t oe
u' ,n' | Ru,n (t) , e∈E
u,n
.L ' ' (t) = t t α t βt o (3.47)
u ,n ⎩ αu,n βe oe + u' ,n' e e , e ∈ E.
Ru,n (t) Ru' ,n' (t)
For an intra-application dependency edge .e = (vu' ,n' , vu,n ) in .E, the transmitted
data size is .oe . If tasks .vu' ,n' and .vu,n are executed at the same location, the
transmission latency is zero, indicated by .|αu,n
t − αt
u' ,n' | = 0. Otherwise, the latency
oe
is . Ru,n (t) .
For an ED edge .e ∈ E, the transmitted data size is .βet oe . The latency is set to zero
when the tasks are executed on the same position, i.e., .αu,n t = αut ' ,n' = 0). If both
tasks are executed on their respective local devices, the latency includes the sum of
62 3 Computation Offloading in Industrial Edge Computing
βt o βet oe
(t) + R
e e
transmissions to and from the edge server, calculated as . Ru,n . If one task
u' ,n' (t)
βet oe βt o
is local and the other is on the edge server, the latency is either or . R 'e 'e(t) ,
.
Ru,n (t)
u ,n
depending on the task’s location.
When tasks .vu' ,n' and .vu,n are connected by a task dependency edge and executed
in different locations, the transmission involves sending data first to the BS and then
to the other user.
Application Latency For task .vu,n in time slot t, let .LB C
u,n (t), .Lu,n (t), and .Lu,n (t)
represent the initiation, computational latency, and completion time, respectively.
The completion time is given by
Lu,n (t) = LB
. u,n (t) + Lu,n (t).
C
(3.48)
The application latency for user u in time slot t is .Lu,N (t). Due to the potential
asynchrony of task execution among different users, we adjust the initiation time as
follows:
The initiation time of .vu,n is the latest time by which all output data from tasks
in .Ωu, n has been transmitted:
{ }
.Lu,n (t) =
B
max Lu' ,n' (t) + Lu,n
u' ,n' (t) . (3.51)
vu' ,n' ∈Ω u,n
cn ( ) cn
u,n (t) = αu,n
LC + 1 − αu,n
t t
. , (3.52)
fb λtu fl
where .fb and .fl are the CPU frequencies of the edge server and end device,
respectively. This method is also applicable to CPU-GPU heterogeneous computing
platforms.
Cooperation Gain Calculating the exact cooperation gain from EDs through
partial data sharing is challenging. To address this, we correlate the shared data
amount with application utility. Recognizing that the benefit of increased shared
3.3 Dependent Offloading with DAG-Based Cooperation Gain 63
data diminishes over time, we use a .log10 (·) function to evaluate the cooperation
gain:
⎲
G(t) =
. log10 βet oe . (3.53)
e∈E
⎲
N
.Eu (t) = C
Eu,n (t) + Eu,n
T
(t). (3.56)
n=1
⎲
U
⎾ ⏋
rt = ω1 G(t) −
. ω2 Lu,N (t) + ω3 Eu (t) , (3.57)
u=1
−1
T⎲
P1 :
. max rt
α,β,λ
t=0
s.t. t
αu,1 = t
αu,N = 0, ∀ u, t (3.58)
t
αu,n ∈ {0, 1}, ∀ u, t, n
βet ∈ [0, 1], ∀ e ∈ E
λtu ∈ [0.1, 1], ∀ u, t, n.
64 3 Computation Offloading in Industrial Edge Computing
In this section, we address .P1 by modeling it as an MDP, enabling the use of actor–
critic methods to address the complex interplay between offloading, computation,
and cooperation decisions. We then introduce a soft policy function to expand
the action space and avoid local optimum. Finally, we propose a branch-based
actor, designed to synchronize the individual offloading decisions while preserving
cooperation gain.
Q-Network (DQN) [39] and DDPG [40]. However, SAC’s single-agent nature
does not account for inter-agent interference. To mitigate this, we adopt the
Multi-Agent Soft Actor–Critic (MASAC) approach, which effectively manages the
nonstationarity of multi-agent environments, facilitating efficient and privacy-aware
service migration decisions amidst resource competition.
Facing the challenge of large action spaces in classical DRL methods, we adopt
SAC algorithm for DAG-ED-based offloading problem. SAC adds an entropy to the
reward, boosting the exploration capabilities. This entropy term represents policy
randomness, encouraging the policy to choose actions with higher rewards more
diversely.
Consider a policy .π(at |st ), which selects action .at in a given environment state
.st . The entropy under this policy is defined as
The soft V -value in the BSAC framework evaluates a state under policy .π using
a modified Bellman backup:
We introduce BSAC which includes two main modules: the critic and the actor, as
illustrated in Fig. 3.11. The critic module evaluates feedback from the environment
66 3 Computation Offloading in Industrial Edge Computing
using soft Q-value and V-value functions. The branches in the actor output the
offloading, cooperation, and computation decisions, respectively.
Critic Module This module employs two DNNs, .Qθ for the soft Q-value and .V ρ
for the soft V-value, enhancing training stability. An experience replay buffer .B
stores and updates experiences as shown in
B ← (st , at , rt , st+1 ) ∪ B.
. (3.65)
Network parameters .θ and .ρ are updated using random samples from .B,
reducing sample correlation with the network parameters.
Let .(si , ai , ri , si+1 ) denote the ith sampled tuple. For the soft V-value network
.Vρ , we train it by minimizing the mean squared error (MSE):
⎧ ⎾ ⏋2 ⎫
1
L
. (ρ) = Esi Vρ (si )−Eai {Qθ (si , ai )+μH (π )} , (3.66)
2
and soft V-value network parameter .ρ is updated based on gradient decent theory
ρ ← ρ − lρ ∇L (ρ),
. (3.67)
where .lρ is the learning rate of the soft V-value network. Here, .∇ρ L (ρ) is the
gradient, which is
⎾ ⏋
∇ρ L (ρ) = ∇ρ Vρ (si ) Vρ (si ) − Qθ (si , ai ) − μH (π ) .
. (3.68)
We train it toward minimum soft Bellman residuals between .Qθ (si , ai ) and
^θ (si , ai ), i.e.,
Q
.
⎧ ⎫
1⎾ ⏋
^θ (si , ai ) 2 .
L (θ ) = E{si ,ai }
. Qθ (si , ai ) − Q (3.71)
2
. ˆ (θ ).
θ ← θ − lθ ∇L (3.72)
Actor Module with Multiple Branches The actor module features a multi-
branch structure, each branch composed of a DNN and governed by a distinct
network parameter .ϕj , j ∈ {1, 2, · · · , U + 2}. These branches are dedicated
to different decision-making processes: offloading decisions for individual users,
cooperation decisions of EDS, and computation decisions for all users, aligned
with the policy .πϕj . The first U networks, each corresponding to a user u, focus
j
on offloading decisions, represented by .ai , j ∈ {1, 2, · · · , U }. The subsequent
+1
network handles cooperation decisions .aU i , while the final one is tasked with
U +2
computation decisions .ai . This distributed training approach allows the system
to adapt to the specific requirements of different applications, thereby enhancing
overall application utility.
The optimal policy is determined by minimizing the Kullback–Leibler (KL)
divergence expectation, formulated as
⎧ ⎛ ‖ ⎞⎫
‖ exp(Qθ (si , ·))
.L (ϕj ) = Esi DKL πϕj (·|si )‖
‖ , (3.74)
Zθ (si )
j
ai = ϕ(δi ; si ).
. (3.75)
68 3 Computation Offloading in Industrial Edge Computing
whose gradient is
⎛ ⎞ ⎛ ⎞
j j
∇ϕj L (ϕj ) =∇ϕj H
. πϕj (ai |si ) + ∇aj H πϕj (ai |si )
i
⎛ ⎞
j j
+ ∇aj Qθ si , ai ∇ϕj ai . (3.77)
i
In this book, our simulation is established in a typical cellular network over .T = 500
time slots, where four users are uniformly distributed. The BS is equipped with a
CPU of frequency .fb = 2.5 × 109 cycles/s, .kb = 8 cores, and transmitting power
.pb = 1 W. Each user’s device features a CPU with frequency .fl = 8 × 10 cycles/s,
7
.kl = 2 cores, and transmitting power .pl = 0.1 W. The wireless channel .hu (t)
between users and BS follows the free space path loss model, with channel noise .σ 2
set at .10−10 .
Users run an application comprising six tasks, as illustrated in Fig. 3.12. The
data size of intermediate results is indicated by the values on the directed edges of
the DAG. Four EDs enable cooperation among users, shown as red dashed lines.
The computation workload for these tasks is randomly assigned values in the range
of .[0, 60, 80, 150, 100, 0] (M Cycles), and the energy consumption constant .κ is
.10
−27 . Weight parameters for application utility, .ω , ω , and .ω , are set to .0.5, 0.5,
1 2 3
and 5, respectively. These simulation parameters are consistent with existing works
on DAG task offloading and are detailed in Table 3.2.
We compare the performance of BSAC with the following four benchmarks:
• Never Cooperate (NC): It uses a greedy algorithm to offload the tasks toward
maximum application utility, without cooperation.
3.3 Dependent Offloading with DAG-Based Cooperation Gain 69
Fig. 3.12 DAG-ED model including 6 EDs, i.e., 4 red ones and 2 green ones, where the 4 red dash
edges are the basic EDs and the 2 green dash edges are used to verify the impact of the number of
EDs
.kb 8
.kl 2
.pb 1W
.pl 0.1 W
.κ .10
−27
In Fig. 3.13a, we observe the application utility of five distinct methods across a
bandwidth range of 8–16 MHz. Notably, the AC method initially shows the lowest
utility at bandwidths below 9 MHz, primarily due to the significant execution latency
caused by EDs. However, as the bandwidth exceeds 10 MHz, AC’s utility escalates
70 3 Computation Offloading in Industrial Edge Computing
Fig. 3.13 The different performance with different bandwidths: (a) application utility, (b) average
execution latency, (c) average cooperation gain, and (d) average energy consumption
altogether, records the lowest latency, offering a more streamlined processing route.
BASC, focusing on energy efficiency, exhibits the third-lowest average application
latency among the methods.
In terms of average cooperation gain, illustrated in Fig. 3.13c, the bandwidth
variation from 8 MHz to 16 MHz brings interesting dynamics. NC, which does not
participate in data sharing, maintains a consistent zero gain across the bandwidth
spectrum. AC, with its policy of consistent cooperation, achieves the highest average
cooperation gain, irrespective of network conditions. RC, with its random approach
to cooperation, realizes a gain that is roughly half that of AC. Interestingly, QSAC’s
cooperation gain increases with bandwidth and eventually surpasses that of RC.
BASC, by adaptively changing the data transmission ratio in EDs, secures the
second-highest cooperation gain, benefiting from its strategy of partial data sharing.
Finally, Fig. 3.13d shows the average energy consumption for the five methods,
with the bandwidth spanning 8 to 16 MHz. RC emerges as the method with the
highest energy consumption due to its random approach to cooperation, which does
not consider the interplay between task offloading and cooperation under dynamic
network conditions. AC, on the other hand, manages to reduce energy consumption
by maintaining user cooperation, which leads to lower transmission latency between
interdependent tasks during offloading, especially as bandwidth increases. NC and
QSAC, while efficient in some respects, exhibit higher energy consumption than
BASC. BASC stands out by balancing energy consumption with cooperation gains
and application latency, achieving the lowest energy consumption overall. This
efficiency is largely attributed to BASC’s flexibility in reducing energy consumption
by adaptively adjusting the CPU frequency of the end device.
In our research, three DAG-ED configurations are defined for analysis. DAG-ED-1
refers to the original model, featuring four red EDs. Expanding on this, DAG-ED-2
includes the same four red EDs, complemented by an additional green ED valued
at 1300. This setup explores the influence of an extra ED with significant value.
Finally, DAG-ED-3 encompasses the full spectrum, incorporating all six EDs to
examine the system’s capacity in a more complex setup.
In the analysis presented in Fig. 3.14a, the total application utility of five different
methods is evaluated under three distinct DAG-ED configurations with a fixed
bandwidth of 12 MHz. NC, not participating in data sharing, shows a consistent
application utility across all DAG-ED variations. In contrast, the application utility
for other methods exhibits an upward trend as the number of EDs increases from
DAG-ED-1 to DAG-ED-3. This increase is attributed to the growing number of
EDs providing a wider range of cooperative options. Among these methods, BASC
consistently achieves the highest application utility in each DAG-ED setup. It
adeptly balances energy consumption, cooperation gain, and application latency,
effectively utilizing the available EDs to enhance overall utility.
72 3 Computation Offloading in Industrial Edge Computing
Fig. 3.14 (a) Application utility with different DAG-EDs and (b) latency, energy consumption,
and cooperation gain of BASC with different DAG-EDs
Further insights are offered in Fig. 3.14b, which displays changes in application
utility concerning latency, energy consumption, and cooperation gain for BASC at
a bandwidth of 12 MHz. Following the principle outlined in Eq. (3.57), application
utility is inversely related to energy consumption and latency. Therefore, energy
consumption and latency are represented as negative values to illustrate their impact
on application utility. As the number of EDs rises from DAG-ED-1 to DAG-ED-
3, there is a notable increase in cooperation gain, albeit accompanied by extended
waiting times and elevated transmission energy consumption. The addition of
new EDs introduces more flexible cooperation options, resulting in an upsurge in
cooperation gain. This comprehensive analysis underscores the influence of EDs on
the application utility, highlighting the trade-offs between cooperation gain, energy
expenditure, and latency.
In Fig. 3.15a, the impact of varying the number of cores in the edge server on the
application utility of five different methods is presented. As the number of cores in
the edge server increases, a noticeable growth trend in application utility is observed
across all methods. This improvement is primarily due to the enhanced capacity of
the edge servers to execute tasks in parallel, effectively reducing application latency.
Similarly, Fig. 3.15b focuses on the application utility as influenced by the
number of cores in each end device. Unlike the edge server scenario, the increase in
utility here is relatively marginal. This limited improvement can be attributed to the
lower CPU frequency of the end devices compared to the edge servers. Despite the
increase in cores allowing for more parallel task execution at the device level, the
lower frequency of these devices restrains the overall gain in application utility.
3.3 Dependent Offloading with DAG-Based Cooperation Gain 73
Fig. 3.15 Comparison of application utility with variations in core numbers (a) in the edge server
and (b) in each end device
10500
10000
Cooperation Gain
9500
9000
8500
0
0.5 1.5
1
2 1 0.5
1.5 0
1
Figures 3.16, 3.17, 3.18, 3.19, 3.20, and 3.21 demonstrate the effects of varying the
weight parameters .ω1 , .ω2 , and .ω3 , which, respectively, correspond to cooperation
gain, application latency, and energy consumption. The values for .ω1 and .ω2 are set
at .[0.1, 0.5, 1, 1.5], while .ω3 ranges from .[1, 5, 10, 15].
With an increase in .ω1 , cooperation gain assumes greater significance in the
reward function. As a result, there is a marked increase in cooperation gain,
accompanied by a slight decrease in application latency and energy consumption. In
contrast, a higher value of .ω2 , which acts as a penalty in the reward function, tends
to favor local task execution. This approach leads to increased cooperation costs and
energy consumption, offsetting the benefits of spared cooperation gain.
Increasing the value of .ω3 in the reward function leads to a noticeable decrease
in energy consumption, while simultaneously causing an increase in application
latency and cooperation gain. This trend arises because prioritizing energy savings
74 3 Computation Offloading in Industrial Edge Computing
10500
Cooperation Gain
10000
9500
9000
8500
0
5 0
0.5
3 10 15 1.5
1
1
104
2
Application Latency (s)
1.8
1.6
1.4
1.2
0.8
0
0.5 0
0.5
1 1
2 1.5 1.5 1
104
1.8
Application Latency (s)
1.6
1.4
1.2
0.8
15
10 0
0.5
5
3 0 1.5
1
1
2000
1800
1700
1600
1500
1400
0
0.5 1.5
1
1 0.5
2 1.5 0 1
2000
Energy Consumption (J)
1800
1600
1400
1200
0
5 0
0.5
10
3 15 1.5
1
1
in the reward function encourages task execution on edge servers, even if it means
incurring longer waiting times.
References
1. Tie Qiu, Jiancheng Chi, Xiaobo Zhou, Zhaolong Ning, Mohammed Atiquzzaman, and
Dapeng Oliver Wu. Edge computing in industrial internet of things: Architecture, advances
and challenges. IEEE Communications Surveys Tutorials, 22(4):2462–2488, 2020.
2. Min Chen and Yixue Hao. Task offloading for mobile edge computing in software defined ultra-
dense network. IEEE Journal on Selected Areas in Communications, 36(3):587–597, 2018.
3. Pavel Mach and Zdenek Becvar. Mobile edge computing: A survey on architecture and
computation offloading. IEEE communications surveys & tutorials, 19(3):1628–1656, 2017.
4. Hai Lin, Sherali Zeadally, Zhihong Chen, Houda Labiod, and Lusheng Wang. A survey on
computation offloading modeling for edge computing. Journal of Network and Computer
Applications, 169:102781, 2020.
76 3 Computation Offloading in Industrial Edge Computing
5. Bin Cao, Long Zhang, Yun Li, Daquan Feng, and Wei Cao. Intelligent offloading in multi-
access edge computing: A state-of-the-art review and framework. IEEE Communications
Magazine, 57(3):56–62, 2019.
6. Xianfu Chen, Jinsong Wu, Yueming Cai, Honggang Zhang, and Tao Chen. Energy-efficiency
oriented traffic offloading in wireless networks: A brief survey and a learning approach
for heterogeneous cellular networks. IEEE Journal on Selected Areas in Communications,
33(4):627–640, 2015.
7. Li Lin, Xiaofei Liao, Hai Jin, and Peng Li. Computation offloading toward edge computing.
Proceedings of the IEEE, 107(8):1584–1607, 2019.
8. Yuxuan Sun, Xueying Guo, Jinhui Song, Sheng Zhou, Zhiyuan Jiang, Xin Liu, and Zhisheng
Niu. Adaptive learning-based task offloading for vehicular edge computing systems. IEEE
Transactions on Vehicular Technology, 68(4):3061–3074, 2019.
9. Jia Yan, Suzhi Bi, Ying Jun Zhang, and Meixia Tao. Optimal task offloading and resource
allocation in mobile-edge computing with inter-user task dependency. IEEE Transactions on
Wireless Communications, 19(1):235–250, 2019.
10. Ke Zhang, Yongxu Zhu, Supeng Leng, Yejun He, Sabita Maharjan, and Yan Zhang. Deep
learning empowered task offloading for mobile edge computing in urban informatics. IEEE
Internet of Things Journal, 6(5):7635–7647, 2019.
11. Jiancheng Chi, Chao Xu, Tie Qiu, Di Jin, Zhaolong Ning, and Mahmoud Daneshmand. How
matching theory enables multi-access edge computing adaptive task scheduling in IIoT. IEEE
Network, pages 1–7, 2022.
12. Jiancheng Chi, Tie Qiu, Fu Xiao, and Xiaobo Zhou. Atom: Adaptive task offloading with
two-stage hybrid matching in MEC-enabled industrial IoT. IEEE Transactions on Mobile
Computing, pages 1–17, 2023.
13. Ming Tang and Vincent WS Wong. Deep reinforcement learning for task offloading in mobile
edge computing systems. IEEE Transactions on Mobile Computing, 21(6):1985–1997, 2020.
14. Xinchen Lyu, Hui Tian, Cigdem Sengul, and Ping Zhang. Multiuser joint task offloading
and resource optimization in proximate clouds. IEEE Transactions on Vehicular Technology,
66(4):3435–3447, 2016.
15. Ying Ju, Yuchao Chen, Zhiwei Cao, Lei Liu, Qingqi Pei, Ming Xiao, Kaoru Ota, Mianxiong
Dong, and Victor CM Leung. Joint secure offloading and resource allocation for vehicular edge
computing network: A multi-agent deep reinforcement learning approach. IEEE Transactions
on Intelligent Transportation Systems, 2023.
16. Xiaobo Zhou, Shuxin Ge, Pengbo Liu, and Tie Qiu. Dag-based dependent tasks offloading in
MEC-enabled IoT with soft cooperation. IEEE Transactions on Mobile Computing, 2023.
17. Zhaolong Ning, Peiran Dong, Miaowen Wen, Xiaojie Wang, Lei Guo, Ricky Y. K. Kwok, and
H. Vincent Poor. 5G-enabled UAV-to-community offloading: Joint trajectory design and task
scheduling. IEEE Journal on Selected Areas in Communications, 39(11):3306–3320, 2021.
18. Yu Liu, Yong Li, Yong Niu, and Depeng Jin. Joint optimization of path planning and resource
allocation in mobile edge computing. IEEE Transactions on Mobile Computing, 19(9):2129–
2144, 2020.
19. Bo Yang, Xuelin Cao, Joshua Bassey, Xiangfang Li, and Lijun Qian. Computation offloading
in multi-access edge computing: A multi-task learning approach. IEEE Transactions on Mobile
Computing, 20(9):2745–2762, 2021.
20. Matthew Fahrbach, Zhiyi Huang, Runzhou Tao, and Morteza Zadimoghaddam. Edge-weighted
online bipartite matching. In 2020 IEEE 61st Annual Symposium on Foundations of Computer
Science (FOCS), pages 412–423, 2020.
21. Zhiyi Huang, Zhihao Gavin Tang, Xiaowei Wu, and Yuhao Zhang. Fully online matching II:
Beating ranking and water-filling. In 2020 IEEE 61st Annual Symposium on Foundations of
Computer Science (FOCS), pages 1380–1391, 2020.
22. Yunan Gu, Walid Saad, Mehdi Bennis, Merouane Debbah, and Zhu Han. Matching theory for
future wireless networks: fundamentals and applications. IEEE Communications Magazine,
53(5):52–59, 2015.
References 77
23. Aranyak Mehta and Debmalya Panigrahi. Online matching with stochastic rewards. In 2012
IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 728–737. IEEE,
2012.
24. Hyame Assem Alameddine, Sanaa Sharafeddine, Samir Sebbah, Sara Ayoubi, and Chadi Assi.
Dynamic task offloading and scheduling for low-latency IoT services in multi-access edge
computing. IEEE Journal on Selected Areas in Communications, 37(3):668–682, 2019.
25. Sarhad Arisdakessian, Omar Abdel Wahab, Azzam Mourad, Hadi Otrok, and Nadjia Kara.
FoGMatch: an intelligent multi-criteria IoT -Fog scheduling approach using game theory.
IEEE/ACM Transactions on Networking, 28(4):1779–1789, 2020.
26. Lichao Yang, Heli Zhang, Xi Li, Hong Ji, and Victor CM Leung. A distributed computation
offloading strategy in small-cell networks integrated with mobile edge computing. IEEE/ACM
Transactions on Networking, 26(6):2762–2773, 2018.
27. Sladana Jošilo and György Dán. Decentralized algorithm for randomized task allocation in fog
computing systems. IEEE/ACM Transactions on Networking, 27(1):85–97, 2019.
28. Xiong Wang, Jiancheng Ye, and John C.S. Lui. Decentralized task offloading in edge
computing: A multi-user multi-armed bandit approach. In IEEE INFOCOM 2022—IEEE
Conference on Computer Communications, pages 1199–1208, 2022.
29. Liping Qian, Yuan Wu, Fuli Jiang, Ningning Yu, Weidang Lu, and Bin Lin. NOMA assisted
multi-task multi-access mobile edge computing via deep reinforcement learning for industrial
internet of things. IEEE Transactions on Industrial Informatics, 17(8):5688–5698, 2021.
30. A. Mehta, A. Saberi, U. Vazirani, and V. Vazirani. AdWords and generalized on-line matching.
In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05), pages 264–
273, 2005.
31. Sameer Singh Chauhan and R. C. Joshi. A weighted mean time min-min max-min selective
scheduling strategy for independent tasks on grid. In 2010 IEEE 2nd International Advance
Computing Conference (IACC), pages 4–9, 2010.
32. Ismael Salih Aref, Juliet Kadum, and Amaal Kadum. Optimization of max-min and min-
min task scheduling algorithms using G.A in cloud computing. In 2022 5th International
Conference on Engineering Technology and its Applications (IICETA), pages 238–242, 2022.
33. Slad̄ana Jošilo and György Dán. Wireless and computing resource allocation for selfish
computation offloading in edge computing. In IEEE INFOCOM 2019-IEEE Conference on
Computer Communications, pages 2467–2475. IEEE, 2019.
34. Colin Funai, Cristiano Tapparello, and Wendi Heinzelman. Computational offloading for
energy constrained devices in multi-hop cooperative networks. IEEE Transactions on Mobile
Computing, 19(1):60–73, 2019.
35. Etienne Le Sueur and Gernot Heiser. Dynamic voltage and frequency scaling: The laws of
diminishing returns. In Proceedings of the 2010 international conference on Power aware
computing and systems, pages 1–8, 2010.
36. Greg Semeraro, Grigorios Magklis, Rajeev Balasubramonian, David H Albonesi, Sandhya
Dwarkadas, and Michael L Scott. Energy-efficient processor design using multiple clock
domains with dynamic voltage and frequency scaling. In Proceedings Eighth International
Symposium on High Performance Computer Architecture, pages 29–40. IEEE, 2002.
37. Zhaolong Ning, Peiran Dong, Xiangjie Kong, and Feng Xia. A cooperative partial computation
offloading scheme for mobile edge computing enabled internet of things. IEEE Internet of
Things Journal, 6(3):4804–4814, 2018.
38. Xiongwei Wu, Xiuhua Li, Jun Li, P.C. Ching, C.M. Leung, Victor, and Vincent Poor, H.
Caching transient content for IoT sensing: Multi-agent soft actor-critic. IEEE Transactions
on Communications, 69(9):5886–5901, 2021.
39. Quan Yuan, Jinglin Li, Haibo Zhou, Tao Lin, Guiyang Luo, and Xuemin Shen. A joint
service migration and mobility optimization approach for vehicular edge computing. IEEE
Transactions on Vehicular Technology, 69(8):9041–9052, 2020.
40. Haixia Peng and Xuemin Shen. Multi-agent reinforcement learning based resource manage-
ment in MEC- and UAV-assisted vehicular networks. IEEE Journal on Selected Areas in
Communications, 39(1):131–141, 2021.
78 3 Computation Offloading in Industrial Edge Computing
41. Pengbo Liu, Shuxin Ge, Xiaobo Zhou, Chaokun Zhang, and Keqiu Li. Soft actor-critic-
based DAG tasks offloading in multi-access edge computing with inter-user cooperation. In
Algorithms and Architectures for Parallel Processing—21st International Conference, ICA3PP
2021, Virtual Event, December, 2021, Proceedings, Part III, volume 13157, pages 313–327,
2021.
Chapter 4
Data Caching in Industrial Edge
Computing
Edge caching is a prominent research area and practical field, especially benefiting
from the emergence of data mining for IIoT operation control. Typically, before
processing an offloaded task, it is necessary to access relevant sensing data from
servers caching the needed information. This chapter initially introduces data
caching optimization in industrial edge computing systems, which is crucial when
applications rely on inferences from sensing data over specific historical periods.
Given the diverse sources and heterogeneous nature of data collected from numerous
sensors, this chapter then presents two caching solutions tailored for two common
types of data: latency-sensitive data and video streaming data.
4.1 Introduction
Caching the related databases and AI applications within the storage capacities of
edges is a promising way to enable the decision-making of devices in industrial edge
computing systems [1, 2]. The caching data should be determined by the service
requirement of data that directs the activity of devices, such as a map to control the
moving of robots. The cached data can greatly reduce the latency of end devices and
alleviate traffic loads in backhaul links [3].
In fact, the QoE improvement caused by data caching significantly relies on
the prediction accuracy of devices’ requirements, which is reflected by the data
popularity.1 Therefore, current research works pay attention to devoting to obtaining
accurate data popularity for improving Cache Hit Rate (CHR). ML techniques,
including DL, DRL, transfer learning [4], and so on, are widely adopted to
intelligently cache the data by predicting the underlying data popularity within
given historical requests. Meanwhile, some studies reveal that data diversity, when
1 Here, the data popularity indicates the level of the frequency the data is requested.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 79
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_4
80 4 Data Caching in Industrial Edge Computing
leveraged through cooperation among edges, can help reduce service redundancy
and enhance the CHR [5].
Specifically, in industrial edge computing systems, when receiving a request
from an end user, the edge checks whether the required data has been cached. The
request will be responded to instantly on the edge which caches the required data
and enough computation resources. Otherwise, this edge will forward this service
request to its neighboring edge that satisfies the above conditions, the so-called
edge cooperation [6–8]. In extreme circumstances, i.e., no edge can response to
this request, it will be forwarded to the centralized cloud with high latency.
In industrial settings, the freshness of data, or its age of information, significantly
impacts decision-making in production processes. Outdated data delay the decision-
making and adversely affect production. Additionally, with the increasing adoption
of AR and VR technologies in industrial production, the edge often needs to cache
large volumes of videos. Therefore, designing effective video caching strategies
that minimize video streaming latency while maintaining high video quality is
a substantial challenge. Balancing these requirements is crucial for the smooth
operation and efficiency of industrial processes that rely on real-time data and high-
quality video content.
This chapter introduces caching methods in two different scenarios. Section 4.2
pays attention to caching the freshness-aware data, e.g., the High-Definition (HD)
map, incorporating download latency and freshness. We leverage a distributed Mul-
tiarmed Bandit Algorithm (MAMAB) in the decision-making process. Simulation
results demonstrated that our algorithm outperforms other existing algorithms [9].
Section 4.3 focuses on caching QoE-aware data, e.g., video, which further leads
to a video’s bitrate selection problem. We formulate this problem as a multi-
agent cooperative MDP and solve it by Field-of-View (FoV)-aware multi-agent soft
actor–critic (FA-MASAC). The extensive simulation results show the superiority of
FA-MASAC in terms of average QoE and so on [10].
Our goal is to make a trade-off between download latency and data freshness, i.e.,
minimizing the total cost.
4.2.2.1 Overview
As shown in Fig. 4.1, in the Internet of Vehicles (IoV) scenario, there are usually
multiple RSUs and devices distributed in the geographical map and one remote
cloud that stores the entire HD map. We assume the geographical map is composed
of M blocks, each of which has an RSU and its individual HD map. Thus, we use
.M = {1, · · · , M} to denote the sets of RSUs whose maximum storage is S. The
transmission power and channel gain when communicating to m are denoted by .Pm
and .gm , respectively.
82 4 Data Caching in Industrial Edge Computing
HD map comprises the basic and advanced layers, containing static and dynamic
information for different driving control functions [18]. For example, the basic layer
guides the coarse-level planning of paths, while the fine-grained paths are planned
with the support of the advanced layer. Based on this, as shown in Table 4.1, we
divide the HD map into four sub-maps [19], i.e., basic layer with static information
.fbs and dynamic information .fbd and advanced layer with static information .fas
{ }M
and dynamic information .fad . Let .F = fm,bs , fm,bd fm,as , fm,ad m=1 denote the
file set of the sub-map f . We assume that the data size of each sub-map is equal to
.sf .
block and its neighboring vehicles to obtain the HD map. Its transmission power
channel gains are .Pv and .gv , respectively. Note that the vehicle v also caches sub-
maps locally in time slot t, denoted by .Utv .
In time slot t, RSUs adaptively cache the sub-maps from the entire HD map in
the cloud. There are three ways to obtain the required sub-maps for vehicles, i.e.,
request to the cloud via V2I, RSU via V2I, and neighboring vehicles via the V2V.
To prevent interference, we assign distinct communication spectrums for V2V and
V2I. We use .Itv = {1, · · · , K} to represent the candidate vehicles, which is able
4.2 Freshness-Aware Caching with Distributed MAMAB 83
to communicate to the vehicle v in time slot t. The operator can make the caching
decision for RSUs and vehicles per time slot:
• Caching decision for RSUs .β t ∈ {0, 1}M×F : .βm,f t = 1 indicates that sub-map
has been cached in RSU m, while .βm,f = 0 otherwise.
t
During driving, each vehicle should make driving control according to the planning
path. Path planning is a coarse-level control requiring the basic layers .fbs and .fbd
for blocks among all possible paths, denoted by .Ftv,l . Driving is a fine-grained
control made based on both basic and advanced layers to deal with the dynamic
environment and find the safe and efficient target blocks .Ftv,h .
Thus, we can use .Ntv = Ftv,l +Ftv,h . To represent the required sub-maps of vehicle
v in time slot t, which helps to find the sub-maps, .Qtv should be cached, i.e.,
( )
where .εvt and . Nt−1
v − Ntv are the set of stale sub-maps and needless sub-maps of
vehicle v in time slot t.
84 4 Data Caching in Industrial Edge Computing
The specific cost function on account of driving safety is defined as a weighted sum
of download latency and freshness loss. First, we use .Xt ∈ {0, 1}V ×(M+V +1)×F to
t
denote where vehicles request the sub-maps. More specifically, .xv,k,f = 1 indicates
the request is sent to a vehicle; .xv,m,f = 1 indicates the request is sent to either the
t
Once the sub-map f is cached from vehicle k with a bandwidth .Bv via V2V link,
the download latency is
sf
t
dv,k,f
. = ⎛ ⎞. (4.4)
Pv ·gv
Bv log2 1 + N
Similarly, for the caching from the cloud, the download latency can be calculated
by
sf
.
t
dv,0,f = ⎛ ⎞. (4.5)
P0 ·g0
B0 log2 1 + N
Here, .B0 , .P0 , and .g0 are the corresponding bandwidth, transmission power, and
channel gain, respectively.
Next, we define the freshness loss to indicate whether or not to update the sub-
map f on vehicle v in time slot t, which is
⎧ t
⎨ ak,f
, if k ∈ Itv and ak,f
t ≤ τ,
.
t
lv,k,f = 10τ (4.6)
⎩0, otherwise.
t
Specifically, we set .lv,0,f t
, lv,m,f to 0 to capture the constant update of cached sub-
maps in the cloud server and RSUs. By combining the download latency with the
t
freshness loss, the cost .Cv,i,f for the request decision of a certain vehicle, i.e.,
t
Cv,i,f
. = ω · dv,i,f
t
+ (1 − ω) · lv,i,f
t
, i ∈ {0} ∪ M ∪ V. (4.7)
4.2 Freshness-Aware Caching with Distributed MAMAB 85
Note that we use a weighting factor .ω ∈ [0, 1] to make a trade-off between latency
and freshness.
Finally, the cost of vehicle v in block m caches sub-map f in time slot t is
⎾ ⎛ ⎞ ⏋ ⎲
.
D t
v,m,f =x t
v,m,f β t
m,f C t
v,m,f + 1 − β t
m,f C t
v,0,f + t
xv,k,f t
αk,f t
Cv,k,f .
k∈Itv
(4.8)
Based on Eq. (4.8), we formulate the caching problem, which makes caching
decision .β t and vehicle request decision to minimize the total cost, as follows:
⎲
M ⎲ ⎲
P1 : min
.
t
Dv,m,f (4.9)
x t ,β t
m=1 v∈Vtm f ∈Qtv
t
s.t. xv,k,f
. ≤ αk,f
t
,. (4.9a)
⎲
t
xv,k,f + xv,m,f
t
= 1, . (4.9b)
k∈Itv
⎲
t
βm,f sf ≤ S, . (4.9c)
f ∈F
t
βm,f ∈ {0, 1} , xv,k,f
t t
, xv,m,f ∈ {0, 1} . (4.9d)
Constraint (4.9a) specifies that the vehicle must request a sub-map from the
vehicle that has cached it. Constraint (4.9b) stipulates that vehicle v is limited to
acquiring only one instance of sub-map f . Constraint (4.9c) ensures the storage
requirement of cached maps can be burdened by RSU’s capacity. Directly solving
P1 is unfeasible under the vast solution space, especially with the intricate interplay
between caching and request decisions.
Therefore, in the subsequent discussion, we make the caching and request decisions
by solving two decomposed subproblems subsequently, i.e., vehicle request and
caching placement problem. Initially, we randomly cache the contents based on .β t
to find a potential optimal request decision .xot . This decision is further used as the
basis of the caching placement problem, which is solved by a distributed MAMAB
algorithm.
86 4 Data Caching in Industrial Edge Computing
Since the RSU is desired to make caching and request decisions in each time slot,
i.e., .β t = 1, we transform Eq. (4.8) into a novel cost
' ⎲
t
Dv,m,f =xv,m,f
t t
Cv,m,f + t
xv,k,f t
αk,f t
Cv,k,f .
. (4.10)
k∈Itv
⎲
M ⎲ ⎲
'
P2 : min
.
t
Dv,m,f (4.11)
xt
m=1 v∈Vtm f ∈Qtv
algorithm.
To deal with the high complexity, we proposed a freshness-aware request method
based on matching. First, vehicle v covered by RSU m is indexed by its caching
t
cost .Cv,m,f for receiving sub-map f from the RSU m and that from vehicle k
t
once .αk,f = 1. Subsequently, for vehicle v, we derive the cost difference .ΔLtk =
t
Cv,m,f − Cv,k,f
t , k ∈ Ivt :
• If .ΔLtk > 0, vehicle k is regarded as a candidate selection. All the candidate
selections are sorted based on this value in descending order to form a candidate
set .ψvt .
• Otherwise, the vehicle v will cache sub-map from RSU.
We iteratively find the maximum cost discrepancy in .ψvt and decide whether or
not cache sub-maps by receiving contents from other vehicles via V2V links. Here,
the convergent of the iteration is determined by the average cost.
For sub-map f , a high number of vehicles in block m caching it via V2I indicate a
large reduction space for the cost of caching it at RSU m. Thus, the corresponding
reward for a caching decision can be defined as
⎲ ( )
t
rm,f
. = I f, Gtv,m . (4.12)
v∈Vtm
4.2 Freshness-Aware Caching with Distributed MAMAB 87
.Gtv,m is the sub-map that vehicle v requires to request from RSU m in time slot t.
.I(f, Gv,m ) = 1 if .f ∈ Gv,m , and .I(f, Gv,m ) = 0 otherwise. Hence, we transform
t t t
⎲
T ⎲
M ⎲
P3 : max
.
t
βm,f t
rm,f (4.13)
βt
t=1 m=1 f ∈F
s.t. (4.9c),
t
βm,f ∈ {0, 1} .
The increasing number of RSUs also greatly expands the action space, leading to
additional latency by the traditional centralized approach. Furthermore, to handle
the additional communication overhead, we implement a distributed MAMAB
method, where each RSU and sub-map is regarded as agent and arm, respectively.
Meanwhile, we employ the Upper Confidence Bound (UCB) to maintain a balance
between exploration and exploitation.
Initially, RSU m randomly caches the sub-maps and updates the caching
t
frequency .Jm,f t
of sub-map f to calculate average reward .R̄m,f that is
⎾ ( )
|
| 3 log γf 2 t
t
R̂m,f
.
t−1
= R̄m,f +⏌ t−1
, (4.14)
2Jm,f
where .γf2 is the maximum reward of RSU m caching sub-map f . Since the
t , the increasing
exploration count of the sub-map f is positive correlated with .Jm,f
/ ( )
3 log γf 2 t
of term . leads to a high probability of selecting sub-map f . During the
2J m,f t−1
exploration process of sub-map f , the value of .Jm,f t becomes larger, shifting the
t
focus toward .R̄m,f , which signifies the exploitation phase of sub-map f .
∑
Each RSU m optimizes its caching strategy by .max Ff =1 βm,f t t
R̂m,f under
the storage limitation. Then, the average reward is updated based on the caching
t−1 t−1
R̄m,f Jm,f +rm,f
t
t−1
t
decisions. If .βm,f = 1, .R̄m,f
t = t−1
t
and .Jm,f = Jm,f + 1. Otherwise,
Jm,f +1
t−1 t−1
t
R̄m,f
. = R̄m,f t
and .Jm,f = Jm,f .
The bandwidths allocated for vehicles and RSUs are configured at 50 and
150 MHz, correspondingly. A vehicle’s communication range spans 100 m. Addi-
tionally, transmission powers for vehicles and RSUs are set to 300 mW and 2 W,
respectively. Gaussian channel noise and channel gain parameters are established at
.10
−6 mW and .5 dB, respectively [21]. The simulation duration spans .T = 20000
slots, with a threshold .τ defined at 5 time slots. We set .ω to 0.5. The subsequent
benchmarks are employed for comparative analysis:
• Caching without V2V [17]: RSUs independently cache sub-maps to optimize
the average caching reward without leveraging V2V collaboration. The dis-
tributed MAMAB method is employed for caching decisions.
• Latency-Aware Caching [15]: Vehicle request decisions prioritize minimizing
download latency, disregarding the freshness of dynamic sub-maps. Both V2I
and V2V communications are taken into account.
• Location-Aware Caching: RSUs cache sub-maps based on proximity, starting
from nearby blocks and progressing outward until their cache capacity is reached.
Figures 4.2, 4.3, and 4.4 depict the total cost, average freshness loss, and average
download latency across all methods, where the cache size of each RSU varies from
1 to 5 Gbits. Caching without V2V exhibits the lowest average freshness loss, albeit
with the highest download latency, as it relies solely on RSUs or the cloud platform
for sub-map caching. Conversely, latency-aware caching yields the highest average
freshness loss but boasts the lowest download latency, prioritizing this metric
in vehicle request decisions. Location-aware caching, leveraging the MAMAB
algorithm, achieves an average freshness loss comparable to caching without
V2V, showcasing the efficacy of the MAMAB algorithm. However, its download
latency ranks second-highest due to suboptimal caching decisions. The proposed
freshness-aware caching method strikes a balance between average freshness loss
and download latency, resulting in the lowest total cost. Additionally, an observed
trend indicates a decrease in total cost across all methods with an increase in RSU
cache size, attributed to the increased cacheable sub-maps from RSUs.
Figures 4.5, 4.6, and 4.7 further illustrate the performance of the four methods, in
terms of cost, freshness loss, and download latency under varying RSU bandwidths.
As previously discussed, the proposed method achieves the lowest total cost.
Meanwhile, with the bandwidth of RSU increasing, the download latency for
obtaining sub-maps decreases, as well as the total cost.
The simulation outcomes unequivocally demonstrate the superiority of FA-
MASAC over other caching strategies. However, in practical scenarios, it is
imperative to recognize that not all vehicles may be inclined to share their
90 4 Data Caching in Industrial Edge Computing
core network. For 360-degree video streaming, it is difficult to enhance the QoE for
users, with video quality and rebuffering being two pivotal metrics.
The previous studies focus on tile-based streaming and thus augment video qual-
ity and decrease rebuffering, ultimately enhancing QoE [23, 24]. Its fundamental
idea involves spatially dividing the video into multiple tiles, each of which is flexible
to choose a certain bitrate based on available bandwidth [25]. Furthermore, in 360-
degree video, the users often focus on a specific part of the video within their
FoV [26, 27]. Consequently, a series of efficient methods for FoV prediction springs
up, e.g., Linear Regression (LR) [28, 29], DL-based algorithms [25, 30], and so on.
These methods transmit off of tiles in the FoV with a high bitrate, while other tiles
outside the FoV are transmitted at a low bitrate or omitted entirely, contributing to
an improvement in video quality.
These methods rely on temporal correlation between frames, which is difficult
to apply to mobile VR scenarios as the freedom of VR diminishes the temporal
correlation. Numerous studies introduced saliency-driven approaches to enhance
QoE performance of 360-degree video. These solutions capitalize on the sub-
stantial correlation between historical view trajectory and pixel saliency within
the video [31, 32]. Moreover, certain works have delved into a more refined
assessment of the significance of FoV in bitrate selection, playing an essential role
in maintaining video quality with less bandwidth [33–35].
However, although the above tries to improve the overall performance with the
given resources, it cannot break the bottleneck as the resource is heavily limited,
where edge caching is regarded as a potential solution [36, 37]. Several studies make
caching and bitrate selection decisions to further improve QoE [38, 39]. Given a
constant bitrate selection, as shown in Fig. 4.8, tiles in FoV can be cached in advance
at an edge, allowing the user to access request content promptly, thus decreasing
latency. Otherwise, i.e., the bitrate selection changed adaptively according to the
bandwidth, tiles in the FoV are cached at the edge with a high-bitrate format. It is
evident that decisions regarding edge caching and bitrate selection directly affect
the final video quality and latency.
However, the limitation of existing QoE-driven strategies lies in the uniform
QoE function, while, in fact, the distinct application performs different preferences
in the factors for estimating the QoE. Research has demonstrated that different
video genres place different importance on aspects such as video quality and
rebuffering [40–42]. Taking the game as an example, a higher weight tends to be
assigned to rebuffer, as users prefer the smoothness of the gaming experience, while
for a landscape video, a higher weight for video quality is warranted, given the
preference for distortion-free landscape images.
Figure 4.9 illustrates that the QoE estimation from one perspective is inadequate
for accommodating the various requirements of different categories (e.g., game and
landscape). The three strategies are detailed below:
• The quality-first (QF) strategy aims to cache a few portions of tiles with high
quality at the edge. Note that this approach may fall into the local optimum for
users engaging in games.
92 4 Data Caching in Industrial Edge Computing
FoV
Cloud Server Low High
Tile-based 360-degree video
with multiple bitrates
Bitrate Selection
Cache
Game
Tiles with high bitrate Tiles with low bitrate Tiles with adptive bitrate
Landscape
Edge caching QoE Edge caching QoE Edge caching QoE
• The rebuffer-first (RF) strategy tends to store the tiles as much as possible with
low quality. It can offer a smoother experience for users engaging in games, while
may not satisfy the requirements of landscape videos.
• The optimal strategy involves electing different qualities for different categories
to satisfy the corresponding requirements. By doing so, there is considerable
potential for enhancing the average QoE across the board.
Addressing multicategory 360-degree video streaming involves partitioning
storage space for various video categories and applying a dedicated QoE-driven
strategy to each category using its specific QoE function. Indeed, rigidly dividing
cache space is not feasible for a random request state. Moreover, the decisions
regarding caching and bitrate selection suffer from a huge decision space as they
are greatly coupled, which poses a formidable challenge in maximizing the average
QoE for users.
4.3 Multicategory Video Caching 93
We formulate the edge caching and bitrate selection for 360-degree video streaming
problem as a multicategory optimization problem in an industrial edge computing
system.
As shown in Fig. 4.10, the multicategory 360-degree video streaming system
is composed of a remote cloud, an edge server, and U users, denoted by .U =
{1, · · · , u, · · · , U }. Specifically, let C denote the caching capability of the edge
server. The link between the edge server and users is a wireless link, while that
between edge server and cloud is a high-capacity backhaul link.
The remote cloud caches all video categories, denoted by .O = {1, · · · , o, · · · , O}.
Each video category contains multiple videos, denoted by .Vo = {1, · · · , z, · · · , Z}.
The video duration is divide into multiple segments .T = {1, · · · , t, · · · , T },
each of which has a tile set .M = {1, · · · , m, · · · , M}. The tiles have .K ∈ K bitrates
for selection, whose caching capacity requirement is .ck . The transmission latency
between edge and users and that between edge and cloud are denoted by .d E and .d C ,
respectively.
Edge Node
Bitrate Selection of
multicategory 360-degree videos
Users for game (User1)
User1
Users for Landscape (User2)
User2 Cache
Cache of multicategory
360-degree videos
We use a lightweight two-layer LSTM model to predict the FoV [24]. The output
of the LSTM model is the t-th segment of user u, i.e., .Iu,t = {Iu,t,1 , · · · , Iu,t,A },
where .a ∈ A = {1, · · · , A} is the index of tiles.
The operator decides whether the tile is cached on the edge and its corresponding
bitrate quality level. The bitrate selection decisions for category o are represented
by .x o = {x o,u |u ∈ Uo , o ∈ O}. From the perspective of users, we further denote it
by .x o,u = {xu,t,a |t ∈ T, a ∈ A}. Here, .xu,t,a ∈ K is an integer variable to denote
the bitrate selection for a-th tile of segment t’s FoV in the user u’s viewpoint.
Moreover, let .y o = {y o,u |u ∈ Uo , o ∈ O} denote edge caching decisions of
all the categories. Here .y o,u = {yu,t,a
x |t ∈ T, a ∈ A, x ∈ K} is a binary variable,
where .yu,t,a = 1 denotes a-th tile with x-th bitrate of segment t viewed by u that is
x
x
cached, .yu,t,a = 0, otherwise. Based on the above concepts, we decouple QoE from
the following aspects:
Average FoV Quality Given that the user mainly perceives content in the FoV,
which significantly impacts the QoE, utilizing FoV prediction, the average FoV
quality of segment t for user u is computed as follows:
∑A
a=1 q(xu,t,a )
1
QoEu,t
. = . (4.15)
A
Here, .q(·) maps the bitrate selection to video quality experienced by users, i.e.,
q(x) = Rx /RK ,
. (4.16)
1 ⎲ ⎲ || |
A
|
3
QoEu,t
. = · |q(xu,t,a ) − q(xu,t,j )|, (4.18)
A
a=1 j ∈G(a)
where .Du,t and .Bu,t are the download latency and buffer length of user u when
downloading segment t. Note that the tiles in non-FoV are cached with the lowest
'
bitrate. Let .Du,t = (M − A) · l1 · d E denote the transmission latency of tiles in
non-FoV, which is
A ⎾
⎲ ⏋ '
Du,t =
.
x
yu,t,a · lx · d E + (1 − yu,t,a
x
) · lx · d C + Du,t , (4.20)
a=1
where .lx is the size of tile with x-th bitrate. Monitoring the buffer length of a user is
essential for calculating rebuffer events, and thus we calculate the variation of buffer
length as follows:
Based on Eq. (4.22), we make edge caching and bitrate selection decisions
according to the predicted FoV to maximize average QoE, which can be formulated
as
∑
T ∑
O ∑
P1 : max
.
1
T
1
O
1
|Uo | QoEut , . (4.23)
x,y t=1 o=1 u∈Uo
∑
K ∑
O
s.t. fo,k,t · ck ≤ C, ∀t ∈ T, . (4.24)
k=1 o=1
It can be found that P1 performs typical Markov features as the video’s experience
is greatly correlated with the consecutive epochs. Hence, we transform P1 into a
multi-agent MDP, and each video category is regarded as an agent to deal with the
different QoE requirements among multiple categories. The agents make decisions
collaboratively to maximize long-term QoE.
State S In the given system, state s comprises local observations of all agents,
represented as .s = {s1 , · · · , so , · · · , sO }. It comprises the weight of the category
.Wo , request state .Ru,t and buffer length .Bu,t , and the user set of agent o after FoV-
where .Ru,t is composed of video id, segment id, and the predicted viewing
probabilities of all tiles.
Action A Actions .a = {a1 , · · · , ao , · · · , aO } of agents comprise the bitrate
selection and edge caching decisions, i.e.,
where .xu,t = {xu,t,1 , · · · , xu,t,A } represents bitrate selection of tiles in FoV of user
u. .yu,t = {yu,t,1 , · · · , yu,t,A } is the edge caching decision.
Reward Each agent is desired to optimize the average QoE within a given state .sot .
Thus, the overall reward decided by agent o can be calculated by
⎧
⎪ 1 ⎲⎛ ⎞
⎪
⎨ QoEut + QoEub , 0 < C ' ≤ C,
|Uo |
ave
QoEo,t
. = u∈Uo (4.28)
⎪
⎪
⎩
−5, C ' > C,
where
⎧
0, 0 ≤ But ≤ Bumax ,
b
.QoEu =
−2, otherwise,
is a penalty term when the used buffer length .C ' of u violates caching capacity
constraints. Note that when .C ' > C, the average QoE of agent o is set to .QoEo,t ave
= −5.
Nevertheless, there is a significant issue in estimating the state-transition prob-
ability function P1, especially before the action is taken. This is because the
equilibrium is further complicated by the dynamic state, as well as the great coupled
4.3 Multicategory Video Caching 97
Buffer size
Actor ... Actor ... Actor
Cache size
... ...
Double Q-value Network
Batch Sampling
Critic ... Critic ... Critic
Replay Buffer
MASAC
.d E = 1/14 s/Mbit, while the corresponding time from the remote cloud to the user
is .d C = 1/2.9 s/Mbit. The edge is equipped with a cache capacity capable of storing
20% of the 360-degree videos from the generated video library. To streamline
analysis, we assume that all users commence video playback simultaneously,
with a uniform duration (e.g., 30 s) for video consumption. We categorize videos
into three distinct groups (e.g., Category1, Category2, and Category3) based on
the varied quality of experience (QoE) requirements for 360-degree videos. The
weight parameters are set differently to guide optimal edge caching and bitrate
selection decisions for service operators and users. For instance, C1 places equal
emphasis on both video quality and rebuffering, reflected in the weight parameters
.α1 :.β1 = 30 : 30. C2 prioritizes minimizing rebuffering, assigning a weight ratio of
.α2 :.β2 = 1 : 30. Conversely, C3 prioritizes video quality, featuring a larger weight
for video quality with .α3 :.β3 = 30 : 1. Table 4.2 delineates the QoE metric weights
based on video categories. Each 360-degree video is segmented into 30 segments,
each lasting 1 s. Within each segment, we divide it into .M = 24 tiles. Employing
FFMPEG[46] with H.264, each tile is encoded into .K = 4 different bitrate levels:
360p (2 Mbps), 720p (5 Mbps), 1080p (8 Mbps), and 2K (16 Mbps). Additionally,
we assume that each Field of View (FoV) consists of .A = 9 tiles.
The benchmarks used for comparing with FA-MASCA are as follows:
• RF [38] generates a virtual viewport according to the overlap of requests. It
makes the decision by a DQN model to maximize the video quality with limited
rebuffer (0.2 s).
• QF [28] constantly updates common FoV through previous users’ historical
trajectories. The common FoV guides the caching decision-making process
toward maximum video quality. Once receiving any request, it selects a high
or low bitrate of tiles to serve the user. It selects the same quality for tiles in FoV.
• Quality–Rebuffer–Balance (QRB) uses the SAC algorithm [47] to decide
which tile is cached in the edge server edge with what bitrate. The objective
of QRB is to maximize average QoE. With given FoV prediction, QRB treats
multiple video categories as Category 1, assigning equal weight to video quality
and rebuffer to inform the decisions. Note that the RF, QF, and QRB ignore the
distinct QoE requirements for different categories.
• Multi-video Category Based on A3C (MVC-A3C) utilizes three A3C net-
works [48, 49] for three video categories to make decisions that are the same as
QRB. In this method, the cache resources of the edge are evenly divided into three
parts. Each video category employs its respective A3C network to independently
make optimal bitrate selection and edge caching decisions based on its own QoE
function.
4.3 Multicategory Video Caching 99
FoV-aware operation. Furthermore, the hidden layer and the output layer consist of
.2 × A × Ūo nodes, aligning with the .2 × A × Ūo actions. Conversely, the critic state
mance. In C2, QF that emphasizes video quality yields an unsatisfactory QoE, while
RF excels, delivering excellent QoE. This success is attributed to RF prioritizing
minimal rebuffer in its decision-making, allowing for the storage of more tiles with
relatively low-quality levels in the edge, meeting the QoE requirements of C2.
Conversely, prioritizing video quality, QF greatly benefits C3, while RF struggles
to deliver good QoE in this category. The MVC-A3C, which adopts an independent
strategy for each video category, performs reasonably well in each category but fails
to achieve optimal performance due to the lack of consideration for the interplay
between video categories. Additionally, the consistent performance comparison
results across various datasets, as depicted in Figs. 4.12 and 4.13, highlight the
robustness of FA-MASAC in real-world scenarios.
Figures 4.14 and 4.15 show video quality selected by all methods. As can be
seen from Fig. 4.14, FA-MASAC and MVC-A3C dynamically select appropriate
video quality for different video categories, successfully increasing the average
QoE. These methods show better performance, i.e., offering higher video quality, in
C3, and vice versa in C2. Since MVC-A3C makes bitrate selection and edge caching
decisions independently for each video category, it lacks the adaptability to optimize
decisions based on dynamic user requests, failing to deliver a higher average QoE.
In contrast, other baseline methods consistently allocate the same video quality to all
three video categories, regardless of their dynamic QoE requirements. For instance,
for RF, its video quality for all video categories is low, which is assigned to high
video quality by QF. The mismatch between video quality and QoE requirements
is bound to adversely impact users’ average QoE. The results demonstrate that
adaptively assigning video quality to different video categories greatly improves
average QoE.
Figure 4.15 illustrates that FA-MASAC effectively reduces rebuffering for all
video categories by judiciously utilizing the limited cache resources of the edge
to store more tiles. It strategically allocates more cache resources to store high-
quality tiles, resulting in a lower rebuffer value for Category 3. QF with its strategy
of caching a small number of high-quality tiles at the edge leads to high rebuffer
values for all video categories. Notably, this method is particularly unfavorable to
Category 2, which prioritizes rebuffering. MVC-A3C makes decisions based on the
specific QoE function of each video category, yielding relatively low rebuffering
for Category 1, lower rebuffering for Category 2, and relatively high rebuffering for
Category 3. Conversely, RF and QRB provide relatively low rebuffering. However, a
distinct drawback is their inability to adaptively allocate differentiated video quality
for different video categories hampers the average QoE, especially for Category 3.
These results highlight that FA-MASAC not only collaboratively leverages the
edge’s cache resources but also employs different strategies to mitigate the mismatch
between a single unified QoE function and the diverse QoE requirements of various
video categories, thereby enhancing the average QoE for users.
Figure 4.16 depicts the average QoE of all five methods over time. FA-MASAC
stands out by achieving the highest average QoE as it considers the diverse QoE
102 4 Data Caching in Industrial Edge Computing
equal to 0.84, which performs 3.7% and 5% improvement than QRB and MVC-
A3C, respectively. Also, the normalized average QoE of RF is 0.63 across all
video categories. Since QF emphasizes video quality, it has the lowest performance,
i.e., 0.42, for multicategory 360-degree videos with diverse QoE requirements. FA-
MASAC exhibits a notably higher proportion of normalized average QoE exceeding
0.8, reaching as high as 0.68, while the baseline methods are distributed at 0.6,
0.5, 0.4, and 0.3, respectively. This observation reveals the limitations of methods
employing a single unified QoE function in delivering satisfactory average QoE
among categories.
We explore the impact of proportions of user requests for different video categories,
as shown in Figs. 4.18 and 4.19, where the proportion for C2 varies from 10% to
50%. Here, the remaining requests are equally divided into C1 and C3. The results
demonstrate that FA-MASAC always outperforms other baseline methods across
all datasets, showcasing its adaptability to different proportions of user requests.
This superiority is attributed to FA-MASAC’s utilization of distinct edge caching
and bitrate selection strategies for each video category, effectively mitigating the
mismatch between a single unified QoE function and the specific QoE requirements
of each video category. QRB consistently achieves a higher normalized average
QoE compared to other methods across different proportions of user requests.
Notably, the normalized average QoE of MVC-A3C exhibits an increasing trend
as the proportion of user requests for C2 rises, but it declines when this proportion
exceeds 30%. This behavior is a consequence of MVC-A3C dividing the edge’s
cache resource into three parts, allocating each to a specific video category. If the
actual user requests for these three video categories are unbalanced, the performance
of MVC-A3C deteriorates.
With an increase in the proportion of user requests for Category 2, the normalized
average QoE of RF rises, while that of QF declines. This behavior arises because
both QF and RF solely consider video quality or rebuffer in their edge caching and
bitrate selection decisions, rendering them inflexible to changes in the proportion
of user requests for different video categories. Specifically, when the proportion of
user requests for Category 2 is below 30%, QF outperforms RF. This is because the
proportion of user requests for Category 2 is lower than that for Category 3, and QF
prioritizes video quality, favoring Category 3. Consequently, the average QoE of QF
surpasses that of RF. However, when the proportion of user requests for Category 2
exceeds 30%, RF may outperform QF.
Figure 4.20 shows the impact of cache size on average QoE, where the cache
capacity C varies within the range of 5% to 25%. FA-MASAC achieves a better
performance than other methods. This superiority arises from FA-MASAC’s ability
in adaptive bitrate selection for different video categories, ensuring a more rational
allocation of cache resources and avoiding wastage. Since FA-MASAC allows
tiles in FoV to select different qualities, it exhibits flexibility in selecting the
bitrate quality of cached tiles, thus improving the video quality and reducing
latency. Additionally, with the increasing cache capacity, more and more popular
video content can be stored at the edge, resulting in a gradual stabilization of the
performance of all methods, especially with large cache capacity (e.g., 25%).
The comprehensive simulation results consistently demonstrate the superiority of
FA-MASAC over other baseline methods in terms of average QoE. It is important
to note that this work does not consider the collaboration of multiple edge networks.
Future research could explore more complex and realistic scenarios to further
enhance the understanding of collaborative edge computing environments.
References 105
Fig. 4.20 Average QoE by different methods over different cache sizes
References
1. Konstantinos Poularakis, Jaime Llorca, Antonia Maria Tulino, Ian J. Taylor, and Leandros
Tassiulas. Joint service placement and request routing in multi-cell mobile edge computing
networks. In 2019 IEEE Conference on Computer Communications, INFOCOM 2019, Paris,
France, April 29–May 2, 2019, pages 10–18. IEEE, 2019.
2. Pawani Porambage, Jude Okwuibe, Madhusanka Liyanage, Mika Ylianttila, and Tarik Taleb.
Survey on multi-access edge computing for internet of things realization. IEEE Communica-
tions Surveys Tutorials, 20(4):2961–2991, 2018.
3. Prithwish Basu, Theodoros Salonidis, Brent Kraczek, Sayed M. Saghaian N. E., Ali Sydney,
Bongjun Ko, Tom La Porta, and Kevin S. Chan. Decentralized placement of data and analytics
in wireless networks for energy-efficient execution. In 39th IEEE Conference on Computer
Communications, INFOCOM 2020, Toronto, ON, Canada, July 6-9, 2020, pages 486–495.
IEEE, 2020.
4. B. N. Bharath, Kyatsandra G. Nagananda, and H. Vincent Poor. A learning-based approach to
caching in heterogeneous small cell networks. IEEE Trans. Commun., 64(4):1674–1686, 2016.
5. Yu-Jia Chen, Kai-Min Liao, Meng-Lin Ku, and Fung Po Tso. Mobility-aware probabilistic
caching in UAV-assisted wireless D2D networks. In 2019 IEEE Global Communications
Conference, GLOBECOM 2019, Waikoloa, HI, USA, December 9-13, 2019, pages 1–6. IEEE,
2019.
6. Yuris Mulya Saputra, Dinh Thai Hoang, Diep N. Nguyen, and Eryk Dutkiewicz. A novel
mobile edge network architecture with joint caching-delivering and horizontal cooperation.
IEEE Trans. Mob. Comput., 20(1):19–31, 2021.
7. Tuyen X. Tran and Dario Pompili. Adaptive bitrate video caching and processing in mobile-
edge computing networks. IEEE Trans. Mob. Comput., 18(9):1965–1978, 2019.
8. Yong Xiao and Marwan Krunz. QoE and Power Efficiency Tradeoff for Fog Computing
Networks with Fog Node Cooperation. In IEEE Conference on Computer Communications,
INFOCOM, Atlanta, GA, USA, May 1-4, pages 1–9, 2017.
9. Qixia Hao, Jiaxin Zeng, Xiaobo Zhou, and Tie Qiu. Freshness-aware high definition map
caching with distributed MAMAB in internet of vehicles. In Lei Wang, Michael Segal,
Jenhui Chen, and Tie Qiu, editors, Wireless Algorithms, Systems, and Applications—17th
International Conference, WASA 2022, Dalian, China, November 24-26, 2022, Proceedings,
Part III, volume 13473 of Lecture Notes in Computer Science, pages 273–284. Springer, 2022.
106 4 Data Caching in Industrial Edge Computing
10. Jiaxin Zeng, Xiaobo Zhou, and Keqiu Li. MADRL-based joint edge caching and bitrate
selection for multicategory 360-degree video streaming. IEEE Internet of Things Journal,
pages 1–1, 2023.
11. Jeffrey Minoru Adachi. Accuracy of global navigation satellite system based positioning using
high definition map based localization, 2021.
12. Zhou Su, Yilong Hui, Qichao Xu, Tingting Yang, Jianyi Liu, and Yunjian Jia. An edge caching
scheme to distribute content in vehicular networks. IEEE Trans. Veh. Technol., 67(6):5346–
5356, 2018.
13. Georgios S. Paschos, George Iosifidis, Meixia Tao, Don Towsley, and Giuseppe Caire. The
role of caching in future communication systems and networks. IEEE J. Sel. Areas Commun.,
36(6):1111–1125, 2018.
14. Lei Yang, Lingling Zhang, Zongjian He, Jiannong Cao, and Weigang Wu. Efficient hybrid
data dissemination for edge-assisted automated driving. IEEE Internet Things J., 7(1):148–
159, 2020.
15. Xiaoge Huang, Ke Xu, Qianbin Chen, and Jie Zhang. Delay-aware caching in internet-of-
vehicles networks. IEEE Internet Things J., 8(13):10911–10921, 2021.
16. Shiyu Tang, Ali Alnoman, Alagan Anpalagan, and Isaac Woungang. A user-centric cooperative
edge caching scheme for minimizing delay in 5g content delivery networks. Trans. Emerg.
Telecommun. Technol., 29(8), 2018.
17. Yunzhu Wu, Yan Shi, Zixuan Li, and Shanzhi Chen. A cluster-based data offloading strategy for
high definition map application. In 91st IEEE Vehicular Technology Conference, VTC Spring
2020, Antwerp, Belgium, May 25-28, 2020, pages 1–5, 2020.
18. Xianzhe Xu, Shuai Gao, and Meixia Tao. Distributed online caching for high-definition maps
in autonomous driving systems. IEEE Wirel. Commun. Lett., 10(7):1390–1394, 2021.
19. Rong Liu, Jinling Wang, and Bingqi Zhang. High definition map for automated driving:
Overview and analysis. Journal of Navigation, 73(2):1–18, 2019.
20. Sanjit Krishnan Kaul, Roy D. Yates, and Marco Gruteser. Real-time status: How often should
one update? In Albert G. Greenberg and Kazem Sohraby, editors, Proceedings of the IEEE
INFOCOM 2012, Orlando, FL, USA, March 25-30, 2012, pages 2731–2735, 2012.
21. Penglin Dai, Kaiwen Hu, Xiao Wu, Huanlai Xing, and Zhaofei Yu. Asynchronous deep
reinforcement learning for data-driven task offloading in MEC-empowered vehicular networks.
In 40th IEEE Conference on Computer Communications, INFOCOM 2021, Vancouver, BC,
Canada, May 10-13, 2021, pages 1–10, 2021.
22. H Hellini. The real deal with virtual and augmented reality. Available: http://www.
goldmansachs.com/our-thinking/pages/virtual-and-augmented-reality.html.
23. Yuanxing Zhang, Pengyu Zhao, Kaigui Bian, Yunxin Liu, Lingyang Song, and Xiaoming
Li. DRL360: 360-degree video streaming with deep reinforcement learning. In 2019 IEEE
Conference on Computer Communications, INFOCOM 2019, Paris, France, April 29–May 2,
2019, pages 1252–1260, 2019.
24. Xiaosong Gao, Jiaxin Zeng, Xiaobo Zhou, Tie Qiu, and Keqiu Li. Soft actor-critic algorithm for
360-degree video streaming with long-term viewport prediction. In International Conference
on Mobility, Sensing and Networking, 2021.
25. Xueshi Hou, Sujit Dey, Jianzhong Zhang, and Madhukar Budagavi. Predictive adaptive stream-
ing to enable mobile 360-degree and VR experiences. IEEE Transactions on Multimedia,
23:716–731, 2021.
26. Yuanxing Zhang, Yushuo Guan, Kaigui Bian, Yunxin Liu, Hu Tuo, Lingyang Song, and
Xiaoming Li. EPASS360: QoE-aware 360-degree video streaming over mobile devices. IEEE
Trans. Mob. Comput., 20(7):2338–2353, 2021.
27. Xuekai Wei, Mingliang Zhou, Sam Kwong, Hui Yuan, and Weijia Jia. A hybrid control scheme
for 360-degree dynamic adaptive video streaming over mobile devices. IEEE Transactions on
Mobile Computing, 2021.
References 107
28. Anahita Mahzari, Taghavi Nasrabadi, Afshin, Aliehsan Samiei, and Ravi Prakash. FoV-aware
edge caching for adaptive 360-degree video streaming. In Proceedings of the 26th ACM
international conference of Multimedia, 2018.
29. Liyang Sun, Fanyi Duanmu, Yong Liu, Yao Wang, Yinghuam Ye, Hang Shi, and David Dai. A
two-tier system for on-demand streaming of 360 degree video over dynamic networks. IEEE
Journal on Emerging and Selected Topics in Circuits and Systems, 9:43–57, 2019.
30. Zhiqian Jiang, Xu Zhang, Yiling Xu, Zhan Ma, Jun Sun, and Yunfei Zhang. Reinforcement
learning based rate adaptation for 360-degree video streaming. IEEE Transactions on Broad-
casting, 67(2):409–423, 2021.
31. Shibo Wang, Shusen Yang, Hailiang Li, Xiaodan Zhang, Chen Zhou, Chenren Xu, Feng Qian,
Nanbi Wang, and Zongben Xu. SalientVR: saliency-driven mobile 360-degree video streaming
with gaze information. In In Proceedings on Mobile Computing and Networking, 2022.
32. ShibBo Wang, Shusen Yang, Hairong Su, Cong Zhao, Chenren Xu, Feng Qian, Nanbin Wang,
and Zongben Xu. Robust saliency-driven quality adaptation for mobile 360-degree video
streaming. IEEE Transactions on Mobile Computing, 2023.
33. Ming Hu, Lifeng Wang, and Shi Jin. Two-tier 360-degree video delivery control in multiuser
immersive communications systems. IEEE Transactions on Vehicular Technology, 2022.
34. Xuekai Wei, Mingliang Zhou, and Weijia Jia. Towards low-latency and high-quality adaptive
360-degree streaming. IEEE Transactions on Industrial Informatics, 2022.
35. Xianda Chen, Tianaxiang Tan, and Guohong Cao. Macrotile: Toward QoE-aware and energy-
efficient 360-degree video streaming. IEEE Transactions on Mobile Computing, 2022.
36. Pantelis Maniotis, Eirina Bourtsoulatze, and Nikolaos Thomos. Tile-based joint caching
and delivery of 360◦ videos in heterogeneous networks. IEEE Transactions on Multimedia,
22(9):2382–2395, 2020.
37. Yanwei Liu, Jinxia Liu, Antonious Argyriou, Liming Wang, and Zhen Xu. Rendering-aware
VR video caching over multi-cell MEC networks. IEEE Transactions on Vehicular Technology,
70(3):2728–2742, 2021.
38. Pantelis Maniotis and Nikolaos Thomos. Viewport-aware deep reinforcement learning
approach for 360◦ video caching. IEEE Transactions on Multimedia, 2021.
39. Qi Cheng, Hangguan Shan, Weihua Zhuang, Lu Yu, Zhaoyang Zhang, and Q. S. Quek, Tony.
Design and analysis of MEC- and proactive caching-based 360 mobile VR video streaming.
IEEE Transactions on Multimedia, 2021.
40. Ivan Sliver, Mirko Suznjevic, and Skorin Kapov Lea. Game categorization for deriving
QoE-driven video encoding configuration strategies for cloud gaming. ACM Transactions on
Multimedia Computing, Communications, and Applications, 2017.
41. Chunyu Qiao, Jiliang Wang, and Yunhao Liu. Beyond QoE: Diversity adaption in video
streaming at the edge. In Proceedings of 39th International Conference on Distributed
Computing Systems, 2019.
42. Guanghui Zhang, Jie Zhang, Yan Liu, Haibo Hu, Jack Lee, and Vaneet Aggarwal. Adaptive
video streaming with automatic quality-of-experience optimization. IEEE Transactions on
Mobile Computing, 2022.
43. Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. A control-theoretic approach
for dynamic adaptive video streaming over http. In ACM SIGCOMM, 2015.
44. Chenglei Wu, Zhihao Tan, Zhi Wang, and Shiqiang Yang. A dataset for exploring user
behaviors in VR spherical video streaming. In Proceedings of the 8th ACM on Multimedia
Systems Conference, 2017.
45. Taghavi Nasrabadi, Afshin, Aliehsan Samiei, C.Q. Farias, Mylene, and M. Carvalho, Marcelo.
A taxonomy and dataset for 360◦ videos. In In Proceedings of the 10th ACM on Multimedia
systems Conference, 2019.
46. FFmpeg. About FFmpeg. Available: https://ffmpeg.org. [Online].
47. Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy
maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the
35th International Conference on Machine Learning, 2018.
108 4 Data Caching in Industrial Edge Computing
48. Nuowen Kan, Junni Zou, Chenglin Li, Wenrui Dai, and Hongkai Xiong. Rapt360: Reinforce-
ment learning-based rate adaptation for 360-degree video streaming with adaptive prediction
and tiling. IEEE Transactions on Circuits and Systems for Video Technology, 32(3):1607–1623,
2022.
49. Yongkai Huo and Hongye Kuang. Ts360: A two-stage deep reinforcement learning system for
360-degree video streaming. In In Proceedings on Multimedia and Expo (ICME), 2022.
50. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
and et al. PyTorch: An imperative style, high-performance deep learning library. In Proceed-
ings of 33rd Conference on Neural Information Processing System, 2019.
Chapter 5
Service Migration in Industrial Edge
Computing
5.1 Introduction
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 109
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_5
110 5 Service Migration in Industrial Edge Computing
As mentioned before, in VM, each service preserves an instance for the serving
users, encompassing intermediate results or historical data. It enables all the users to
enjoy an extremely low latency [4–6]. In multiuser dense industrial edge computing
systems, the efficiency of caching may diminish under user mobility. As users
move among the coverage of BSs, they usually connect to the BS from which it
receives the highest signal strength indication, i.e., signal handoff. It results in extra
energy consumption as it should route the request among BSs, as well as an extra
communication latency, even reaching unacceptable levels. Despite the option to
update service caching policies frequently, resource limitations in core networks
introduce an extra latency and energy consumption for transmission between edge
server and cloud via backhaul links [7].
Earned by user mobility, the services should seamlessly follow users, which is
referred to as service migration, thereby minimizing request routing latency [8–10].
Also, this process consumes additional energy. Existing research pays attention to
designing single-user migration strategies, which independently selects the optimal
edge by predicted trajectory-based MDP of the user without taking other users’
actions into account [11].
However, in multiuser industrial edge computing systems, this approach presents
two primary challenges. Firstly, the presence of uncertain interference among users,
denoted as resource conflict, may result in migration failures. When multiple users
independently make service migration decisions, their service may be migrated
to one BS that cannot burden the storage requirement. Additionally, the shared
migration strategies for services among multiple users are inevitably ignored in the
existing single-user strategies, resulting in the shortage and misuse of resources [12].
Secondly, enabling migration effectiveness heavily depends on accurate trajectory
prediction. Existing strategies overlook how long users connect to their target BSs,
which can be efficiently reflected by the future trajectories. Moreover, the state space
extremely expands with the increasing number of users, which further complicates
precisely trajectory prediction for users [13].
This chapter mainly introduces the migration toward low energy consump-
tion and high privacy within the latency constraint. The Energy-efficient service
miGration for multiuser hterO-geneous dense cellular networks algorithm (EGO)
is introduced in Sect. 5.2. The objective is to achieve minimum overall energy
consumption within serve service latency requirements and limited resources.
Meanwhile, EGO also takes the interference among users into consideration to
improve the system performance [14]. Section 4.3 introduces a location privacy-
aware migration algorithm that protects against attacks. To estimate the risks of
location privacy leakage, we proposed a specific entropy-based location privacy
metric [15].
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 111
Figure 5.1 shows a typical multiuser industrial edge computing system, including
several BSs and users distributed in the geographical map. The deployment of
BSs, each of which equips edge servers, performs a great overlap of their coverage
Fig. 5.1 Service migration illustration in multiuser industrial edge computing systems
112 5 Service Migration in Industrial Edge Computing
regions due to their high spatial density. It also means the mobile users are always
located in serving coverage of multiple BSs. Generally, the user receives different
signal strengths from different BSs, influenced by distance, bandwidth, and so on,
and connects the BSs with the best signal [19]. The BSs are interconnected via
wireless links, allowing users to access services across the network by request
routing. Note that the changing of the user’s connected BS does not mean the
migration will occur.
The services are hosted by VM in edge servers within limited resources, whose
migration process incurs more energy consumption than request routing with less
latency. The service can be abstracted to a 3-layer VM model [20] as follows:
• Base layer: An operating system that provides basic support and is always
employed in the BSs to build the VM.
• Application layer: The essential data to provide the service for users, which is
shared by multiple users.
• Instance layer: Individual state, e.g., historical pattern, privacy, etc.
When a user generates a service request, the corresponding BS, which possesses
both its corresponding application layer and the instance layer, must respond. Once
the target edge server already hosts the required application layer, it means that the
user is not required to consume time for transmitting the application data again.
In this case, only the instance layer with low data size should be migrated from
the previous serving node. Otherwise, both of them need to be migrated. Also, in
the special case, i.e., some stateless services, we set the data size of instance layer
to zero. The migration process is exemplified in a simplified scenario depicted in
Fig. 5.1.
The system involves three BSs and two users, named Emma and Steve. The
services deployed on .BS1 , .BS2 , and .BS3 are .{Steam}, .{Steam and F acebook},
and .{AR}, respectively. Initially, Emma is covered by .BS1 , and Steve is covered by
.BS2 . Emma generates requests for the Steam service, while Steve requests Steam
and F acebook. As they move, Emma’s connectivity switches to .BS2 , and Steve’s to
.BS3 . It can be seen that .BS2 has the service Steam, and thus only the instance layer
data is required to be migrated for Emma, which greatly reduces response latency.
Meanwhile, Steve continues to request Steam from .BS2 through request routing.
Alternatively, migrating application and instance layers of F acebook to .BS3 allows
Steve to be served locally.
In industrial edge computing systems, the following questions must be addressed
when minimizing the average energy consumption:
• Which service is migrated to which BS with user mobility, BS heterogeneity,
service latency deadline, and interference among users?
• Which layer, i.e., application and instance layer, should be migrated?
In multiuser industrial edge computing systems, there are N BSs and U users.
In time slot .t ∈ {0, 1, · · · , T }, each user requests a certain service m, .m ∈
{1, 2, · · · , M} based on its historical request statistics, where one time slot lasts
for .τ .
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 113
⎲
N
. xtc (u, n) = 1, ∀t, u. (5.1)
n=1
Generally, the controller determines BS n where the user u can request service m
in the next time slot .t +1, denoted by a U -by-N .x m,t that means that the service m is
provided by BS n to user u in time slot t. Also, let a U -by-N matrix .x 'm,t represent
the migration decision (i.e., target BS) in time slot t, which is
'
xm,t+1 (u, n) = xm,t
. (u, n). (5.2)
The physical meaning of .xm,t (u, n) = 1 is that the BS n has cached instance layer
of service m for the user in time slot t. Since the user requests service from one
certain server or does not request the service, the migration decision should satisfy
⎲
N
'
. xm,t (u, n) ≤ 1. (5.3)
n=1
⎧ ⎫
⎲
U
' '
.Pt (m, n) = min 1, xm,t (u, n) . (5.5)
u=1
Normally, the services that achieve different functions perform different require-
ments in CPU cycles, energy consumption, and so on. In this book, we abstract the
service m into a 8-tuple .< λm , γm , Dm , fm,u , θmA , θm,u
I , W , ω >, i.e.:
m m
• .θmA /θm,u
I : the application/instance layer data size for user u.
⎲
M ⎲
U
'
. fm,u xm,t (u, n) ≤ Fn , ∀t, n, (5.6)
m=1 u=1
⎾U ⏋
⎲
M ⎲
'
.
I
θm,u xm,t (u, n) + θmA Pt' (m, n) ≤ Sn , ∀t, n, (5.7)
m=1 u=1
⎲
M ⎲
U
'
. Wm,u xm,t (u, n) ≤ Wn , ∀t, n, (5.8)
m=1 u=1
respectively.
The transmission latency for user u is composed of latency for sending requests
and routing requests from the user’s connected BS to its serving BS, which is
λm
tra
lm,t
. (u) = c + C, (5.10)
R(πt,u , πm,t,u )
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 115
where
c
πt,u
. = arg max xtc (u, n), (5.11)
n
Note that it is difficult to obtain an exact measure of transmission latency from user
to BS with users’ mobile nature, and thus we approach this latency by a constant
value C.
Also, we can calculate the computation latency as
λ m γm
com
lm,t
. (u) = . (5.13)
fm,u
'
We use .πm,t,u ' (u, n) to represent the migration target BS.
= arg max xm,t
n
Benefiting from 3-layer VM, the service’s application layer that is not in BS
'
.πm,t,u can be downloaded from the nearby BS, while the ongoing service can be
Actually, the service migration latency should only consider the instance layer
transmission latency as the application layer can be transmitted beforehand, i.e.,
I
θm,u
ins
lm,t
. (u) = ' . (5.15)
R(πm,t,u , πm,t,u )
'
Here, .πm,t,u = πm,t,u indicates that the service m is not migrated and its migration
latency is equal to zero. As each time slot may deal with multiple requests,
determined by the service frequency, the total service latency .Lm,t (u) can be
expressed as
⎾ ⏋
Lm,t (u) = ωm lm,t
.
tra
(u) + lm,t
com
(u) + lm,t
ins
(u). (5.16)
As the user must receive the results before the service deadline, the latency must be
less than the threshold .Dm , i.e.,
Next, we detail the calculation for the critical energy consumption, which includes
transmission energy
⎲
U
tra
Em,t
. = pπm,t,u ωm tra
lm (u), (5.18)
u=1
⎲
U
.
com
Em,t = κωm 3 com
fm,u lm,t (u), (5.19)
u=1
Here, .κ is the unit energy consumption for one CPU cycle processing in BS n.
The migration process typically involves transmitting a larger amount of data
compared to regular service requests, resulting in increased energy consumption,
as well as latency. However, after migration, the energy consumption for routing
requests, along with service latency, is significantly reduced in subsequent time
slots. Consequently, with optimal migration decisions, the cost associated with
migration is offset, leading to overall performance improvements. The total energy
consumption can be calculated by
mig
Em,t = Em,t
.
tra
+ Em,t
com
+ Em,t . (5.21)
M T −1
1 ⎲⎲
P1 :
. min
'
Em,t . (5.22)
x m,t T
m=1 t=0
app
s.t. lm,t (n) ≤ τ (5.23)
(5.1), (5.3), (5.6), (5.7), (5.8), (5.17).
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 117
Constraint (5.23) guarantees that the ongoing service will not be disrupted by
application layer migration. It is noteworthy that the absence of future information
poses a challenge in deriving the optimal solution. In other words, solving .P1
optimally demands comprehensive offline information, e.g., historical trajectories
and the service preference of the user for requesting, which is challenging to acquire.
Furthermore, even with known offline information, .P1 remains an NP-hard Mixed-
Integer Nonlinear Programming (MINP) problem.
In order to solve .P1 with low complexity, we decouple it into M subproblems, each
of which makes migration decisions for different services with certain allocated
resources. Let .αm (t, n), .βm (t, n), and .δm (t, n) represent the percentage of compu-
tation, storage, and bandwidth resources allocated to service m, respectively. They
are calculated as follows:
2 |x '
ωm E{fm,u m,t−1 (u, n) = 1, ∀u}
αm (t, n) =
. , (5.24)
∑
M
'
ω m' E{fm2 ' ,u |xm ' ,t−1 (u, n) = 1, ∀u}
m' =1
⎾ ⏋
I |x '
ωm θmA + E{θm,u m,t−1 (u, n) = 1, ∀u}
.βm (t, n) = (5.25)
∑
M ⎾ ⏋,
A I '
ωm' θm' +E{θm' ,u |xm' ,t−1 (u, n) = 1, ∀u}
m' =1
2 |x '
ωm E{Wm,u m,t−1 (u, n) = 1, ∀u}
δm (t, n) =
. . (5.26)
∑
M
'
ωm' E{Wm2 ' ,u |xm ' ,t−1 (u, n) = 1, ∀u}
m' =1
T −1
1 ⎲
. P2 : min
'
Em,t . (5.27)
x m,t T
t=0
s.t. (5.1),(5.3),(5.17),(5.23) ∀m
⎲
U
' f
fm,u xm,t (u, n) = yn,t (m) + αm (t, n)Fn , . (5.28)
u=1
118 5 Service Migration in Industrial Edge Computing
⎲
U
'
I
θm,u xm,t (u, n) + θmA Pt' (m, n) = yn,t
s
(m) + βm (t, n)Sn , . (5.29)
u=1
⎲
U
'
Wm,u xm,t (u, n) = yn,t
w
(m) + δm (t, n)Wn , (5.30)
u=1
The objective is to harmonize the resource utilization per service with the collective
utilization across all services. It minimizes energy consumption by optimizing
resource allocation in a way that controls the resources allocated to any single
service. A metric that is widely used to estimate the congestion level for resource
allocation is quadratic Lyapunov, i.e., .L(x) ≜ 12 x 2 .
T −1 N
1 ⎲⎲⎲⎾ ∗ ⏋2
L(Q(m)) ≜
. qn,t (m) . (5.32)
2T ∗∈F
t=0 n=1
Equation (5.32) builds a virtual resource queue for services, where small backlog
L(Q(m)) reflects a plenty of resources, as well as high stability. The corresponding
.
one-slot conditional drift that shifts the quadratic Lyapunov function to low
congestion is
ΔR
.1 (Q(m)) ≜ L(Q(m + 1)) − L(Q(m)) (5.33)
T −1 N T −1 N
1 ⎲⎲⎲⎾ ∗ ⏋2 1 ⎲⎲⎲ ∗ ∗
⩽ yn,t (m) + qn,t (m)yn,t (m)
2T ∗∈F
T ∗∈F
t=0 n=1 t=0 n=1
−1 ⎲
T⎲ N ⎲
1 ∗ ∗
≤B+ qn,t (m)yn,t (m),
T
t=0 n=1 ∗∈F
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 119
∑
where .B = B∗ ,
∗∈F
⎧ T −1 N
⎫
1 ⎲⎲ ∗
.B∗ = sup [yn,t (m)]2 . (5.34)
2T
t=0 n=1
Minimizing Eq. (5.32) ensures the total resource utilization does not exceed the
resource bound, i.e., m minimizing the drift-plus-penalty per time slot function
T −1 T −1 T −1 N
V ⎲ V ⎲ 1 ⎲⎲⎲ ∗ ∗
Δ R
. 1 (Q(m))+ Em,t ≤ Em,t +B + qn,t (m)yn,t (m).
T T T ∗∈F
t=0 t=0 t=0 n=1
(5.35)
Here, .V > 0 serves as a parameter for balancing energy consumption and resource
utilization. With the above queues, we can decompose .P2 into multiple service-
orient subproblems .P3, where the elastic resource allocation is taken into account.
The goal is to find a state of resource allocation that mitigates collisions between
energy consumption and resource utilization. Hence, we obtain the migration
decisions for each service by solving .P3, i.e.,
T −1
⎾ ⏋
1 ⎲ ⎲
N
.P3 : min V Em,t + Qn,t (m)
x 'm,t T
t=0 n=1
where
⎲
U
f '
Qn,t (m) = qn,t (m)
. fm,u xm,t (u, n)
u=1
⎾U ⏋
⎲
'
+ qn,t
s
(m) I
θm,u xm,t (u, n) + θmA Pt' (m, n)
u=1
⎲
U
'
+ qn,t
w
(m) Wm,u xm,t (u, n). (5.37)
u=1
It can be seen that .P3 is an offline problem that inherently considers long-term
energy consumption and cannot be solved in an online manner, while, in practice,
the decision regarding service migration must be made instantly without access to
future information in an online manner. To tackle this challenge, we further utilize
Lyapunov optimization to transform .P3 into T subproblems.
120 5 Service Migration in Industrial Edge Computing
mig
Let .Em,t and .êm (t) represent the actual and excepted migration energy consump-
tion, where
1 ⎲ mig
t−1
êm (t) =
. Em,t ' . (5.38)
t '
t =0
em (t + 1) = max{em (t) + ym
.
e
(t), 0}, (5.40)
more specifically, .em (0) = 0. When .em (t) ⪢ 0, less energy is desired to be
consumed for migration in the future. In this case, the migration probability is
mig
restricted, i.e., reducing the value of .Em,t → 0, to compensate for energy deficiency
and thus stabilize the energy queue and vice versa. The Lyapunov function of
migration energy consumption is expressed as
1
. L(em (t)) = [em (t)]2 . (5.41)
2
1 e 2
ΔE
.1 (em (t)) ≜ L(em (t + 1)) − L(em (t)) ⩽ y (t)
2 m
+ em (t)ym
e
(t) ⩽ Be + em (t)ym
e
(t).
To ensure the stability of the energy queue and the optimality of the solution of
P3, we execute minimizing a supremum bound of Eq. (5.42).
.
⎾ ⎲
N ⏋
'
ΔE
.1 (em (t))+V V Em,t + Q n,t (m) ⩽ Be +em (t)ym
e
(t)
n=1
⎾ ⎲
N ⏋
+ V ' V Em,t + Qn,t (m)
n=1
⎾ ⎲
N ⏋
mig '
= Be + em (t)Em,t − em (t)êm (t) + V V Em,t + Qn,t (m) , (5.42)
n=1
⎾ ⎲
N ⏋
' mig
P4 : min
.
'
V V Em,t + Q n,t (m) + em (t)Em,t
x m,t
n=1
.V ' is a positive weighting parameter, which can make a trade-off between the
objective value and energy queue stability influenced by migration. Hence, the
energy queue enables the BSs to save total energy with unknown future information
and thus approximate the optimal decision-making. Note that, to simplify the
expression in the following section, let .z̄m,t (x 'm,t ) denote the objective function in
.P4.
EGO involves several steps: In times slot t, the controller updates the service
placement .P t . The resource queues of services are also updated based on Eq. (5.31).
After this step, .P4 is solved using a modified Particle Swarm Optimization (PSO)
algorithm, as detailed in Sect. 5.2.4. Subsequently, the energy queue is updated,
and finally, we use the migration decisions from .P4 to update the next system
environment.
Taking the NP-hard feature of .P4 into account, we propose a modified PSO method.
The traditional PSO algorithm operates by constructing a population, known as a
swarm, consisting of candidate solutions referred to as particles. These particles,
denoted as .x k for .k = 1, · · · , K, navigate within the search space following the
formulas below. The velocity .vk of particle k is influenced by the best particle
position .pk and the best group position g.
⎛ ⎞ ⎛ ⎞
vki+1 = ωvki + c1 r1 pki − x ik + c2 r2 g i − x ik ,
. (5.44)
where i is the index of iteration, .ω is the inertia weight, .c1 and .c2 are the
learning factors, and .r1 , r2 ∈ [0, 1] are random numbers. In Eq. (5.44), .v i+1
k
comprises three distinct components. Firstly, the inertia component reflects the
particle k’s inclination to preserve its current velocity. Secondly, the cognition
component indicates that particle k leans toward its local optimum. Lastly, the
society component suggests that particle k gravitates toward global optimum. The
movement of particle .x ik is then determined by these components.
x i+1
.
k = x ik + vki . (5.45)
122 5 Service Migration in Industrial Edge Computing
The above steps are repeated to find a solution with the best fitness indicator, i.e.,
z̄m,t (x ik ). Here, the value of .z̄m,t (x ik ) is negative correlated with the fitness. Based
.
and
⎧
g i−1 , z̄m,t (p̂ki ) > z̄m,t (g i−1 ),
g =
.
i
(5.47)
p̂ki , otherwise,
where
⎛ ⎞
p̂ki = arg min z̄m,t pki , k = 1, · · · , K.
. (5.48)
pki
Generally, the particles constantly move around the space according to their best
individual positions and collected population position .g i . This process continues
until the convergence, specifically when .max z̄m, t(pki ) − z̄m,t (g i ) < ε, or when a
predefined number of iterations are reached.
The particle .x ik,m,t can be regarded as a potential optimal solution .x ik,m,t ∈
{0, 1}U ×N . It prevents the update for .x ik,m,t through the traditional PSO algorithms’
update rule, i.e., Eqs. (5.44) and (5.45). To deal with this issue, two modifications of
PSO are made:
(1) A novel updating rule is designed for particles to improve the exploration
capacity. The velocity of .x ik,m,t is
⎲
N
.
i
vk,m,t (u, n) = 1, ∀u. (5.50)
n=1
in which case .n∗,i+1 may always be equal to .n∗,i , leading solution to fall into local
optimum.
Hence, we define .Fk,m,t
i (u), a discrete probability distribution, to explore more
potential optimal solutions
⎛ ⎞
1 2 ··· N
.Fk,m,t (u) ∼ i
i
,
i
vk,m,t (u, 1) vk,m,t (u, 2) · · · vk,m,t
i (u, N)
i
where .vk,m,t (u, n) is the probability of user u migrating the service to BS n. We
randomly sample .r3 according to .Fk,m,t
i (u) as the target BS. Thus, particle .x i+1
k,m,t
can be updated as follows:
⎧
1, if n = r3 ,
. x i+1
k,m,t (u, n) = (5.52)
0, otherwise.
i.e., Euclidean distance between the position of particle k in the current time
slot and that in the previous time slot, denoted as .x 'm,t−1 .
To rapidly converge, we narrow the search space by ensuring the particle satisfies
.D(k, m, t) ≤ T0 , where .T0 is an empirical threshold. When a particle satisfies
.D(k, m, t) > T0 , let
⎧
1, n = n(u),
x i+1
.
k,m,t (u, n) = (5.54)
0, otherwise,
where
'
n(u) = arg max x i+1
. k,m,t (u, n) + x m,t−1 (u, n) + g m,t (u, n).
i
n
Nevertheless, the modified PSO might still converge to local optimal by consis-
tently selecting superior decisions. To circumvent this potential issue, we extend it to
multiple (J ) population versions. Increasing the number of populations J enhances
124 5 Service Migration in Industrial Edge Computing
the likelihood of converging to the globally optimal solution with a greater number
of iterations, as outlined in Theorem 5.1.
Theorem 5.1 With the increasing of .J > 0, the probability of the modified PSO
algorithm reaching the global optimum of .P4 increases correspondingly. Especially
when .J → ∞, the probability approaches to 1.
Proof Let .φ ik = (x i+1 i i i
k,m,t , v k , p k , g ) represent the particle k’s state in i-th iteration.
.{φ , i ≥ 1} is a Markov chain, whose transition probability is [22]
i
k
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
. Pr φ lk |φ ik = Pr x lk |x ik · Pr v lk |v ik Pr plk |pik Pr g l |g i , (5.55)
where
⎛ ⎞ ⎧ 1, z̄ (pi+1 ) ≤ z̄ (pi )),
i+1 i m,t m,t
k |p k = 0, otherwise,
. Pr p k k (5.56)
⎛ ⎞ ⎧ 1, z̄ (g i+1 ) ≤ z̄ (g i )),
m,t k m,t k
. Pr g i+1 |g i = (5.57)
0, otherwise.
⎛ ⎞ ∏K ⎛ ⎞
. Pr ξ lk |ξ ik = P φ lk |φ ik . (5.58)
k=1
Let .p∗k and .g ∗ denote the best particle and group positions, and the optimal
particle state can be expressed as
{ }
K = φ ∗k = (x ik , v ik , p ∗k , g ∗ ), i ≥ 1 .
. (5.59)
Note that both the set .K and set .G are closed sets. ⋃
We further build two closed sets .B and .H, where .H = G B. The probability of
.ξ
i+1 /∈ H is
∏
K
. Pr(ξ l+1 /∈ H|ξ l /∈ H) = Pr(φ l+1
k /∈ H|φ k /∈ H)
l
(5.63)
k=1
K ⎾
∏ ⏋
= 1− Pr(x ik |x i−1
k ) Pr(v i i−1
|v
k k ) · Pr(p i
|p
k k
i−1
) Pr(g |g
i i−1
) .
k=1
∞
⎲ ∞ ∏
⎲ K {
i−1 ∏
. Pr(ξ ∈
i
/ H) = Pr(ξ /∈ H) l
1−Pr(g i |g i−1 )·
i=1 i=1 l=1 k=1
}
Pr(x ik |x i−1 i i−1 i−1
k ) Pr(v k |v k ) Pr(p k |p k )
i
. (5.64)
∑
∞ ∑
∞
As . Pr(ξ i ) < ∞, . Pr(ξ i /∈ H) < ∞, we obtain
i=1 i=1
{ }
. lim 1 − Pr(x ik |x i−1 i i−1
k ) Pr(v k |v k ) · Pr(pik |pi−1
k ) Pr(g |g
i i−1
) = 0. (5.65)
i→∞
Therefore, we have
This is equivalent to
By substituting Eqs. (5.67) into Eqs. (5.56) and (5.57), when .i → ∞, we have
p ik = pi+1
.
k and .g i = g i+1 . Thus, we obtain .Pr( lim ξ i ∈ H) = 1. This proves that
i→∞
{ξ i , i ≥ 1} converges to .H. .g i = g i+1 = g ∗ , .{ξ i , i ≥ 1} converges to .G, which
.
indicates the modified PSO reaches the global optimum within a single population,
126 5 Service Migration in Industrial Edge Computing
noise power .N0 are set to [0.1,0.4] W and .2 × 10−10 W, respectively. Moreover, h
is modeled as symmetric complex Gaussian variables [24].
The computation resource and bandwidth, i.e., .fm,u and .Wm,u , range from [500,
1000] cycles/bit and [1,10] MHz, respectively. For each BS, the maximum resources
.Fn , .Sn and .Wn are randomly sampled in [50,100] GHz, [1,5] GB, and [0.5,1] GHz.
Figure 5.3 illustrates the average energy consumption for the four methods, with
varying user numbers from 100 to 3000. Notably, SOP exhibits the highest average
energy consumption among the methods. This is attributed to the absence of service
migration as users move, resulting in increased transmission energy consumption.
Additionally, SOP’s average energy consistently rises with an increasing number
of users, reflecting higher computing energy consumption to meet the latency
requirements of all users.
128 5 Service Migration in Industrial Edge Computing
3
EGO
1.5
0.5
500 1000 1500 2000 2500 3000
User Number
Fig. 5.3 Average energy consumption, where the user numbers vary from 100 to 3000
4.4
4.2
3.8 EGO
SOP
3.6
DMDP
3.4 PDOA
3.2
2.8
2.6
500 1000 1500 2000 2500 3000
User Number
Fig. 5.4 Average service latency, where the user numbers vary from 100 to 3000
100
Average Deadline Guarantee Rate (%)
95
EGO
SOP
90
DMDP
PDOA
85
80
500 1000 1500 2000 2500 3000
User Number
Fig. 5.5 Average deadline guarantee rate, where the user numbers vary from 100 to 3000
1
35 42 49 56 63 70 77
BS Number
the highest deadline guarantee rate, especially with a large user number. With 3000
users, EGO achieves a rate of .93.2%, surpassing the rates of the other three methods,
which remain below .87%.
In Fig. 5.6, the average energy consumption is presented with 3000 users, varying
the number of BSs from 35 to 77. Notably, the average energy consumption
decreases for SOP, PDOA, DMDP, and EGO as the number of BSs increases. This
reduction is attributed to the increased resources and diminished interference among
users facilitated by a higher number of BSs. Comparatively, PDOA and DMDP
exhibit steeper declines in their average energy consumption curves in response
to the growing number of BSs, emphasizing their heightened sensitivity to user
interference.
In Fig. 5.7, the average energy consumption is illustrated for varying time slot
duration (.τ ) ranging from 1 to 32. Notably, the value of .τ exerts no influence on SOP,
as it remains inactive in each time slot. For PDOA, DMDP, and EGO, an increase
in .τ results in less frequent service migration, thereby reducing migration energy
consumption and, consequently, the average energy consumption. However, once .τ
surpasses a specific threshold, transmission energy consumption in request routing
becomes dominant. Consequently, the average energy consumption of PDOA begins
to rise.
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 131
2.5
2
EGO
SOP
DMDP
1.5 PDOA
1
1 2 4 8 16 32
(s)
3.8
3.6
EGO
3.4 SOP
DMDP
PDOA
3.2
1 2 4 8 16 32
(s)
In Fig. 5.8, the average service latency is presented with varying time slot
durations (.τ ) ranging from 1 to 32. Once again, it is observed that SOP remains
unaffected by the value of .τ due to its inactivity in each time slot. For PDOA,
DMDP, and EGO, an increase in .τ initially leads to a decline in average service
latency, attributed to the decline in migration latency. However, after .τ exceeds a
specific threshold, the average service latency begins to increase.
Figure 5.9 illustrates the average service latency under different mobility with 3000
users. It is apparent that the velocity greatly influences the performance of SOP
132 5 Service Migration in Industrial Edge Computing
4.4
3.8
3.6
3.4
3.2
0.7 1.05 1.4 1.75 2.1
Average Velocity of Users (m/s)
95
Average Deadline Guaratee Rate (%)
EGO
SOP
DMDP
90 PDOA
85
80
0.7 1.05 1.4 1.75 2.1
Average Velocity of Users (m/s)
Fig. 5.10 Average deadline guarantee rate with different user mobility
as it does not have any tools to deal with mobility. However, with the increasing
of average velocity, the average service latency of all the other methods also
inevitably grows. This is attributed to the heightened interference among users
caused by frequent service migration. Figure 5.10 further demonstrates that the
deadline guarantee rate of EGO decreases as the average velocity of users increases.
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 133
Figure 5.11 illustrates the average energy consumption and average service latency
of the EGO algorithm with varying values of V , ranging from .10−1 to .103 . As
V increases from .10−1 to .103 , EGO places more emphasis on average energy
consumption than on resource utilization. The results indicate that as average energy
consumption decreases, there is a corresponding increase in average service latency.
Setting a suitable value of V , EGO ensures the balance between average energy
consumption and average service latency.
Figure 5.12 illustrates the impact of .V ' in energy consumption and service
latency. With the growth of .V ' , the migration energy consumption will be taken into
account preferentially, leading to frequent migration as well as high service latency.
It means that the EDO should find a suitable .V ' as an empirical value (.V ' = 20) to
make a trade-off between energy consumption and service latency.
Additionally, Fig. 5.13 provides a breakdown of the average service latency for each
of the seven services with 3000 users. The red lines represent the deadlines for
each service. SOP exhibits the poorest performance, as the average service latency
exceeds the response deadline for all services due to its lack of service migration.
DMDP and PDOA experience average service latency beyond the response deadline
for five services, highlighting the impact of interference among users. In contrast,
EGO successfully meets the deadline requirements for all services, even for low-
priority and data-intensive services like “platoon.”
3.5 1.4
Average Service Latency
Average Energy Consumption
Average Energy Consumption (J)
Average Service Latency (s)
3.4 1.3
3.3 1.2
10-1 100 101 102 103
Fig. 5.11 Average energy consumption of the EGO algorithm, where V varies from .10−1 to .103
134 5 Service Migration in Industrial Edge Computing
3.65 1.55
Average Service Latency
Average Energy Consumption
3.6 1.5
3.55 1.45
3.5 1.4
3.45 1.3
3.4 1.25
3.35 1.2
10-1 100 101 102 103
Fig. 5.12 Average energy consumption of the EGO algorithm, where .V ' varies from .10−1 to .103
600
EGO
SOP
500 DMDP
PDOA
Average Service Latency (ms)
400
300
200
100
0
Emergency Collision Accident Parking Platoon Face Video
Fig. 5.13 Average service latency of each service with 3000 users
While service migration works well in guaranteeing service continuity, it may also
incur severe location privacy issues. The correlation between service migration
trajectories and user trajectories introduces potential privacy concerns. As illustrated
in Fig.5.14, the service migration trajectory closely follows the user’s movement
from .u(t) to .u(t + 3). This correlation raises the risk of privacy breaches, where
malicious entities such as untrusted or compromised service providers could exploit
service migration records to stealthily infer user locations. The implications of such
privacy violations include stalking, blackmail, and even kidnapping, as highlighted
in previous studies [26].
To safeguard user location privacy in the context of service migration, various
LPPMs have been explored. Common approaches include cloaking-based algo-
rithms [27], dummy-based algorithms [28], and differential privacy (DP)-based
algorithms [29], which have primarily been developed for Location-Based Service
(LBS). Cloaking-based algorithms and differential privacy-based algorithms focus
on introducing ambiguity in users’ locations, either by creating cloaking areas or by
adding location noise, to conceal precise location details.
In practice, the continuous nature of service migration poses challenges, as adver-
saries can leverage historical migration trajectories to infer users’ true locations,
User Trajectory
s
s
s
s
mitigating the impact of noise and cloaking areas. Dummy-based algorithms attempt
to conceal real migration trajectories [26, 30]. Meanwhile, maintaining these decoy
migration services entails additional computation and storage resources, where the
effective location privacy-aware service migration method is desired.
There are two key challenges that should be overcome. Besides the interfer-
ence among users, accurate measurement of location privacy leakage risk under
adversaries’ location inference attacks is difficult. Traditional metrics, such as the
communication distance between the user and the edge where the requested service
is deployed [31, 32], rely on the assumption that longer distances correspond to
lower location privacy leakage risk and vice versa.
However, this distance-based metric falls short of accurate privacy leakage risk
evaluation when suffering from adversary location inference attacks. Adversaries,
armed with users’ historical movement and service migration trajectories, leverage
Bayesian attacks [33] to infer potential user locations. In such scenarios, although
migrating service to a remote edge, the location privacy leakage risk still remains
high.
In industrial edge computing systems, there are always distributed M users and
N BSs equipped with edge servers in the map. Let .N = {1, 2, · · · , N } and
.M = {1, 2, · · · , M} to denote the sets of BSs and users, respectively. For user
.m ∈ M , its mobility determines its trajectory across the coverage of different BSs,
i.e., its connected BS .cm t constantly changes, and thus we update the location .ut
m
in each time slot. Note that since the storage of BS limits its deployed services, the
connected BS of user m is not always the serving BS .sm t that provides the required
service for the user. Both the connected and serving BS of the user greatly influence
the migration decision made in time slot t, i.e., whether or not to migrate services
to which target BS .am t ∈N .
Here, Fig. 5.15 depicts the service migration process in MEC systems. In time
slot t, users .u1 and .u2 request services .s1 and .s2 from BS 1, while .u3 requests
service .s3 from BS 2. By time slot .t + 1, as users move, services .s1 , .s2 , and .s3
migrate to BS 2, BS 3, and BS 4, respectively, and thus reduce the service latency.
Next, in time slot .t +2, the connected BS of the three users is changed to BS 4, which
encourages .u1 migrate service .s1 from BS 2 to BS 4 due to the long communication
distance between .u1 and its serving BS 2. Simultaneously, .u2 continues to request
service .s2 from BS 3 instead of BS 4, strategically alleviating competition pressure
in resources at BS 4. In this scenario, an effective strategy should account for
user mobility and resource competition, minimizing communication latency and
mitigating interference for a seamless user experience.
5.3 Location Privacy-Aware Service Migration 137
BS 1 BS 2 BS 3 BS 4
Migration Migration
( + 1)
( )
BS 2
( + 1)
( )
( + 1) ( + 2)
BS 4
BS 1 ( + 2)
( )
BS 3
Most of the existing malicious adversaries attempt to infer user’s location based
on its observed historical migration trajectories, which raises two urgent problems
to be solved:
• How to accurately estimate the risk of location privacy leakage when migration
occurs?
• How to select the BS to migrate which service under unknown user mobility,
uncertain resource competition and considering user mobility, resource competi-
tion, and location privacy leakage risk?
We assume that the role of the adversary is a service provider, who is honest and
curious. It tends to infer a user’s location based on its current migration trajectories
obtained by monitoring in confidence. Normally, only the migration trajectories
are collected by the adversary, in which case the adversary treats the location of
the user as its nearby BS which has deployed the corresponding service [26], i.e.,
.um (t) = sm (t). The location inference in this manner without any extra background
.Pmu2s . We represent the probabilities that user m moves from .utm to .ut+1
m and that BS
.sm can provide the service to user in location .um , by .Pm (um |um ) and .Pm (sm |um ),
t t u t+1 t u2s t t
Equation (5.69) shows that the user’s locations can be tracked based on the
collected user’s service location. In contrast to KFAs, KAs can eliminate locations
that do not align with the background knowledge, thereby enhancing the accuracy
of inferring the user’s true location.
Next, we introduce the proposed entropy-based location privacy metric. The user,
who suffers from the KA, has a high probability of leaking privacy although it
migrates the service to a remote BS.
To strike a balance between protecting location privacy and improving the
experience of users, inspired by information theory, we propose the concept of
privacy entropy to enhance the efficiency of risk estimation, i.e.,
⎲ ( ) ⎛ ⎞
Hm (t) = −
. Pms2u utm |sm
t
log Pms2u (utm |sm
t
) . (5.70)
utm ∈N
Equation (5.70) means that the location entropy value is negatively correlated
with the inference accuracy of the adversary, resulting in relatively low location
privacy leakage risk. The user’s location privacy leakage risk .Rm (t) can be estimated
by
( t)
.Rm (t) = −Hm sm . (5.71)
Generally, the service latency is the interval from the time of request initiation
to that of receiving the response, comprising the communication, computation,
req
and migration latency. Let a 4-tuple .< pm , λm , λserm , δm > represent the service
req
information requested by mobile user m, where .pm , .λm , λser m , and .δm are the
transmission power, the data size of request and service, and the computation
intensity (i.e., CPU Cycles/bit), respectively. It can be calculated according to the
equations in Sect. 5.2.2.1.
access to all users’ information (such as location and requested services) and all
BSs’ states to determine the optimal migration decisions, which is impractical
in practice. Additionally, owing to interference, migration decisions among users
mutually influence each other, leading to great coupling between service latency
and location privacy leakage risks. Consequently, we convert P1 into a partially
observable Markov decision process (POMDP) problem and address it using the
MADRL algorithm.
action .a t . .O is the observation set of the local environment state from the perspective
of a single user. The observation distribution is denoted by U , where .U (ot |a t−1 , et )
is the probability of a user observing state .ot given the action .a t−1 and the
environment state .et . .r(ot , a t ) and .γ signify the corresponding instantaneous reward
and the long-term discounted factor.
Normally, it is difficult to accurately predict the transition P and observation
U without known users’ movements which perform significant uncertainty. To
capture this uncertainty, we introduce the DRL technique that uses DNN to learn
corresponding probability distributions [35]. More specifically, we propose an
MASAC algorithm based on the SAC and POMDP to find the optimal migration
decisions:
1. Environment State: It includes the users’ information (e.g., user location, ser-
vice location, and requested service) and BSs’ configurations (e.g., computation
capacity) in time slot t, which is
2. Observation: The users only observe the partial environment state without infor-
mation exchange among multiple users. We use .om t to represent the observation
observations.
3. Migration Action: The candidate action set for users is denoted by .At , including
any BS nearby the users. The action of user m in time slot t, .am t ∈ At , indicates
4. Reward: Let .rm t denote instantaneous reward under action .a t and observation
m
.om . The total cost in time slot t represents the instantaneous reward, that is, .rm =
t t
−Cm (t).
The MASAC algorithm’s architecture, as illustrated in Fig. 5.16, treats each user
as an SAC agent responsible for independently determining service migration
decisions. Further details on this algorithm are provided in Sect. 3.3.3.
As mentioned earlier, we adhere to the centralized training and decentralized exe-
cution paradigm, where other agents’ observation states and actions are observable
during training but unobservable during execution.
During the training stage, all agents gather the historical environment state
from the experience replay buffer, including observations, actions, as well as
rewards. Samples from this buffer are then used to centrally train the actor–critic
models for each agent. The agent takes interference from other agents’ actions into
consideration to make migration decisions that maximize rewards for all agents.
We first initialize each agent’s policy .μm , soft state-value function .ϱm , soft Q-
value function .θm , and the memory of the experience replay buffer .D. In time slot
t, the agents determine the migration action .am t under observation .ot based on the
m
μ
policy .πm to transition into a new state .e ' t+1 t of
. After that, the immediate reward .rm
t t t ' t
each agent is obtained. Then, we record a 4-tuple .(e , a , rm , e ) into experience
replay buffer .Dm. Finally, using mini-batch training, we update the actor–critic
network by learning the soft state value .V ϱm (et ) and soft Q-value .Qθm (et , a t ).
With the benefits of centralized training, agents can collaborate without direct
information exchange. During execution, the trained policy network guides the
agents to make migration decisions independently toward low service latency and
location privacy leakage risk. It can efficiently reduce the interference among agents.
Fig. 5.16 Multi-agent soft actor–critic framework for location privacy-aware service migration
algorithm
5.3 Location Privacy-Aware Service Migration 141
In this section, we conduct some experiments to verify the efficiency of our proposed
method. We set 13 distributed BSs in a .1000 × 1000 m area, each of whose
communication radius and computation capacity are set to .200 m and [5, 20] GHz,
respectively. For the users in this area, we simulate their trajectories according to
real-world user movement in GeoLife DataSet [36]. The requested service of users is
regarded as a latency-sensitive service, whose parameters are listed below: The sizes
of request data .λm and image data .Λm , and computation intensity .δm are uniformly
selected from [1, 5] MB, [10, 50] MB, and uniform distribution [100, 500] CPU
cycles/bit, respectively.
Each user sends request to its connected BS via a wireless channel with a
transmission power .pm ∈ [0.5, 1] W. The wireless bandwidth W and noise power
.N0 are set to [5, 25] MHz and .10
−8 W. Meanwhile, we use a circular symmetric
complex Gaussian random variable [37] to simulate the fading vector h. The
connection among BSs is achieved by wired communication with a transmission
rate .r b , where the interruption latency .ξ is 0.05 s/hop. The detail of the variables
can also be referred to as Table 5.2.
There are five methods that are used as the benchmark of the proposed methods
from the perspective of service latency and location privacy leakage risk:
• DMDP [38]: The detail of DMDP has been introduced in Sect. 5.2.5.
• MASAC: MASAC optimizes the service latency while easing the resources
competition among users without taking the location privacy leakage risk into
account.
• DMDP with distance-based location privacy (DMDP-distance) [32]: This
algorithm is a variation of DMDP, which further introduces the location privacy
to minimize both service latency and location privacy leakage risk. It uses a
distance-based location privacy metric to estimate the location privacy leakage
risk.
• MASAC with distance-based location privacy (MASAC-distance): The loca-
tion privacy leakage risk estimation in MASAC-distance is the same as that
Figure 5.17 illustrates the privacy entropy with different network bandwidths.
Figures 5.18 and 5.19 show the location accuracy under knowledge-free and
under KAs, respectively. Regarding KFAs, the DMDP-distance, MASAC-distance,
MASAC-dp, and our proposed algorithms achieve approximately 14% to 27%
decline. The performance of the proposed method is significantly superior to DMDP
and MASAC algorithms. For confronted with KAs, location accuracy with DMDP-
distance and MASAC-distance algorithms perform around 52% to 65% increasing.
This is attributed to adversaries gathering auxiliary knowledge to enhance the
accuracy of user location inference. In comparison, MASAC-dp and our proposed
algorithms can restrict location accuracy to below 30%. This is achieved by reducing
the correlation mentioned above through increased migration decision randomness.
Figure 5.20 illustrates the impact of wireless bandwidth on service latency.
The results show a decrease in service latency with all six algorithms as wireless
bandwidth increases. The MASAC algorithm stands out, achieving the lowest
service latency due to its focus on optimizing service latency for multiple users.
Fig. 5.18 Location accuracy under adversary’s KFA with different network bandwidths
Fig. 5.19 Location accuracy under adversary’s KA with different network bandwidths
Figure 5.24 exhibits the location privacy protection ability of the six algorithms,
req
where request size .λm varies from 1 to 5 MB. Here, the service data size .λser
m is
req
ranged in [10, 50] MB. Even with dramatic varying of .λm , the proposed algorithm
5.3 Location Privacy-Aware Service Migration 145
Fig. 5.24 Performance with different request sizes: (a) the privacy entropy, (b) the location
accuracy under adversary’s KFA, and (c) the location accuracy under adversary’s KA
with .λser
m increasing, leading to an increase in communication distance with user
movement.
Figure 5.27 presents the migration latency results. The migration latency of
MASAC-dp and our proposed algorithms is high due to the frequent migration
for enhancing location privacy protection. On the contrary, MASAC-distance
algorithm has the lowest migration latency. Figure 5.28 displays the computation
latency results, where the computation latencies of DMDP and DMDP-distance
significantly exceed that of MASAC algorithms.
5.3 Location Privacy-Aware Service Migration 147
Figure 5.29 illustrates the location privacy protection capabilities of the six algo-
rithms with varying user numbers, ranging in [16, 80]. With the number of
users increasing, the privacy entropy of the DMDP algorithms remains relatively
stable, while the privacy entropy of MASAC-based algorithms increases. It is
attributed to MASAC algorithm migrating services to different BSs to alleviate
resource competition, thereby enhancing the migration randomness. Consequently,
even with different location inference attacks, the location accuracy of MASAC-
based algorithms gradually decreases when the number of users increases. Our
proposed algorithm consistently demonstrates effective location privacy protection
with varying user numbers.
Figures 5.30, 5.31, 5.32, and 5.33 display the variations in latency performance,
including response latency, communication latency, migration latency, and com-
putation latency, with different numbers of users (ranging from 16 to 80). As
the number of users increases, service latencies for MASAC, MASAC-distance,
MASAC-dp, and our proposed algorithms experience smooth growth, while DMDP
Fig. 5.29 Performance with a different number of users (a), the privacy entropy, (b) the location
accuracy under adversary’s KFA, and (c) the location accuracy under adversary’s KA
With varying network bandwidths, service request data, as well as the numbers
of users, extensive simulations show the superior performance of the proposed
algorithm. It ensures the services achieve seamless migration with low latency and
high QoS. In the future, efforts will be directed toward enhancing the scalability of
the proposed method to suit flexible industrial edge computing systems.
References
1. Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, and Khaled Ben Letaief. A survey on
mobile edge computing: The communication perspective. IEEE Communications Surveys and
Tutorials, 19(4):2322–2358, Dec. 2017.
2. X. Ge, S. Tu, G. Mao, C. Wang, and T. Han. 5G ultra-dense cellular networks. IEEE Wireless
Communications, 23(1):72–79, Feb. 2016.
3. Adyson Magalhães Maia, Yacine Ghamri-Doudane, Dario Vieira, and Miguel Franklin
de Castro. Optimized placement of scalable IoT services in edge computing. In IFIP/IEEE
International Symposium on Integrated Network Management, IM, pages 189–197, Washing-
ton DC USA, Apr. 2019.
4. Jie Xu, Lixing Chen, and Pan Zhou. Joint service caching and task offloading for mobile edge
computing in dense networks. In IEEE Conference on Computer Communications, INFOCOM,
pages 207–215, Honolulu, HI, USA, Apr. 2018.
5. T. Ouyang, Z. Zhou, and X. Chen. Follow me at the edge: Mobility-aware dynamic service
placement for mobile edge computing. IEEE Journal on Selected Areas in Communications,
36(10):2333–2345, Oct. 2018.
6. Tie Qiu, Aoyang Zhao, Feng Xia, Weisheng Si, and Dapeng Oliver Wu. ROSE: robustness
strategy for scale-free wireless sensor networks. IEEE/ACM Transactions on Networking,
25(5):2944–2959, Sep. 2017.
7. X. Zhang and Q. Zhu. Hierarchical caching for statistical QoS guaranteed multimedia trans-
missions over 5G edge computing mobile wireless networks. IEEE Wireless Communications,
25(3):12–20, Jun. 2018.
8. Wahida Nasrin and Jiang Xie. SharedMEC: Sharing clouds to support user mobility in mobile
edge computing. In IEEE International Conference on Communications, ICC, pages 1–6,
Kansas City, MO, USA, May 2018.
9. Y. Sun, S. Zhou, and J. Xu. EMM: Energy-aware mobility management for mobile edge
computing in ultra dense networks. IEEE Journal on Selected Areas in Communications,
35(11):2637–2646, Nov. 2017.
10. T. Taleb, A. Ksentini, and P. A. Frangoudis. Follow-me cloud: When cloud services follow
mobile users. IEEE Transactions on Cloud Computing, 7(2):369–382, Apr. 2019.
11. S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung. Dynamic service migration
in mobile edge computing based on Markov decision process. IEEE/ACM Transactions on
Networking, 27(3):1272–1288, Jun. 2019.
12. Andrew Machen, Shiqiang Wang, Kin K. Leung, Bongjun Ko, and Theodoros Salonidis.
Migrating running applications across mobile edge clouds: poster. In International Conference
on Mobile Computing and Networking, MobiCom, pages 435–436, New York City, NY, USA,
Oct. 2016.
13. Adam Sadilek and John Krumm. Far out: Predicting long-term human mobility. In Interna-
tional Conference on Artificial Intelligence, AAAI, pages 814–820, Toronto, Ontario, Canada,
Jul. 2012.
14. Xiaobo Zhou, Shuxin Ge, Tie Qiu, Keqiu Li, and Mohammed Atiquzzaman. Energy-efficient
service migration for multi-user heterogeneous dense cellular networks. IEEE Transactions on
Mobile Computing, 22(2):890–905, 2023.
150 5 Service Migration in Industrial Edge Computing
15. Weixu Wang, Xiaobo Zhou, Tie Qiu, Xin He, and Shuxin Ge. Location privacy-aware service
migration against inference attacks in multi-user MEC systems. IEEE Internet of Things
Journal, pages 1–1, 2023.
16. Matt Walker. Operators facing power cost crunch. https://www.mtnconsulting.biz/product.
Accessed Nov 7, 2020.
17. D. Chen and W. Ye. 5G power: Creating a green grid that slashes costs, emissions & energy
use. https://www.huawei.com/en/publications/communicate/89/5g-power-green-grid-slashes-
costs-emissions-energy-use. Accessed Nov 7, 2020.
18. Valentin Poirot, Mårten Ericson, Mats Nordberg, and Karl Andersson. Energy efficient multi-
connectivity algorithms for ultra-dense 5G networks. IEEE Wireless Networks, 26(3):2207–
2222, Jun. 2020.
19. Li Ping Qian, Yuan Wu, Bo Ji, Liang Huang, and Danny H. K. Tsang. HybridIoT: Integration
of hierarchical multiple access and computation offloading for IoT-based smart cities. IEEE
Network, 33(2):6–13, 2019.
20. Andrew Machen, Shiqiang Wang, Kin K. Leung, Bongjun Ko, and Theodoros Salonidis. Live
service migration in mobile edge clouds. IEEE Wireless Communication, 25(1):140–147, Mar.
2018.
21. Qi Zhang, Lin Gui, Fen Hou, Jiacheng Chen, Shichao Zhu, and Feng Tian. Dynamic task
offloading and resource allocation for mobile-edge computing in dense cloud RAN. IEEE
Internet Things Journal, 7(4):3282–3299, Jun. 2020.
22. Ning Lai and Fei Han. A hybrid particle swarm optimization algorithm based on migration
mechanism. In Intelligence Science and Big Data Engineering—7th International Conference,
IScIDE, pages 88–100, Dalian, China, Sep. 201.
23. Esri. ArcGIS. https://developers.arcgis.com/.
24. Y. Wang, M. Sheng, X. Wang, L. Wang, and J. Li. Mobile-edge computing: Partial com-
putation offloading using dynamic voltage scaling. IEEE Transactions on Communications,
64(10):4268–4282, 2016.
25. X. Yu, M. Guan, M. Liao, and X. Fan. Pre-migration of vehicle to network services based on
priority in mobile edge computing. IEEE Access, 7:3722–3730, Jan. 2019.
26. Ting He, Ertugrul Necdet Ciftcioglu, Shiqiang Wang, and Kevin S. Chan. Location privacy in
mobile edge clouds: A chaff-based approach. IEEE Journal on Selected Areas in Communica-
tions, 35(11):2625–2636, 2017.
27. F. Fei, S. Li, H. Dai, C. Hu, W. Dou, and Q. Ni. A k-anonymity based schema for location
privacy preservation. IEEE Transactions on Sustainable Computing, 4(2):156–167, April 2019.
28. Pasika Ranaweera, Anca Delia Jurcut, and Madhusanka Liyanage. Survey on multi-access edge
computing security and privacy. IEEE Communications Surveys Tutorials, 23(2):1078–1124,
2021.
29. Weiqi Zhang, Guisheng Yin, Yuhai Sha, and Jishen Yang. Protecting the moving user’s
locations by combining differential privacy and k-anonymity under temporal correlations in
wireless networks. Wirel. Commun. Mob. Comput., 2021:6691975:1–6691975:12, 2021.
30. Jian Kang, Doug Steiert, Dan Lin, and Yanjie Fu. MoveWithMe: Location privacy preservation
for smartphone users. IEEE Transactions on Information Forensics and Security, 15:711–724,
2020.
31. Xiaofan He, Juan Liu, Richeng Jin, and Huaiyu Dai. Privacy-aware offloading in mobile-edge
computing. In GLOBECOM 2017—2017 IEEE Global Communications Conference, pages 1–
6, 2017.
32. Weixu Wang, Shuxin Ge, and Xiaobo Zhou. Location-privacy-aware service migration in
mobile edge computing. In 2020 IEEE Wireless Communications and Networking Conference
(WCNC), pages 1–6, 2020.
33. Rinku Dewri. Local differential perturbations: Location privacy under approximate knowledge
attackers. IEEE Transactions on Mobile Computing, 12(12):2360–2372, 2013.
34. Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux.
Quantifying location privacy. In 2011 IEEE symposium on security and privacy, pages 247–
262. IEEE, 2011.
References 151
35. Fang Fu, Yunpeng Kang, Zhicai Zhang, F. Richard Yu, and Tuan Wu. Soft actor–critic DRL
for live transcoding and streaming in vehicular fog-computing-enabled IoV. IEEE Internet of
Things Journal, 8(3):1308–1321, 2021.
36. Yu Zheng, Hao Fu, Xing Xie, Wei-Ying Ma, and Quannan Li. GeoLife GPS trajectory
dataset—User Guide, geolife gps trajectories 1.1 edition, July 2011. Geolife GPS trajectories
1.1.
37. Yanting Wang, Min Sheng, Xijun Wang, Liang Wang, and Jiandong Li. Mobile-edge com-
puting: Partial computation offloading using dynamic voltage scaling. IEEE Transactions on
Communications, 64(10):4268–4282, 2016.
38. Shiqiang Wang, Rahul Urgaonkar, Murtaza Zafer, Ting He, Kevin Chan, and Kin K. Leung.
Dynamic service migration in mobile edge computing based on Markov decision process.
IEEE/ACM Transactions on Networking, 27(3):1272–1288, 2019.
39. Xuewen Dong, Tao Zhang, Di Lu, Guangxia Li, Yulong Shen, and Jianfeng Ma. Preserving
geo-indistinguishability of the primary user in dynamic spectrum sharing. IEEE Transactions
on Vehicular Technology, 68(9):8881–8892, 2019.
Chapter 6
Application-Oriented Industrial Edge
Computing
Industrial edge computing applications cover nearly every possible scenario in our
daily life. In the current era of AI, the most promising scenario is in the area of
edge-assisted model inference, whose typical application is object detection. Object
detection is the basis of making any other control decisions and also plays an
irreplaceable role in preventive maintenance and quality control. Therefore, this
chapter will introduce edge-assisted object detection for two typical data, i.e., image
and point cloud. Meanwhile, fluctuations in wireless bandwidth may incur long
communication latency for both edge-assisted methods, and thus we propose a
teacher–student learning framework to further accelerate the inference.
Real-time object detection with high accuracy greatly supports the development
of mobile vision applications in industrial edge computing systems, such as
autonomous driving [1]. Generally, there is a mismatching between limited
resources in mobile end devices, i.e., robots, and significant resource requirement
of DNN-based computation-intensive object detection. To deal with the problem,
mobile vision applications should find a way to reduce the resource requirement in
devices while maintaining accuracy performance.
There are various studies that have been devoted to breaking the resource
bottleneck, which can be classified into two categories. (i) One strategy involves
executing object detection tasks directly on mobile devices and employing model
compression techniques such as weight sharing [2] and knowledge distillation [3] to
transform computation-sensitive CNN models into more lightweight versions [3].
However, these lightweight models suffer from significant degradation in detection
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 153
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_6
154 6 Application-Oriented Industrial Edge Computing
The devices connect to the edge server via wireless links, while the edge servers
communicate with each other via wired links. Each mobile device is equipped with
a high-resolution camera, which is used for image collection. The collected data
will be sent to the edge server to execute the object detection algorithm. Figure
6.1 shows the architecture of MASS to achieve rapid detection with only a few
accuracy decline in industrial edge computing systems [17], which is composed of
the following four modules (depicted in Fig. 6.1):
• Entry Point Selection Module: Upon receiving a frame of an image, the
operator partitions the CNN into two parts based on a selected parallel entry
point. The sub-tasks that should be processed locally are referred to as the head
part. The rest of the sub-tasks, named tail part, will be sent to multiple edge
servers and be executed in parallel.
• Estimation Module: It is used to evaluate the computation cost of sub-tasks.
• Adaptive Sub-task Generation and Offloading Module: This is the core
module in MASS, which acts on the tail part to solve how to adaptively divide
these sub-tasks, where the resources in edge servers are taken into account.
Additionally, we design a uniformly sampled zero-padding scheme to minimize
communication costs among edge servers while preserving detection accuracy.
• Result Merging Module It gathers and merges the outcomes of each sub-task,
and the ultimate results are sent to the mobile device.
6.1 Image-Oriented Object Detection 155
We use .Hjin , .Wjin , .Cjin to represent the height, width, and channels of the input
of CNN layer j . Meanwhile, for the output of CNN layer j , the channels are
represented by .Cjout . Here, each layer of the CNN model [8, 18] can be a parallel
entry point. Nevertheless, different entry points result in distinct gains and costs
associated with parallel offloading, ultimately impacting detection latency. Here, .Kj
represents the kernel size of CNN layer .j, j ∈ 1, 2, · · · , M, where M is the number
of layers in CNN model. Note that MASS can be applied to most CNN-based object
detection models, allowing for the substitution of the CNN models [19–21].
The operator can select a certain layer .i, i ∈ {1, 2, · · · , M} as the parallel entry
point. In this case, the revenue and cost of parallel offloading can be calculated by
and
CP Oi = Amtitrans1 + Amtitrans2 .
. (6.2)
Here, .Amticom , Amtimem , Amtitrans1 are the computation cost, memory cost, and
communication cost when offloading sub-tasks [22, 23], which are
⎲
M ⎛ ⎞
Amticom =
. Hjin × Wjin × Cjin × Kj2 + 1 × Cjout , (6.3)
j =i
⎲
M ⎛ ⎞
Amtimem =
. Hjin × Wjin × Cjin + Cjout + Cjin × Cjout , (6.4)
j =i
156 6 Application-Oriented Industrial Edge Computing
A B
C D
0 1 0
C D
and
It is important to recognize the presence of a data halo when dividing the CNN
layer into slices, as illustrated in Figs. 6.2 and 6.3. To uphold detection accuracy, the
sub-tasks need to exchange information within the data halo [24]. Consequently, the
communication cost between two edge servers is
M ⎛
⎲ ⎞
Amtitrans2 =
. Hjin + Wjin × Cjin . (6.6)
j =i
Next, we use parallel efficiency to measure the influence of selecting layer i, i.e.,
RP Oi
P Ei =
. . (6.7)
CP Oi
The P E values for various parallel entry points across different CNN models are
depicted in Fig. 6.4. Essentially, a higher .P Ei value signifies greater offloading
revenue and lower offloading cost. Consequently, we choose the optimal entry point
as .i = arg max P Ei . During this process, the optimal entry point strikes a balance
between revenue and cost caused by offloading.
6.1 Image-Oriented Object Detection 157
Fig. 6.4 The P E value versus different entry points using Faster R-CNN, SSD, and YOLO,
respectively
E
Zero Padding
F
Fig. 6.5 An example of zero-padding, where the data halo is filled with zeros
Normally, it is necessary to exchange data for sub-tasks in the tail part within
the data halo of each layer, which also produces an extra communication latency,
denoted by .Amtitrans2 . A possible solution is to eliminate this data exchange by
employing zero-padding along the edges of CNN slices, which is depicted in
Fig. 6.5. However, it results in the accumulation of errors in the data halo over layers,
leading to a significant degradation in detection accuracy.
To fill this gap, we further introduce the uniform sample to it. Initially, we
uniformly sample some layers from the tail part. The sampled layers use zero-
padding to avoid the data exchange, while the rest of the layers are exchanged.
Also, we permit data exchange at layers .i, i + l, i + 2l, · · · to periodically correct
the accumulated error in the data halo. The value of l is a nonnegative empirical
integer, which greatly reduces communication costs among edge servers without
158 6 Application-Oriented Industrial Edge Computing
accuracy degradation. The utilization of the uniform sample helps strike a balance
in the communication cost among edge servers.
⎲ l⏌ ⎛
⎿(M−i)/ ⎞
.Amtitrans3 = in
Hi+j l + Wi+j l × Ci+j l .
in in
(6.8)
j =0
⎲
M ⎛ ⎞
theory_com
Amt1
. = Hjin × Wjin × Cjin × Kj2 + 1 × Cjout . (6.9)
j =1
theory_com
However, theoretical FLOPs .Amt1 are unable to measure the com-
pletion latency of the sub-task on a specific hardware platform. The completion
latency of a sub-task is intricately tied to the hardware platform’s configuration,
encompassing factors such as CPU frequency, memory size, GPU pipeline, and
cache [25]. Therefore, mapping the theoretical FLOPs to experimental FLOPs can
be abstracted to a regression problem. We build a quadratic polynomial for the
regression, i.e.,
⎛ ⎞ ⎛ ⎞
theory_com 3 theory_com 2
Amt1real_com = β1 × Amt1 + β2 × Amt1
. (6.10)
theory_com
+ β3 × Amt1 + β4 .
The traditional methods split CNNs into multiple slices with the same size, and each
slice is executed by an edge server, as shown in Fig. 6.6. However, for heterogeneous
edges, we must adaptively offload according to the capacity differences among edge
servers. This adjustment takes into account factors such as computation resources,
memory resources, and communication resources, aiming to minimize detection
latency, as depicted in Fig. 6.7.
The set of sub-tasks in the tail part is denoted by .S = {Sp |p ∈ {1, 2, · · · , P },
where P is also the number of available edges, i.e., .Sp is offloaded to edge server
.Ep . The transmission latency of this offloading is
( )
αp × Amtitrans1 + Amtitrans3
trans
.Tp = , (6.11)
bp
where .αp and .bp are the task partitioning ratio and allocated network bandwidth for
transmission, respectively.
Based on Eq. (6.10), the computation cost of the tail part is
( )3 ( )2
Amtireal_com = β1 × Amticom + β2 × Amticom
. (6.12)
+ β3 × Amticom + β4 .
Thus, we calculate the execution latency of sub-task .Sp on edge server .Ep as
follows:
αp × Amtireal_com
Tpexec =
. , (6.13)
xp
Subtask 3 Subtask 4
α4
Subtask 4
160 6 Application-Oriented Industrial Edge Computing
where .xp is the allocated computation resource of .Ep . Similarly, the corresponding
memory consumption is
. yp = αp × Amtimem . (6.14)
Overall, the completion latency of sub-task .Sp and average completion latency are
Tp = Tptrans + Tpexec
. (6.15)
and
∑
P
Tp
p=1
T =
. . (6.16)
P
In industrial edge computing systems, we should find the partition ratio .αp , p ∈
{1, 2, · · · , P }, network bandwidth .bp , computation resource .xp , and memory
resource .yp for each sub-task, to minimize the completion latency. We formulate
the adaptive sub-task generation and offloading problem as follows:
∑
P ∑
P
Tp + (Tp − T )2
p=1 p=1
. min . (6.17a)
α,x,y,b P
⎲
P
s.t. αp = 1, p ∈ {1, 2, · · · , P }. (6.17b)
p=1
Here, .XEp , .YEp , and .BEp as the available computation, memory, and communi-
cation resources of .Ep , respectively. Constraint (6.17b) ensures the partitioning
and offloading of the tail part of the CNN model. Constraints (6.17c), (6.17d),
and (6.17e) ensure that the resources used by sub-tasks do not exceed the capacity
of edges. It can be effectively solved using Sequential Least Squares Programming
(SLSQP).
6.1 Image-Oriented Object Detection 161
Figure 6.8 exhibits the testbed used in this chapter. We employ a mobile phone as
the mobile device, two NVIDIA Jetson AGX Xavier, and two NVIDIA Jetson TX2
development boards as edge servers to capture the heterogeneity. The edge server
communicates with other edges by 1 Gbps Ethernet cables, and with the mobile
device by 5 GHz WiFi.
We implement the mobile side functions on the mobile phone. It constantly
captures the video frame and subsequently offloads them to the edge server for
processing. We use three popular object detection models: Faster R-CNN, SSD,
and YOLO. To ensure repeatability and consistency, we utilize the COCO 2017
dataset [26] for validation.
The object detection accuracy of MASS, implemented with Faster R-CNN, SSD,
and YOLO, is examined, where the numbers of edge servers vary from 1 to 4. The
parameter l is set to 50, 35, and 53 for Faster R-CNN, SSD, and YOLO, respectively.
MASS achieves nearly identical detection accuracy compared to the original models
with one edge server. Figure 6.9 exhibits the result for the same frame with the
original Faster R-CNN model and MASS based on Faster R-CNN, respectively, each
of which can detect all vehicles. Note that, in Fig. 6.10, there are multiple bounding
boxes for one object achieved by MASS. This is caused by the feature pyramid
network (FPN) in CNN, which is executed among multiple servers. Consequently,
Fig. 6.11 The object detection accuracy of MASS based on Faster R-CNN, SSD, and YOLO
the result merging module is designed for fusing these boxes by the non-maximum
suppression (NMS) algorithm.
Figure 6.11 depicts the average object detection accuracy of MASS, where
the numbers of edge servers vary from 1 to 4. When comparing MASS with the
original object detection models, a minimal degradation in object detection accuracy
is observed for MASS based on Faster R-CNN and YOLO, staying below .1.4%
with two edge servers. Notably, MASS based on SSD experiences an accuracy
degradation of less than .0.1%. The detection accuracy of MASS based on Faster
R-CNN, SSD, and YOLO decreases by .2.9%, .0.2%, and .1.9%, respectively, with
6.1 Image-Oriented Object Detection 163
four edge servers. Also, in this case, the accuracy of MASS under SSD exceeds that
under YOLO. This highlights the varying sensitivity in partitioning for different
object detection models. The setting of l has an impact on detection accuracy, but
the accuracy degradation can be ignored when l is lower than 36.
Figure 6.12 shows the object detection latency of MASS of the above three CNN
models, where the numbers of edge servers vary from 1 to 4. Compared to the
original model, MASS based on Faster R-CNN experiences reductions in object
detection latency by .40.98%, .56.54%, and .64.83% when the number of edge servers
is .2, 3, 4, respectively. Furthermore, when utilizing four edge servers, MASS based
on SSD and YOLO perform reductions in detection latency by .60.97% and .46.4%.
This demonstrates that MASS effectively reduces the detection latency for both
two-stage (e.g., Faster R-CNN) and one-stage models (e.g., SSD and YOLO).
Additionally, the acceleration ratio for MASS based on Faster R-CNN and SSD
remains consistent. When the number of edge servers is limited, MASS based
on SSD outperforms that based on Faster R-CNN. Nevertheless, MASS based
on SSD will gradually exceed that based on SSD and YOLO with an increasing
number of edge servers. This is because of the heavier computation requirements
of the two-stage Faster R-CNN model, allowing for more significant gains with
additional edge servers. Conversely, one-stage object detection models generally
have shorter completion times than two-stage models; for instance, Faster R-CNN
requires .1.429 s to detect an image, whereas SSD requires .0.238 s. Consequently,
more edge servers can greatly reduce the execution latency, which also results in a
more substantial transmission latency.
Fig. 6.12 The object detection latency of MASS based on Faster R-CNN, SSD, and YOLO, where
the numbers of edge servers vary from 1 to 4 and the l is set to 50, 35, and 53
164 6 Application-Oriented Industrial Edge Computing
Fig. 6.13 The completion latency of the sub-tasks on four edge servers
Fig. 6.14 The STD of sub-tasks completion latency of four edge servers
6.2 Point Cloud Oriented Object Detection 165
Fig. 6.15 The object detection accuracy based on Faster R-CNN with different l, entry points i on
four edge servers
Figure 6.15 displays the object detection accuracy of MASS based on Faster R-CNN
with different values of l and entry points. For a given entry point i, increasing
l leads to a decline in detection accuracy due to the inevitable cumulative error
in the uniform sample for sub-tasks execution. The l is negatively correlated with
the accumulated errors, causing a greater degradation in accuracy. Additionally, the
detection accuracy is also negatively correlated with the position of entry point i.
This is because the closer to the last layer in CNN, the more global knowledge is
required to extract comprehensive object features.
In general, when .l < 36, MASS has a relatively high detection accuracy, proving
the effectiveness of the uniformly sampled zero-padding. It is noteworthy that, for
cases where the entry point .i ≥ 24, Fig. 6.15 does not display detection accuracy for
certain l values. This omission is due to the total CNN layers in the Faster R-CNN
model being less than .i + l in these instances.
Experimental results indicate that MASS achieves up to .64.83% decline in
detection latency with a low (i.e., around 3%) accuracy decline. In future studies, we
tend to leverage machine learning techniques to further enhance sub-task offloading
decisions.
In industrial edge computing systems, the perception ability of a single device (e.g.,
vehicle) is greatly restrained by the sensors’ capacity, such as its coverage. Hence,
166 6 Application-Oriented Industrial Edge Computing
Fig. 6.16 An illustration of cooperative perception helps autonomous vehicles extend sensing
range and improve detection precision
the Pointpillars model, the raw-, feature-, and object-level detection achieve 60.9%,
55.2%, and 52.9% precision, respectively—outperforming single-vehicle detection
(46.4%). The average precision of raw-level perception is the highest, while that of
object-level perception is the lowest.
Here, we give a detailed analysis of the bandwidth requirements of the above
strategies. We use
to denote the end-to-end latency with cooperation, where .ttransmit and .tprocess are
the transmission latency via V2V and the processing latency of object detection
model. Taking the urgent real-time requirement into account, .te2e is set to 100 ms,
i.e., the LiDARs scan with 10 fps. .tprocess in SECOND, Pointpillars, PartA2-Net,
and PV-RCNN models are 50.61 ms, 16.46 ms, 80.11 ms, and 79.66 ms, which is
tested on a desktop (Intel i7 CPU with NVIDIA 1080 Ti GPU). The bandwidth
requirements of raw-, feature-, and object-level cooperative perceptions are depicted
in Fig. 6.18b, considering a Wi-Fi 2.4G V2V link. The observation is that the
bandwidth requirements remain stable, resulting in frequent occurrences of either
bandwidth saturation or underutilization under dynamic Channel State Information
(CSI).
In Fig. 6.18a, it is evident that different levels lead to distinct average precision.
Generally, the more data exchanged among vehicles, the higher the average
precision is achieved. However, these existing cooperative schemes deviate from the
optimal as they suffer from long transmission latency and the lack of information
caused by bandwidth saturation or underutilization.
Hence, we introduce a novel cooperative perception scheme, named ML-Cooper,
toward more flexible and adaptive perception within limited bandwidth. Specifically,
ML-Cooper bridges the transmitted data with the dynamic CSI of the V2V link.
Figure 6.19 [35] exhibits the architecture of ML-Cooper. Two vehicles, distin-
guished as sender and receiver, collect point clouds from their own LiDAR sensors
and connect to each other via a V2V link. The point clouds frame is processed to
be feature and object data through 3D object detection model. ML-Cooper allows
vehicles for hybrid data sharing, i.e., the data in each level can be sent to other
vehicles. For the sender, its processed point cloud data is divided into three parts,
each of which contains partial raw, feature, and object data. After this data is sent to
α,β,Υ
receiver, the receiver executes an alignment based on the positions and angles and
fuses them with local data. Finally, the fuse data is fed to the 3D detection model.
Additionally, ML-Cooper can be applied to several SOTA 3D object detection
models, e.g., SECOND, Pointpillars, and so on.1 Taking the different influences in
the precision of data levels, we tend to balance the ratio of the three parts with
limited bandwidth to improve the average precision.
The most important module of ML-Cooper is point cloud partitioning. Different
from the 2D image, the point cloud frame is sparse, irregular, orderless, and
continuous. Thus, as depicted in Fig. 6.20, we design two specific point cloud
partitioning methods, i.e., angle-based and density-based partitioning:
• Angle-based partitioning pays attention to the straightforward view of the
vehicle, and thus the raw data here is transmitted at the raw-level to avoid the
lack of information.
• Density-based partitioning focuses on the point clouds far away from the vehi-
cle where the density of points is ultra-low. Similar to angle-based partitioning,
the point clouds in the far range are sent to another vehicle at the raw-level.
Both the angle-based and density-based partition methods reach a high detection
precision with low complexity, which is detailed in Sect. 6.2.6. Indeed, more sophis-
ticated partition methods may enhance perception performance. However, such
enhancements come at the cost of increased complexity in cooperative perception
systems, a topic that warrants further exploration in future studies. It is essential
to recognize that once the boundaries of three parts are established, extracting
corresponding data becomes straightforward as both types of data are point-wise.
However, object data, being nonpoint-wise, introduces a challenge, particularly
when a certain object straddles the borderline between two parts. In such cases,
it is regarded as the object data of the third part.
Normally, each vehicle owns various views of the environment, relying on many
factors, e.g., its location. The received data of a receiver should be aligned with
the sender’s view, which greatly influences the fusing efficiency. It means that, in
practice, extra information of the sender should also be transmitted, e.g., LiDAR
configuration, GPS/IMU, and so on. The IMU helpes the receiver to obtain a
transformation matrix as follows:
where .Rz (θyaw ), .Ry (θpitch ), .Rx (θroll ) are three basic .3 × 3 rotation matrices. .θyaw ,
θpitch , and .θroll represent the differences in yaw, pitch, and roll angles, respectively,
.
where .(Δdx , Δdy , Δdz ) is the GPS gap between sender and receiver. We regard
the GPS data to be accurate as the advanced localization technologies have reached
centimeter-level accuracy [36, 37]. Once the LiDAR sensor is properly calibrated,
the feature and object data can be aligned in the same manner. This is because the
features retain the location information of the objects.
It means that after alignment, the points are added to the raw data point set of the
receiver.
• Feature Data Fusion: We use the voxel feature fusion method for encoded
feature map fusion [33]. The non-empty voxels of the sender and receiver are
transformed into 2 128-dimension vectors .Vr = {Vri |i = 1, 2, · · · , 128} and
.Vs = {Vs |i = 1, 2, · · · , 128} for fusion with an element-wise maxout. It can
i
efficiently estimate the importance of features for cooperation, and thus the fused
features .Vf can be obtained by
⎛ ⎞
.Vfi = max Vri , Vsi , i = 1, . . . , 128. (6.22)
• Object Data Fusion: The receiver will add the newly detected objects to the
results, while the repeated objects take the maximum confidence score value from
the local detection and sender’s detection.
At the beginning of time slot t,2 the sender and receiver obtain its point clouds
frame, which is regarded as an agent engaging with observed state .st within discrete
decision time slots. Then it takes an action .at based on its policy .πθ based on neural
networks parameterized by .θ , while an immediate reward .rt and next state .st+1 are
returned according to the dynamic network conditions. The objective is to find an
optimal∑ policy .πθ∗ in each time slot with maximum discounted cumulative reward
∞
.R0 = t=0 δ rt . Here, .δ ∈ [0, 1) is the long-term discounting factor. Next, we
t
⎲
M−1
rt =
. (φn+1 − φn )pinterp (φn+1 ), (6.23)
n=0
where
. ~),
pinterp (φ) = max p(φ (6.24)
~≥φ
φ
where M is the number of estimated bounding boxes, .p(φ ~) is the measured precision
~, and .pinterp (φ) is a smoothed version of the precision curve .p(φ) [40].
at recall .φ
The recall value .φi ∈ {φ1 , . . . , φM } is obtained by the confidence threshold equal
to the confidence score of the i-th bounding box within the estimated bounding box
set when sorted by the confidence score in descending order.
K-SAC agent aims to find the policy .π(a|s) that also maximizes an entropy term
.− log π(at |st ), which can encourage exploration, of the policy, i.e.,
⎧ ∞
⎫
⎲
L(π ) = E
. δ [rt − λ(1 − Kt ) log π(at |st )]|π .
t
(6.25)
t=0
Here, .Kt indicates whether or not the frame is a key frame to distinguish the key and
non-key frames. .λ is a temperature parameter used for balancing the importance of
the entropy against the system reward. Since the key frames always receive a larger
weight, the impact of the entropy term is smoothed.
In this section, the KITTI dataset [41] and the dataset collected from two real
vehicles are used for the evaluation of ML-Cooper. Also, we compare ML-Cooper
with the following benchmarks:
• Cooper [28] is a raw-level cooperative perception method, which shares and
fuses the raw point clouds collected from different vehicles.
• F-Cooper [33] takes the processed feature maps for fusion, which greatly
reduces the bandwidth requirement.
• L3 [42] is a typical object-level cooperative perception method. The resource-
limited vehicle broadcasts its local sensing results to other vehicles.
• AFS-COD [43] is a feature-level cooperative perception method. It adaptively
transmits and aggregates feature maps with different sizes, where the dynamic
bandwidth is taken into account.
We introduce an adaptive bandwidth mechanism to Cooper and F-Cooper,
enabling them to selectively share only a portion of the raw or feature data. The
performance of these modified schemes, denoted as Cooper-BA and F-Cooper-BA,
respectively, is then compared with that of ML-Cooper. To ensure the fairness of the
comparison, the 3D object detection models are consistently set to be the same.
Experimental Setting We select 1004 frames of consecutive point clouds from the
KITTI dataset, captured by a vehicle equipped with a Velodyne 64-beam LiDAR
sensor. The 3D object detection models are executed on a desktop system featuring
an Intel i7-8700 CPU, 48 GB memory, 240 GB+1 TB hard disk, NVIDIA 1080
Ti GPU, running Ubuntu 18.04 with a Linux 5.4.0 kernel. Since the KITTI data
originates from a single vehicle, we utilize two point cloud frames from different
time segments to simulate data generated from two vehicles [28, 33]. The feasibility
of this approach is demonstrated in Figs. 6.22 and 6.23.
With the moving of the vehicle, two subsequent point cloud frames are collected
in time slots .t1 (Fig. 6.22a) and .t2 (Fig. 6.22b), respectively. This vehicle uses
SECOND to detect objects, where the ground truth and detected result are bounded
by green and red boxes. Figure 6.22a exhibits three objects, whose point clouds
are in the far range and thus sparse, are not detected. In our experiment, these two
frames are regarded as two independent point clouds frames that are collected by
two different vehicles, i.e., sender and receiver. It can be seen from Fig. 6.23 that
the detection accuracy after sharing raw, feature, and object data greatly improves
compared with the detection results from the perspective of a single vehicle.
Experimental Results The average precision results for Cooper, F-Cooper, L3,
Cooper-BA, F-Cooper-BA, and ML-Cooper are depicted in Fig. 6.24 using four
different 3D detection models and considering three distinct V2V link scenarios:
6.2 Point Cloud Oriented Object Detection 175
Fig. 6.22 Two frames of point clouds from KITTI dataset, and their detection results
Cellular 4G, WiFi 2.4G, and WiFi 5G, all under the angle-based point clouds
partition scheme. Notably, L3 consistently achieves the same performance across
the three V2V link channels, indicating that the small size of object data ensures
successful transmission from the sender to the receiver vehicle. Conversely, as
bandwidth increases, both Cooper and F-Cooper show improved average preci-
sion. For instance, Cooper with Pointpillars exhibits average precision values of
53.3%, 54.6%, and 60.8% in the Cellular 4G, WiFi 2.4G, and WiFi 5G channels,
respectively. Similarly, F-Cooper with SECOND achieves average precision values
of 57.1%, 57.4%, and 58.5% in the three channels, respectively. This improvement
is attributed to the higher bandwidth facilitating more data transmission, thereby
enhancing perception performance. However, Cooper’s average precision with
PartA2-Net and PV-RCNN remains the same across the channels due to these mod-
els requiring more processing time, leaving limited time for data transmission and
leading to consistent bandwidth saturation. Interestingly, in some cases, Cooper’s
176 6 Application-Oriented Industrial Edge Computing
Fig. 6.24 The average precision of Cooper, F-Cooper, L3, AFS-COD, Cooper-BA, F-Cooper-BA,
and ML-Cooper with four different 3D detection models
6.2 Point Cloud Oriented Object Detection 177
Fig. 6.25 Cooperative detection results, where the bandwidth is 150 Mbps. The green and red
boxes represent the ground truth and detected cars, respectively. The yellow and red shadows
indicate the raw point clouds data and feature data received from the sender, respectively
average precision is even lower than that of F-Cooper and L3 due to bandwidth
saturation.
In Fig. 6.24, although Cooper-BA and F-Cooper-BA show slight improvements
in average precision compared to Cooper and F-Cooper, respectively, they still fall
short of ML-Cooper’s performance. The incremental improvement is attributed to
the fact that a certain portion of information is missing in Cooper-BA and F-Cooper-
BA, resulting in completely undetected objects in the scene. Figure 6.25 provides
visualizations of cooperative perception results for Cooper-BA, F-Cooper-BA, and
ML-Cooper, where the bandwidth is set to 150 Mbps, and the angle-based partition
scheme is applied. It is evident that 5 and 4 objects are not detected in Fig. 6.25a
and b, respectively, because the sensing data from the edge is not shared by the
sender. Consequently, the receiver must rely solely on its own sensing data for
object detection. In contrast, Fig. 6.25c illustrates that ML-Cooper can detect more
objects by supplementing the missing raw data with feature data or object data.
This approach enables the sender to provide maximum assistance to the receiver,
resulting in superior perception performance.
For ML-Cooper and AFS-COD, a feature-level cooperative perception method,
it is evident in Fig. 6.25c that AFS-COD outperforms F-Cooper by reducing
data discarding. However, despite AFS-COD’s ability to adjust the size of the
transmitted feature, it still cannot surpass the extreme detection precision achieved
by ML-Cooper. Moreover, since the data size per channel remains essentially fixed,
AFS-COD struggles to accurately adapt to continuous bandwidth variations.
The results in Fig. 6.24 further demonstrate that ML-Cooper consistently
achieves the highest average precision across all cases. As mentioned earlier,
ML-Cooper optimally utilizes the available bandwidth of the V2V link in each time
slot by dynamically adjusting the values of α, β, and γ . This approach effectively
eliminates the impact of bandwidth saturation and underutilization, leading to
superior performance.
Figure 6.26 depicts the average precision of angle-based and density-based ML-
Cooper, where the bandwidths vary from 10 to 1000 Mbps. It is observed that the
performances of the two methods are similar to each other.
178 6 Application-Oriented Industrial Edge Computing
Fig. 6.26 The performance comparison of angle-based and density-based point clouds partition
methods
Evaluating ML-Cooper on a dataset collected from two real vehicles adds realism
to the assessment. This approach considers factors such as sensor measurements at
different timestamps and the obstruction of the view of the vehicle behind by the
one in front. It addresses some of the limitations associated with simulating vehicle-
to-vehicle cooperation using frames from the KITTI dataset.
Experimental Setting We use two Great Wall WEY VV7 vehicles (Fig. 6.27) to
collect images and point clouds used for this experiment. The configurations of the
6.2 Point Cloud Oriented Object Detection 179
vehicle are referred to as Table 6.2. The common scenarios where we collect the
data are as below:
• Multilane roads. This urban scene is quite common, characterized by numerous
dynamic vehicles driving at high speeds, with car following being a frequent
occurrence. Such complex traffic scenarios are ideal for testing the performance
of our system.
• Road intersections. Another typical scenario is a busy road intersection, where
vehicles congregate in large numbers and congestion easily occurs. Due to the
diverse behaviors of traffic participants and the complexity of traffic conditions
at intersections, real-time cooperative perception plays a crucial role in ensuring
driving safety. Therefore, we include this scenario as one of our test cases.
• Parking lots. This is a crowded environment with numerous obstacles, i.e., busy
aboveground parking lots as one of our test scenarios. As crowded parking lots
are representative of congested areas, we included this scenario as one of our test
cases.
Fig. 6.28 The average precision of Cooper, F-Cooper, L3, ASF-COD, Cooper-BA, F-Cooper-BA,
and ML-Cooper with four different 3D object detection models and DSRC V2V link
Fig. 6.29 The performance comparison of angle-based and density-based point clouds partition
methods
It is shown in [48] that a student model can achieve a frame rate of 30 FPS
on a Samsung Galaxy S10+ smartphone for semantic segmentation. Unfortunately,
student models suffer from a decline in accuracy as finite parameters cannot
maintain the similar accuracy of the teacher model across different visual scenes
in video streams [49], which is caused by data drift. To deal with data drift and
eliminate accuracy decrease, the student model has to be updated periodically with
the help of the teacher model through a training process on the edge server. Note
that different training configurations, such as training epochs and frozen layers,
lead to different accuracy improvements with different resource requirements (i.e.,
training cost). It is pointed out in [49] that training costs vary by up to 200-fold
depending on different training configurations, and higher resource usage does not
always translate into higher accuracy.
In multi-device heterogeneous MEC networks, the network operator has to make
optimal updating decisions for each device to achieve high inference accuracy. The
updating decision includes the offloading decision, i.e., which edge to offload, and
the configuration selection decision, i.e., which training configuration to select.
However, it is quite challenging due to resource heterogeneity and limited com-
puting resources of edge servers. First, updating the student model with expensive
training configuration under frequent update requests from all devices poses a great
challenge to the limited resources of edge servers. Moreover, resource heterogeneity
further complicates the problem. Second, the offloading decisions and configuration
selection decisions are strongly coupled with each other, resulting in an extremely
huge solution space, which makes it difficult to find the optimal decision.
To solve these problems, we propose an adaptive teacher–student framework for
real-time video inference in industrial edge computing systems in the following part.
The system model considered in this chapter is shown in Fig. 6.30, where there are
multiple devices and multiple BSs randomly distributed in the MEC networks. Each
BS is equipped with an edge server, and thus devices can offload the updating tasks
to its connected BS through wireless channels.
Let .M = {1, 2, · · · , M} denote the set of BSs, indexed by m. The GPU
processing capabilities of BSs for executing a task are represented by an M-
dimension vector .W, indexed by .ωm . When the BS has multiple tasks to execute, the
“First Come, First Served” rule should be abided by, i.e., the task received earlier
will be executed sooner, where up to one task can be processed.
Let .N = {1, 2, · · · , N} denote the set of devices, indexed by n. Each device has
a continuous video stream to be inferred, which requires the device to maintain
a lightweight student model for video inference. According to a given training
configuration, the student model is updated with the help of the teacher model by
sending part of the sampled video frames to BS. After the training process at the
BS, the new model is sent back to the devices.
6.3 Video Inference with Knowledge Distillation 183
× ×
Fig. 6.30 The system model of video inference with teacher–student learning in multi-device
heterogeneous MEC networks
In time slot .t ∈ {1, 2, · · · , T }, device u can offload the training task to edge
server m to update its model, or continue to use the old model with an accuracy .δt− .
Here, each time slot lasts .τ seconds. We use two M-dimension vectors .Bun,t and .Bdn,t
to denote the upload and download transmission rates between device n and BSs,
where .Bn,tu (m) and .B d (m) are the upload and download transmission rates between
n,t
device n and BS m, respectively. To maximize the average inference accuracy, the
system should make updating decisions for each device, including the configuration
selection and offloading decisions. The configuration selection decision determines
the accuracy improvement and resource usage. The offloading decision indicates
which BS to offload, which is greatly influenced by the available resources of the
edge servers and network conditions. The definitions are shown in Table 4.2.
Let a 3-tuple .α n,t =< cu , cd , ce > denote the configuration selection decision
for device n in time slot t, i.e., the hyperparameters of training configuration, where
.α n,t (x), x ∈ {1, 2, 3} indicates the three elements in the tuple in sequence. The
total candidate configuration set is indicated by .C. .cu is the ratio of the sampled
video frames sent to the edge server. .cd is the ratio of unfrozen layers in the
DNN model. The parameters of these unfrozen layers will be updated during the
training process. .ce denotes the number of training epochs. Intuitively, expensive
training configuration with large .cu , .cd , and .ce values results in higher accuracy
improvement of the student model. Here we establish the relationship between the
inference accuracy and the training configuration.
184 6 Application-Oriented Industrial Edge Computing
Let .cg (αn,t ) denote the GPU seconds3 required by configuration .αn,t ∈ C.
∗ denote the maximum inference accuracy that device n can reach in time
Let .δn,t
slot t. The accuracy improvement ratio .ηn,t is a function of .cg (αn,t ), i.e., .ηn,t =
g(cg (αn,t )). Then the estimated inference accuracy of device n in time slot t can be
obtained as
∗
δn,t = δn,t
. · ηn,t . (6.26)
However, it is quite difficult to derive the closed form expression of .g(·). Therefore,
we use a measurement-based method to approximate .g(·).
We take deeplabv3+ model [50] on cityscapes [51] and A2D2 [52] dataset as
an example. The training process was taken on an NVIDIA RTX 2080Ti GPU.
First, we randomly select several training configurations from .C and conduct the
training process. Then, we measure the accuracy improvement ratios with these
configurations and plot the results in Fig. 6.31. Finally, curve fitting is used to obtain
an approximation of .g(·), which is
⎾ ⏋
g(cg (αn,t )) = 0.1946 ∗ ln cg (α n,t ) + 0.3615.
. (6.27)
Note that this approach is general and can be applied to other models and GPU
hardware.
Let a M-dimension binary vector .β n,t denote the offloading decision of device n
in time slot t, where .βn,t (m) = 1 indicate device n offload the task to BS m in time
slot t. Note that devices can offload the task to at most one BS, so we have
⎲
M
. βn,t (m) ≤ 1. (6.28)
m=1
3 GPU seconds refer to the time taken for training with 100% GPU processing capabilities.
6.3 Video Inference with Knowledge Distillation 185
The total latency of a model update process of device n in time slot t consists
of four parts: the training data upload latency, the computation latency, the queuing
latency at the edge server, and the model download latency.
We assume the frame rate of each video f and the resolution of the image
sampled s is fixed, and thus the total amount of the training data contains every
frame in the previous time slot t, which is .Sn,t = f · s · τ . Hence, the training data
upload latency for device u in time slot t is
⎲
M u
αn,t (1)Sn,t
u
.ln,t = βn,t (m) u . (6.29)
Bn,t (m)
m=1
Similarly, the model download latency for receiving the updated model for device n
in time slot t is
⎲
M
αn,t (2)S d
.
d
ln,t = βn,t (m) d (m)
, (6.30)
m=1
Bn,t
where .S d is the size of the parameters of the student model. Then, the computation
latency of device u in time slot t is
g wm
.
c
ln,t = cn,t , (6.31)
ŵ
q
where the .ŵ is the 100% GPU processing capabilities of edge server. Let .ln,t denote
the queuing time of device n in time slot t, and then the total update latency is
q
.Ln,t = ln,t
u
+ ln,t
c
+ ln,t + ln,t
d
. (6.32)
Note that the updating process should be finished within each time slot, so we have
Ln,t ≤ τ.
. (6.33)
With the predicted inference accuracy and model update latency, we obtain the
estimated average inference accuracy of all the devices in time slot t as
1 ⎲{ −
N
}
.R(t) = δt Ln,t + δn,t (τ − Ln,t ) . (6.34)
τN
n=1
1 ⎲
T
. max R(t) (6.35)
α n,t ,β n,t T
t=1
In this section, we present the CEM-MASAC algorithm for solving the optimization
problem, of which the architecture is shown in Fig. 6.32. In CEM-MASAC, users,
i.e., agents, take their own actions with a soft value function to interact with the
environment based on SAC introduced in Sect. 3.3.3. CEM-MASAC also leverages
a cross entropy method to further explore the optimal action with population evo-
lution, which aims at avoiding falling into local optima and improving exploration
efficiency.
We formulate the problem as a POMDP problem .< E, S, A, R >:
• Environment .E: In this chapter, let .et ∈ E denote the environment, including
the network conditions, computation resources of all edge servers, and maximum
accuracy improvement of all devices.
⎛ ⎞
et =
. Bu1,t , · · · , BuN,t , Bd1,t , · · · , BdN,t , W, δ ∗t . (6.36)
, 2
1 2
1 2
• Action .A: The action .an,t ∈ A of device n in time slot t contains the configuration
selection decision and offloading decision, which is
• Reward .R: We define the reward as the average inference of the devices
according to Eq. (6.34). After all the agents take actions in each time slot, the
environment returns an immediate reward
⎲
K/2
μnew =
. λ i φk , (6.40)
i=1
and
⎲
K/2
2
.σnew = λi (φk − μ)2 + ϵ. (6.41)
i=1
Here .λi and .ϵ are the weights given to the individuals and the noise added to the
usual covariance update to prevent premature convergence.
188 6 Application-Oriented Industrial Edge Computing
Datasets We used two datasets for the evaluation of our method: cityscapes [51]
(driving in Frankfurt, 46 mins long), A2D2 [52] (2 videos, 25 mins in total), which
covered various ranges of scenes with fixed cameras and moving cameras at walking
and driving. We split each video into several 10-second segments. The upload and
download transmission rates were set based on two sets of 1200 traces from real-
world communication traces FCC [53], which range from 1 Mbps to 10 Mbps and
from 1 Mbps to 20 Mbps, respectively.
Inference Models We considered the semantic segmentation tasks in our system.
We used deeplabv3+ [50] with Xception65 and mobilenetv2 as the backbone to
simulate the teacher model and student model, respectively. We used deeplabv3+
with Xception65 to label the video frames as the ground truth, which were then
used to supervise the student model training.
Evolutionary Deep Reinforcement Learning Hyperparameters We set the
learning rate of critic part as 10−3 . We set the future reward discount γ as 0.99.
The population size was set to 10, and each population selected the top-half fittest
individuals as the elite individuals.
Retraining Configurations We used set C u = {0.01, 0.02, 0.05, 0.1, 0.2} to
simulate the ratio of sampled frames sent to the edge server. We used set C d =
{0.05, 0.1, 0.25, 1} to simulate the ratio of unfrozen layers of the student model. We
used set C d = {10, 20, 30, 40, 50} to simulate the number of retraining epochs. The
experiment setups are shown in Table 6.3.
Baseline We compared the performance of our method with the following meth-
ods:
• No Update: Each mobile device performs the inference task without offloading
the model updating task to the edge server.
• R-F(α1 ): Each mobile device offloads the model updating task to a random
edge server with a fixed retraining configuration, i.e., 2% training frames, 25%
unfrozen layers, and 20 epochs, to simulate a low-cost retraining configuration.
Note that AMS considers a single BS MEC system and executes the model
updating tasks in a polling manner. Since our environment is a heterogeneous
MEC network, we add random offloading to AMS for fairness.
• R-F(α2 ): Each mobile device offloads the model updating task to a random
edge server with a fixed retraining configuration, i.e., 5% training frames, 100%
unfrozen layers, and 40 epochs, to simulate a high-cost retraining configuration.
• S-O: Each mobile device makes the optimal decision independently without
considering the resource contention by traversing the potential solution space.
Fig. 6.33 The average inference accuracy of 12 devices with 3 BSs on cityscapes dataset
Fig. 6.34 The average inference accuracy of 12 devices with 3 BSs on A2D2 dataset
190 6 Application-Oriented Industrial Edge Computing
No Update. With R-F, devices randomly offload the model updating tasks to edge
servers and get random average inference accuracy improvement. It is also found
that the average inference accuracy of S-O improves up to 4.42% compared to No
Update, which is similar to R-F. This is because the devices suffer from resource
contention, and some devices cannot update their models in time. Our method
outperforms all other baseline methods and shows an accuracy improvement of up
to 9.24% compared to R-F. Since it jointly considers the available resources and
network conditions to adaptively make updating decisions, the other methods do
not take into account both.
Since the edge servers are heterogeneous in our scenario, we study three com-
binations of edge servers with a fixed number of devices, i.e., 12 devices. The
combinations are shown in Table 6.4. Different combinations present different
numbers and different resources of edge servers. As shown in Figs. 6.35 and 6.36, it
can be found that with more powerful edge servers, the average inference accuracy
will be higher in our method. Since more edge servers mean more resources for
each device in a time slot, leading to more accuracy improvements. It can also
be found that S-O does not yield as many accuracy improvements as our method,
because most devices offload to the same edge server without considering resource
contention and cannot be finished in time. Our method makes the optimal decision
with the edge resources changing and gets the best accuracy improvement compared
to other methods.
Fig. 6.35 The rewards with different combinations of edge servers on cityscapes dataset
References 191
References
1. Zhengxia Zou, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A
survey. CoRR, abs/1905.05055, 2019.
2. Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Jagannathan Sarangapani. A compre-
hensive survey on model compression and acceleration. Artif. Intell. Rev., 53(7):5113–5155,
2020.
3. Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. Model compression and hardware
acceleration for neural networks: A comprehensive survey. Proc. IEEE, 108(4):485–532, 2020.
4. Jangwon Lee, Jingya Wang, David J. Crandall, Selma Sabanovic, and Geoffrey C. Fox. Real-
time, cloud-based object detection for unmanned aerial vehicles. In First IEEE International
Conference on Robotic Computing, IRC 2017, Taichung, Taiwan, April 10–12, 2017, pages
36–43. IEEE Computer Society, 2017.
5. Yiwen Han, Xiaofei Wang, Victor C. M. Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen.
Convergence of edge computing and deep learning: A comprehensive survey. CoRR,
abs/1907.08349, 2019.
6. Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. Edge intelligence:
Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE,
107(8):1738–1762, 2019.
7. Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. Distributed deep neural networks
over the cloud, the edge and end devices. In 37th IEEE International Conference on Distributed
Computing Systems, ICDCS 2017, Atlanta, GA, USA, June 5–8, 2017, pages 328–339. IEEE
Computer Society, 2017.
8. Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. Dynamic adaptive DNN surgery for
inference acceleration on the edge. In 2019 IEEE Conference on Computer Communications,
INFOCOM 2019, Paris, France, April 29–May 2, 2019, pages 1423–1431. IEEE, 2019.
9. Shigeng Zhang, Yinggang Li, Xuan Liu, Song Guo, Weiping Wang, Jianxin Wang, Bo Ding,
and Di Wu. Towards real-time cooperative deep inference over the cloud and edge end devices.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 4(2):69:1–69:24, 2020.
10. Mikolaj Jankowski, Deniz Gündüz, and Krystian Mikolajczyk. Joint device-edge inference
over wireless links with pruning. In 21st IEEE International Workshop on Signal Processing
Advances in Wireless Communications, SPAWC 2020, Atlanta, GA, USA, May 26–29, 2020,
pages 1–5. IEEE, 2020.
11. Wuyang Zhang, Zhezhi He, Luyang Liu, Zhenhua Jia, Yunxin Liu, Marco Gruteser, Dipankar
Raychaudhuri, and Yanyong Zhang. Elf: accelerate high-resolution mobile deep vision with
content-aware parallel offloading. In Proceedings of the 27th Annual International Conference
on Mobile Computing and Networking, pages 201–214, 2021.
192 6 Application-Oriented Industrial Edge Computing
12. Rafael Stahl, Zhuoran Zhao, Daniel Mueller-Gritschneder, Andreas Gerstlauer, and Ulf
Schlichtmann. Fully distributed deep learning inference on resource-constrained edge devices.
In Embedded Computer Systems: Architectures, Modeling, and Simulation—19th International
Conference, SAMOS 2019, Samos, Greece, July 7–11, 2019, Proceedings, volume 11733 of
Lecture Notes in Computer Science, pages 77–90. Springer, 2019.
13. Li Zhou, Mohammad Hossein Samavatian, Anys Bacha, Saikat Majumdar, and Radu Teodor-
escu. Adaptive parallel execution of deep neural networks on heterogeneous edge devices.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019, Arlington,
Virginia, USA, November 7–9, 2019, pages 195–208. ACM, 2019.
14. Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed
inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Confer-
ence on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020,
pages 854–863. IEEE, 2020.
15. Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. DeepThings: Distributed
adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst., 37(11):2348–2359, 2018.
16. Sai Qian Zhang, Jieyu Lin, and Qi Zhang. Adaptive distributed convolutional neural network
inference at the network edge with ADCNN. In ICPP 2020: 49th International Conference on
Parallel Processing, Edmonton, AB, Canada, August 17–20, 2020, pages 10:1–10:11. ACM,
2020.
17. Duanyang Li, Zhihui Ke, and Xiaobo Zhou. MASS: multi-edge assisted fast object detection
for autonomous mobile vision in heterogeneous edge networks. In Periklis Chatzimisios,
Rodolfo W. L. Coutinho, and Mirela Notare, editors, Q2SWinet 2021: Proceedings of the 17th
ACM Symposium on QoS and Security for Wireless and Mobile Networks, Alicante, Spain,
November 22–26, 2021, pages 61–68. ACM, 2021.
18. En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. Edge AI: on-demand accelerating deep neural
network inference via edge computing. IEEE Trans. Wireless Communications, 19(1):447–457,
2020.
19. Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. Faster R-CNN: towards real-
time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell.,
39(6):1137–1149, 2017.
20. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed, Cheng-Yang
Fu, and Alexander C. Berg. SSD: single shot multibox detector. In Computer Vision—
ECCV 2016—14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016,
Proceedings, Part I, volume 9905 of Lecture Notes in Computer Science, pages 21–37.
Springer, 2016.
21. Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. You only look
once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and
Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pages 779–788.
IEEE Computer Society, 2016.
22. Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet V2: practical
guidelines for efficient CNN architecture design. In Computer Vision—ECCV 2018—15th
European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part XIV,
volume 11218 of Lecture Notes in Computer Science, pages 122–138. Springer, 2018.
23. Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. Pruning convo-
lutional neural networks for resource efficient inference. In 5th International Conference on
Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track
Proceedings. OpenReview.net, 2017.
24. Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkate-
san, Brucek Khailany, Joel S. Emer, Stephen W. Keckler, and William J. Dally. SCNN: an
accelerator for compressed-sparse convolutional neural networks. In Proceedings of the 44th
Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada,
June 24–28, 2017, pages 27–40. ACM, 2017.
References 193
25. Norman P. Jouppi, Cliff Young, and Nishant Patil et al. In-datacenter performance analysis
of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on
Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24–28, 2017, pages 1–12.
ACM, 2017.
26. Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan,
Piotr Dollár, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. In
Computer Vision—ECCV 2014—13th European Conference, Zurich, Switzerland, September
6–12, 2014, Proceedings, Part V, volume 8693 of Lecture Notes in Computer Science, pages
740–755. Springer, 2014.
27. Florian A. Schiegg, Ignacio Llatser, Daniel Bischoff, and Georg Volk. Collective perception:
A safety perspective. Sensors, 21(1):159, 2021.
28. Qi Chen, Sihai Tang, Qing Yang, and Song Fu. Cooper: Cooperative perception for connected
autonomous vehicles based on 3d point clouds. In 39th IEEE International Conference on
Distributed Computing Systems, Dallas, TX, USA, pages 514–524, 2019.
29. Velodyne lidar hdl-64e. https://www.velodynelidar.com/hdl-64e.html.
30. Jingda Guo, Dominic Carrillo, Sihai Tang, Qi Chen, Qing Yang, Song Fu, Xi Wang, Nannan
Wang, and Paparao Palacharla. CoFF: cooperative spatial feature fusion for 3D object detection
on autonomous vehicles. IEEE Internet of Things Journal, 8(14):11078–11087, 2021.
31. Moreno Ambrosin, Ignacio J. Alvarez, Cornelius Bürkle, Lily L. Yang, Fabian Oboril,
Manoj R. Sastry, and Kathiravetpillai Sivanesan. Object-level perception sharing among
connected vehicles. In IEEE Intelligent Transportation Systems Conference, Auckland, New
Zealand, pages 1566–1573, 2019.
32. Zijian Zhang, Shuai Wang, Yuncong Hong, Liangkai Zhou, and Qi Hao. Distributed dynamic
map fusion via federated learning for intelligent networked vehicles. In IEEE International
Conference on Robotics and Automation, Xi’an, China, pages 953–959, 2021.
33. Qi Chen, Xu Ma, Sihai Tang, Jingda Guo, Qing Yang, and Song Fu. F-cooper: feature based
cooperative perception for autonomous vehicle edge computing system using 3d point clouds.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, Arlington, Virginia,
USA, pages 88–100, 2019.
34. Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, Yaser P. Fallah, Rui Guo, and
Hongsheng Lu. Cooperative LIDAR object detection via feature sharing in deep networks. In
92nd IEEE Vehicular Technology Conference, Victoria, BC, Canada, pages 1–7, 2020.
35. Qi Xie, Xiaobo Zhou, Tie Qiu, Qingyu Zhang, and Wenyu Qu. Soft actor-critic-based
multilevel cooperative perception for connected autonomous vehicles. IEEE Internet of Things
Journal, 9(21):21370–21381, 2022.
36. High performance INS for ADAS and autonomous vehicle testing. https://www.oxts.com/
products/rt3000-v3/.
37. Verizon hyper precise location. https://thingspace.verizon.com/services/hyper-precise-
location/.
38. Yu Feng, Shaoshan Liu, and Yuhao Zhu. Real-time spatio-temporal LiDAR point cloud
compression. In IEEE/RSJ International Conference on Intelligent Robots and Systems, Las
Vegas, NV, USA, pages 10766–10773, 2020.
39. Hansong Wang, Xi Li, Hong Ji, and Heli Zhang. Federated offloading scheme to minimize
latency in MEC-enabled vehicular networks. In IEEE Globecom Workshops, Abu Dhabi,
United Arab Emirates, pages 1–6, 2018.
40. Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew
Zisserman. The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis., 88(2):303–
338, 2010.
41. Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving?
the KITTI vision benchmark suite. In IEEE Conference on Computer Vision and Pattern
Recognition, Providence, RI, USA, pages 3354–3361, 2012.
194 6 Application-Oriented Industrial Edge Computing
42. Qi Chen, Sihai Tang, Jacob Hochstetler, Jingda Guo, Yuan Li, Jinbo Xiong, Qing Yang,
and Song Fu. Low-latency high-level data sharing for connected and autonomous vehicular
networks. In IEEE International Conference on Industrial Internet, Orlando, FL, USA, pages
287–296, 2019.
43. Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, and Yaser P. Fallah. Bandwidth-
adaptive feature sharing for cooperative LIDAR object detection. In 3rd IEEE Connected and
Automated Vehicles Symposium, Victoria, BC, Canada, pages 1–7, 2020.
44. Bin Dai, Fanglin Xu, Yuanyuan Cao, and Yang Xu. Hybrid sensing data fusion of cooperative
perception for autonomous driving with augmented vehicular reality. IEEE Systems Journal,
15(1):1413–1422, 2021.
45. Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose
estimation using part affinity fields. In IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, pages 1302–1310, 2017.
46. Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. Edge computing: Vision and
challenges. IEEE Internet of Things Journal, 3(5):637–646, 2016.
47. Lin Wang and Kuk-Jin Yoon. Knowledge distillation and student-teacher learning for visual
intelligence: A review and new outlooks. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 44(6):3048–3068, 2022.
48. Mehrdad Khani Shirkoohi, Pouya Hamadanian, Arash Nasr-Esfahany, and Mohammad
Alizadeh. Real-time video inference on edge devices via adaptive model streaming. In
IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada,
October 10–17, pages 4552–4562, 2021.
49. Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu,
Nikolaos Karianakis, Kevin Hsieh, Paramvir Bahl, and Ion Stoica. Ekya: Continuous learning
of video analytics models on edge compute servers. In 19th USENIX Symposium on Networked
Systems Design and Implementation, NSDI 2022, Renton, WA, USA, April 4–6, pages 119–135,
2022.
50. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam.
Encoder-decoder with atrous separable convolution for semantic image segmentation. In
European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8–14,
volume 11211, pages 833–851, 2018.
51. Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler,
Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for
semantic urban scene understanding. In IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, pages 3213–3223, 2016.
52. Jakob Geyer, Yohannes Kassahun, Mentar Mahmudi, Xavier Ricou, Rupesh Durgesh,
Andrew S. Chung, Lorenz Hauswald, Viet Hoang Pham, Maximilian Mühlegg, Sebastian
Dorn, Tiffany Fernandez, Martin Jänicke, Sudesh Mirashi, Chiragkumar Savani, Martin Sturm,
Oleksandr Vorobiov, Martin Oelker, Sebastian Garreis, and Peter Schuberth. A2D2: Audi
Autonomous Driving Dataset. CoRR, abs/2004.06320, 2020.
53. Federal Communications Commission. 2016. Raw data Measuring Broadband America.
https://www.fcc.gov/reports/research/reports/measuring-broadband-america/raw-data-
measuring-broadband-america-20.
Chapter 7
Future Research Directions
As we delve into the future of industrial edge computing, the convergence of the-
oretical research and practical applications becomes a pivotal focus for innovation.
This field, at the forefront of technological progress, requires a seamless integration
of theoretical concepts and real-world implementations. Our exploration of future
research directions will highlight the incorporation of emerging technologies like
digital twin and collaborative cloud–edge data analysis, applied in industrial
contexts such as smart manufacturing and predictive maintenance. This approach
not only advances theoretical knowledge but also anchors these developments in
practical, industry-specific use cases. We aim to navigate the evolving realm of
industrial edge computing through this perspective, aiming to discover innovative
solutions and strategies that will define the future of industrial technology.
The integration of digital twin technology into industrial edge computing systems
marks a significant shift in our approach [1]. Digital twins, as detailed virtual models
of physical entities, greatly enhance real-time monitoring and control capabilities
at the edge [2]. This section thoroughly examines the theoretical foundations and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 195
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_7
196 7 Future Research Directions
The synergistic relationship between cloud and edge computing creates a fertile
environment for collaborative data analysis, an area of great importance with sig-
nificant implications for system efficiency and responsiveness [5–8]. Investigating
the intersection of cloud–edge collaboration in data analysis is crucial. This section
aims to delve into the theoretical complexities, exploring how these technologies
can work together to optimize data processing, storage, and analytics:
7.1 Theory Exploration for Future Directions 197
cross-platform data processing and analysis, thereby harnessing the full potential
of both cloud and edge computing paradigms in industrial applications.
• Energy Efficiency and Sustainability: Optimizing the energy efficiency of
cloud–edge collaborative data analysis, particularly on edge devices, is a critical
area for future research. This includes exploring the use of renewable energy
sources and energy-saving technologies to reduce the overall energy consump-
tion of cloud–edge systems. Pursuing sustainable data analysis practices is
paramount, not only to minimize environmental impact but also to ensure the
long-term viability and cost-effectiveness of industrial edge computing solutions.
Each of these research directions holds the potential to significantly advance the
field of cloud–edge collaborative data analysis, addressing current limitations and
unlocking new possibilities for industrial edge computing.
The third dimension of this exploration focuses on the crucial aspect of real-time
communication optimization in edge computing, with an emphasis on minimizing
data transmission. This involves theoretical considerations aimed at developing
advanced communication protocols and algorithms [9, 10]. The goal is to reduce
latency and bandwidth usage, ensuring quick and accurate information exchange
with minimal data payload. This section will not only delve into the theoretical
underpinnings but also suggest potential methodologies for achieving efficient real-
time communication in industrial edge computing environments.
In today’s world, where timely information is paramount, the need for real-
time communication, coupled with the necessity to minimize data transmission,
has become a key area of focus [11]. “Real-time Communication with Less
Data” represents a significant shift in approach, recognizing the importance of
instant information exchange while optimizing network resource utilization. The
theoretical exploration of this concept includes the following aspects:
• Low Latency Protocols: The theoretical underpinnings revolve around the
development of low latency communication protocols. These protocols aim
to minimize the time it takes for data to traverse the network, ensuring that
information reaches its destination with the utmost promptness.
• Bandwidth Optimization Strategies: The exploration includes strategies for
optimizing bandwidth usage. This involves the development of compression
algorithms, data aggregation techniques, and other methodologies to reduce the
amount of data transmitted without compromising the integrity and accuracy of
the information.
• Edge Computing for Localized Communication: Real-time communication
is enhanced by leveraging edge computing capabilities. Edge devices facilitate
localized communication, allowing critical information to be exchanged without
7.2 Application Scenarios 199
the need for extensive data transfer to centralized servers, thereby reducing
latency.
• Prioritized Data Transmission: The theoretical framework includes mecha-
nisms for prioritizing data transmission. By discerning between critical and
noncritical data, the communication system can ensure that essential information
is transmitted swiftly, while less time-sensitive data can follow, optimizing the
use of network resources.
• Dynamic QoS Adaptation: The exploration involves the theoretical develop-
ment of dynamic QoS adaptation mechanisms. These mechanisms enable the
communication system to adjust its parameters based on the current network
conditions, ensuring optimal performance in varying situations.
• Security Measures: The theoretical exploration also encompasses security
measures. Efficient real-time communication requires robust security protocols
to safeguard the transmitted data from unauthorized access and potential threats.
This involves encryption, authentication, and other security measures integrated
into the communication framework.
In conclusion, the pursuit of real-time communication with minimal data is a
crucial theoretical response to the growing need for instant information exchange
in today’s highly connected world. This exploration goes beyond addressing the
technical complexities of reducing latency and optimizing bandwidth. It sets the
groundwork for a communication framework that is not only adaptive and secure but
also precisely calibrated to meet the requirements of contemporary interconnected
systems. This endeavor is pivotal in ensuring that industrial edge computing systems
can operate efficiently, responsively, and securely in an environment where rapid
data exchange is increasingly vital.
ICV
chain management, ensuring efficient resource utilization and route planning. These
application scenarios are illustrated in Fig. 7.1, highlighting the wide-ranging and
transformative impact of edge computing across different industrial sectors.
feature extraction [17] and anomaly detection in railway tracks. This facilitates
monitoring the real-time performance of railways and trains, predicting potential
failures to prevent downtime and support optimization decisions. Additionally,
drones are being employed as a source of information for railway track detection,
enhancing the scope of edge computing in PHM [18].
The smart grid stands as a prime example of industrial edge computing in action.
Its primary goal is to facilitate node monitoring and information exchange for the
transmission of electrical energy from power plants to end users [19, 20]. The
smart grid offers significant advancements over traditional power grids through
the integration of various aspects of power production, transmission, distribution,
and security protection using advanced information technology. This integration
allows both power grid companies and users to access real-time information about
the status of the grid and electricity consumption, thereby enhancing the overall
efficiency of the power system.
In modern smart grids, a vast array of smart meters and various types of sensing
devices are deployed. This results in a complex overall structure with heterogeneous
data types and substantial instantaneous data volumes. To address these challenges,
edge servers can be deployed near smart meters and sensing devices. These servers
perform in data analysis and make partial decisions, facilitating regional equipment
management and energy efficiency optimization. This approach not only improves
management efficiency but also meets real-time operational requirements. The edge
servers collect data essential for equipment maintenance and structural optimization
and then upload it to the cloud center for centralized processing, analysis, and
training [21].
A smart grid system based on industrial edge computing can intelligently detect
grid structures [22], distribute computing, storage, and control services to the edge
network, and effectively allocate intelligent resources of the entire power system
closer to end users [23]. Such a system can support high-demand functions like
intelligent low-voltage area management, user power management, and monitoring
of external force damage risks [24], showcasing the transformative potential of edge
computing in enhancing smart grid capabilities.
7.2.3 Manufacturing
The emergence of the 5G era heralds significant advancements in the field of Intel-
ligent Connected Vehicle (ICV), poised to become a crucial scenario in industrial
edge computing [30]. A core solution for ICV in this new era is the collaboration
between edge and cloud computing. Cloud computing acts as a super brain for
vehicles, handling complex processes such as area-wide traffic forecasting [31, 32],
while edge computing functions like the vehicles’ nerve endings, performing more
immediate and “subconscious” reactions such as gathering driving information
about nearby cars or initiating automatic emergency braking.
A key focus within industrial edge computing applied to ICV is autonomous
driving. Regional autonomous driving, which is relatively straightforward, involves
pre-planning the path and speed of a vehicle based on the environmental information
of the entire running area. This enables automatic vehicle operation within a desig-
7.2 Application Scenarios 203
nated area, such as a small amusement park. In situations like the sudden entry of
pedestrians or other vehicles, edge computing swiftly processes image information
from cameras against onboard road data to facilitate immediate responses.
However, adaptive autonomous driving in varying environments is considerably
more complex, encompassing diverse scenarios such as cruising, lane-changing
assistance, navigating intersections, automatic parking, speed control, and path
planning. In these situations, the detection of surrounding vehicles, identification
of traffic signals, and response to emergency obstacles require prompt processing,
which cannot afford the delays of cloud data uploading. Thus, industrial edge com-
puting becomes the processing hub in these scenarios, determining the prioritization
of events for processing either on the edge server or the onboard system.
Beyond driving functionalities, onboard entertainment and services are integral
to the ICV experience. As vehicles travel at high speeds, roadside fixed edge servers
support real-time communication with the vehicles, akin to mobile phone services
but with different movement ranges and speeds. Consequently, the application
scenario for ICV closely aligns with research in MEC [33]. The frequent interactions
among vehicles, edge servers, and the cloud platform necessitate efficient data
routing, caching, and offloading strategies to fulfill the needs of ICVs in the 5G
era.
References
1. Sawsan AbdulRahman, Safa Otoum, Ouns Bouachir, and Azzam Mourad. Management of
digital twin-driven IoT using federated learning. IEEE J. Sel. Areas Commun., 41(11):3636–
3649, 2023.
2. Xiangyi Chen, Guangjie Han, Yuanguo Bi, Zimeng Yuan, Mahesh K. Marina, Yufei Liu,
and Hai Zhao. Traffic prediction-assisted federated deep reinforcement learning for service
migration in digital twins-enabled MEC networks. IEEE J. Sel. Areas Commun., 41(10):3212–
3229, 2023.
3. Felipe Arraño-Vargas and Georgios Konstantinou. Modular design and real-time simulators
toward power system digital twins implementation. IEEE Trans. Ind. Informatics, 19(1):52–
61, 2023.
4. Sangeen Khan, Sehat Ullah, Habib Ullah Khan, and Inam Ur Rehman. Digital-twins-based
internet of robotic things for remote health monitoring of COVID-19 patients. IEEE Internet
Things J., 10(18):16087–16098, 2023.
5. Peiyin Xing, Yaowei Wang, Peixi Peng, Yonghong Tian, and Tiejun Huang. End-edge-cloud
collaborative system: A video big data processing and analysis architecture. In 3rd IEEE
Conference on Multimedia Information Processing and Retrieval, MIPR 2020, Shenzhen,
China, August 6–8, 2020, pages 233–236, 2020.
6. Zhichen Ni, Honglong Chen, Zhe Li, Xiaomeng Wang, Na Yan, Weifeng Liu, and Feng Xia.
MSCET: A multi-scenario offloading schedule for biomedical data processing and analysis
in cloud-edge-terminal collaborative vehicular networks. IEEE ACM Trans. Comput. Biol.
Bioinform., 20(4):2376–2386, 2023.
7. Qing Han, Xuebin Ren, Peng Zhao, Yimeng Wang, Luhui Wang, Cong Zhao, and Xinyu
Yang. Eccvideo: A scalable edge cloud collaborative video analysis system. IEEE Intell. Syst.,
38(1):34–44, 2023.
8. Xilai Liu, Zhihui Ke, Xiaobo Zhou, Tie Qiu, and Keqiu Li. QoE-oriented adaptive video
streaming with edge-client collaborative super-resolution. In IEEE Global Communications
Conference, GLOBECOM 2022, Rio de Janeiro, Brazil, December 4–8, 2022, pages 6158–
6163, 2022.
9. Baoquan Yu, Yueming Cai, Xianbang Diao, and Yong Chen. AoI minimization scheme
for short-packet communications in energy-constrained IIoT. IEEE Internet Things J.,
10(22):20188–20200, 2023.
References 205
10. Chenlu Zhuansun, Kedong Yan, Gongxuan Zhang, Chanying Huang, and Shan Xiao.
Hypergraph-based joint channel and power resource allocation for cross-cell M2M commu-
nication in IIoT. IEEE Internet Things J., 10(17):15350–15361, 2023.
11. Shaoling Hu and Wei Chen. Joint lossy compression and power allocation in low latency wire-
less communications for IIoT: A cross-layer approach. IEEE Trans. Commun., 69(8):5106–
5120, 2021.
12. Zakaria Abou El Houda, Bouziane Brik, Adlen Ksentini, Lyes Khoukhi, and Mohsen Guizani.
When federated learning meets game theory: A cooperative framework to secure IIoT
applications on edge computing. IEEE Trans. Ind. Informatics, 18(11):7988–7997, 2022.
13. Wenhao Fan, Shenmeng Li, Jie Liu, Yi Su, Fan Wu, and Yuanan Liu. Joint task offloading
and resource allocation for accuracy-aware machine-learning-based IIoT applications. IEEE
Internet Things J., 10(4):3305–3321, 2023.
14. X. Yi, Y. Chen, P. Hou, and Q. Wang. A survey on prognostic and health management for
special vehicles. In 2018 Prognostics and System Health Management Conference (PHM-
Chongqing), pages 201–208, 2018.
15. Carlos Pedroso, Yan Uehara de Moraes, Michele Nogueira, and Aldri Santos. Relational
consensus-based cooperative task allocation management for IIoT-health networks. In 17th
IFIP/IEEE International Symposium on Integrated Network Management, IM 2021, Bordeaux,
France, May 17–21, 2021, pages 579–585, 2021.
16. A. L. Ellefsen, V. Æsøy, S. Ushakov, and H. Zhang. A comprehensive survey of prognostics
and health management based on deep learning for autonomous ships. IEEE Transactions on
Reliability, 68(2):720–740, 2019.
17. Z. Liu and Others. Industrial AI enabled prognostics for high-speed railway systems. In 2018
IEEE International Conference on Prognostics and Health Management (ICPHM), pages 1–8,
2018.
18. J. Yang, X. Cheng, Y. Wu, Y. Qin, and L. Jia. Railway comprehensive monitoring and warning
system framework based on space-air-vehicle-ground integration network. In 2018 Prognostics
and System Health Management Conference (PHM-Chongqing), pages 1314–1319, 2018.
19. Lulu Wen, Kaile Zhou, Wei Feng, and Shanlin Yang. Demand side management in smart
grid: A dynamic-price-based demand response model. IEEE Trans. Engineering Management,
71:1439–1451, 2024.
20. H. Farhangi. The path of the smart grid. IEEE Power and Energy Magazine, 8(1):18–28, 2010.
21. James Cunningham, Alexander J. Aved, David Ferris, Philip Morrone, and Conrad S. Tucker.
A deep learning game theoretic model for defending against large scale smart grid attacks.
IEEE Trans. Smart Grid, 14(2):1188–1197, 2023.
22. G. Lin and Others. Community detection in power grids based on Louvain heuristic algorithm.
In 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), pages 1–4,
2017.
23. H. Wang, Q. Wang, Y. Li, G. Chen, and Y. Tang. Application of fog architecture based on
multi-agent mechanism in CPPS. In 2018 2nd IEEE Conference on Energy Internet and Energy
System Integration (EI2), pages 1–6, 2018.
24. C. Jinming, J. Wei, J. Hao, G. Yajuan, N. Guoji, and C. Wu. Application prospect of
edge computing in smart distribution. In 2018 China International Conference on Electricity
Distribution (CICED), pages 1370–1375, 2018.
25. F. Shrouf, J. Ordieres, and G. Miragliotta. Smart factories in industry 4.0: A review of the
concept and of energy management approached in production based on the internet of things
paradigm. In 2014 IEEE International Conference on Industrial Engineering and Engineering
Management, pages 697–701, 2014.
26. L. Li, K. Ota, and M. Dong. Deep learning for smart industry: Efficient manufacture inspection
system with fog computing. IEEE Transactions on Industrial Informatics, 14(10):4665–4673,
2018.
27. H. Kanzaki, K. Schubert, and N. Bambos. Video streaming schemes for industrial IoT. In 2017
26th International Conference on Computer Communication and Networks (ICCCN), pages
1–7, 2017.
206 7 Future Research Directions
28. A. Sabu and K. Sreekumar. Literature review of image features and classifiers used in leaf
based plant recognition through image analysis approach. In 2017 International Conference on
Inventive Communication and Computational Technologies (ICICCT), pages 145–149, 2017.
29. Y. Wang and M. Weyrich. An adaptive image processing system based on incremental learning
for industrial applications. In Proceedings of the 2014 IEEE Emerging Technology and Factory
Automation (ETFA), pages 1–4, 2014.
30. C. Chen, J. Hu, T. Qiu, M. Atiquzzaman, and Z. Ren. CVCG: Cooperative V2V-aided
transmission scheme based on coalitional game for popular content distribution in vehicular
ad-hoc networks. IEEE Transactions on Mobile Computing, pages 1–18, 2018.
31. A. Thakur and R. Malekian. Fog computing for detecting vehicular congestion, an internet
of vehicles based approach: A review. IEEE Intelligent Transportation Systems Magazine,
11(2):8–16, 2019.
32. S. Yang, Y. Su, Y. Chang, and H. Hung. Short-term traffic prediction for edge computing-
enhanced autonomous and connected cars. IEEE Transactions on Vehicular Technology,
68(4):3140–3153, 2019.
33. F. Giust, V. Sciancalepore, D. Sabella, M. C. Filippou, S. Mangiante, W. Featherstone, and
D. Munaretto. Multi-access edge computing: The driver behind the wheel of 5g-connected
cars. IEEE Communications Standards Magazine, 2(3):66–73, 2018.
34. Pirmin Fontaine, Stefan Minner, and Maximilian Schiffer. Smart and sustainable city logistics:
Design, consolidation, and regulation. Eur. J. Oper. Res., 307(3):1071–1084, 2023.
35. Hiren Dutta, Saurabh Nagesh, Jawahar Talluri, and Parama Bhaumik. A solution to blockchain
smart contract based parametric transport and logistics insurance. IEEE Transactions on
Services Computing, 16(5):3155–3167, 2023.
36. C. Lin and J. Yang. Cost-efficient deployment of fog computing systems at logistics centers in
industry 4.0. IEEE Transactions on Industrial Informatics, 14(10):4603–4611, 2018.