0% found this document useful (0 votes)
9 views216 pages

Xiaobo Z. Industrial Edge Computing. Architecture, Optimization and Apps 2024 (1)

The document discusses Industrial Edge Computing, highlighting its architecture, optimization, and applications in various industrial sectors. It emphasizes the importance of real-time data processing for improved efficiency and decision-making, particularly in areas like autonomous driving and smart manufacturing. The book explores computation offloading, data caching, and service migration strategies while addressing the challenges and benefits associated with industrial edge computing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views216 pages

Xiaobo Z. Industrial Edge Computing. Architecture, Optimization and Apps 2024 (1)

The document discusses Industrial Edge Computing, highlighting its architecture, optimization, and applications in various industrial sectors. It emphasizes the importance of real-time data processing for improved efficiency and decision-making, particularly in areas like autonomous driving and smart manufacturing. The book explores computation offloading, data caching, and service migration strategies while addressing the challenges and benefits associated with industrial edge computing.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 216

Xiaobo Zhou

Shuxin Ge
Jiancheng Chi
Tie Qiu

Industrial Edge
Computing
Architecture, Optimization and
Applications
Industrial Edge Computing
Xiaobo Zhou • Shuxin Ge • Jiancheng Chi • Tie Qiu

Industrial Edge Computing


Architecture, Optimization and Applications
Xiaobo Zhou Shuxin Ge
College of Intelligence and Computing College of Intelligence and Computing
Tianjin University Tianjin University
Tianjin, China Tianjin, China

Jiancheng Chi Tie Qiu


College of Intelligence and Computing College of Intelligence and Computing
Tianjin University Tianjin University
Tianjin, China Tianjin, China

ISBN 978-981-97-4751-1 ISBN 978-981-97-4752-8 (eBook)


https://doi.org/10.1007/978-981-97-4752-8

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore
Pte Ltd. 2024
This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.
The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,
Singapore

If disposing of this product, please recycle the paper.


Preface

Industrial edge computing refers to the practice of managing data-handling activities


using individual sources of data capture or storage, such as smart edge devices or
equipment. This allows capturing, analyzing, or accessing data without accessing
a centralized network segment or the cloud. Industrial edge computing provides
a more efficient and effective way of processing data in real-time, leading to
faster decision-making, improved operational efficiency, and reduced costs in
various industrial sectors. The adoption of industrial edge computing is accelerated,
especially in industrial applications like autonomous driving, smart manufacturing,
and predictive maintenance. These applications usually have stringent requirements
in terms of quality of service (QoS), reliability, scalability, and deployability.
To address the complex industrial demands and challenges, the comprehensive
theory on industrial edge computing, including network architecture, protocols, and
algorithms, has to be established and tailored for industrial environments.
In this book, we explored and designed specific solutions for different appli-
cations based on latency minimization schemes, energy consumption trimming,
and accelerating inference with improved model accuracy. It covers the following
topics:
• Computation Offloading in Industrial Edge Computing: Emphasizing the
need for adaptive matching of computational tasks to the most appropriate
processing units, computation offloading involves a careful assessment of both
task requirements, task dependency, and resource availability. The goal is to
orchestrate computational tasks in a way that is not only highly effective but
also efficient in terms of resource utilization.
• Data Caching in Industrial Edge Computing: Utilizing the Age of Information
(AoI) to assess the freshness of cached data, and Field-of-View (FoV) to
estimate the Quality of Experience (QoE), we have developed multi-agent-based
algorithms for decision-making. These algorithms aim to minimize download
latency and enhance QoE. This approach ensures that the data is both current and
optimized for user experience, leveraging the collective intelligence of multiple
agents for more efficient and effective decision-making.

v
vi Preface

• Service Migration in Industrial Edge Computing: Considering the interfer-


ence among devices and privacy concerns arising from user mobility, we have
developed service migration strategies based on Lyapunov optimization and
reinforcement learning. These strategies are aimed at minimizing the total energy
consumption, response latency, and risk of privacy leakage.
Besides, we further explore the application of industrial edge computing in
image-oriented object detection, point cloud-oriented object detection, and video
inference with knowledge distillation to demonstrate its advantages. Overall, this
book offers a detailed and structured insight into industrial edge computing. It
thoroughly examines various research achievements in this area, including its
architecture, developments, challenges, and applications. The goal is to clarify the
relationship between Industrial Internet of Things (IIoT) and edge computing and
to encourage their ongoing growth and integration. The book is a valuable resource
for researchers, designers, and newcomers in industrial edge computing. It intends
to significantly enhance the understanding and progress of this emerging field.
In writing this book, we have received substantial support from the SmartIoT
laboratory team, which includes Jiaxin Zeng, Zhihui Ke, Qi Xie, Weixu Wang,
Duanyang Li, Pengbo Liu, Qixia Hao, Yuan Lu, Chuanan Wang, Ning Chen, and
Songwei Zhang. Additionally, international scholars Prof. Dapeng Oliver Wu and
Prof. Mohammed Atiquzzaman provided valuable comments and suggestions.
Furthermore, this work was partially supported by the National Science Fund
for Distinguished Young Scholars (No. 62325208), the Joint Funds of the National
Natural Science Foundation of China (No. U2001204), and the National Natural
Science Foundation of China (No. 62072330, 62272339).
Finally, we extend our heartfelt appreciation to everyone who participated in idea
or scheme discussions, and to the organizations that provided funding and platform
support.

Tianjin, China Xiaobo Zhou


December 2023 Shuxin Ge
Jiancheng Chi
Tie Qiu
Contents

1 Introduction to Industrial Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1


1.1 Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 What Is IIoT? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 What Is Cloud Computing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 What Is Edge Computing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Reference Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Existing Reference Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.2 Proposed Reference Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Benefits and Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Benefits of Industrial Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 Challenges of Industrial Edge Computing . . . . . . . . . . . . . . . . . . . . . 9
1.4 Organization of This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 Performance Metrics of Industrial Edge Computing . . . . . . . . . . . . . . . . . . 15
2.1.1 Latency Minimization Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.2 Energy Consumption Trimming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.3 Accelerating Inference with Improved Model Accuracy . . . . . 20
2.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.1 Offloading for Mass End Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 Multisource Data Caching and Migration . . . . . . . . . . . . . . . . . . . . . 25
2.2.3 Intelligent Application in Industrial Edge Computing . . . . . . . . 28
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Computation Offloading in Industrial Edge Computing . . . . . . . . . . . . . . . . 37
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Adaptive Offloading with Two-Stage Hybrid Matching . . . . . . . . . . . . . . 39
3.2.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.2 Scheme Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 Global Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.4 Online Matching Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

vii
viii Contents

3.2.5 Offline Matching Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49


3.2.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.3 Dependent Offloading with DAG-Based Cooperation Gain . . . . . . . . . . 58
3.3.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.2 Cooperation Gain Estimation Based on DAG . . . . . . . . . . . . . . . . . 59
3.3.3 Branch Soft Actor–Critic Offloading Algorithm . . . . . . . . . . . . . . 64
3.3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4 Data Caching in Industrial Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Freshness-Aware Caching with Distributed MAMAB . . . . . . . . . . . . . . . . 80
4.2.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2.2 HD Map Caching Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.2.3 Distributed Caching and Requesting Algorithm . . . . . . . . . . . . . . 85
4.2.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.3 Multicategory Video Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.3.2 FoV-Based QoE of Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3.3 Multi-agent Soft Actor–Critic Caching . . . . . . . . . . . . . . . . . . . . . . . . 96
4.3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
5 Service Migration in Industrial Edge Computing . . . . . . . . . . . . . . . . . . . . . . . . 109
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture . . . . . 111
5.2.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2.2 Energy-Efficient Service Migration Model Under
3-Layer VM Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.2.3 Lyapunov Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.2.4 Probabilistic Particle Swarm Optimization Algorithm . . . . . . . . 121
5.2.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.3 Location Privacy-Aware Service Migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3.2 Adversary’s Location Inference Attack . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3.3 Location Privacy-Aware Multiuser Service
Migration Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6 Application-Oriented Industrial Edge Computing . . . . . . . . . . . . . . . . . . . . . . . 153
6.1 Image-Oriented Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1.2 Entry Point Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
6.1.3 Computation Cost Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Contents ix

6.1.4 Adaptive Offloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159


6.1.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.2 Point Cloud Oriented Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.2.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.2.2 Point Cloud Partition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.2.3 Data Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2.4 Multilevel Data Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.2.5 K-Soft Actor–Critic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.2.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3 Video Inference with Knowledge Distillation . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.3.1 Statement of Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.3.2 Inference Accuracy Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
6.3.3 Cross Entropy Method (CEM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.3.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
7 Future Research Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.1 Theory Exploration for Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
7.1.1 Digital Twin for Industrial Edge Computing System . . . . . . . . . 195
7.1.2 Cloud–Edge Collaborative Data Analysis . . . . . . . . . . . . . . . . . . . . . 196
7.1.3 Real-Time Communication with Less Data . . . . . . . . . . . . . . . . . . . 198
7.2 Application Scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
7.2.1 Prognostics and Health Management . . . . . . . . . . . . . . . . . . . . . . . . . . 200
7.2.2 Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.2.3 Manufacturing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.2.4 Intelligent Connected Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.2.5 Smart Logistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
Acronyms

1-D One-Dimensional
2-D Two-Dimensional
3G Third Generation cellular networks
4G Fourth Generation cellular networks
5G Fifth Generation cellular networks
AC Always Cooperate
AI Artificial Intelligence
AET Average Execution Time
AoI Age of Information
AR Augmented Reality
ATOM Adaptive Offloading with Two-Stage Hybrid Matching
BS Base Station
CCU Computing Capability Utilization
CDF Cumulative Distribution Function
CEM Cross Entropy Method
CHR Cache Hit Rate
CNN Convolutional Neural Network
CPU Central Processing Unit
CSI Channel State Information
D2D Device-to-Device
DAG Direct Acyclic Graph
DAG-ED Directed Acyclic Graph with External Dependency
DDPG Deep Deterministic Policy Gradient
DL Deep Learning
DMDP Dynamic Markov Decision Process
DNN Deep Neural Network
DP Differential Privacy
DQN Deep Q-Network
DRL Deep Reinforcement Learning
DVFS Dynamic Voltage and Frequency Scaling
ED External Dependency

xi
xii Acronyms

EGO Energy-efficient service miGration for multi-user hterOgeneous


dense cellular networks algorithm
FA-MASAC FoV-Aware Multi-Agent Soft Actor-Critic
FLOPS Floating-point Operations Per Second
FoV Field-of-View
FPN Feature Pyramid Network
GNN Graph Neural Network
GPU Graphic Process Unit
HD High Definition
HMD Head Mounted Device
ICV Intelligent Connected Vehicle
ILP Integer Linear Programming
IIoT Industrial Internet of Things
IoT Internet of Things
IoV Internet of Vehicles
KL Kullback-Leibler
LBS Location-Based Service
LiDAR Light Detection and Ranging
LoRa Long Range Radio
LR Linear Regression
LTE Long Time Evolution
LPPM Location Privacy Protection Mechanisms
LSTM Long Short-Term Memory
MAMAB Multi-Agent Multi-Armed Bandit
MARL Multi-Agent Reinforcement Learning
MASAC Multi-Agent Soft Actor-Critic
MCTS Monte Carlo Tree Search
MDP Markov Decision Process
MEC Multi-Access (Mobile) edge computing
MINP Mixed-Integer Nonlinear Programming
ML Machine Learning
MS MEC Server
MSE Mean Squared Error
MVC-A3C Multi-Video Category based on A3C
NB-IoT Narrow Band Internet of Things
NC Never Cooperate
NFV Network Function Virtualization
NGN Next-Generation Network
NMS Non-Maximum Suppression
OFDMA Orthogonal Frequency Division Multiple Access
PHM Prognostics and Health Management
PLC Programmable Logic Controller
POMDP Partially Observable Markov Decision Process
PSO Particle Swarm Optimization
QoE Quality of Experience
Acronyms xiii

QoS Quality of Service


QRB Quality-Rebuffer-Balance
QSAC Quantized Soft Actor-Critic
RC Random Cooperate
RFID Radio Frequency Identification
RL Reinforcement Learning
RSU Roadside Unit
RPN Region Proposal Networks
SAC Soft Actor-Critic
SDN Software Definition Networks
SLSQP Sequential Least Squares Programming
SOP Static Optimization Placement
SOTA State-of-The-Art
STD Standard Deviation
SUMO Simulation of Urban Mobility
TR Timeout Rate
UAV Unmanned Aerial Vehicle
UCB Upper Confidence Bound
V2I Vehicle-to-Infrastructure
V2V Vehicle-to-Vehicle
VM Virtual Machine
VR Virtual Reality
Chapter 1
Introduction to Industrial Edge
Computing

Industrial edge computing involves deploying data processing capabilities close


to the data source in industrial settings. This chapter first introduces industrial
edge computing and its contemporary reference architecture. Then, we explore the
technology’s benefits and challenges, highlighting key concerns. Lastly, we outline
the structure of this book.

1.1 Concepts

Industrial edge computing refers to the deployment of data processing capabilities


near the source of data generation within industrial environments, such as manu-
facturing plants, oil rigs, or power stations. This approach minimizes latency by
processing data close to where it is generated, rather than sending it to distant
data centers or cloud infrastructures for analysis [1, 2]. Industrial edge computing
enables real-time data analysis, improves operational efficiency, supports predictive
maintenance, and enhances decision-making processes by providing immediate
insights and actions directly at the site of industrial operations [3, 4].
Industrial edge computing combines several advanced technologies, including
IIoT [5], cloud computing [6], and edge computing [7] to optimize production
processes, increase safety, and reduce operational costs. IIoT specifically integrates
devices with sensing, communication, and computational capabilities, bringing
intelligence to the fourth industrial revolution. Conversely, edge computing provides
a method to manage these devices cohesively, offering benefits such as reduced
latency in decision-making and lower energy consumption. This section introduces
these related concepts and their interconnections.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 1
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_1
2 1 Introduction to Industrial Edge Computing

1.1.1 What Is IIoT?

The Internet of Things (IoT) is a dynamic, evolving global network infrastructure


known for its self-configuring capabilities based on standardized, interoperable
communication protocols [8–10]. It enables universal connectivity among objects,
facilitating information exchange and collaborative decision-making [11]. In IoT,
diverse entities with unique identities are interconnected via various networks,
enabling dynamic information interactions. The integration of IoT in industrial sec-
tors has led to a distinct field known as IIoT. IIoT represents a service-centric indus-
trial ecosystem that utilizes networked resources, data interconnectivity, and system
interoperability. This ecosystem enables efficient resource distribution, responsive
process execution, optimization, and adaptation to environmental changes [12].
IIoT transforms industrial processes and devices into data entities and nodes,
respectively. It involves collecting foundational data that, combined with cloud
computing’s storage and computational power, allows for deeper data analysis and
operational optimization. IIoT is revolutionizing industry production, management,
and operational methods, enhancing supply chain resource allocation and produc-
tion efficiencies.
The IIoT framework consists of a wide range of node devices connected
through both wired and wireless networks, including sensor networks, Wi-Fi,
mobile communications (3G/4G/LTE/5G), and specialized industrial buses [13].
These devices form an edge network responsible for real-time industrial data
collection and transmission to cloud servers for processing. As IIoT grows, the
complexity and scale of these networks challenge traditional cloud data centers,
especially in handling massive data transmission and processing. For example, in
smart factories, production devices collect huge amounts of sensory data, reaching
gigabytes per second [14]. Uploading all this data to the cloud can be bandwidth-
intensive, costly, and cause processing delays. In intelligent connected vehicles,
delay sensitivity is crucial due to high operating speeds. Relying solely on cloud
platforms for processing can miss crucial information phases [15]. In time-sensitive
situations, like emergency shutdowns in factories or emergency braking in vehicles,
delays in cloud processing can be dangerous. To overcome these challenges,
shifting some data processing tasks from IIoT cloud centers to edge networks,
a method known as edge computing, has become a key research solution. Edge
computing is increasingly used in various IIoT scenarios, such as remote equipment
monitoring [16–18], predictive maintenance [19], and quality control [20].

1.1.2 What Is Cloud Computing?

Industrial edge computing can be seen as an extension of cloud computing. Cloud


computing services, managed by users or external providers, are hosted on Internet-
connected servers. This setup allows companies to easily gather data from their
1.1 Concepts 3

equipment, whether from a single location or globally. The collected data can
be stored and analyzed for production planning or for innovating and optimizing
processes. Cloud computing offers a comprehensive, easy-to-maintain, flexible, and
cost-efficient data pool.
However, as services become increasingly digitized and automated, the volume
of data grows. Real-time application requirements, bandwidth limitations, and
security concerns reveal the shortcomings of cloud computing. For example, safety-
critical decisions in automated driving (such as “Does the car have to brake
immediately to avoid an accident?”) or industrial settings (such as “Does the
machine need to stop now to prevent injury?”) require instant action. However,
relaying data to the cloud for processing and decision-making in these situations
is inefficient. Delays due to low latency or poor connections can lead to serious
consequences. This is where edge computing becomes crucial.

1.1.3 What Is Edge Computing?

Edge computing is a new computing paradigm that uses computing, storage, and
network resources distributed between data sources and cloud computing centers for
data analysis and processing [21]. This model employs edge devices with significant
computing power for local data preprocessing, immediate decision-making, and
then sends the results or preprocessed data to the cloud center. The rise of edge
computing is driven by the growing need for real-time data processing, especially
where bandwidth is limited and low latency is crucial. The advancement of IIoT has
led to smarter, more interconnected machinery in factories and on production lines,
increasing the need for advanced data processing technologies. In IIoT settings, edge
computing enables quick, local data processing for industrial equipment, essential
for efficient real-time monitoring and decision-making.
In this context, industrial edge computing has become a key technology. It
combines edge computing’s rapid data processing with IIoT’s intelligent device
management and optimization. Therefore, industrial edge computing enables more
efficient, reliable, and secure data processing in industrial environments. For
example, in smart manufacturing, it can process sensor data in real time for
better production control and maintenance. In automated workshops and intelli-
gent logistics systems, it plays a crucial role in enhancing production efficiency,
reducing operational costs, and improving safety and stability. As a blend of edge
computing and IIoT, industrial edge computing is opening new opportunities for
industrial automation and intelligence. With technological progress, its importance
in various industrial applications is expected to grow increasingly significant and
transformative [22, 23].
4 1 Introduction to Industrial Edge Computing

1.2 Reference Architecture

The concept of a reference architecture is a high-level abstract model designed


for a specific technological domain. It acts as a guiding framework for developing
software systems and is adaptable to various application scenarios. In this section,
we explore the reference architecture of industrial edge computing, shedding light
on its structure and functions.
Industrial edge computing focuses on integrating edge computing into different
IIoT scenarios to reduce network traffic and decision-making latency. Therefore, the
reference architecture for industrial edge computing requires further enhancement
and refinement beyond current edge computing frameworks. This process should
also fully consider the unique characteristics of both edge computing and IIoT,
ensuring a comprehensive approach tailored to the specific requirements and details
of these technologies.

1.2.1 Existing Reference Architectures

Architectural frameworks are usually developed based on technical, business, and


service requirements specific to a scenario, guiding the number of layers and their
functions. Constructing a reference architecture is key for understanding and apply-
ing industrial edge computing. This starts with examining existing architectures
in edge computing and IIoT, adapting them to meet the unique challenges and
opportunities of industrial edge computing.
In edge computing, various reference architectures have been proposed, featuring
three layers: the first, often called the equipment, IoT, or things/end devices layer,
includes devices with sensing capabilities for data collection; the second, the edge
layer, transmits information to the cloud and processes data partially; the third,
the cloud layer, involves cloud computing resources for major data processing
and decision-making [24, 25]. Despite naming differences, the functional roles of
these layers are similar across models, structuring the architecture of industrial edge
computing.
Understanding IIoT’s architecture is crucial for grasping industrial edge com-
puting. A typical IIoT reference architecture has three layers: the physical layer
with tangible components; the communication layer for data transmission; and
the application layer, where data is used for specific purposes [26]. Some models
add a service layer between communication and application, creating a four-tier
architecture [27]. In specialized IIoT sectors, like Industry 4.0, these architectures
often need more customization to meet industry-specific needs [28].
While edge computing and IIoT architectures are well-established, research on
industrial edge computing’s reference architecture is still emerging. Some studies,
like Aazam et al. [29] and Candanedo et al. [30], have outlined basic structures but
do not fully address industrial edge computing’s specific characteristics and require-
1.2 Reference Architecture 5

ments. Furthermore, critical trends, like the varying computing power in traditional
edge computing layers, are often overlooked. Recognizing and addressing these
differences are vital for a more accurate and functional reference architecture for
industrial edge computing.

1.2.2 Proposed Reference Architectures

This subsection introduces a proposed reference architecture for industrial edge


computing, considering earlier discussions. As shown in Fig. 1.1, the architecture
consists of three layers: the Device Layer, the Edge Layer, and the Cloud Applica-
tion Layer. These layers are based on existing edge computing architectures but
with key differences. Our proposed architecture focuses on the unique aspects
of industrial edge computing. It specifically defines the functions of each layer
and details their interactions, providing a comprehensive understanding tailored to
industrial edge computing needs.

Cloud Application Service


Application
Layer Collabration
Models
And Service Wired Networks
Big
Data
Platform
Operation Control
Near-Edge Management
Layer
Data Processing Collabration

Wired Networks

Management Computing
Mid-Edge
Layer
Edge Storage Collabration
Layer
Wired & Wireless Networks

Networking Development
Far-Edge
Layer
Controlling Collabration
Instruction
Sensing Wired & Wireless Networks
Data

Device
Layer
Equipment Sensors Machines

Fig. 1.1 Proposed reference architecture of industrial edge computing


6 1 Introduction to Industrial Edge Computing

1.2.2.1 Device Layer

The Device Layer in the industrial edge computing architecture encompasses a


diverse range of components such as sensors, handheld terminals, instruments
and meters, smart machines, vehicles, robots, and other similar devices or equip-
ment. These elements are interconnected through various types of wired networks
(like Fieldbus, Industrial Ethernet, Industrial Optical Fiber) or wireless networks
(including Wi-Fi, Bluetooth, RFID, NB-IoT, LoRa, 5G, etc.). They are responsible
for collecting extensive parameter data using a variety of sensors. This data is
then transmitted to the Edge Layer, where the devices await control instructions.
This arrangement facilitates the seamless flow of both data and control commands
between the Device Layer and the Edge Layer, ensuring efficient connectivity and
interaction within the industrial edge computing framework.

1.2.2.2 Edge Layer

The Edge Layer is the central component of the reference architecture for industrial
edge computing. Its main function is to receive, process, and transmit data from
the Device Layer. This layer provides critical services like edge security, privacy
protection, data analysis, intelligent computing, process optimization, and real-time
control. Given the significant variation in computing power among devices in the
Edge Layer, a practical approach is to divide this layer into three sub-layers: the
Near-Edge Layer, the Mid-Edge Layer, and the Far-Edge Layer. These divisions are
based on the varying data processing capabilities of the devices. Segmenting the
Edge Layer in this way allows for more customized data processing, ensuring that
each sub-layer’s specific capabilities and needs are effectively addressed within the
wider scope of industrial edge computing:
• Far-edge layer: The Far-Edge Layer, key to the Edge Layer in industrial edge
computing architecture, includes edge controllers that interface with the Device
Layer. These controllers handle initial tasks like threshold judgment or data
filtering and send control flows to the Device Layer. This can be directed from
the Edge Layer or through the Cloud Application Layer. Due to the variety of
sensors and devices in the Device Layer, the Far-Edge Layer’s edge controllers
must support various protocols for real-time data collection from IIoT’s time-
delay sensitive networks. After data collection, it undergoes initial processing for
threshold judgment or filtering. The edge controllers must integrate and update
an algorithm library specific to their environmental setup, improving strategic
effectiveness. Additionally, they send control instructions to the Device Layer
using Programmable Logic Controller (PLC) control or action control modules,
based on decisions made at the Far-Edge Layer or above. Collaboration among
edge controllers is sometimes necessary for certain tasks.
A vital feature of the Far-Edge Layer is its millisecond-level latency in judgment
and feedback, crucial in emergencies. For instance, an unmanned vehicle needs
1.2 Reference Architecture 7

to rapidly respond to a pedestrian suddenly entering its path, or machinery like


a lathe spindle must immediately shut down when a safety hazard, such as hair
entanglement, is detected. Such urgent situations require processing within the
Far-Edge Layer to minimize delays and ensure safety.
• Mid-Edge Layer: The Mid-Edge Layer, a crucial component of industrial edge
computing architecture, primarily consists of edge gateways. These gateways are
responsible for collecting data from the Far-Edge Layer via both wired (like
Fieldbus, Industrial Ethernet, Industrial Optical Fiber) and wireless networks
(such as Wi-Fi, Bluetooth, RFID, NB-IoT, LoRa, 5G). They cache this data and
perform heterogeneous computing. Additionally, these gateways relay control
flows from upper layers to the Far-Edge Layer and manage equipment in both
the Mid- and Far-Edge Layers. Unlike the Far-Edge Layer, which focuses on
simple tasks like threshold judging or data filtering, the Mid-Edge Layer has
more substantial storage and computing resources to process IIoT data. This
layer preprocesses, fuses, and caches the diverse data collected by the Far-Edge
Layer. The edge gateways then process this data using embedded systems or
lightweight containers, applying big data analytics or advanced edge intelligence
technologies. They also store data analysis logs for future use. Furthermore,
the Mid-Edge Layer includes a management module for various functions (like
device, access, and communication management) and a collaboration module for
edge gateways, enabling multi-layer and multi-device collaboration. This layer
effectively manages events that can tolerate delays of a few seconds or minutes.
Typically, the Mid-Edge Layer has latency ranging from seconds to minutes,
allowing for more comprehensive judgments by integrating information from
multiple devices. For example, a Road Side Unit (RSU) might analyze and
predict vehicle traffic based on nearby vehicles’ location data. Similarly, a
smart gateway could combine data from multiple cameras for product quality
assessment. The Mid-Edge Layer adeptly handles scenarios where short delays
are acceptable.
• Near-Edge Layer: The Near-Edge Layer, featuring high-powered edge servers,
is responsible for complex and critical data processing and providing directional
decision guidance, based on data from the Mid-Edge Layer. These edge servers
handle both business application and platform management. With significantly
greater storage and computational resources than the Far- and Mid-Edge Layers,
the Near-Edge Layer is ideal for bulk processing and handling heterogeneous
data. It plays a key role in developing accurate models for improved production
scheduling within the edge network. Additionally, the Near-Edge Layer manages
resources across the entire Edge Layer, requiring capabilities in operation, virtu-
alization, and the deployment and scheduling of edge-side business applications.
This ensures rational resource allocation and efficient task completion and
delivery.
The Near-Edge Layer, analyzing a wide range of data from various equipment, is
crucial for process optimization and strategy formulation across larger areas and
longer timeframes. Typically characterized by hour-level latency, it is vital in
scenarios like a smart factory’s edge server optimizing product parameters from
8 1 Introduction to Industrial Edge Computing

different production lines and equipment, or a smart grid’s edge server aggregat-
ing and optimizing electricity consumption statistics for diverse communities.

1.2.2.3 Application Layer

The Cloud Application Layer plays a pivotal role in industrial edge computing
architecture, primarily focusing on extracting potential value from vast amounts of
data and optimizing resource allocation across an enterprise, a region, or on a nation-
wide scale. This layer, operating through the public network, retrieves data from the
Edge Layer and supports upper-layer applications. These applications span a wide
array of functions, including product or process design, comprehensive enterprise
management, sales, and after-sales services. Additionally, the Cloud Application
Layer feeds back models and microservices to the Edge Layer, enhancing its
operational efficiency and decision-making capabilities.
Another key function of the Cloud Application Layer is its ability to facilitate
cloud collaboration. This feature enables the sharing of data among various groups
with different attributes, such as managers, cooperative enterprises, designers, and
customers. Such collaboration not only broadens the scope of data utilization but
also deepens the mining of data value, leading to more nuanced and multifaceted
insights. Decision-making processes within this layer typically span a longer time-
frame, often measured in days. This extended timescale is reflective of the complex
and comprehensive nature of the tasks handled by the Cloud Application Layer,
where strategic decisions impact broader organizational or regional objectives.

1.3 Benefits and Challenges

1.3.1 Benefits of Industrial Edge Computing

Industrial edge computing represents a transformative approach in handling and


processing data within the IIoT environment. By leveraging edge computing-related
technologies, industrial edge computing brings significant improvements in system
performance, data security, and operational cost efficiency:
• Improve System Performance: In addition to collecting and transmitting data to
the cloud platform, the most important contribution of industrial edge computing
is achieving millisecond-level data processing. This efficiency significantly
reduces the overall system delay, decreases the demand for communication
bandwidth, and enhances the overall system performance.
• Protect Data Security and Privacy: While cloud platform service providers
offer comprehensive centralized data security protection solutions, the centraliza-
tion of stored data poses risks if it gets leaked, leading to serious consequences.
Industrial edge computing enables enterprises to deploy the most suitable secu-
1.3 Benefits and Challenges 9

rity solutions locally, minimizing the risk of data leakage during transmission and
reducing the volume of data stored in the cloud, thereby significantly lowering
security and privacy risks.
• Reduce Operational Costs: Transferring data directly to the cloud platform
can incur substantial operational costs due to data migration, bandwidth require-
ments, and latency issues. Industrial edge computing reduces the volume of data
that needs to be uploaded, thereby decreasing the amount of data migration,
bandwidth consumption, and latency, which in turn reduces operational costs.

1.3.2 Challenges of Industrial Edge Computing

In this section, we explore the challenges faced by industrial edge computing. While
it offers significant benefits in system performance, data security, and cost reduction,
industrial edge computing faces various practical challenges. These include issues
with 5G foundational communications, data offloading and load balancing, edge
artificial intelligence (AI), and data sharing security. For example, integrating 5G
with industrial edge computing presents challenges in Quality of Service (QoS),
node management, and network slicing. As device numbers increase and computing
resources disperse, designing efficient data offloading and load balancing schemes
become crucial. Edge AI, while offering new data processing opportunities, also
raises concerns about computational power and model complexity. Furthermore,
ensuring the security and privacy of data in edge computing environments is a
pressing issue in industrial edge computing. We will now examine these challenges
in more detail to better understand the current and future prospects of industrial edge
computing.

1.3.2.1 5G-Based Edge Communication

Integrating 5G into industrial edge computing aims to enhance communication


modes and performance, providing greater flexibility in industrial processes by
reducing reliance on fiber optic cables. Although 5G’s broad bandwidth and
extensive base station network improve the management of edge devices and data
transmission, several challenges remain. These include aligning QoS standards
between 5G and edge computing, developing edge node management strategies
suited to local conditions amidst numerous 5G base stations, and customizing
network slicing architectures for the parallel operation of multiple services. Fur-
thermore, updating traditional equipment for 5G compatibility and creating remote
maintenance plans for 5G infrastructure are crucial for successfully applying 5G in
industrial edge computing.
10 1 Introduction to Industrial Edge Computing

1.3.2.2 Data Offloading and Load Balancing

In industrial edge computing systems, data offloading and load balancing are
major challenges. These arise from the large number of devices and the distributed
nature of computing resources. To tackle these issues, specialized schemes for data
offloading and load balancing are needed, considering the specific requirements of
each case. Data offloading in edge networks typically falls into two categories: full
and partial. Full data offloading means transferring all data from one device or
edge server to another. Partial data offloading involves dividing the task data and
distributing it among different devices, sometimes offloading everything to other
devices. The main aim of load balancing methods is to distribute the load evenly,
addressing the varied storage and computing capacities of edge devices and the
differences in offloading strategies. Effective load balancing requires a customized
approach, tailored to the unique characteristics and scenarios of the edge computing
environment. Integrating new technologies into these strategies is also essential,
aiming to optimize load distribution across the network of devices in industrial edge
computing systems.

1.3.2.3 Edge Artificial Intelligence

Edge AI brings both new opportunities and significant challenges for data pro-
cessing in industrial edge computing. The challenges mainly focus on two areas.
First, the limited computing power of edge devices makes it difficult to quickly
complete extensive computational tasks. Second, the complexity of models used
in edge AI requires substantial computational resources for training and inference.
Acknowledging these limitations, a promising approach in industrial edge comput-
ing is the more effective integration of AI with edge computing. This integration
aims to combine the strengths of both AI and edge computing, using the real-time
data processing capabilities at the edge and efficiently managing the computational
demands of AI models. This synergy could significantly improve the efficiency and
effectiveness of data processing in industrial edge computing, making it a key area
of ongoing research and development.

1.3.2.4 Data Sharing Security

The integration of edge computing in IIoT allows real-time data processing at the
edge. However, given the limited resources and the large number of edge devices,
many tasks require collaboration between multiple devices. This necessitates secure
data sharing among edge devices. IIoT demands high levels of security, and
blockchain technology can provide some measure of security for edge data sharing.
Nonetheless, the limited computing resources of edge devices pose a challenge in
designing and optimizing edge IIoT architecture based on blockchain. Challenges
References 11

include access control and secure storage using blockchain. Therefore, more focus
is needed on developing edge IIoT solutions that incorporate blockchain technology.

1.4 Organization of This Book

This book offers a comprehensive and systematic examination of industrial edge


computing systems, covering theories, solutions, and applications. It presents a
three-layer architecture for these systems, including the device layer, edge layer,
and application layer, catering to students, practitioners, industry professionals,
and researchers. This structure considers the specific features of edge computing
and IIoT, tackling challenges like varying computing power. It thoroughly explores
solutions for computing in devices, caching, and migration in edge servers, linking
concepts such as offloading for resource-limited devices, data caching for edge
decision-making, and migration due to mobile device interactions. The book also
emphasizes edge-assisted model inference, showcasing a common application in
industrial edge computing systems.
The book’s structure is as follows: Chap. 1 provides an overview of industrial
edge computing systems, discussing their advantages, challenges, and a reference
architecture. Chapter 2 examines optimization metrics like energy consumption
and reviews related literature. Chapters 3, 4, and 5 detail optimization schemes for
computing offloading, data caching, and service migration. Chapter 6 is dedicated
to application-oriented solutions. Chapter 7 looks at future research directions in the
field, identifying areas of interest for researchers.
This book is a valuable resource for those seeking to understand industrial
edge computing systems and their future research trajectories. It effectively blends
theoretical concepts with practical applications, providing insight into the value
and potential of industrial edge computing. The discussion on future research
illuminates unresolved issues and emerging topics, offering guidance for researchers
in this evolving field.

References

1. Ming Yang, Yanhui Wang, Cheng Wang, Yan Liang, Shaoqiong Yang, Lidong Wang, and
Shuxin Wang. Digital twin-driven industrialization development of underwater gliders. IEEE
Trans. Ind. Informatics, 19(9):9680–9690, 2023.
2. Veronica Brizzi, Giulia Baccarin, Andreas Bordonetti, and Michele Comperini. Implementa-
tion and industrialization of a deep-learning model for flood wave prediction based on grid
weather forecast for hourly hydroelectric plant optimization: case study on three alpine basins.
In Proceedings of the Italia Intelligenza Artificiale—Thematic Workshops co-located with the
3rd CINI National Lab AIIS Conference on Artificial Intelligence (Ital IA 2023), Pisa, Italy,
May 29–30, 2023, volume 3486 of CEUR Workshop Proceedings, pages 590–594, 2023.
12 1 Introduction to Industrial Edge Computing

3. Samaneh Zolfaghari, Sumaiya Suravee, Daniele Riboni, and Kristina Yordanova. Sensor-based
locomotion data mining for supporting the diagnosis of neurodegenerative disorders: A survey.
ACM Comput. Surv., 56(1):10:1–10:36, 2024.
4. Shuhui Fan, Shaojing Fu, Yuchuan Luo, Haoran Xu, Xuyun Zhang, and Ming Xu. Smart
contract scams detection with topological data analysis on account interaction. In Proceedings
of the 31st ACM International Conference on Information & Knowledge Management, Atlanta,
GA, USA, October 17–21, 2022, pages 468–477, 2022.
5. Abhishek Hazra, Mainak Adhikari, Tarachand Amgoth, and Satish Narayana Srirama. A
comprehensive survey on interoperability for IIoT: Taxonomy, standards, and future directions.
ACM Comput. Surv., 55(2):9:1–9:35, 2023.
6. Tarik Taleb, Konstantinos Samdanis, Badr Mada, Hannu Flinck, Sunny Dutta, and Dario
Sabella. On multi-access edge computing: A survey of the emerging 5g network edge cloud
architecture and orchestration. IEEE Commun. Surv. Tutorials, 19(3):1657–1681, 2017.
7. Yushan Siriwardhana, Pawani Porambage, Madhusanka Liyanage, and Mika Ylianttila. A sur-
vey on mobile augmented reality with 5g mobile edge computing: Architectures, applications,
and technical aspects. IEEE Commun. Surv. Tutorials, 23(2):1160–1192 , 2021.
8. Chi-Wei Lien and Sudip Vhaduri. Challenges and opportunities of biometric user authentica-
tion in the age of IoT: A survey. ACM Comput. Surv., 56(1):14:1–14:37, 2024.
9. François De Keersmaeker, Yinan Cao, Gorby Kabasele Ndonda, and Ramin Sadre. A survey of
public IoT datasets for network security research. IEEE Commun. Surv. Tutorials, 25(3):1808–
1840, 2023.
10. Rodrigo Marotti Togneri, Ronaldo C. Prati, Hitoshi Nagano, and Carlos Kamienski. Data-
driven water need estimation for IoT-based smart irrigation: A survey. Expert Syst. Appl.,
225:120194, 2023.
11. A. Al-Fuqaha, M. Guizani, M. Mohammadi, M. Aledhari, and M. Ayyash. Internet of things: A
survey on enabling technologies, protocols, and applications. IEEE Communications Surveys
& Tutorials, 17(4):2347–2376, 2015.
12. Emiliano Sisinni, Abusayeed Saifullah, Song Han, Ulf Jennehag, and Mikael Gidlund.
Industrial internet of things: Challenges, opportunities, and directions. IEEE Transactions on
Industrial Informatics, 14(11):4724–4734, 2018.
13. T. Qiu, B. Li, X. Zhou, H. Song, I. Lee, and J. Lloret. A novel shortcut addition algorithm with
particle swarm for multi-sink internet of things. IEEE Transactions on Industrial Informatics,
pages 1–12, 2019.
14. Prasanna Kumar Illa and Nikhil Padhi. Practical guide to smart factory transition using IoT,
big data and edge analytics. IEEE Access, 6:55162–55170, 2018.
15. A. Thakur and R. Malekian. Fog computing for detecting vehicular congestion, an internet
of vehicles based approach: A review. IEEE Intelligent Transportation Systems Magazine,
11(2):8–16, 2019.
16. H. Wang, Q. Wang, Y. Li, G. Chen, and Y. Tang. Application of fog architecture based on
multi-agent mechanism in CPPS. In 2018 2nd IEEE Conference on Energy Internet and Energy
System Integration (EI2), pages 1–6, 2018.
17. N. Yoshikane et al. First demonstration of geographically unconstrained control of an industrial
robot by jointly employing SDN-based optical transport networks and edge compute. In
2016 21st OptoElectronics and Communications Conference (OECC) held jointly with 2016
International Conference on Photonics in Switching (PS), pages 1–3, 2016.
18. I. A. Tsokalo, H. Wu, G. T. Nguyen, H. Salah, and F. H. P. Fitzek. Mobile edge cloud for robot
control services in industry automation. In 2019 16th IEEE Annual Consumer Communications
& Networking Conference (CCNC), pages 1–2, 2019.
19. T. M. Jose. A novel sensor based approach to predictive maintenance of machines by leveraging
heterogeneous computing. In 2018 IEEE SENSORS, pages 1–4, 2018.
20. L. Li, K. Ota, and M. Dong. Deep learning for smart industry: Efficient manufacture inspection
system with fog computing. IEEE Transactions on Industrial Informatics, 14(10):4665–4673,
2018.
References 13

21. W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. Edge computing: Vision and challenges. IEEE
Internet of Things Journal, 3(5):637–646, 2016.
22. Rosario Giuseppe Garroppo and Maria Grazia Scutellà. Design model of an IEEE 802.11ad
infrastructure for TSN-based industrial applications. Comput. Networks, 230:109771, 2023.
23. Abhishek Hazra, Praveen Kumar Donta, Tarachand Amgoth, and Schahram Dustdar. Cooper-
ative transmission scheduling and computation offloading with collaboration of fog and cloud
for industrial IoT applications. IEEE Internet of Things Journal, 10(5):3944–3953, 2023.
24. C. Mouradian, D. Naboulsi, S. Yangui, R. H. Glitho, M. J. Morrow, and P. A. Polakos.
A comprehensive survey on fog computing: State-of-the-art and research challenges. IEEE
Communications Surveys & Tutorials, 20(1):416–464, 2018.
25. M. Mukherjee, L. Shu, and D. Wang. Survey of fog computing: Fundamental, network applica-
tions, and research challenges. IEEE Communications Surveys & Tutorials, 20(3):1826–1857,
2018.
26. H. Xu, W. Yu, D. Griffith, and N. Golmie. A survey on industrial internet of things: A cyber-
physical systems perspective. IEEE Access, 6:78238–78259, 2018.
27. Jesus Martin Talavera and Others. Review of IoT applications in agro-industrial and environ-
mental fields. Computers and Electronics in Agriculture, 142:283–297, 2017.
28. Christian Weber, Jan Koenigsberger, Laura Kassner, and Bernhard Mitschang. M2ddm-a
maturity model for data-driven manufacturing. Manufacturing Systems 4.0, 63:173–178, 2017.
29. M. Aazam, S. Zeadally, and K. A. Harras. Deploying fog computing in industrial internet
of things and industry 4.0. IEEE Transactions on Industrial Informatics, 14(10):4674–4682,
2018.
30. Ines Sitton-Candanedo, Ricardo S. Alonso, Sara Rodriguez-Gonzalez, Jose Alberto Gar-
cia Coria, and Fernando De La Prieta. Edge computing architectures in industry 4.0: A
general survey and comparison. In 14th International Conference on Soft Computing Models
in Industrial and Environmental Applications, volume 950 of Advances in Intelligent Systems
and Computing, pages 121–131, 2020.
Chapter 2
Preliminaries

This chapter begins with a focused introduction to industrial edge computing


systems, emphasizing optimization metrics. The design of appropriate optimiza-
tion algorithms is crucial for IIoT applications to ensure their QoS, reliability,
scalability, and deployability. Subsequently, the chapter presents several related
works, discussing their technical coexistence issues. Understanding the limitations
of existing work enables readers to comprehend the interactions and interference
among components (such as devices and edge servers) in industrial edge computing
systems. The chapter also discusses potential changes that could be incorporated
into future versions of industrial edge computing systems. These changes are
considered in the context of offloading in end devices, caching and migration in edge
servers, and applications. This approach helps to understand how improvements can
be made at different levels of the system.

2.1 Performance Metrics of Industrial Edge Computing

In this section, we first give a high-level overview of the workflow in industrial


edge computing systems [1]. The main components of an industrial edge computing
system are multiple devices, BSs equipped with edge servers, and the cloud.
Device These devices, such as sensors, terminals, and robots, play different roles in
the running process of the system, e.g., sensing the operation status and production.
Due to volume limitation, these devices perform some common features, i.e., limited
computation capacity, storage, and short battery life, which make them like a “baby.”
They mostly cannot decide how many intermediate products should be made, nor
can they store enough data, on their own, even easy to go down due to low battery
or failure [2]. Thus, the energy consumption of each device must be addressed to
reduce the changing time. This issue is detailed in Sect. 2.1.2.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 15
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_2
16 2 Preliminaries

It is worth noting that a naughty “baby” moves around the map, highlighting the
necessity of real-time monitoring. Based on this, these devices integrate various
communication modules (Wi-Fi, Bluetooth, RFID, NB-IoT, LoRa, 5G, etc.) to
search the help from the capable units. Like the baby takes a message by crying,
the messages that are coded by different communication protocols can be captured.
The louder the crying, i.e., the stronger the signal strength, a more rapid response,
as known as the communication latency.
Base Station (BS)/Edge Server The BS equipped with the edge server takes on
the responsibility of being an “adult.” It is usually densely distributed close to end
devices and responsible for multiple devices through wireless or wired links. The
abundant computing and storage resources of the edge server support dealing with
insoluble troubles that bother devices [3].
Using the computing resources in the edge server for devices’ requests is referred
to as offloading [4]. Offloading pays attention to making optimal decisions from a
higher perspective to coordinate the overall process, similarly to mediate conflicts
between babies. Benefiting from AI, the algorithm at the edge behaves more like a
logical adult. It takes care of each device’s request with priority determined by the
“sound of crying,” i.e., the features of the request such as latency awareness and
computation intensity. Everyone wants the cry to stop quickly, and in an industrial
edge computing system, it performs the objective of minimizing the latency, which
will be introduced in Sect. 2.1.1.
Storage resources in industrial edge computing systems are utilized to deploy
applications and store data collected from end devices. These applications, like
scheduling customized maintenance plans, form the foundation for the edge to offer
services to devices. This can be likened to an adult’s life skills, such as knowing
which medicines are needed for a baby with a fever. Here, the collected data assists
in diagnosis, similar to how an adult uses their knowledge and experience to care
for a baby.
Cloud The cloud owns more adequate resources than the edge, while longer
response latency incurs. Also, it holds a broader range of applications (skills),
making it look like a baby store that covers the shortage of adults. The service that
the edge cannot provide will be scheduled to the cloud [5]. To achieve this function,
the objective of the cloud is to enrich the commodities, i.e., providing the AI model
with high inference speed and accuracy. This will be detailed in Sect. 2.1.3.
In summary, with edge computing, the task can be processed without extra
jitter during transmission in the core network. By facilitating real-time decisions,
industrial edge computing improves the accuracy of control with minimum cost.
However, to fully enjoy the benefits of industrial edge computing systems, the
following metrics must be taken into account when designing specific solutions for
different applications.
2.1 Performance Metrics of Industrial Edge Computing 17

2.1.1 Latency Minimization Scheme

In the existing research on industrial edge computing, some schemes are designed
to minimize the latency [6, 7]. Generally, latency comprises computing latency,
transmission latency, and some extra latency, e.g., queuing latency. Nevertheless,
it should also be calculated in different ways in different cases relying on the
offloading scheme. In this section, we introduce a general latency estimation model.
For a task, we assume its computation requirement as O, while the size of the
request when offloaded to edge or cloud is K. Meanwhile, let .ld , le , lc represent
the computation capacity of the device, edge, and cloud. The transmission rate
between edge and device and that between edge and cloud are represented by
.Rd and .Rc . There are two offloading modes in industrial edge computing: full

offloading and partial offloading. These modes determine whether to divide the
computational requirements for parallel processing across different locations, such
as the device, edge, and cloud. Two variables, .λ and .β, both ranging from .[0, 1]
and representing offloading decisions, denote the proportion of the computational
requirement allocated to the edge and the cloud, respectively. It is important to note
that the sum of .λ and .β should not exceed 1, i.e., .λ + β ≤ 1. Full offloading is a
specific instance of partial offloading, where either .λ = 1 or .β = 1.
When all the offloaded computation requirements are finished and the result
is back to the device, the task is considered completed. Thus, we calculate the
latency for three parts of the computation requirements. For the local computation
requirement, its computing latency is

(1 − λ − β)O
.c =
LD , (2.1)
ld

while the transmission latency .LDt is zero. When offloading the task to the edge, the
request is transmitted to the edge, and the result, whose size is .wλO, is then back to
the device. In this case, the computing latency is

λO
.c =
LE , (2.2)
le

and the transmission latency is

K wλO
.t =
LE + . (2.3)
Rd Rd

Similarly, for the cloud, the request is first transmitted to the cloud via the edge
and then back to the device. Its computing latency is

λO
. c =
LC , (2.4)
le
18 2 Preliminaries

and the transmission latency is

K K wβO wβO
t =
LC
. + + + . (2.5)
Rd Rc Rd Rc

Hence, the total latency can be calculated by


{ }
L = max LD
. c + ϵ , Lc + Lt + ϵ , Lc + Lt + ϵ
D E E E C C C
, (2.6)

where .ϵ D , ϵ E , ϵ C are the extra latency in device, edge, and cloud, respectively. For
example, if the service that used to process this task is not cached at the edge,
an extra cache latency is incurred. It is further discussed in detailed solutions for
different applications.
To minimize the latency, we can further introduce more devices to execute tasks
in a parallel manner to reduce the maximum of each item in Eq. (2.6). It involves
the problem of where to offload and how to partition the task. Meanwhile, when
calculating the total latency for multiple substantial tasks, the time-varying densities
should be taken into account. Here, we provide three ways to minimize the latency
and hope that can inspire readers for latency minimization schemes:
• Device-to-Device (D2D) Offloading: From the perspective of computing, the
main idea of offloading is to utilize the computation resources of edge and cloud.
Here, offloading the task to idle devices via D2D communication is an ignored
way. In this case, whether the idle device is willing to use its resources to help
others and how to encourage them to share their resources are potential solutions.
• Offloading Partition: As mentioned before, the partial offloading cannot apply
to practical applications as the task cannot be partitioned to optional parallel
computation requirements. In fact, it is greatly determined by the task segments,
i.e., sequential sub-tasks with determined computation requirements. How to
offload these sequential sub-tasks to minimize the latency should be addressed.
• Time-Varying Arrival Density: Normally, to simulate the task arriving at each
time, the existing divides the period into several uniform time slots. In practice,
the tasks arriving in each time slot show different densities, incurring an extra
execution latency. To solve this problem, a global buffer can be built, where the
tasks are regarded as sparse arriving until the buffer is full, and thus the online
matching stage can schedule the tasks.

2.1.2 Energy Consumption Trimming

Similar to latency, energy consumption in industrial edge computing systems is


composed of computing energy consumption, transmission energy consumption,
and additional energy consumption. Energy consumption occurs in each step that
involves latency and is typically inversely related to it. Increasing the computing fre-
2.1 Performance Metrics of Industrial Edge Computing 19

quency and transmission power can decrease computing and transmission latency,
but this inevitably leads to higher energy consumption. Therefore, the trade-off
between energy and latency is an important factor to consider in the design and
optimization of industrial edge computing systems.
There is also a difference between energy and latency, that is, the energy
consumption is shared by all the participant units, i.e., device, edge, and cloud [8, 9].
The units have different concerns about energy consumption, which makes no sense
in the estimation of total energy consumption toward a task. Therefore, the typical
energy consumption is evaluated from the perspective of the device and edge,
respectively. The energy consumption of the cloud is not taken into account as it has
enough power supply. Next, we extend the model before and introduce a general
estimation model for the energy consumption of the device and edge.
Energy Consumption of Device Most of simple IoT devices like soil monitoring
sensors and patrol robots have a component, i.e., battery, to support the energy
consumption to achieve the function of itself. Thus, a reasonably long lifetime must
be ensured by very low power consumption. The computing energy consumption
and the transmission energy consumption of the device can be obtained by

EcD = κld3 LC
. c, (2.7)

and
⎛ ⎞
K wβO
.EtD = P D LE
t + + , (2.8)
Rd Rd

respectively. Here, .P D is the transmission power of the device during sending and
receiving.
Energy Consumption of Edge In addition to device energy consumption, energy
usage at the edge is becoming increasingly crucial, especially with the advent of
5G technology. It has been reported in MTN Consulting and Huawei [10, 11] that
5G BS consumes at least double the energy that of a 4G BS. It is projected that
power consumption will account for 18% of operational expenses in Europe and
32% in India [12]. Furthermore, the rapid advancement in hardware is leading to a
trend where edge devices are becoming smaller, such as Unmanned Aerial Vehicles
(UAVs), making energy efficiency at the edge more critical than ever. The computing
energy consumption and the transmission energy consumption of the edge can be
obtained by

. EcE = κle3 LE
c, (2.9)
20 2 Preliminaries

and
⎛ ⎞
.EtE = P E LE
t + Lt ,
C
(2.10)

respectively. Here, .P E is the transmission power of the edge during sending and
receiving. Using energy consumption as a supplement indicator can efficiently make
a trade-off between minimizing the latency and minimizing the energy consumption.
It greatly supports the devices to own a longer run time while satisfying the latency
requirement of different tasks.
It is important to recognize that significantly reducing computing and transmis-
sion energy consumption is generally challenging. However, there is potential for
a substantial reduction in other types of energy consumption. For example, the
energy used for periodically caching data on the edge server presents a promising
opportunity for a decrease. In this context, we identify two types of extra energy
consumption that can be significantly reduced or avoided:
• Data Caching: Normally, once the data required by the device’s service is not
processed on the edge, the edge should download the data from the cloud. The
cached data is usually large and consumes a lot of energy. Note that due to the
limited storage of the edge, the downloaded data may replace the existing data. To
save the energy consumption in downloading data, an intelligent caching strategy
with a high hit ratio is desired.
• Service Migration: This phenomenon arises as a secondary effect of data
caching and acts as an adjunct to caching strategies meant to accommodate
user mobility. When mobile devices, like robots, move between different edge
locations, their connectivity configurations change. This necessitates updating
the cache at the new edge location, leading to increased energy consumption,
which can sometimes be excessively high. A possible solution is to migrate data
between edge servers or to reroute the service from the original edge to the
new location. However, this method requires careful evaluation of the trade-off
between the energy consumed in migration and routing to determine the most
energy-efficient solution.

2.1.3 Accelerating Inference with Improved Model Accuracy

AI has emerged as a pivotal technique across diverse domains, showcasing its


significance in numerous applications. It has demonstrated its prowess in the edge
computing area, offering various approaches, such as object detection, applied in
the context of industrial edge computing [13].
Not only high processing speed, but also high accuracy is required for industrial
edge computing systems. The IoT device cannot provide real-time and high-
precision results by heavyweight model, and the cloud inference suffers from the
2.1 Performance Metrics of Industrial Edge Computing 21

transmission in the congested backhaul network. Hence, the Deep Learning (DL)
models that achieve various necessary functions for IoT devices, including computer
vision, natural language processing, machine translation, and so on, are deployed on
the edge. Note that, here, accelerating inference indicates the reduction of value of
O (i.e., transform the heavyweight model into lightweight model), while latency
minimization focuses on reducing the latency with given O. Also, the coexistence
of them is an interesting topic to reduce the total latency, which will be introduced
in Sect. 6.1.
Another concern is model accuracy, which refers to how well these models
can predict or classify data from sensors, devices, or systems within industrial
environments for detecting anomalies, optimizing processes, etc. [4]. Industrial
edge computing systems identify abnormal patterns and defects or deviations from
quality standards to prevent failures or security breaches. Accurate models ensure
that maintenance activities can be scheduled effectively, minimizing downtime and
maximizing operational efficiency, reducing waste, and improving product quality.
Here, we also provide three aspects to show the direction may be useful for
accelerating inference with improved model precision:
• Edge Cooperation: The inference model is allowed to be divided into multiple
parts, one executing locally, and the remaining parts offloading to multiple
cooperative edge servers. In this way, this is no doubt that the inference will
be accelerated. However, the differences between models, e.g., computational
workload, input data volume, and output data volume of each layer, limit the
spread of a cooperation algorithm as it should be designed for a specific inference
model.
• Knowledge Distillation: Fluctuations in wireless bandwidth can lead to pro-
longed communication latency for both methods, particularly when dealing
with a significant volume of raw video data or intermediate features that need
to be transmitted. A recent development in this context is the adoption of
teacher–student learning, which has shown promise as a framework for real-time
video inference on resource-constrained mobile devices within multi-access edge
computing (MEC) networks [14]. In this approach, robust teacher models are
stationed on edge servers, while lightweight student models, distilled from these
teacher models, are deployed on mobile devices. This setup aims to expedite the
inference process, enhancing the inference speed with a tolerant accuracy.
• Device Cooperation: After deploying a lightweight model in devices, there is
still a challenge that should be addressed. When inference by the data for a
single device, the inference accuracy is limited due to the short sensing range and
blind zones of its sensors. Cooperative sensing seems to be an efficient way to
infer beyond their local sensing capabilities by exchanging sensing information
between neighboring units and thus improve the inference accuracy.
22 2 Preliminaries

2.2 Related Work

2.2.1 Offloading for Mass End Devices

An industrial edge computing system is a multisource data-driven distributed


system, where the distributed edges process the data, the so-called offloading,
to help the resource-limited device provide intelligent service. With different
requirements of service for end devices, proper offloading schemes should be
designed specifically.

2.2.1.1 From Decentralized to Centralized in Industrial Edge Computing

Existing studies in the offloading field can be categorized from the perspective of
decision-making manner (i.e., centralized or decentralized).
Decentralized Offloading There are some studies that focus on making offloading
decisions in each edge independent based on existing local information in a
decentralized manner without requiring global information.
Josilo et al. [15] addressed the issue of task offloading among devices to
neighbors or cloud services. They devised a game-theoretical model to tackle the
problem of minimizing completion time. Yu et al. [16] proposed a hybrid task
offloading method based on multi-cast for industrial edge computing, catering to
numerous mobile edge devices. The framework exploits network-assisted collabo-
ration to facilitate wireless distributed computing sharing. Within this approach, the
authors employ the Monte Carlo Tree Search (MCTS) algorithm to optimize the
task assignment problem. Mohammed et al. [17] partitioned Deep Neural Network
(DNN) into multiple segments, enabling processing either locally on end devices
or offloading to potent nodes, such as those found in fog computing. Wang et
al. [18] crafted a multiuser decentralized epoch-based framework. It facilitates
decentralized user-initiated offloading without global system information. Qian et
al. [19] paid their attention to the diverse channel realizations in dynamic industrial
edge computing systems and proposed a time-varying online algorithm based on
Deep Reinforcement Learning (DRL).
The common feature of these works is that the offloading decision is made to
maximize its individual reward, such as minimum latency, based on the local obser-
vation. In this case, the individual decision always causes conflicts in maximizing
the global reward. These methodologies demonstrate commendable scalability and
adaptability across complicated environments. It eases the resilience deployment
for new devices and improves the robustness to avoid a complete breakdown when
a single point of failure. However, it is the overall superiority of the decision-
making process that is required for industrial edge computing systems, which are
not satisfied by local optimum decisions.
2.2 Related Work 23

Centralized Offloading Compared with decentralized offloading, a centralized


one may sacrifice some individual rewards to achieve global optimization. These
strategies hinge on global operators, e.g., macro BS and cloud servers, that gather
extensive information about devices and edges for optimizing offloading decision-
making. The determined offloading decisions include executed locally on the
devices or transmitted to specific edges for processing. In this case, the global
operators have a high-level perspective to reasonably schedule the resources and
thus improve the system performance.
A substantial body of research has been dedicated to centralized offloading
schemes. Josilo et al. [20] addressed the coordination of offloading decisions
for periodical computation-intensive tasks directed by game theory and solved
it by finding the equilibrium state of the system in polynomial time. Similarly,
Arisdakessian et al. [21] also used game theory to guide the offloading process in
IoT fog computing systems. They introduced multiple criteria to find suitable fog
nodes for each IoT service toward minimizing latency for IoT services and fog nodes
simultaneously. Zhao et al. [22] designed an effective convex programming-based
algorithm to ensure the efficiency of offloading for dependent tasks with limited
data caching in the edge.
Centralized schemes obviate the need for information exchange among devices,
thereby minimizing the risk of privacy exposure. Additionally, it proves more
efficient due to the comprehensive global information at its disposal. Nevertheless,
current centralized schemes exhibit various areas that warrant improvement, includ-
ing aspects like offloading targets and partitioning.

2.2.1.2 Hybrid Offloading

Once the computation capacity of the devices and edge servers cannot burden the
task processing, it is necessary to split the task into multiple sub-tasks to execute
them in a hybrid manner. As studies of single offloading mode become more mature,
authors begin to consider hybrid offloading. In [23], the authors integrated the D2D
communications with MEC to further improve the computation capacity of the
cellular networks. However, the current hybrid offloading generally considers the
cooperative utilization of MEC and D2D resources, and D2D communication is
limited to one device, so the utilization is low.
A potential researcher’s direction is that a mobile device can act as a relay and
help other devices communicate with MEC servers. The relay ensures the high data
rates required by Next-Generation Networks (NGNs) D2D communication [24].
D2D-assisted task offloading allows mobile devices to offload tasks not only to edge
servers but also to neighboring nodes through D2D links [23].
The multiple-hop transmissions between multiple users through D2D links,
which are referred to as cooperative relays, produce an extra latency [25]. The
first framework supporting cooperative relay is proposed in [26], which also jointly
combines the D2D links with cellular links for communication. Meanwhile, in [27],
the authors allow a mobile device to offload tasks through relay devices. They put
24 2 Preliminaries

forth various algorithms aimed at minimizing both latency and energy consumption
in this context.

2.2.1.3 Offloading with Direct Acyclic Graph (DAG)-Based Partition

Parallel programming is a methodology that breaks down applications into multiple


tasks with dependencies to enable parallel execution. Shu et al. [28] used a DAG to
represent the application, where the task dependencies are addressed for deciding
how to offload tasks to the edges. DAG-based offloading problem is proved that
cannot be solved in polynomial time [29].
Typically, Liu et al. [30] aimed to minimize application latency within limited
computation resources for the task described by a specific DAG, which is solved
by a simple greedy algorithm. Liao et al. [31] enhanced the algorithm by iteratively
updating task priorities and scheduling tasks orderly.
Unfortunately, the concept of DAG in offloading also brings increasing com-
plexity, i.e., large solution space. Therefore, the solution obtained by heuristic
algorithms always suffers from the local optimum. As a result, DRL-based methods
are regarded as a hopeful way to find the optimal solution. Tang et al. [32] leveraged
the edge’s computation resources and network conditions as environment state to
make a trade-off between latency and energy consumption, as known as a reward.
Yan et al. [33] enhanced the performance of the DRL-based method by employing
a one-climb pruning method to eliminate impractical offloading decisions. Liu et
al. [34] developed a multi-priority task offloading algorithm grounded in Deep
Deterministic Policy Gradient (DDPG) learning. Notably, none of these methods
take the potential benefits of cooperating the intermediate results among devices into
account. Taking the object detection task as an example, the intermediate features
greatly help other devices to recognize the objective, which significantly enhances
the overall application utility.

2.2.1.4 Offloading with Cooperation

Task offloading process at the edge servers and end devices is a critical issue due to
the diverse task dependencies of each application with the EDs, which is essential
to improving the application utility [35].
Cooperation among applications proves advantageous in enhancing overall
application performance, manifesting in improvements such as heightened detection
accuracy in autonomous driving [36]. In scenarios involving DAG-based offloading
with cooperation, the cooperation entails the operator’s decision to identify task
pairs from distinct users that should share their intermediate data. This determina-
tion takes into account factors such as the size of shared data and the dynamically
changing network conditions. Yan et al. [37] formulated offloading as an application
utility maximization problem. It used Gibbs sampling to find the best cooperation
between devices, whose DAGs are connected with each other via fixed external
2.2 Related Work 25

dependencies (EDs). Liu et al. [38] used a Quantized Soft Actor–Critic (QSAC)
algorithm to find the optimal EDs under dynamic network conditions. Meanwhile,
to explore more actions, the output of the actor module is further quantized in an
order-preserving way.

2.2.2 Multisource Data Caching and Migration

In industrial edge computing systems, data (services) is deployed on the edge servers
rather than the remote cloud. The existing works mainly focus on data caching
and service migration. The efficient management of the services decides when and
where to cache or migrate services to optimize the system performance, e.g., latency
and energy consumption [39].

2.2.2.1 Data Caching

Similar to virtual machine (VM) caching, data (service) caching in industrial edge
computing is designed to optimize the hit rate. This involves caching data in edge
servers with limited resources while taking into account user request statistics [40].
Numerous ongoing efforts are dedicated to addressing data caching challenges,
which mainly focus on data popularity and cooperative caching.
Data Popularity The popularity of data within the coverage area of edge servers
usually adheres to a specific distribution, e.g., Zipf distribution [41]. It is essential
to note that, within industrial edge computing systems, data popularity serves as an
indicator of the frequency with which data is requested.
Wang et al. [42] indicated that the requested data of different edge nodes is highly
variant and each edge has its specific features. Since the performance of data caching
strategies heavily depends on the accuracy of popularity prediction [43], there are
lots of research works that adopt Machine Learning (ML) technologies to predict
popularity for adapting to the time-varying popularity of user’s requests [44, 45].
In [46], the authors predicted data popularity based on historical device requests
through the Long Short-Term Memory (LSTM) algorithm. However, the frequent
mobility of devices often leads to dynamic changes in data popularity, posing
challenges for accurate predictions.
With the predicted popularity of the data, the authors in [47] adopted a statistical
model to make bitrate selection and edge caching decisions for video streaming
transmission in industrial edge computing. These strategies driven by Quality of
Experience (QoE) neglect the impact of video quality and rebuffer on different
categories of 360-degree videos, which greatly reduces the QoE. However, research
has indicated that diverse video categories place different emphasis on factors
like video quality and rebuffer [48]. There exists considerable potential for further
enhancement in users’ average QoE.
26 2 Preliminaries

Cooperative Caching Cooperative caching has been widely adopted to improve


caching efficiency by coordinating the caching decisions among multiple edges.
Jiang et al. [49] have delved into decentralized cooperative caching strategies, in
which every edge makes its own caching decision with a small caching information
exchange cost.
In addressing the D2D caching problem, Jiang et al. [50] approached it as
a Multi-Agent Multiarmed Bandit (MAMAB) problem, introducing decentralized
Multi-Agent Reinforcement Learning (MARL) algorithms to maximize individual
caching efficiency. Xu et al. [51] employed the .ϵ-calibration algorithm to estimate
caching decisions of other edges, subsequently proposing a decentralized MAMAB
algorithm for each edge. This algorithm guides each edge in making caching
decisions based on local historical actions and predicted caching decisions of other
edges.
Poularakis et al. [52] formulated caching as a multi-criterion response rate
optimization problem by placement and proposed a random rounding technique
to solve it. Xu et al. [53] introduced a data caching algorithm based on Lyapunov
optimization, aiming to decrease average latency within limited energy consumption
and storage resources in dynamic networks. Also, in [54], the problem is furthered
regularized, rounded, and decomposed for solving, while the time-varying request
is considered.
Xu et al. [55] introduced collaborative MAMAB-based algorithms. These algo-
rithms provide solutions for optimizing data caching when user preferences are
not explicitly known. Additionally, Baccelli et al. [56] employed two cooperative
edges for data delivery to users. Contextual MAB techniques were applied in [57]
for context-specific data popularity prediction. Subsequently, each edge optimized
caching decisions based on the estimated data popularity. In [58], Ioannidis et al.
considered request routing costs within cooperative networks when designing a data
caching strategy. This approach takes into account the intricacies of cooperative
networks to optimize the overall caching strategy.
Recommendation-Driven Caching
Over the past decade, recommendation-driven data caching has been widely studied
to improve the caching hit rate by jointly taking the caching and recommendation
into account. The role of the recommendation is depicted in Fig. 2.1.
The soft cache hits scheme proposed by Sermpezis et al. [59] involves users
accepting recommended similar but cached data when the requested data is not
available in the cache. The authors subsequently developed the optimal caching
policy under the allowance of the soft cache hits scheme. In [60], an efficient
heuristic joint caching and recommendation algorithm was initially devised to
maximize the average cache hit ratio. This algorithm enables a recommender system
to suggest cached items that users may like but are not their absolute favorites.
Also, some DRL-based temporal–spatial recommendation and caching algorithms
have emerged. However, it is worth noting that the works discussed in [61] do not
account for the potential of multiple BSs sharing their cached data in multi-cell
cooperative networks, i.e., primarily focused on scenarios involving a single edge.
2.2 Related Work 27

(a) w/ recommendation, w/o cooperation

(b) w/o recommendation, w/ cooperation (c) w/ recommendation, w/ cooperation

Fig. 2.1 An illustration of a caching system including two small BSs (SBSs) equipped with edge
servers and four users: (a) Since the blue data is the most popular, the two SBSs cache and
recommend it to users, while the expected data of user 2/4 is others; (b) With cooperation, the
two SBSs cache red and blue data, respectively, in which case the request from user 4 for yellow
data cannot be responded; (c) With cooperation and recommendation, the request of user 4 is
changed to red one, and thus all the requests can be responded when two SBSs cache red and blue
data, respectively. Note that the user request patterns are different from the corresponding user
preference patterns with the impact of the recommendation system

This collaborative approach offers an effective solution to optimize the utilization


of limited edge storage resources.
In cooperative networks, the authors in [61] the requests in each BS frequently
change due to the device moving around different areas. It reveals the drawback
28 2 Preliminaries

of the existing data caching methods, i.e., user mobility, leading to a reduction
in the service hit rate as users move among the coverage area of different edge
servers. Although this can be alleviated by frequently updating the data caching
policy, significant edge–cloud transmission overheads would be incurred causing
deterioration in the latency.

2.2.2.2 Service Migration

To deal with the problem caused by mobility, service migration is pointed out to
migrate the service among BSs adaptively [62].
Many solutions based on Software Definition Networks (SDN) have been
developed to improve the mobility management performance in 5G dense networks
according to the radio signal strength. Oliva et al. [63] presented a resilient
SDN framework by building a virtual registration area to avoid updating users’
locations. In this framework, the local mobility anchor establishes a bidirectional
tunnel between mobility access gateways located in edge servers and thus achieves
seamless handoff. Tartarini et al. [64] proposed a quality of service handoff rerouting
component, which proactively generates a route path before users reach their new
locations based on trace prediction. However, due to the substantial data size
involved in transmission (e.g., application and history data), surpassing session-
based and resilient signal messages, this method introduces additional latency.
Moreover, the services may not be migrated to the other BS after the connected
BS of user changes, which further complicates the service migration problem.
Therefore, it is crucial for service migration strategies to incorporate consid-
erations of migration energy consumption and latency. In one-dimensional (1-D)
mobility scenario, i.e., users can only move forward or backward, Ksentini et al.
[65] proposed a method that predicts user trajectories using the Markov Decision
Process (MDP) model, guiding the process of service migration. Extending the
scope to a more realistic two-dimensional (2-D) mobility scenario, Wang et al.
[66] based their migration decisions on trajectory predictions derived from a 2-
D MDP model with a significantly larger state space than the 1-D model. Sun
et al. [67] devised migration decisions through a user-centric scheme based on
MAB theories to meet long-term energy budgets. These strategies, independently
crafted by different users, overlooked interference and service sharing among users,
resulting in latency reduction with constrained resources. Furthermore, sustaining
trajectory prediction accuracy proved challenging with a significant increase in the
number of users.

2.2.3 Intelligent Application in Industrial Edge Computing

Industrial edge computing applications span a wide range of scenarios in our


daily lives, encompassing everything from manufacturing to the IoT domain. The
2.2 Related Work 29

emergence of Industry 4.0 has ushered in a new era, where edge computing, in
conjunction with industrial clouds, offers holistic solutions for pioneering business
models. Within these models, concepts like extensive customization and service-
based production take center stage. In this section, we list some related research for
improving the processing speed and accuracy.

2.2.3.1 Inference Acceleration

Edge-assisted inference is an effective method to achieve accurate and real-


time inference for applications in industrial edge computing systems, e.g., object
detection by offloading a part of the workload-heavy Convolutional Neural Network
(CNN) model to edge servers.
Parallel Inference For the computation-intensive heavyweight models, existing
works focused on offloading them to edge servers and executing them in a parallel
manner without the limitation of end device resources [13].
The author in [68] selected a proper partitioning point for the CNN model
and thus divided it into two parts. One of the parts is executed locally, and
another one is executed on the edge server. Another perspective was presented
in [13], where the author chose a partitioning point to minimize channel coding
costs. However, these methodologies presupposed that the edge server possessed
ample computational resources to fulfill the accuracy and real-time requirements of
inference tasks. In reality, edge servers are often resource-limited and incapable of
meeting the stringent accuracy and real-time demands of autonomous mobile vision
applications. Therefore, some recent studies [69, 70] proposed multi-edge-assisted
inference in the edge networks, which support multiple edge servers collaborating
with each other.
There are two methods in DNN model inference parallelism, i.e., model parallel
and data parallel. The model parallel is based on the fact that the first several layers
of DNN models can be executed on mobile devices to reduce the size of transmitting
data, as well as the transmitting latency. In [71], multiple layers are fused to a single
layer, which is divided into multi-tasks based on its computation and communication
resources. The authors in [72] offloaded slices, parted from multiple convolution
layers to different edge servers to improve the inference speed. Data parallel splits
the video frame into several regions and executes each region on a different edge
server to reduce computation latency.
These parallel sub-task partitioning and offloading strategies introduce high com-
munication costs among edge servers to maintain inference accuracy, which makes
the inference latency vulnerable to network fluctuations. To this end, the authors
in [70] introduced a fully decomposable spatial task partitioning approach. These
sub-tasks are handled by multiple homogeneous edge servers without requiring data
exchange among them. However, applying this method to the tail part of the CNN
model greatly reduces the inference accuracy.
30 2 Preliminaries

Knowledge Distillation With the continuous expansion in deep neural networks,


the execution of these DNN models, i.e., heavyweight models, on resource-limited
end devices has become infeasible. Prior works aimed to reduce the computation
resource requirement of inference models, i.e., lightweight models, by model
pruning [73], model distillation [14], and model quantization [74]. Compared to
heavyweight models, lightweight models cannot maintain high accuracy under
video scenes due to their limited generalization capabilities.
To solve the above problem, existing works have also studied a lot of video
analytic systems to balance inference accuracy and latency by constructing a
collaborative system with a teacher–student learning framework. The main research
direction is adjusting the inference process to achieve optimal inference accuracy.
For example, Noscope [75] proposed a “detect and track” method to reduce the
computation cost by only inferring the key frames and tracking the other frames with
a simple tracking model. AMS [76] proposed a collaborative system to optimize the
retraining process, which updates student models in a polling manner with different
retraining costs. However, in heterogeneous MEC networks, AMS ignores the huge
potential improvement of the inference accuracy as the computation resources in
other neighboring low-load edge servers are wasted.

2.2.3.2 Cooperative Inference for Improved Accuracy

The accuracy of inference for a single device is constrained by the limited sensing
range and blind spots of its sensors, which requires sharing the data among devices.
Depending on the type of sensing data to be shared between devices, the level of
cooperative perception consists of the raw, feature, and object, as shown in Fig. 2.2.
Raw-Level The approach involves sharing and gathering raw data to create a
comprehensive view. Cooper [77] is a pioneering method, aiming to enhance the
sensing area and improve inference accuracy by facilitating the exchange of raw
data between two devices. EMP [36] leveraged infrastructure support to share

Fig. 2.2 An example of cooperation in the perception of connected and autonomous vehicles,
where the raw, feature, and object data can be shared with others, respectively
References 31

nonoverlapping segments of Light Detection and Ranging (LiDAR) point clouds


using adaptive spatial partitioning, providing scalability, robustness, and efficiency
to multi-device perception.
Feature-Level F-Cooper [78] designed a cooperative perception framework based
on features extracted from point cloud data, which fuses voxel features and spatial
features using maxout function. V2VNet [79] employed a spatial-aware Graph
Neural Network (GNN) to intelligently aggregate information from different points
in time and viewpoints within the scene.
Object-Level Object-level cooperative perception, where only detection results
(e.g., 3D bounding box position) are exchanged, is exemplified in [80]. This work
introduced a two-layer architecture that handles object tracking and fusion from
dynamic remote sources of information, optimizing both the quality of information
and computation latency. In a similar vein, Rauch et al. [81] explored the sharing
of locally perceived object data and investigated temporal and spatial alignment for
the exchanged data.

References

1. Jiong Jin, Kan Yu, Ning Zhang, and Zhibo Pang. Guest editorial: Special section on real-
time edge computing over new generation automation networks for industrial cyber-physical
systems. IEEE Trans. Ind. Informatics, 18(12):9268–9270, 2022.
2. Akanksha Dixit, Arjun Singh, Yogachandran Rahulamathavan, and Muttukrishnan Rajarajan.
FAST DATA: A fair, secure, and trusted decentralized IIoT data marketplace enabled by
blockchain. IEEE Internet Things J., 10(4):2934–2944, 2023.
3. Mingkai Chen, Lindong Zhao, Jianxin Chen, Xin Wei, and Mohsen Guizani. Modal-aware
resource allocation for cross-modal collaborative communication in IIoT. IEEE Internet Things
J., 10(17):14952–14964, 2023.
4. Wenhao Fan, Shenmeng Li, Jie Liu, Yi Su, Fan Wu, and Yuanan Liu. Joint task offloading
and resource allocation for accuracy-aware machine-learning-based IIoT applications. IEEE
Internet Things J., 10(4):3305–3321, 2023.
5. Hui Yin, Wei Zhang, Hua Deng, Zheng Qin, and Keqin Li. An attribute-based searchable
encryption scheme for cloud-assisted IIoT. IEEE Internet Things J., 10(12):11014–11023,
2023.
6. Yuhuai Peng, Alireza Jolfaei, Qiaozhi Hua, Wen-Long Shang, and Keping Yu. Real-time
transmission optimization for edge computing in industrial cyber-physical systems. IEEE
Trans. Ind. Informatics, 18(12):9292–9301, 2022.
7. Peiying Zhang, Yi Zhang, Neeraj Kumar, and Ching-Hsien Hsu. Deep reinforcement learning
algorithm for latency-oriented IIoT resource orchestration. IEEE Internet Things J., 10(8, April
15):7153–7163, 2023.
8. Guowen Wu, Zhiqi Xu, Hong Zhang, Shigen Shen, and Shui Yu. Multi-agent DRL for joint
completion delay and energy consumption with queuing theory in MEC-based IIoT. J. Parallel
Distributed Comput., 176:80–94, 2023.
9. M. S. Syam, Sheng Luo, Yue Ling Che, Kaishun Wu, and Victor C. M. Leung. Energy-efficient
intelligent reflecting surface aided wireless-powered IIoT networks. IEEE Syst. J., 17(2):2534–
2545, 2023.
10. Matt Walker. Operators facing power cost crunch. https://www.mtnconsulting.biz/product.
Accessed Nov 7, 2020.
32 2 Preliminaries

11. D. Chen and W. Ye. 5G power: Creating a green grid that slashes costs, emissions & energy
use. https://www.huawei.com/en/publications/communicate/89/5g-power-green-grid-slashes-
costs-emissions-energy-use. Accessed Nov 7, 2020.
12. Valentin Poirot, Mårten Ericson, Mats Nordberg, and Karl Andersson. Energy efficient multi-
connectivity algorithms for ultra-dense 5G networks. IEEE Wireless Networks, 26(3):2207–
2222, Jun. 2020.
13. Mikolaj Jankowski, Deniz Gündüz, and Krystian Mikolajczyk. Joint device-edge inference
over wireless links with pruning. In 21st IEEE International Workshop on Signal Processing
Advances in Wireless Communications, SPAWC 2020, Atlanta, GA, USA, May 26–29, 2020,
pages 1–5. IEEE, 2020.
14. Lin Wang and Kuk-Jin Yoon. Knowledge distillation and student-teacher learning for visual
intelligence: A review and new outlooks. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 44(6):3048–3068, 2022.
15. Mingchuan Zhang, Yangfan Zhou, Quanbo Ge, Ruijuan Zheng, and Qingtao Wu. Decentralized
randomized block-coordinate Frank-Wolfe algorithms for submodular maximization over
networks. IEEE Trans. Syst. Man Cybern. Syst., 52(8):5081–5091, 2022.
16. Zheng Yao, Huaiyu Wu, and Yang Chen. Multi-objective cooperative computation offloading
for MEC in UAVs hybrid networks via integrated optimization framework. Comput. Commun.,
202:124–134, 2023.
17. Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed
inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Confer-
ence on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020,
pages 854–863. IEEE, 2020.
18. Xiong Wang, Jiancheng Ye, and John C. S. Lui. Decentralized task offloading in edge
computing: A multi-user multi-armed bandit approach. In IEEE INFOCOM 2022—IEEE
Conference on Computer Communications, London, United Kingdom, May 2–5, 2022, pages
1199–1208. IEEE, 2022.
19. Liping Qian, Yuan Wu, Fuli Jiang, Ningning Yu, Weidang Lu, and Bin Lin. NOMA assisted
multi-task multi-access mobile edge computing via deep reinforcement learning for industrial
internet of things. IEEE Trans. Ind. Informatics, 17(8):5688–5698, 2021.
20. Sladana Josilo and György Dán. Computation offloading scheduling for periodic tasks in
mobile edge computing. IEEE/ACM Trans. Netw., 28(2):667–680, 2020.
21. Sarhad Arisdakessian, Omar Abdel Wahab, Azzam Mourad, Hadi Otrok, and Nadjia Kara.
FoGMatch: An intelligent multi-criteria IoT-FoG scheduling approach using game theory.
IEEE/ACM Trans. Netw., 28(4):1779–1789, 2020.
22. Gongming Zhao, Hongli Xu, Yangming Zhao, Chunming Qiao, and Liusheng Huang. Offload-
ing dependent tasks in mobile edge computing with service caching. In 39th IEEE Conference
on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020, pages
1997–2006. IEEE, 2020.
23. Yinghui He, Jinke Ren, Guanding Yu, and Yunlong Cai. D2D Communications Meet Mobile
Edge Computing for Enhanced Computation Capacity in Cellular Networks. IEEE Transac-
tions on Wireless Communications, 18(3):1750–1763, 2019.
24. Molin Li, Xiaobo Zhou, Tie Qiu, Qinglin Zhao, and Keqiu Li. Multi-relay assisted computation
offloading for multi-access edge computing systems with energy harvesting. IEEE Trans. Veh.
Technol., 70(10):10941–10956, 2021.
25. Pimmy Gandotra and Rakesh Kumar Jha. Device-to-device communication in cellular net-
works: A survey. J. Netw. Comput. Appl., 71:99–117, 2016.
26. J. Nicholas Laneman, David N. C. Tse, and Gregory W. Wornell. Cooperative diversity
in wireless networks: Efficient protocols and outage behavior. IEEE Trans. Inf. Theory,
50(12):3062–3080, 2004.
27. Yang Li, Gaochao Xu, Kun Yang, Jiaqi Ge, Peng Liu, and Zhenjun Jin. Energy efficient relay
selection and resource allocation in d2d-enabled mobile edge computing. IEEE Trans. Veh.
Technol., 69(12):15800–15814, 2020.
References 33

28. Chang Shu, Zhiwei Zhao, Yunpeng Han, Geyong Min, and Hancong Duan. Multi-user
offloading for edge computing networks: A dependency-aware and latency-optimal approach.
IEEE Internet of Things Journal, 7(3):1678–1689, 2020.
29. Jeffrey D. Ullman. NP-complete scheduling problems. Journal of Computer and System
Sciences, 10(3):384–393, 1975.
30. Yujiong Liu, Shangguang Wang, Qinglin Zhao, Shiyu Du, Ao Zhou, Xiao Ma, and Fangchun
Yang. Dependency-aware task scheduling in vehicular edge computing. IEEE Internet of
Things Journal, 7(6):4961–4971, 2020.
31. Hanlong Liao, Xinyi Li, Deke Guo, Wenjie Kang, and Jiangfan Li. Dependency-aware
application assigning and scheduling in edge computing. IEEE Internet of Things Journal,
9(6):4451–4463, 2022.
32. Zhiqing Tang, Jiong Lou, Fuming Zhang, and Weijia Jia. Dependent task offloading for
multiple jobs in edge computing. In International Conference on Computer Communications
and Networks, ICCCN 2020, Honolulu, HI, USA, August 3–6, 2020, 2020.
33. Jia Yan, Suzhi Bi, and Ying Jun Angela Zhang. Offloading and resource allocation with
general task graph in mobile edge computing: A deep reinforcement learning approach. IEEE
Transactions on Wireless Communications, 19(8):5404–5419, 2020.
34. Shumei Liu, Yao Yu, Xiao Lian, Yuze Feng, Changyang She, Phee Lep Yeoh, Lei Guo,
Branka Vucetic, and Yonghui Li. Dependent task scheduling and offloading for minimizing
deadline violation ratio in mobile edge computing networks. IEEE Journal on Selected Areas
in Communications, 41(2):538–554, 2023.
35. Xuming An, Rongfei Fan, Han Hu, Ning Zhang, Saman Atapattu, and Theodoros A. Tsiftsis.
Joint task offloading and resource allocation for IoT edge computing with sequential task
dependency. IEEE Internet of Things Journal, 9(17):16546–16561, 2022.
36. Xumiao Zhang, Anlan Zhang, Jiachen Sun, Xiao Zhu, Yihua Ethan Guo, Feng Qian, and
Z. Morley Mao. EMP: edge-assisted multi-vehicle perception. In ACM MobiCom ’21: The
27th Annual International Conference on Mobile Computing and Networking, New Orleans,
Louisiana, USA, October 25–29, 2021, 2021.
37. Jia Yan, Suzhi Bi, Ying Jun Zhang, and Meixia Tao. Optimal task offloading and resource
allocation in mobile-edge computing with inter-user task dependency. IEEE Transaction on
Wireless Communication, 19(1):235–250, 2020.
38. Pengbo Liu, Shuxin Ge, Xiaobo Zhou, Chaokun Zhang, and Keqiu Li. Soft actor-critic-
based DAG tasks offloading in multi-access edge computing with inter-user cooperation. In
Algorithms and Architectures for Parallel Processing—21st International Conference, ICA3PP
2021, Virtual Event, December 3–5, 2021, Proceedings, Part III, volume 13157, pages 313–
327, 2021.
39. Steven Davy, Jeroen Famaey, Joan Serrat, Juan Luis Gorricho, Avi Miron, Manos Dramitinos,
Pedro Miguel Neves, Steven Latré, and Ezer Gochen. Challenges to support edge-as-a-service.
IEEE Communications Magazine, 52(1):132–139, Jul. 2014.
40. X. Zhang and Q. Zhu. Hierarchical caching for statistical QoS guaranteed multimedia trans-
missions over 5G edge computing mobile wireless networks. IEEE Wireless Communications,
25(3):12–20, Jun. 2018.
41. Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. Web caching and Zipf-
like distributions: Evidence and implications. In Proceedings IEEE INFOCOM ’99, The
Conference on Computer Communications, Eighteenth Annual Joint Conference of the IEEE
Computer and Communications Societies, The Future Is Now, New York, NY, USA, March 21–
25, 1999, pages 126–134, 1999.
42. Fangxin Wang, Feng Wang, Jiangchuan Liu, Ryan Shea, and Lifeng Sun. Intelligent video
caching at network edge: A multi-agent deep reinforcement learning approach. In 39th IEEE
Conference on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9,
2020, pages 2499–2508. IEEE, 2020.
43. Liang Li, Dian Shi, Ronghui Hou, Rui Chen, Bin Lin, and Miao Pan. Energy-efficient proactive
caching for adaptive video streaming via data-driven optimization. IEEE Internet Things J.,
7(6):5549–5561, 2020.
34 2 Preliminaries

44. Hao Zhu, Yang Cao, Xiao Wei, Wei Wang, Tao Jiang, and Shi Jin. Caching transient data
for internet of things: A deep reinforcement learning approach. IEEE Internet Things J.,
6(2):2074–2083, 2019.
45. Jingjing Yao and Nirwan Ansari. Caching in dynamic IoT networks by deep reinforcement
learning. IEEE Internet Things J., 8(5):3268–3275, 2021.
46. Ruyan Wang, Zunwei Kan, Yaping Cui, Dapeng Wu, and Yan Zhen. Cooperative caching
strategy with content request prediction in internet of vehicles. IEEE Internet Things J.,
8(11):8964–8975, 2021.
47. Georgios Papaioannou and Lordanis Koutsopolulos. Tile-based caching optimization for 360◦
videos. In Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc
Networking and Computing, 2019.
48. Ivan Sliver, Mirko Suznjevic, and Skorin Kapov Lea. Game categorization for deriving
QoE-driven video encoding configuration strategies for cloud gaming. ACM Transactions on
Multimedia Computing, Communications, and Applications, 2017.
49. Wei Jiang, Gang Feng, Shuang Qin, and Ying-Chang Liang. Learning-based cooperative
content caching policy for mobile edge computing. In ICC 2019–2019 IEEE International
Conference on Communications (ICC), pages 1–6. IEEE, 2019.
50. Wei Jiang, Gang Feng, Shuang Qin, Tak Shing Peter Yum, and Guohong Cao. Multi-
agent reinforcement learning for efficient content caching in mobile d2d networks. IEEE
Transactions on Wireless Communications, 18(3):1610–1622, 2019.
51. Xianzhe Xu and Meixia Tao. Decentralized multi-agent multi-armed bandit learning with
calibration for multi-cell caching. IEEE Transactions on Communications, 2020.
52. K. Poularakis, J. Llorca, A. M. Tulino, I. Taylor, and L. Tassiulas. Joint service placement
and request routing in multi-cell mobile edge computing networks. In IEEE Conference on
Computer Communications, INFOCOM, pages 10–18, Paris, France, Apr. 2019.
53. Jie Xu, Lixing Chen, and Pan Zhou. Joint service caching and task offloading for mobile edge
computing in dense networks. In IEEE Conference on Computer Communications, INFOCOM,
pages 207–215, Honolulu, HI, USA, Apr. 2018.
54. Lingjun Pu, Jiao Lei, Chen Xu, Wang Lin, and Jingdong Xu. Online resource allocation,
content placement and request routing for cost-efficient edge caching in cloud radio access
networks. IEEE Journal on Selected Areas in Communications, 36(8):1751–1767, Dec. 2018.
55. Xianzhe Xu, Meixia Tao, and Cong Shen. Collaborative multi-agent multi-armed bandit
learning for small-cell caching. IEEE Transactions on Wireless Communications, 19(4):2570–
2585, 2020.
56. François Baccelli and Anastasios Giovanidis. A stochastic geometry framework for analyzing
pairwise-cooperative cellular networks. IEEE Transactions on Wireless Communications,
14(2):794–808, 2014.
57. S. Müller, O. Atan, M. van der Schaar, and A. Klein. Context-aware proactive content
caching with service differentiation in wireless networks. IEEE Transactions on Wireless
Communications, 16(2):1024–1036, 2017.
58. Stratis Ioannidis and Edmund Yeh. Adaptive caching networks with optimality guarantees.
IEEE/ACM Transactions on Networking, 26(2):737–750, 2018.
59. Pavlos Sermpezis, Theodoros Giannakas, Thrasyvoulos Spyropoulos, and Luigi Vigneri. Soft
cache hits: Improving performance through recommendation and delivery of related content.
IEEE Journal on Selected Areas in Communications, 36(6):1300–1313, 2018.
60. Livia Elena Chatzieleftheriou, Merkouris Karaliopoulos, and Iordanis Koutsopoulos. Jointly
optimizing content caching and recommendations in small cell networks. IEEE Transactions
on Mobile Computing, 18(1):125–138, 2018.
61. Kaiyang Guo and Chenyang Yang. Temporal-spatial recommendation for caching at base
stations via deep reinforcement learning. IEEE Access, 7:58519–58532, 2019.
62. T. Ouyang, Z. Zhou, and X. Chen. Follow me at the edge: Mobility-aware dynamic service
placement for mobile edge computing. IEEE Journal on Selected Areas in Communications,
36(10):2333–2345, Oct. 2018.
References 35

63. Antonio de la Oliva, Xi Li, Xavier Pérez Costa, Carlos Jesus Bernardos, Philippe Bertin,
Paola Iovanna, Thomas Deiß, Josep Mangues, Alain Mourad, Claudio Casetti, Jose Enrique
Gonzalez, and Arturo Azcorra. 5G-TRANSFORMER: Slicing and orchestrating transport
networks for industry verticals. IEEE Communication Magazine, 56(8):78–84, Aug. 2018.
64. Luca Tartarini, Marcelo Antonio Marotta, Eduardo Cerqueira, Juergen Rochol, Cristiano Bon-
ato Both, Mario Gerla, and Paolo Bellavista. Software-defined handover decision engine for
heterogeneous cloud radio access networks. Computing Communication, 115:21–34, Mar.
2018.
65. Adlen Ksentini, Tarik Taleb, and Min Chen. A Markov decision process-based service migra-
tion procedure for follow me cloud. In IEEE International Conference on Communications,
ICC, pages 1350–1354, Sydney, Australia„ Oct. 2014.
66. S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung. Dynamic service migration
in mobile edge computing based on Markov decision process. IEEE/ACM Transactions on
Networking, 27(3):1272–1288, Jun. 2019.
67. Y. Sun, S. Zhou, and J. Xu. EMM: Energy-aware mobility management for mobile edge
computing in ultra dense networks. IEEE Journal on Selected Areas in Communications,
35(11):2637–2646, Nov. 2017.
68. Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. Distributed deep neural networks
over the cloud, the edge and end devices. In 37th IEEE International Conference on Distributed
Computing Systems, ICDCS 2017, Atlanta, GA, USA, June 5–8, 2017, pages 328–339. IEEE
Computer Society, 2017.
69. Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. Deepthings: Distributed
adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst., 37(11):2348–2359, 2018.
70. Sai Qian Zhang, Jieyu Lin, and Qi Zhang. Adaptive distributed convolutional neural network
inference at the network edge with ADCNN. In ICPP 2020: 49th International Conference on
Parallel Processing, Edmonton, AB, Canada, August 17–20, 2020, pages 10:1–10:11. ACM,
2020.
71. Li Zhou, Mohammad Hossein Samavatian, Anys Bacha, Saikat Majumdar, and Radu Teodor-
escu. Adaptive parallel execution of deep neural networks on heterogeneous edge devices.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019, Arlington,
Virginia, USA, November 7–9, 2019, pages 195–208. ACM, 2019.
72. Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed
inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Confer-
ence on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020,
pages 854–863. IEEE, 2020.
73. Ran Xu, Rakesh Kumar, Pengcheng Wang, Peter Bai, Ganga Meghanath, Somali Chaterji,
Subrata Mitra, and Saurabh Bagchi. ApproxNet: Content and contention-aware video object
classification system for embedded clients. ACM Trans. Sens. Networks, 18(1):11:1–11:27,
2022.
74. Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, and Pritish Narayanan. Deep learning
with limited numerical precision. In Proceedings of the 32nd International Conference on
Machine Learning, ICML 2015, Lille, France, 6–11 July, volume 37, pages 1737–1746.
JMLR.org, 2015.
75. Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. NoScope:
Optimizing deep CNN-based queries over video streams at scale. Proc. VLDB Endow.,
10(11):1586–1597, 2017.
76. Mehrdad Khani Shirkoohi, Pouya Hamadanian, Arash Nasr-Esfahany, and Mohammad
Alizadeh. Real-time video inference on edge devices via adaptive model streaming. In
IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada,
October 10–17, pages 4552–4562, 2021.
77. Qi Chen, Sihai Tang, Qing Yang, and Song Fu. Cooper: Cooperative perception for connected
autonomous vehicles based on 3d point clouds. In 39th IEEE International Conference on
Distributed Computing Systems, Dallas, TX, USA, pages 514–524, 2019.
36 2 Preliminaries

78. Qi Chen, Xu Ma, Sihai Tang, Jingda Guo, Qing Yang, and Song Fu. F-Cooper: feature based
cooperative perception for autonomous vehicle edge computing system using 3d point clouds.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, Arlington, Virginia,
USA, pages 88–100, 2019.
79. Tsun-Hsuan Wang, Sivabalan Manivasagam, Ming Liang, Bin Yang, Wenyuan Zeng, and
Raquel Urtasun. V2VNet: vehicle-to-vehicle communication for joint perception and predic-
tion. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK,
pages 605–621, 2020.
80. Moreno Ambrosin, Ignacio J. Alvarez, Cornelius Bürkle, Lily L. Yang, Fabian Oboril,
Manoj R. Sastry, and Kathiravetpillai Sivanesan. Object-level perception sharing among
connected vehicles. In IEEE Intelligent Transportation Systems Conference, Auckland, New
Zealand, pages 1566–1573, 2019.
81. Andreas Rauch, Felix Klanner, Ralph H. Rasshofer, and Klaus Dietmayer. Car2X-based
perception in a high-level fusion architecture for cooperative perception systems. In 2012 IEEE
Intelligent Vehicles Symposium, Alcal de Henares, Madrid, Spain, pages 270–275, 2012.
Chapter 3
Computation Offloading in Industrial
Edge Computing

Computation offloading is a key strategy in industrial edge computing, playing a


significant role in enhancing efficiency, reducing latency, and optimizing resource
utilization. As industries increasingly adopt industrial edge computing to leverage
its transformative potential, the understanding and implementation of effective com-
putation offloading strategies become essential. These strategies are crucial to fully
enjoy the benefits that industrial edge computing offers for industrial operations.
This chapter delves into various computation offloading schemes, highlighting
innovative approaches like Adaptive Offloading with Two-Stage Hybrid Matching
(ATOM) and Dependent Offloading with DAG-Based Cooperation Gain. These
schemes represent advanced methodologies in computation offloading, designed to
adapt to the dynamic and complex environments typical in industrial settings.

3.1 Introduction

Industrial edge computing task offloading is a complex yet vital technology


designed to optimize the distribution and execution of computing tasks in industrial
systems [1–3], as shown in Fig. 3.1. This optimization aims to meet specific
performance, latency, and resource utilization requirements. This technology has
developed in response to the burgeoning industrial Internet, which has led to a surge
in data and computing demands from factories and equipment. Traditional cloud
computing models can introduce unnecessary latency in handling these demands. In
contrast, industrial edge computing task offloading strategically allocates computing
tasks to edge servers near the data source, potentially reducing latency and
improving response times significantly [4, 5].

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 37
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_3
38 3 Computation Offloading in Industrial Edge Computing

However, this form of task offloading encounters various challenges in opti-


mizing issues such as balancing application latency, energy consumption, and
cooperation gains [6, 7]. It involves making informed decisions about offloading,
computation, and cooperation. Determining which tasks should be executed on
edge servers versus edge servers or cloud servers requires intelligent algorithms
and strategies [8–10]. This decision-making process must consider the nature of the
task, latency requirements, and availability of device resources. Security and privacy
are also crucial considerations to prevent data compromise during transmission and
processing.
Industrial edge computing task offloading utilizes machine learning, optimization
algorithms, and collaborative computing for intelligent task allocation [2, 11–
15]. These methods enable the smart selection of task execution locations based
on differing requirements and environmental conditions, thus maximizing the
efficiency and performance of the industrial system.
This chapter introduces two advanced schemes: ATOM and Dependent Offload-
ing with DAG-Based Cooperation Gain [16]. ATOM focuses on adaptively matching
computational tasks to the most appropriate processing units, considering task
requirements and resource availability. On the other hand, Dependent Offloading
with DAG-Based Cooperation Gain considers task dependencies, using a DAG
framework to enhance offloading efficiency and cooperation. Both strategies aim
to optimize the performance of Industrial Edge Computing systems, ensuring that
computational tasks are managed in the most effective and resource-efficient way.

ED8

ED6
ED1
ED9
MS1 ED5 MS3

ED2
ED7

ED4
ED3
MS2

Fig. 3.1 An illustration of industrial edge computing


3.2 Adaptive Offloading with Two-Stage Hybrid Matching 39

3.2 Adaptive Offloading with Two-Stage Hybrid Matching

3.2.1 Statement of Problem


3.2.1.1 Overview

We first give a comprehensive system overview and then detail the communication
among devices and computing of the tasks to formulate the offloading as an
optimization problem.
In this industrial edge computing system model, a network of M MEC servers
(MSs) and N edge servers is conceptualized. The edge servers, each assigned an
identifier from the set .N = 1, 2, · · · , N , are responsible for generating various
tasks. Each task .Tj from edge server j is defined by its data size .dj , required CPU
cycles .τj , and the maximum latency .tjmax it can tolerate.
The system’s architecture permits two primary execution methods for tasks:
• Local Execution: Tasks are processed directly on the edge servers. This method
is generally faster due to reduced communication needs but is limited by the
edge servers’ computational capacity. Hence, the tasks are allowed to be executed
locally once they satisfy the latency requirement.
• Offloading to MSs: When a task cannot be processed within the tolerable latency
by an edge server, it is offloaded to an MS. The MSs are strategically deployed
to ensure each edge server falls under the coverage of multiple MSs, and it has
the option to offload its task to any one of these MSs.
For every task .Tj generated by edge server j , the offloading decision to a
particular MS i is represented by a binary vector .Vj . This vector indicates whether
task .Tj will be offloaded to MS i or executed locally:


⎪ ED j offloads Tj to MS i, ∀i ∈ M ,
⎨1,
Vji =
. (3.1)


⎩0, otherwise.

The decision whether a task .Tj is executed on an MS is indicated by .aj , which


is determined by the sum of the elements in .Vj :


⎪ Tj is offloaded to a MS, ∀j ∈ N ,

M ⎨1,
aj =
. Vji = (3.2)


i=1 ⎩0, otherwise.
40 3 Computation Offloading in Industrial Edge Computing

3.2.1.2 Communication Model

In the industrial edge computing system, communication between edge servers and
MSs is facilitated using Orthogonal Frequency Division Multiple Access (OFDMA)
[17]. This setup is essential for task transmission from edge servers to MSs,
integrating specific location attributes and wireless communication features.
The model considers the horizontal coordinates of MSs, denoted as .pi , and edge
servers, denoted as .qj . These factors are crucial in calculating the physical distance
between each edge server and MS, which is given by the equation:
/
Dj,i =
. ‖ pi − qj ‖2 + h2i , (3.3)

where .hi denotes MS i’s height of the antenna, which is used to capture the three-
dimensional space.
The transmission rate between MS i and edge server j , when edge server j
transmits task .Tj , is determined as follows:
⎛ ⎞
Pj gj,i
Rj,i = B log2 1 +
. . (3.4)
σ2

Here, B denotes the channel bandwidth, .σ 2 is the noise power, and .Pj is the
transmission power of edge server j . .gj,i is the channel power gain between MS
i and edge server, which is calculated considering the distance [18], i.e.,

β0 β0
gj,i =
. = . (3.5)
2
Dj,i ‖ pi − qj ‖2 + h2i

The transmission rate, denoted as .Rj,i , is finally expressed by


⎛ ⎞
Pj β0
.Rj,i = B log2 1 + . (3.6)
σ 2 (‖ pi − qj ‖2 + h2i )

Thus, the upload latency of task .Tj can be calculated by

dj
tr
tj,i
. = . (3.7)
Rj,i

After MS i fully executes task .Tj , the results are sent to edge server j . Normally,
the transmission latency for sending the results back to devices can be neglected due
to its ultrasmall size [19].
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 41

3.2.1.3 Computation Model

The computation model for the industrial edge computing system is based on
the computation capabilities of the edge servers and involves two scenarios: local
computing and computing.

(1) Local Computing

Each edge server, denoted as j , has a specific computation capacity expressed in


CPU cycles per second, represented as .fj . The computation capabilities vary across
different edge servers. For task .Tj = {dj , τj , tjmax } executed locally, its computing
latency .tjlc is determined by the following equation:

τj
tjlc =
. . (3.8)
fj

In local execution, there is no transmission latency. Therefore, the total execution


latency .tjle of task .Tj is solely the computation latency:

tjle = tjlc .
. (3.9)

If .tjle is less than the tolerable latency .tjmax , it indicates that edge server j has
sufficient computation capacity to execute .Tj within the required time frame.

(2) Remote Computing

When the local computation latency exceeds its tolerable latency (.tjle > tjmax ), it
signifies that edge server j lacks the necessary computing power. The task .Tj is
offloaded to the queue of MEC server i to await the resource allocation. The waiting
rw for task .T is the duration it stays in the queue:
latency .tj,i j

g
rw
tj,i
. = tj,i
l
− tj,i . (3.10)

Let .Fi denote the computation capacity of MS i. The computation capacity


allocation decision for i can be denoted by an N -dimensional vector, .Fi =
{Fi,1 , Fi,2 , · · · , Fi,N }, which cannot exceed the capacity of i, i.e.,
τj
Fi,j ≥
.
tr − t rw . (3.11)
tjmax − tj,i j,i
42 3 Computation Offloading in Industrial Edge Computing

rc for task .T on MS i
After determining .τj and .Fi,j , the computation latency .tj,i j
can be calculated by
τj
rc
tj,i
. = . (3.12)
Fi,j
re includes the upload,
The total execution latency for a remotely executed task .tj,i
computation, and waiting latency:
re
tj,i
. = tj,i
tr
+ tj,i
rc
+ tj,i
rw
. (3.13)

3.2.1.4 Problem Formulation

In this MEC-enabled IIoT system, the total execution latency of tasks is calculated
based on the offloading decisions .Vji and .aj for each .i ∈ M and .j ∈ N . The total
latency, denoted as .ttotal , is formulated as follows:
⎾ ⏋
⎲ ⎲
.ttotal = Vji · tj,i
re
+ (1 − aj ) · tjle , (3.14)
j ∈N i∈M

where .tjle and .tj,i


re are as defined earlier.

The optimization problem, aiming to minimize this total task execution latency,
referred to as P, is outlined as

P: min ttotal
{Vji ,Fi,j }

s.t. C1 : Vji ∈ {0, 1}, ∀i ∈ M , ∀j ∈ N ,

C2 : aj ∈ {0, 1}, ∀j ∈ N ,

. (3.15)
C3 : (1 − Vji ) · tjle + Vji · tj,i
re
≤ tjmax ,

τj
C4 : tr − t rw ≤ Fi,j ≤ Fi ,
tjmax − tj,i j,i


N
C5 : Fi,j ≤ Fi .
j =1
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 43

In P, the variables to be optimized are .Vji and .Fi,j . The total execution latency
for each task, whether executed locally (.tjle ) or on server i (.tj,i
re ), is considered.

The constraints of problem P are as follows:


• C1 ensures that offloading decisions are binary.
• C2 ensures that each task is either executed locally or on a single server.
• C3 guarantees that execution latency does not exceed the maximum tolerable
limits.
• C4 and C5 ensure the allocated computation capactitise can response the task
before the deadline, and do not exceed the server’s capacity.
This formulation and its constraints jointly qualify the solution space for
effectively managing task execution in the MEC-enabled IIoT system, balancing
offloading decisions with computational resources while adhering to time and
capacity constraints.

3.2.2 Scheme Overview

Generally, the randomness of devices prevents the operator from obtaining the
density and size of the received tasks beforehand. Hence, we designed the ATOM
framework, depicted in Fig. 3.2, whose main component is the global buffer. It
supports the offloading to be executed by two different stages, i.e., offline matching
and online matching.

3.2.2.1 Global Buffer

We build a global buffer to direct the offloading decision-making by capturing


the time-varying feature of the number of arriving tasks. The buffer functions
as a mirror of the task arrival density, embodying a physical storage area with

Fig. 3.2 Proposed task offloading scheme


44 3 Computation Offloading in Industrial Edge Computing

virtual attributes that change over time. When tasks are generated and necessitate
assignment to the MSs, pertinent information such as data size and processing
demands is initially deposited into this overarching buffer. This storage process
allows for the monitoring of task arrival rates, as the buffer’s content, such as the
number of tasks and their cumulative data size, steadily increases with each new
task.
To effectively utilize this information on task arrival density, a threshold parame-
ter, denoted as .δ, is introduced. This parameter plays a critical role in the ATOM
framework, acting as a switch between its two operational stages: the offline
matching and the online matching stages. The value of .δ is set to dynamically
determine the point of transition between these stages, enabling the system to adjust
its strategy according to the observed arriving tasks. Such a mechanism ensures
that the system can respond aptly to varying task loads, optimizing the offloading
process in accordance with real-time demands and capacities.

3.2.2.2 Online Matching Stage

The online matching stage is initiated when the global buffer’s predetermined
threshold has not been reached yet. In this phase, while the specifics about the
MSs are already known, information about the arriving edge servers is unveiled
incrementally as each task arrives [20, 21]. The primary goal here is to establish a
stable matching between MSs and edge servers.
There are some specific advantages of online matching: It significantly improves
the response speed, supporting the system to deal with heavily dynamic networks
and the continuous arrival of new tasks. The strategy also supports flexible decision-
making, relying on local information rather than requiring comprehensive global
information or extensive coordination. Additionally, it improves resource utility by
adaptively allocating tasks in response to the current availability of resources. These
aspects of online matching make it a particularly effective method for task offloading
in the fast-paced and variable context of industrial edge computing.

3.2.2.3 Offline Matching Stage

Once reaching the threshold of the global buffer, the ATOM framework shifts to
the offline matching stage. At this point, comprehensive information about both the
MSs and the edge servers is available prior to making task offloading decisions.
The objective in this stage is to minimize the total latency, striving for an optimal
match between tasks and servers, a problem addressed through offline matching
theory [22]. This approach involves solving a well-defined problem by allocating
edge server tasks to the most suitable MSs.
Offline matching offers distinct advantages. First, it utilizes global information
and historical data, which often results in superior task offloading solutions. This
approach is beneficial for identifying more efficient and effective task allocations.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 45

Second, by taking into account a range of factors and constraints and applying
accurate models, offline matching can achieve enhanced offloading outcomes. This
method is adept at handling complex scenarios, incorporating various parameters
into the decision-making process. Third, offline matching is particularly effective in
environments where conditions are relatively stable, and the influence of dynamic
changes on task offloading decisions is minimal. In essence, offline matching excels
in optimization, accuracy, and stability. It is especially relevant for scenarios that
demand a thorough understanding of global information, as well as long-term
strategic planning. This stage is critical in environments where consistency and
predictability are key, allowing for well-informed, data-driven decision-making.

3.2.3 Global Buffer

The threshold .δ of the global buffer is influenced by the ratio of heavy tasks to light
tasks, denoted as .ε, and the task arrival rate .λt . When the proportion of light tasks is
high (.ε tends to 0), .δ is better set to a higher value, ideally approaching infinity. This
is because light tasks are more suited for online offloading, and a higher .δ ensures
more of these tasks are processed in the online matching stage. Conversely, when
the proportion of heavy tasks increases (.ε tends to 1), it is preferable to engage in
offline matching more frequently. In this case, for .ε = 1, we minimize the threshold
.δ to .δmin .

To find a suitable threshold, we build the following threshold function:

δ = ⎾(1 − alnε)δmin ⏋.
. (3.16)

Here, a and .ϵ are the weighting factors for the proportion of heavy tasks and the
departure rate, respectively. .δmin is dependent on the time-varying arrival rate .λt .
Within the simulation duration .tsim , the number of tasks that cannot be offloaded is
⎰ tsim
. θt = (λt − ϵ)dt. (3.17)
0

For executing a maximum of b times in the offline matching stage, .δmin is


calculated by
⎰ tsim
θt (λt − ϵ)dt
δmin
. = = 0
. (3.18)
b b
Incorporating this into the threshold function, the final form becomes
⎾ ⎰ tsim ⏋
(λt − ϵ)dt
δ = (1 − alnε)
.
0
. (3.19)
b
46 3 Computation Offloading in Industrial Edge Computing

There are two scenarios for .λt and .ϵ: If they can be estimated by the historical
data, the exact value of the threshold can be obtained by integrating the function.
If they are too complex for such estimation, numerical integration methods are
required for an approximate threshold calculation.
To accurately determine .δ, which is crucial for the transition between online and
offline matching stages in the ATOM framework, we employ numerical integration
methods to handle the complexity of task arrival rate .λt (t) and departure rate .ϵ(t).
The function .f (t) = λt (t) − ϵ(t) represents the net rate of task accumulation over
time.
For the purpose of numerical integration, the total simulation time .tsim is divided
into n equal intervals, each represented as .[tk , tk+1 ], where k ranges from 0 to .n − 1.
The step size for each interval is .h = tsim
n . To minimize integration error, Simpson’s
rule of composite integration is applied, which is a method known for its accuracy
in approximating the value of definite integrals. The integral of .f (t) from 0 to .tsim
is approximated as follows:

⎰ ⎲ h⎾ ⏋
tsim n−1
f (t)dt ≈ f (tk ) + 4f (tk+ 1 ) + f (tk+1 )
0 6 2
k=0
. ⎾ ⏋ (3.20)
h ⎲
n−1 ⎲
n−1
= f (0) + 2 f (tk ) + 4 f (tk+ 1 ) + f (tsim ) .
6 2
k=0 k=0

In this approximation, .tk+ 1 is the midpoint of each interval .[tk , tk+1 ]. By


2
calculating the sum of the function values⎰at these points, weighted appropriately,
t
a precise approximation of the integral . 0sim f (t)dt is obtained. This result is
then used in the threshold formula for the global buffer, allowing for an accurate
calculation of .δ based on the dynamics of task arrival and departure rates. This
method ensures that the threshold .δ is set in a way that optimally balances the switch
between online and offline matching, depending on the real-time conditions of the
task environment.

3.2.4 Online Matching Stage

The online matching stage occurs if the number of tasks in the global buffer
exceeds the threshold. MSs and edge servers are represented as disjoint sets .M =
{1, 2, · · · , i, · · · , M} and .N = {1, 2, · · · , j, · · · N}. Tasks in .N should be sent to
.M in real time, necessitating an online match.

Definition 3.1 For a matching case .μ∗ between .M and .N , let G(.M , N , μ∗ )
represent a bipartite graph with .μ∗ for matching.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 47

In G(.M , N , μ∗ ), we define .Fi as the total and .Fioc as the occupied computation
capacity of MS i. Upon task .Tj creation by j , we first identify available MSs.
Calculate the distance .Di,j between edge server j and MS i using (3.3). MS
i joins the candidate list .CLj if .Di,j < ri . After determining distances, we
complete .CLj . The selection of MS for .Tj aims to minimize execution latency.
The execution latency of a task includes upload and computation latency. The
computation capacity .Fi,j assigned by MS i to task j is variable, constrained by
C4 to keep execution latency within tolerable limits. The sum of upload and waiting
latency is

.
s
tj,i = tj,i
tr
+ tj,i
rw
, (3.21)

and the computation capacity allocation of i to j is


⎧ ⎫
τj
Fi,j
. = min max s , Fi − Fi
oc
, (3.22)
tj − αi,j tj,i

tr follows (3.7). The coefficient .α , adjusted based on latency, varies within


where .tj,i i,j
.[1, αmax ]. For the single candidate scenario, .αi,j = αmax . The coefficient is defined

by
s − min t s
tj,i j,i si ∈CLj
αi,j = 1 +
.
s − min t s (αmax − 1) . (3.23)
max tj,i j,i
si ∈CLj si ∈CLj

After computing .Fi,j , we select an optimal MS from .CLj . A greedy algorithm,


though simple, is suboptimal in efficiency. We measure efficiency through Compet-
itive Ratio (CR) [23].
Definition 3.2 For bipartite graph G(.M , N , μ), with OPT and ALG representing
optimal offline and online algorithms for problem I , the CR is

ALG(I )
CR =
. min . (3.24)
G(M ,N ,μ) OP T (I )

Theorem 3.1 The CR of the greedy algorithm is 1/2.


Proof Assuming the optimal algorithm OP T allocates .Fi,jOP T to .T offloaded to
j
i, and the greedy algorithm ALG allocates .Fi,j , with .E ' being the set of edge
ALG
ALG < F OP T . The loss is
servers where .Fi,j i,j

⎲ ⎛ ⎞
Loss =
.
OP T
Fi,j − Fi,j
ALG
. (3.25)
ei ∈E '
48 3 Computation Offloading in Industrial Edge Computing

Denote .Es' as edge servers in .E ' offloaded to i in OP T :



.Loss = Losssi , (3.26)
si ∈S

where
⎲ ⎛ ⎞ ⎲ ⎛ ⎞
Losssi =
.
OP T
Fi,j − Fi,j
ALG
≤ Fi − ALG
Fi,j . (3.27)
ei ∈Es' ei ∈Es'

For .Es' /= ∅ and edge server .ej ∈ Es' , when .Tj is generated, it is offloaded
to .si' satisfying .Fi,j
OP T > F ALG , implying .F oc ≥ F − F ALG . Incorporating this
i,j i i i,j
into (3.27),
⎲ ⎛ ⎞
Losssi ≤ Fioc + Fi,j
.
ALG
∗ −
ALG
Fi,j , ∀ej ∗ ∈ Es' . (3.28)
ei ∈Es'

Thus, .Losssi ≤ Fioc , leading to


⎲ ⎲
.OP T (I ) − ALG(I ) = Losssi ≤ Fioc = ALG(I ). (3.29)
si ∈S si ∈S

This demonstrates the CR of the greedy algorithm is 1/2.


To improve algorithm efficiency, we consider the occupied computation capacity
fraction as an offloading decision factor, introducing a correction function .ψ(ηi ).
Define .ηi as the fraction of occupied computation capacity of i:
Fioc
ηi =
. (3.30)
Fi
and .ψ(ηi ) as

ψ(ηi ) = 1 − e(ηi −1) .


. (3.31)

For task .Tj of edge server j , calculate .Di,j , .ψ(ηi ) and .Fi,j for each MS i. If
.Di,j < ri , MS i joins .CLj . The task is offloaded to i maximizing the product
.ψ(ηi ) × Fi,j in .CLj , followed by task removal from the global buffer.

To analyze CR, divide MS computation capacity into k slabs. An MS is of type l


k , k ]. Define .σl as the total occupied fractions
if its occupied fraction falls within .( l−1 l

and .ρl as the count of type l MSs:



⎪ N
⎨k, l = 1,
σl =
.
N−

l−1
(3.32)
⎪ ρ
⎩ m=1 m
k , 2 ≤ l ≤ k,
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 49

with


k
. ρl = N. (3.33)
l=1

Theorem 3.2 The CR of the algorithm with .ψ(ηi ) is 1-1/e.


Proof Consider a task .Tj generated by edge server j . Let the optimal algorithm
ˆ
OP T offload .Tj to .si∗ , a type l MS with . kl fraction of its computation capacity
occupied (.lˆ ≤ l). ALG offloads .Tj to .si' with an . kr occupied fraction (.r ∈ [0, k]).
This leads to the inequality:
⎛ ⎞
lˆ ⎛r ⎞

Fi , jψ
. ≤ Fi ' ,j ψ . (3.34)
k k

ˆ
Given .lˆ ≤ l and the monotonic decrease of .ψ(ηi ), .ψ( kl ) ≤ ψ( kl ). Thus, (3.34)
becomes
⎛ ⎞ ⎛r ⎞
l
.Fi ∗ ,j ψ ≤ Fi ' ,j ψ . (3.35)
k k

Summing the inequalities for all tasks gives


k ⎛ ⎞ ⎲
k ⎛ ⎞
l l
. ψ ρl ≤ ψ σl . (3.36)
k k
l=1 l=1

Inserting (3.31) and (3.32) into (3.36), and letting .k → ∞, yields


k ⎛ ⎞
l 1
. ρl ≥ N 1 − . (3.37)
k e
l=1

The left side represents ALG’s occupied computation capacity, completing the
proof.

3.2.5 Offline Matching Stage

Once the global buffer is filled, the offline matching stage commences. We have
previously defined MSs and edge servers as disjoint sets .M = {1, 2, · · · , i, · · · , M}
and .N = {1, 2, · · · , j, · · · N }. Let .N ' ⊆ N represent the edge servers in the full
global buffer. The goal is to establish an optimal matching relationship .μ between
.M and .N .
'
50 3 Computation Offloading in Industrial Edge Computing

Definition 3.3 Matching .μ between .M and .N ' is a function where:



• For each .i ∈ M , .μ(i) ⊆ N ' and . μ(i) Fi,μ(i) ≤ Fi .
• For each .j ∈ N ' , .μ(j ) ⊆ M and either .Fμ(j ),j ≤ Fμ(j ) or .Fμ(j ),j = 0, with
the latter indicating no available i for j .
• .j ∈ μ(i) if and only if .μ(j ) = i, for all .i ∈ M and .j ∈ N ' .
The matching is based on the preferences of each .i ∈ M and .j ∈ N ' . Define
.Pi (j ) and .Pj (i) as the preference lists of i and j , respectively, with elements ordered

by decreasing preference.
Definition 3.4 Within .Pj (i), .i ≻j i ' indicates edge server j prefers MS i over MS
.i ' . Similarly, in .Pi (j ), .j ≻i j ' signifies MS i prefers edge server j over edge server
.j .
'

A stable matching .μ between .M and .N ' is sought [22], where each edge
server matches with at most one MS, and each MS does not exceed its maximum
computation capacity. Additionally, .μ should not have any blocking pairs.
Definition 3.5 A pair .(i, j ) of edge server j and MS i blocks .μ if .j ≻i μ(i) and
i ≻j μ(j ). Such a pair is called a blocking pair of .μ.
.

To enhance the stability, we build the preference lists for all edge servers and
MSs. Intuitively, edge server j prefers an MS with more resources and lower
execution latency, while MS i favors an edge server that utilizes more resources,
enhancing its efficiency. The preference lists, .Pi (j ) for MS i and .Pj (i) for edge
server j , are complete, transitive, and strict. The preference functions are defined as

i ≻j i ' ⇔ Pj (i) > Pj (i ' ),
. (3.38)
j ≻i j ' ⇔ Pi (j ) > Pi (j ' ).

The ordering of .Pi (j ) is as follows:



+∞, tr ,
if j is within i’s range, with larger τj and smaller ti,j
.Pi (j ) =
−∞, if j is out of range, and further from i.
(3.39)
Similarly, .Pj (i) is ordered:

+∞, if j incurs less latency from i,
Pj (i) =
. (3.40)
−∞, otherwise.

The offline stage aims to match .M and .N ' with strong stability. Initially,
preference lists are generated (Line 1). Each unmatched edge server selects the
most preferred MS from its list (Line 3). MSs with available capacity review
their preferences, accepting or rejecting edge servers based on the aggregate
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 51

Algorithm 1 Offline matching stage


Require: .M , .N '
Ensure: .μ
1: Create preference lists for each .i ∈ M and .j ∈ N ' using (3.39) and (3.40);
2: for each unmatched j do
3: Find top MS from .Pj (i);
4: for each unsaturated .i ∈ M do

lP
5: Accept the first .lp edge servers choosing i per .Pi (j ), ensuring . Fi,k ≤ (F − Fioc );
k=1
6: Reject other edge servers choosing i;
7: end for
8: for Unmatched edge servers do
9: Remove rejected MSs;
10: end for
11: Update unmatched edge server list;
12: end for
13: Obtain .μ.

computation capacity requirement (Lines 4–7). Unmatched edge servers then update
their preferences (Lines 8–11) to enhance matching stability.

3.2.6 Performance Evaluation

We show the superior performance of ATOM by various experiments, where various


advanced methods are used for comparison.

3.2.6.1 Experiment Setup

Our experiments utilize a MEC-enabled IIoT framework in the following two


scenarios:
• Scenario 1 (S1): We simulated diverse conditions by generating random envi-
ronment parameters in a .3000 × 3000 m area with 100 MSs, each having a
600 m communication range. The MSs’ computation capabilities varied between
50 GHz and 100 GHz. Edge servers generated tasks with time-varying densities,
following a random arrival pattern within 200 s.
• Scenario 2 (S2): For a more specific industrial context, we selected a
.2.01 × 1.89 km area in Xiqing District, Tianjin (Geographical coordinates:
◦ ' '' ◦ ' ''
.(38 59 30.5 N, 117 15 16.6 E)). This scenario deployed 49 MSs (as shown

in Fig. 3.3), each with a computation capacity of 80 GHz and a .450 m coverage
range, acknowledging industrial areas’ complexity compared to residential ones.
Details are in S2 of Table 3.1.
52 3 Computation Offloading in Industrial Edge Computing

Fig. 3.3 The location


distribution of MSs in the
second experimental scenario

Table 3.1 Experimental Parameter Value


parameter setup
S1 Number of MSs 100
Experiment area 3000 ∗ 3000 m
Communication range 600 m
Computing capacity [50, 100] GHz
The ratio of heavy tasks [0.2, 0.5, 0.8]
S2 Number of MSs 49
Experimental area 2.01 ∗ 1.89 km
Communication range 450 m
Computing capacity 80 GHz
The ratio of heavy tasks [0.4]
Common λmin
t 0.001/ms
Params λmax
t 0.008/ms
ϵ 0.005/ms
B 40 MHz
Pj 50 mW
σ2 −96 dBm
β0 −50 dB

Our experiments share several common parameters across different scenarios.


The wireless channel bandwidth is set to 40 MHz, enhancing transmission speed
and service performance, crucial for IIoT’s real-time applications. The transmission
power is set to 50 mW for balancing signal strength and communication coverage
for medium range in IIoT environments [24, 25]. This power level also helps
minimize battery consumption. The noise level is set at .−96 dBm, which falls
within the typical range for wireless systems (.−90 to .−120 dBm) and ensures good
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 53

signal reception conditions in IIoT contexts [26, 27]. The channel gain per meter is
.−50 dB, accounting for the complex outdoor environments in IIoT [28, 29]. MS’s
signal antennas have a height of .30 m.
Threshold parameters are also configured to demonstrate the scheme is effec-
tiveness. Task arrival rates follow a periodic step function with an .8000 ms period
and four steps per period, ranging from .0.001/ms to .0.008/ms. The arrival rate
also increases with the number of devices. The system’s task departure rate is set
at .0.005/ms. The threshold calculation is based on (3.19). All scenario-specific and
common parameters are summarized in Table 3.1.
In both scenarios, tasks are categorized as light or heavy, based on computa-
tional intensity and resource requirements. The typical heavy task is computation-
incentive Virtual Reality (VR)/Augmented Reality (AR), while the typical light task
is natural language processing. The ratio of heavy tasks .ε is randomly sampled from
.{0.2, 0.5, 0.8} and .{0.4} for two scenarios, aligning with the distribution in IIoT

dataset. The metrics used in the evaluation are given below:


• Average Execution Latency (AET): Central to ATOM’s objective of reducing
task execution latency, AET is calculated as the total execution latency divided
by the total number of tasks. It is represented by

T otal Execution T ime


AET =
. . (3.41)
T otal Number of T asks

• Timeout Rate (TR): This measures the proportion of tasks not completed within
the tolerance latency, computed as the number of unfinished tasks over the total
number of tasks. It is defined as
Number of U nf inished T asks
TR =
. . (3.42)
T otal Number of T asks

• Computing Capacity Utilization (CCU): Reflecting the average utilization of


MECs’ computing resources, CCU is determined by the sum of the computation
capacity utilization at each sampling point across all MECs, averaged over the
number of sampling times (SN) and MECs (M). It is given by
∑M ∑SN
s=1 CCUi,s
CCU =
.
i=1
. (3.43)
SN · M

The compared schemes are illustrated as follows:


• OnlineMatch: This scheme offloads arriving tasks individually to the MS with
the highest .ψ(ηi ) · Fi,j , as per the online matching stage [30].
• FoGMatch: Within a fixed time slot, it orchestrates offloading decisions for
all incoming tasks. FoGMatch creates preference lists pairing MSs with edge
server tasks and then matches them sequentially based on these lists until all MS
resources are allocated or all tasks are matched [25].
54 3 Computation Offloading in Industrial Edge Computing

• Min-Min: In this approach, tasks arriving in batches are offloaded. Each task is
assigned to the MS guaranteeing the shortest execution latency, determined by
the sequential order of task arrivals [31, 32].
• Max-Min: Similar to Min-Min, this scheme offloads tasks in batches within a
fixed time slot. Each task is assigned to the MS that has the longest execution
latency, following the order of task arrival [31, 32].
For all schemes, including FoGMatch, Min-Min, and Max-Min, we utilize
the same computing power distribution. This uniformity is crucial for a fair and
significant comparison of each scheme in offloading decision effectiveness. In
FoGMatch, Min-Min, and Max-Min, the time slot is set to .800 ms.

3.2.6.2 Numerical Results

Traditional schemes often use average execution latency as a sole performance


metric, neglecting the considerable variations in data volume and execution latency
across different task types. In this book, we bridge this gap by experimentally
categorizing tasks as heavy or light. This classification enables a more nuanced
evaluation of ATOM, especially considering the significant disparities in data
volume and execution latency, which can vary by order of magnitude.
The results reflect the fluctuations in average execution latency and timeout rate
as device numbers vary across two distinct scenarios. Additionally, we assess the
computation capacity utilization in MSs. The results indicate that ATOM is better
than other schemes in reducing execution latency for all the tasks. Moreover, ATOM
achieves the lowest timeout rate, ensuring tasks are completed within predetermined
time limits. Also, ATOM exhibits the most effective utilization of MSs’ computation
capabilities, extending its superiority beyond task execution.
In our experiments, the dynamics of average execution latency and timeout rate
for heavy tasks across different .ε values are presented in Fig. 3.4, considering
varying device numbers in five different schemes. Specifically, Fig. 3.4a, b, and c
shows that irrespective of .ε value, Max-Min has the highest average execution
latency. Moreover, OnlineMatch shows a decrease in execution latency as the device
number (N ) surpasses 300, due to its computation capacity allocation strategy. This
decrease is notably observed in Fig. 3.4b for .ε = 0.5, where the time drops from
800s to 400s as N increases from 300 to 500. The sharp decrease in OnlineMatch’s
execution latency signifies extended waiting latency for heavy tasks, impacting
timely execution.
Figure 3.4d, e, and f demonstrates the timeout rate performance of heavy tasks.
Max-Min and OnlineMatch show increasing timeout rates with growing N , whereas
Min-Min, FoGMatch, and ATOM maintain stable rates. For example, at .ε = 0.5
(Fig. 3.4e), Max-Min’s timeout rate climbs from 4.5% to 8.4% as N increases from
100 to 500. OnlineMatch initially parallels Min-Min, FoGMatch, and ATOM in
timeout rate but escalates sharply after .N > 300. Min-Min, FoGMatch, and ATOM
exhibit stable timeout rates under 2.0% for heavy tasks.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 55

(a) e = 0.2 (b) e = 0.5 (c) e = 0.8

(d) e = 0.2 (e) e at 0.5 (f) e = 0.8

Fig. 3.4 Analyzing the variation in heavy tasks’ average execution latency and timeout rate as
device numbers change under distinct .ε levels

(a) e = 0.2 (b) e = 0.5 (c) e = 0.8

(d) e = 0.2 (e) e = 0.5 (f) e = 0.8

Fig. 3.5 The average execution latency and timeout rate of the light tasks with varying device
numbers under different .ε

Figure 3.5 exhibits the average execution latency and timeout ratio of light tasks
with different .ε. As shown in Fig. 3.5a, b, and c, the average execution latency
for light tasks keeps stable in Max-Min, Min-Min, FoGMatch, and ATOM with
the increasing of N. Max-Min consistently shows the highest average execution
56 3 Computation Offloading in Industrial Edge Computing

latency, while Min-Min and FoGMatch exhibit similar times. ATOM outperforms
other schemes with the lowest average execution latency for light tasks. Conversely,
OnlineMatch sees a notable increase in execution latency as N grows, attributable
to accumulating waiting latency.
Figure 3.5d, e, and f shows the timeout rate performance of light tasks in different
schemes with varying N . Across these figures, Max-Min, Min-Min, FoGMatch, and
ATOM show minimal changes in timeout rates with increasing N. Notably, ATOM
always records the lowest timeout rate, outperforming Max-Min, Min-Min, and
FoGMatch by 40.6%, 15.8%, and 14.2%, respectively. In contrast, OnlineMatch
exhibits a marked increase in timeout rate for light tasks, surpassing 95% when
.N ≥ 400, indicating its instability in high-task scenarios.

Given the results from the first scenario, where Max-Min and OnlineMatch were
less effective, the second scenario focuses on comparing ATOM with Min-Min and
FoGMatch.
In Fig. 3.6, the average execution latency of heavy and light tasks in these three
schemes is compared. ATOM achieves lower average execution latency for both task
types and shows higher stability, as evidenced by the clustering of data in the boxplot
analysis. Figure 3.7a and b depicts the timeout rates for heavy and light tasks. The
timeout rate for heavy tasks is generally lower compared to light tasks, attributable
to their longer execution latency. ATOM again demonstrates lower timeout rates
for both task types compared to Min-Min and FoGMatch. An upward trend in all
six curves is observed, likely due to the sparser server distribution in the second
scenario, which limits server options for task assignment. As device numbers and
tasks increase, some tasks face latency beyond their time limits, leading to elevated
timeout rates.
Figure 3.8 presents the variation in MSs’ computation capacity utilization with
changing device numbers across different schemes in two scenarios.
In the first scenario with 100 MSs, Fig. 3.8a shows an increase in computing
demand with more devices due to a higher task count. Notably, after .N > 300,
the OnlineMatch scheme faces congestion in processing tasks, leading to latency
task processing and a sharp rise in computing utilization, indicating a system
breakdown. In contrast, ATOM, FoGMatch, Min-Min, and Max-Min maintain an
average computation capacity utilization around 0.3 with 100 devices, increasing
to approximately 0.6 with 500 devices. ATOM notably exhibits higher utilization
due to its reduced waiting times and less allocated computation capacity per task,
resulting in a longer execution latency and higher average utilization during task
execution.
In scenario 2, depicted in Fig. 3.8a, with a reduced number of 49 MSs, the overall
computation capacity utilization is higher compared to the first scenario. Similar
to scenario 1, the utilization increases with the number of devices. ATOM again
showcases higher average utilization than FoGMatch and Min-Min, credited to its
adept management of tasks and computing resources.
3.2 Adaptive Offloading with Two-Stage Hybrid Matching 57

(a) Heavy tasks

(b) Light tasks

Fig. 3.6 The average execution latency of the tasks with varying device numbers
58 3 Computation Offloading in Industrial Edge Computing

(a) Heavy tasks (b)Light tasks

Fig. 3.7 The timeout rate of the tasks with varying device numbers

(a) Scenario 1 (b) Scenario 2

Fig. 3.8 The average computation capacity utilization of the MECs with varying device numbers

3.3 Dependent Offloading with DAG-Based Cooperation


Gain

3.3.1 Statement of Problem

In the system model illustrated in Fig. 3.9, we focus on a scenario where a single BS,
equipped with an edge server, is interconnected with multiple users. This connection
is facilitated through dynamic wireless channels that operate within the coverage
area of the BS. The edge server in this setup is characterized by .kb cores. Similarly,
each user’s end device is equipped with .kl cores, with each core designed to execute
one task at a time. A key feature of both the end devices and the edge server is
their ability to dynamically adjust CPU frequencies. This adjustment is crucial for
optimizing energy consumption and is made possible through the implementation
of Dynamic Voltage and Frequency Scaling (DVFS) technology [35, 36].
In our system model, time is segmented into T slots, each with a duration of .τ
seconds and indexed as .t ∈ {0, 1, · · · , T − 1}. Within each time slot, users execute
applications that are structured as DAGs. These DAGs are comprised of nodes, each
3.3 Dependent Offloading with DAG-Based Cooperation Gain 59

Fig. 3.9 In the industrial edge computing system model for DAG-based task offloading, three
primary processes are defined: “offloading”, “task precedent”, and “cooperation”. “Offloading”
refers to transferring tasks from the user’s device to the edge server for execution. “Task precedent”
indicates that when a task is executed locally on the user’s device, it requires intermediate data
produced by its preceding task, which may have been executed on the edge server [33, 34].
Finally, “cooperation” suggests that two tasks from different DAGs can collaborate by sharing
their intermediate data, enhancing the overall task processing efficiency

representing a computational task, and edges that denote the dependencies between
these tasks.
Task offloading decisions for all users are determined based on the structure of
these DAGs. These decisions involve determining whether to execute tasks on the
end devices or to offload them to the edge server. Factors such as CPU frequencies
and network conditions are taken into account in this process. Additionally,
decisions regarding cooperation among applications are made to improve overall
application performance [37]. This cooperative process is restricted to applications
within the same time slot to prevent any adverse effects on performance. It entails
selecting an External Dependency (ED) from the pool of available devices for data
transmission and establishing the ratio of data to be transmitted through this ED. The
primary objective of these strategies is to maximize the utility of the applications by
judiciously optimizing the offloading, cooperation, and computation decisions for
each user.

3.3.2 Cooperation Gain Estimation Based on DAG

Let N represent the total number of tasks in an application, with task 1 and task
N serving as virtual tasks that require no computation. These tasks facilitate the
initiation and termination of the application on the end device. With EDs, we define
a DAG-ED for U users’ applications, which is an extension of the standard DAG.
60 3 Computation Offloading in Industrial Edge Computing

Definition 3.6 (Directed Acyclic Graph with External Dependency (DAG-ED))


Figure 3.10 illustrates a four-tuple .< V , C, E, E, O >, representing the DAG-ED
for U users’ applications, defined as follows:
• .V : The set of tasks across all applications, totaling .U × N tasks. Each task .vu,n
represents the nth task for user u.
• .C: The computational requirements for each task. The requirement for task .vu,n
is denoted by .cn , where .c1 = cN = 0.
• .E: The set of dependencies within each application, with each element .e =
(vu,n , vu,n' ) indicating a directed edge from task .vu,n to .vu,n' .
• .E: The set of EDs across applications. Here, an element .e = (vu,n , vu' ,n' ), u /= u'
represents a directed dashed edge between tasks from different users. The data
output size of .e ∈ E varies based on cooperation decisions.
• .O: The dataset representing the output sizes for all edges. For edge .e =
(vu,n , vu,n' ), .oe denotes the output data size from .vu,n to .vu' ,n' .
It is important to note that each application’s first and last tasks must be executed on
the respective user’s end device.
Using the DAG-ED framework, we calculate task latency and energy consump-
tion based on offloading, computation, and cooperation decisions.

Fig. 3.10 An illustration of ED-DAGs for two applications. The circle node in the figure
represents a task .vu,n . In a DAG, the black edge .(vu,n , vu,n' ) is the general dependency, while
the red dash edge .(vu,n , vu' ,n' ) between two DAGs is the ED. The task and edge are described by
computation requirement .cn and the size of transmitted data .oe , respectively. Note that there are two
grey virtual nodes without computation requirements as the beginning and end of an application
execution
3.3 Dependent Offloading with DAG-Based Cooperation Gain 61

• Offloading Decision: Let .α t represent the offloading decisions for N tasks and
t
U users at time slot t. Here, .αu,n = 1 implies that user u’s task n is executed
locally, while .αu,n = 0 indicates the task is offloaded to the edge.
t

• Cooperation Decision: The vector .β t , sized .|E|, indicates the data transmission
ratio through EDs, with .βet ∈ [0, 1] for each .e ∈ E.
• Computation Decision: A U -dimensional vector, .λt , represents the local CPU
frequency ratio for users in time slot t, where .λtu ∈ [0.1, 1] and .0.1 is the
minimum ratio.
Define .Ru,n (t) as the transmission rate between user u and the BS, corresponding
to the execution location of task .vu,n . By Shannon’s theorem, we have
⎛ ⎞
pu,n (t)hu (t)
Ru,n (t) = W log 1 +
. , (3.44)
σ2

where W and .σ 2 denote the bandwidth of orthogonal channels and the channel
noise, respectively. The channel gain .hu (t) between user u and BS in time slot t is
given by
⎛ ⎞p
3 × 108
hu (t) = Ad
. , (3.45)
4πfc lut

with .Ad , .fc , .lut , and p representing the channel gain, communication frequency, the
distance between user u’s end device and the edge server, and the path loss exponent,
respectively. Note that .Ru,n (t) varies due to the mobility of end devices.
Meanwhile, the transmitting power .pu,n (t) from the sender is defined as
( )
pu,n (t) = αu,n
.
t
pb + 1 − αu,n
t
pl , (3.46)

where .pb and .pl are the transmitting powers of the BS and users, respectively.
Consequently, the transmission latency from .vu' ,n' to that of task .vu,n in time slot
t can be calculated based on these parameters:

⎨ |αu,n
t −α t oe
u' ,n' | Ru,n (t) , e∈E
u,n
.L ' ' (t) = t t α t βt o (3.47)
u ,n ⎩ αu,n βe oe + u' ,n' e e , e ∈ E.
Ru,n (t) Ru' ,n' (t)

For an intra-application dependency edge .e = (vu' ,n' , vu,n ) in .E, the transmitted
data size is .oe . If tasks .vu' ,n' and .vu,n are executed at the same location, the
transmission latency is zero, indicated by .|αu,n
t − αt
u' ,n' | = 0. Otherwise, the latency
oe
is . Ru,n (t) .
For an ED edge .e ∈ E, the transmitted data size is .βet oe . The latency is set to zero
when the tasks are executed on the same position, i.e., .αu,n t = αut ' ,n' = 0). If both
tasks are executed on their respective local devices, the latency includes the sum of
62 3 Computation Offloading in Industrial Edge Computing

βt o βet oe
(t) + R
e e
transmissions to and from the edge server, calculated as . Ru,n . If one task
u' ,n' (t)
βet oe βt o
is local and the other is on the edge server, the latency is either or . R 'e 'e(t) ,
.
Ru,n (t)
u ,n
depending on the task’s location.
When tasks .vu' ,n' and .vu,n are connected by a task dependency edge and executed
in different locations, the transmission involves sending data first to the BS and then
to the other user.
Application Latency For task .vu,n in time slot t, let .LB C
u,n (t), .Lu,n (t), and .Lu,n (t)
represent the initiation, computational latency, and completion time, respectively.
The completion time is given by

Lu,n (t) = LB
. u,n (t) + Lu,n (t).
C
(3.48)

The application latency for user u in time slot t is .Lu,N (t). Due to the potential
asynchrony of task execution among different users, we adjust the initiation time as
follows:

u,1 (t) = Lu,1 (t) −


LB B
. min LB
u,1 (t). (3.49)
u∈{1,2,··· ,U }

This adjustment ensures accurate latency calculation even in asynchronous scenar-


ios.
Define .Ωu, n as the set of precedent tasks for .vu, n:
{ }
Ω u,n = vu' ,n' |(vu' ,n' , vu,n ) ∈ E ∪ E .
. (3.50)

The initiation time of .vu,n is the latest time by which all output data from tasks
in .Ωu, n has been transmitted:
{ }
.Lu,n (t) =
B
max Lu' ,n' (t) + Lu,n
u' ,n' (t) . (3.51)
vu' ,n' ∈Ω u,n

The computation latency of task .vu,n is

cn ( ) cn
u,n (t) = αu,n
LC + 1 − αu,n
t t
. , (3.52)
fb λtu fl

where .fb and .fl are the CPU frequencies of the edge server and end device,
respectively. This method is also applicable to CPU-GPU heterogeneous computing
platforms.
Cooperation Gain Calculating the exact cooperation gain from EDs through
partial data sharing is challenging. To address this, we correlate the shared data
amount with application utility. Recognizing that the benefit of increased shared
3.3 Dependent Offloading with DAG-Based Cooperation Gain 63

data diminishes over time, we use a .log10 (·) function to evaluate the cooperation
gain:

G(t) =
. log10 βet oe . (3.53)
e∈E

Energy Consumption The energy consumption in devices comprises computation


and transmission energy. The computation energy for task .vu,n is
C
Eu,n
. (t) = αu,n
t
κλtu fl3 LC
u,n (t), (3.54)

where .κ represents the unit energy consumption.


The transmission energy consumption is given by

T
Eu,n
. (t) = αu,n
t
αut ' ,n' pl Lu,n
u' ,n' (t). (3.55)
vu' ,n' ∈Ω u,n

Consequently, the total energy consumption of an application in time slot t is


N
.Eu (t) = C
Eu,n (t) + Eu,n
T
(t). (3.56)
n=1

Application utility in time slot t is defined as a linear weighted function of


latency, energy consumption, and cooperation gain:


U
⎾ ⏋
rt = ω1 G(t) −
. ω2 Lu,N (t) + ω3 Eu (t) , (3.57)
u=1

where .ω1 , ω2 , ω3 are weight parameters.


The objective in a DAG-ED-based offloading scenario over T time slots is to
maximize this utility:

−1
T⎲
P1 :
. max rt
α,β,λ
t=0

s.t. t
αu,1 = t
αu,N = 0, ∀ u, t (3.58)
t
αu,n ∈ {0, 1}, ∀ u, t, n
βet ∈ [0, 1], ∀ e ∈ E
λtu ∈ [0.1, 1], ∀ u, t, n.
64 3 Computation Offloading in Industrial Edge Computing

This DAG-ED-based offloading problem is an NP-hard integer nonlinear opti-


mization challenge, unsolvable through direct methods. Given the dynamic nature
of wireless channels and the need for extensive environmental state estimations
for optimal decision-making, traditional heuristic and DRL algorithms, such as
actor–critic, struggle due to their hyperparameter sensitivity and difficulty handling
high-dimensional problems.

3.3.3 Branch Soft Actor–Critic Offloading Algorithm

In this section, we address .P1 by modeling it as an MDP, enabling the use of actor–
critic methods to address the complex interplay between offloading, computation,
and cooperation decisions. We then introduce a soft policy function to expand
the action space and avoid local optimum. Finally, we propose a branch-based
actor, designed to synchronize the individual offloading decisions while preserving
cooperation gain.

3.3.3.1 Problem Transformation

P1 is a memoryless sequential decision-making problem adhering to the Markov


property and is formulated as an MDP with a 3-tuple < S , A , R >:
• Environment State: The state st ∈ S at time slot t encapsulates the com-
plete environment information, including network conditions and computational
resources:

st = {h1 (t), · · · , hU (t), kb , kl , fb , fl } .


. (3.59)

• Action: The action space A comprises all possible decisions, with at ∈ A


representing the offloading and cooperation decisions:
{ }
at = αu,n
.
t
, βet , λtu | ∀ u, n; ∀ e ∈ E . (3.60)

Note that the length of at is U (N + 1) + |E|.


• Reward: The instantaneous reward rt ∈ R is the application utility received
after action at at time t.
Upon receiving state st , the agent performs action at, and the environment
responds with a new state st + 1 and reward rt .
To address the slow convergence issue common in DRL algorithms [38],
we employ the Soft Actor–Critic (SAC) algorithm, renowned for its enhanced
exploration capabilities through entropy regularization. This approach, which adds
randomness to policy decisions, contrasts with traditional DRL methods like Deep
3.3 Dependent Offloading with DAG-Based Cooperation Gain 65

Q-Network (DQN) [39] and DDPG [40]. However, SAC’s single-agent nature
does not account for inter-agent interference. To mitigate this, we adopt the
Multi-Agent Soft Actor–Critic (MASAC) approach, which effectively manages the
nonstationarity of multi-agent environments, facilitating efficient and privacy-aware
service migration decisions amidst resource competition.

3.3.3.2 Soft Policy Function

Facing the challenge of large action spaces in classical DRL methods, we adopt
SAC algorithm for DAG-ED-based offloading problem. SAC adds an entropy to the
reward, boosting the exploration capabilities. This entropy term represents policy
randomness, encouraging the policy to choose actions with higher rewards more
diversely.
Consider a policy .π(at |st ), which selects action .at in a given environment state
.st . The entropy under this policy is defined as

.H (π(·|st )) = −Eat ∈A {− log π(at |st )} . (3.61)

Simplifying .π(at|st) as .π , SAC aims to find the optimal policy .π ∗ that


maximizes the expected reward and its entropy:
⎧ ⎫

T

π = arg max E{st ,at }
. γt [rt + μH (π(·|st ))] , (3.62)
π
t=0

where .γ is a discount factor for future rewards, and .μ is a temperature parameter


balancing reward and exploration.
The soft Q-value in BSAC measures the entropy-augmented accumulated reward
of a state–action pair under policy .π :
⎧ ⎫

T
. π (st , at ) = E{st ,at }
Q γt [rt +μH (π )] . (3.63)
t=0

The soft V -value in the BSAC framework evaluates a state under policy .π using
a modified Bellman backup:

Vπ (st ) = Eat ∼π [Qπ (st , at ) + H (π )].


. (3.64)

3.3.3.3 Branch Soft Actor–Critic

We introduce BSAC which includes two main modules: the critic and the actor, as
illustrated in Fig. 3.11. The critic module evaluates feedback from the environment
66 3 Computation Offloading in Industrial Edge Computing

Fig. 3.11 The schematics of the framework of BSAC

using soft Q-value and V-value functions. The branches in the actor output the
offloading, cooperation, and computation decisions, respectively.
Critic Module This module employs two DNNs, .Qθ for the soft Q-value and .V ρ
for the soft V-value, enhancing training stability. An experience replay buffer .B
stores and updates experiences as shown in

B ← (st , at , rt , st+1 ) ∪ B.
. (3.65)

Network parameters .θ and .ρ are updated using random samples from .B,
reducing sample correlation with the network parameters.
Let .(si , ai , ri , si+1 ) denote the ith sampled tuple. For the soft V-value network
.Vρ , we train it by minimizing the mean squared error (MSE):

⎧ ⎾ ⏋2 ⎫
1
L
. (ρ) = Esi Vρ (si )−Eai {Qθ (si , ai )+μH (π )} , (3.66)
2
and soft V-value network parameter .ρ is updated based on gradient decent theory

ρ ← ρ − lρ ∇L (ρ),
. (3.67)

where .lρ is the learning rate of the soft V-value network. Here, .∇ρ L (ρ) is the
gradient, which is
⎾ ⏋
∇ρ L (ρ) = ∇ρ Vρ (si ) Vρ (si ) − Qθ (si , ai ) − μH (π ) .
. (3.68)

In this case, the target soft V-value is

V^ρ (si+1 ) = ρVρ (si+1 ) + (1 − ρ)V^ρ (si ),


. (3.69)
3.3 Dependent Offloading with DAG-Based Cooperation Gain 67

where .ρ is a smoothing factor based on an exponential moving average technique.


^θ as
We define the target Q-value of soft Q-value network .Q
{ }
^θ (si , ai ) = ri + γi Es
Q ^
. i+1 Vρ (si+1 ) . (3.70)

We train it toward minimum soft Bellman residuals between .Qθ (si , ai ) and
^θ (si , ai ), i.e.,
Q
.

⎧ ⎫
1⎾ ⏋
^θ (si , ai ) 2 .
L (θ ) = E{si ,ai }
. Qθ (si , ai ) − Q (3.71)
2

Similar to Eq. (3.68), we update .θ by

. ˆ (θ ).
θ ← θ − lθ ∇L (3.72)

Here, .lθ is the learning rate of a soft Q-value network, and


⎾ ⏋
^
∇. θ L (θ ) = ∇θ Qθ (si , ai ) Qθ (si , ai )−ri −γi Vρ (si+1 ) . (3.73)

Actor Module with Multiple Branches The actor module features a multi-
branch structure, each branch composed of a DNN and governed by a distinct
network parameter .ϕj , j ∈ {1, 2, · · · , U + 2}. These branches are dedicated
to different decision-making processes: offloading decisions for individual users,
cooperation decisions of EDS, and computation decisions for all users, aligned
with the policy .πϕj . The first U networks, each corresponding to a user u, focus
j
on offloading decisions, represented by .ai , j ∈ {1, 2, · · · , U }. The subsequent
+1
network handles cooperation decisions .aU i , while the final one is tasked with
U +2
computation decisions .ai . This distributed training approach allows the system
to adapt to the specific requirements of different applications, thereby enhancing
overall application utility.
The optimal policy is determined by minimizing the Kullback–Leibler (KL)
divergence expectation, formulated as
⎧ ⎛ ‖ ⎞⎫
‖ exp(Qθ (si , ·))
.L (ϕj ) = Esi DKL πϕj (·|si )‖
‖ , (3.74)
Zθ (si )

where .DKL(p||q) quantifies the divergence between distributions p and q, and


Zθ (·) is a normalization function.
.

The policy is reparametrized by incorporating Gaussian noise .δi into each


network, forming a stochastic neural network:

j
ai = ϕ(δi ; si ).
. (3.75)
68 3 Computation Offloading in Industrial Edge Computing

In this case, we rewrite Eq. (3.74) as


⎧ ⎛ ⎛ ⎞⎞ ⎛ ⎞ ⎫
j j )
L (ϕj ) = −Eomk H πϕj ai |si + Qθ si , ai
. , (3.76)

whose gradient is
⎛ ⎞ ⎛ ⎞
j j
∇ϕj L (ϕj ) =∇ϕj H
. πϕj (ai |si ) + ∇aj H πϕj (ai |si )
i
⎛ ⎞
j j
+ ∇aj Qθ si , ai ∇ϕj ai . (3.77)
i

To optimize the loss, .ϕj is updated via gradient descent:

ϕj ← ϕj − lϕj ∇ϕj L (ϕj ),


. (3.78)

where the learning rate is denoted by .lϕj .


For effective learning, with appropriate task representation or initialization, the
maximum complexity for reaching a desired state is .O(|st |3 ) for Q-learning and
.O(|st | ) for value iteration in terms of action executions.
2

3.3.4 Performance Evaluation

In this book, our simulation is established in a typical cellular network over .T = 500
time slots, where four users are uniformly distributed. The BS is equipped with a
CPU of frequency .fb = 2.5 × 109 cycles/s, .kb = 8 cores, and transmitting power
.pb = 1 W. Each user’s device features a CPU with frequency .fl = 8 × 10 cycles/s,
7

.kl = 2 cores, and transmitting power .pl = 0.1 W. The wireless channel .hu (t)

between users and BS follows the free space path loss model, with channel noise .σ 2
set at .10−10 .
Users run an application comprising six tasks, as illustrated in Fig. 3.12. The
data size of intermediate results is indicated by the values on the directed edges of
the DAG. Four EDs enable cooperation among users, shown as red dashed lines.
The computation workload for these tasks is randomly assigned values in the range
of .[0, 60, 80, 150, 100, 0] (M Cycles), and the energy consumption constant .κ is
.10
−27 . Weight parameters for application utility, .ω , ω , and .ω , are set to .0.5, 0.5,
1 2 3
and 5, respectively. These simulation parameters are consistent with existing works
on DAG task offloading and are detailed in Table 3.2.
We compare the performance of BSAC with the following four benchmarks:
• Never Cooperate (NC): It uses a greedy algorithm to offload the tasks toward
maximum application utility, without cooperation.
3.3 Dependent Offloading with DAG-Based Cooperation Gain 69

Fig. 3.12 DAG-ED model including 6 EDs, i.e., 4 red ones and 2 green ones, where the 4 red dash
edges are the basic EDs and the 2 green dash edges are used to verify the impact of the number of
EDs

Table 3.2 Experiment Parameter Value


parameters
.fb .2.5 × 109
.8 × 10
.fl
7

.kb 8
.kl 2
.pb 1W
.pl 0.1 W
.κ .10
−27

• Always Cooperate (AC): It uses Gibbs sampling for offloading decision-making


toward maximum application utility. Note that the task must be cooperated with
others that connect to it through a certain ED.
• Random Cooperate (RC): It uses BSAC to maximize the application utility,
where the ED is randomly selected.
• QSAC [41]: It allows the tasks to decide whether or not to cooperate with others.
Once a task cooperates with others, it transmit all the data to the target, known
as hard cooperation. The objective of QSAC is also to optimize the application
utility.

3.3.4.1 Impact of Bandwidth

In Fig. 3.13a, we observe the application utility of five distinct methods across a
bandwidth range of 8–16 MHz. Notably, the AC method initially shows the lowest
utility at bandwidths below 9 MHz, primarily due to the significant execution latency
caused by EDs. However, as the bandwidth exceeds 10 MHz, AC’s utility escalates
70 3 Computation Offloading in Industrial Edge Computing

Fig. 3.13 The different performance with different bandwidths: (a) application utility, (b) average
execution latency, (c) average cooperation gain, and (d) average energy consumption

rapidly and outperforms NC and RC, thanks to the substantial contribution of


cooperation gain to the overall utility. In contrast, NC, which lacks the additional
execution latency of AC, initially surpasses both AC and RC in utility when
the bandwidth is below 9 MHz. But as the bandwidth increases beyond 12 MHz,
NC’s utility diminishes and falls behind AC and RC due to its inability to
capitalize on the benefits of cooperation gain. QASC, with its strategy of selecting
EDs based on dynamic network conditions, achieves commendable utility but
is somewhat constrained under varying network conditions due to its fixed ED
approach. Remarkably, BASC consistently attains the highest application utility by
strategically balancing offloading decisions and cooperation choices.
The average application latency for each user in a single time slot, as presented
in Fig. 3.13b, varies with the bandwidth ranging from 8 to 16 MHz. Here, the AC
method struggles with the highest application latency, mainly due to the longer wait-
ing times induced by subsequent tasks in EDs. RC, choosing cooperation randomly
without considering network conditions, incurs the second-highest latency. QSAC,
which adapts its cooperation strategy based on current network conditions, manages
to achieve the second-lowest application latency. NC, eschewing the use of EDs
3.3 Dependent Offloading with DAG-Based Cooperation Gain 71

altogether, records the lowest latency, offering a more streamlined processing route.
BASC, focusing on energy efficiency, exhibits the third-lowest average application
latency among the methods.
In terms of average cooperation gain, illustrated in Fig. 3.13c, the bandwidth
variation from 8 MHz to 16 MHz brings interesting dynamics. NC, which does not
participate in data sharing, maintains a consistent zero gain across the bandwidth
spectrum. AC, with its policy of consistent cooperation, achieves the highest average
cooperation gain, irrespective of network conditions. RC, with its random approach
to cooperation, realizes a gain that is roughly half that of AC. Interestingly, QSAC’s
cooperation gain increases with bandwidth and eventually surpasses that of RC.
BASC, by adaptively changing the data transmission ratio in EDs, secures the
second-highest cooperation gain, benefiting from its strategy of partial data sharing.
Finally, Fig. 3.13d shows the average energy consumption for the five methods,
with the bandwidth spanning 8 to 16 MHz. RC emerges as the method with the
highest energy consumption due to its random approach to cooperation, which does
not consider the interplay between task offloading and cooperation under dynamic
network conditions. AC, on the other hand, manages to reduce energy consumption
by maintaining user cooperation, which leads to lower transmission latency between
interdependent tasks during offloading, especially as bandwidth increases. NC and
QSAC, while efficient in some respects, exhibit higher energy consumption than
BASC. BASC stands out by balancing energy consumption with cooperation gains
and application latency, achieving the lowest energy consumption overall. This
efficiency is largely attributed to BASC’s flexibility in reducing energy consumption
by adaptively adjusting the CPU frequency of the end device.

3.3.4.2 Impact of Number of EDs

In our research, three DAG-ED configurations are defined for analysis. DAG-ED-1
refers to the original model, featuring four red EDs. Expanding on this, DAG-ED-2
includes the same four red EDs, complemented by an additional green ED valued
at 1300. This setup explores the influence of an extra ED with significant value.
Finally, DAG-ED-3 encompasses the full spectrum, incorporating all six EDs to
examine the system’s capacity in a more complex setup.
In the analysis presented in Fig. 3.14a, the total application utility of five different
methods is evaluated under three distinct DAG-ED configurations with a fixed
bandwidth of 12 MHz. NC, not participating in data sharing, shows a consistent
application utility across all DAG-ED variations. In contrast, the application utility
for other methods exhibits an upward trend as the number of EDs increases from
DAG-ED-1 to DAG-ED-3. This increase is attributed to the growing number of
EDs providing a wider range of cooperative options. Among these methods, BASC
consistently achieves the highest application utility in each DAG-ED setup. It
adeptly balances energy consumption, cooperation gain, and application latency,
effectively utilizing the available EDs to enhance overall utility.
72 3 Computation Offloading in Industrial Edge Computing

Fig. 3.14 (a) Application utility with different DAG-EDs and (b) latency, energy consumption,
and cooperation gain of BASC with different DAG-EDs

Further insights are offered in Fig. 3.14b, which displays changes in application
utility concerning latency, energy consumption, and cooperation gain for BASC at
a bandwidth of 12 MHz. Following the principle outlined in Eq. (3.57), application
utility is inversely related to energy consumption and latency. Therefore, energy
consumption and latency are represented as negative values to illustrate their impact
on application utility. As the number of EDs rises from DAG-ED-1 to DAG-ED-
3, there is a notable increase in cooperation gain, albeit accompanied by extended
waiting times and elevated transmission energy consumption. The addition of
new EDs introduces more flexible cooperation options, resulting in an upsurge in
cooperation gain. This comprehensive analysis underscores the influence of EDs on
the application utility, highlighting the trade-offs between cooperation gain, energy
expenditure, and latency.

3.3.4.3 Impact of Cores

In Fig. 3.15a, the impact of varying the number of cores in the edge server on the
application utility of five different methods is presented. As the number of cores in
the edge server increases, a noticeable growth trend in application utility is observed
across all methods. This improvement is primarily due to the enhanced capacity of
the edge servers to execute tasks in parallel, effectively reducing application latency.
Similarly, Fig. 3.15b focuses on the application utility as influenced by the
number of cores in each end device. Unlike the edge server scenario, the increase in
utility here is relatively marginal. This limited improvement can be attributed to the
lower CPU frequency of the end devices compared to the edge servers. Despite the
increase in cores allowing for more parallel task execution at the device level, the
lower frequency of these devices restrains the overall gain in application utility.
3.3 Dependent Offloading with DAG-Based Cooperation Gain 73

Fig. 3.15 Comparison of application utility with variations in core numbers (a) in the edge server
and (b) in each end device

10500

10000
Cooperation Gain

9500

9000

8500
0
0.5 1.5
1
2 1 0.5
1.5 0
1

Fig. 3.16 The cooperation gain with different ω1 and ω2

3.3.4.4 Impact of Weight Parameters

Figures 3.16, 3.17, 3.18, 3.19, 3.20, and 3.21 demonstrate the effects of varying the
weight parameters .ω1 , .ω2 , and .ω3 , which, respectively, correspond to cooperation
gain, application latency, and energy consumption. The values for .ω1 and .ω2 are set
at .[0.1, 0.5, 1, 1.5], while .ω3 ranges from .[1, 5, 10, 15].
With an increase in .ω1 , cooperation gain assumes greater significance in the
reward function. As a result, there is a marked increase in cooperation gain,
accompanied by a slight decrease in application latency and energy consumption. In
contrast, a higher value of .ω2 , which acts as a penalty in the reward function, tends
to favor local task execution. This approach leads to increased cooperation costs and
energy consumption, offsetting the benefits of spared cooperation gain.
Increasing the value of .ω3 in the reward function leads to a noticeable decrease
in energy consumption, while simultaneously causing an increase in application
latency and cooperation gain. This trend arises because prioritizing energy savings
74 3 Computation Offloading in Industrial Edge Computing

10500

Cooperation Gain
10000

9500

9000

8500
0
5 0
0.5
3 10 15 1.5
1
1

Fig. 3.17 The cooperation gain with different ω1 and ω3

104
2
Application Latency (s)

1.8

1.6

1.4

1.2

0.8
0
0.5 0
0.5
1 1
2 1.5 1.5 1

Fig. 3.18 The application latency with different ω1 and ω2

104
1.8
Application Latency (s)

1.6

1.4

1.2

0.8
15
10 0
0.5
5
3 0 1.5
1
1

Fig. 3.19 The application latency with different ω1 and ω3


References 75

2000

Energy Consumption (J)


1900

1800

1700

1600

1500

1400
0
0.5 1.5
1
1 0.5
2 1.5 0 1

Fig. 3.20 The energy consumption with different ω1 and ω2

2000
Energy Consumption (J)

1800

1600

1400

1200
0
5 0
0.5
10
3 15 1.5
1
1

Fig. 3.21 The energy consumption with different ω1 and ω3

in the reward function encourages task execution on edge servers, even if it means
incurring longer waiting times.

References

1. Tie Qiu, Jiancheng Chi, Xiaobo Zhou, Zhaolong Ning, Mohammed Atiquzzaman, and
Dapeng Oliver Wu. Edge computing in industrial internet of things: Architecture, advances
and challenges. IEEE Communications Surveys Tutorials, 22(4):2462–2488, 2020.
2. Min Chen and Yixue Hao. Task offloading for mobile edge computing in software defined ultra-
dense network. IEEE Journal on Selected Areas in Communications, 36(3):587–597, 2018.
3. Pavel Mach and Zdenek Becvar. Mobile edge computing: A survey on architecture and
computation offloading. IEEE communications surveys & tutorials, 19(3):1628–1656, 2017.
4. Hai Lin, Sherali Zeadally, Zhihong Chen, Houda Labiod, and Lusheng Wang. A survey on
computation offloading modeling for edge computing. Journal of Network and Computer
Applications, 169:102781, 2020.
76 3 Computation Offloading in Industrial Edge Computing

5. Bin Cao, Long Zhang, Yun Li, Daquan Feng, and Wei Cao. Intelligent offloading in multi-
access edge computing: A state-of-the-art review and framework. IEEE Communications
Magazine, 57(3):56–62, 2019.
6. Xianfu Chen, Jinsong Wu, Yueming Cai, Honggang Zhang, and Tao Chen. Energy-efficiency
oriented traffic offloading in wireless networks: A brief survey and a learning approach
for heterogeneous cellular networks. IEEE Journal on Selected Areas in Communications,
33(4):627–640, 2015.
7. Li Lin, Xiaofei Liao, Hai Jin, and Peng Li. Computation offloading toward edge computing.
Proceedings of the IEEE, 107(8):1584–1607, 2019.
8. Yuxuan Sun, Xueying Guo, Jinhui Song, Sheng Zhou, Zhiyuan Jiang, Xin Liu, and Zhisheng
Niu. Adaptive learning-based task offloading for vehicular edge computing systems. IEEE
Transactions on Vehicular Technology, 68(4):3061–3074, 2019.
9. Jia Yan, Suzhi Bi, Ying Jun Zhang, and Meixia Tao. Optimal task offloading and resource
allocation in mobile-edge computing with inter-user task dependency. IEEE Transactions on
Wireless Communications, 19(1):235–250, 2019.
10. Ke Zhang, Yongxu Zhu, Supeng Leng, Yejun He, Sabita Maharjan, and Yan Zhang. Deep
learning empowered task offloading for mobile edge computing in urban informatics. IEEE
Internet of Things Journal, 6(5):7635–7647, 2019.
11. Jiancheng Chi, Chao Xu, Tie Qiu, Di Jin, Zhaolong Ning, and Mahmoud Daneshmand. How
matching theory enables multi-access edge computing adaptive task scheduling in IIoT. IEEE
Network, pages 1–7, 2022.
12. Jiancheng Chi, Tie Qiu, Fu Xiao, and Xiaobo Zhou. Atom: Adaptive task offloading with
two-stage hybrid matching in MEC-enabled industrial IoT. IEEE Transactions on Mobile
Computing, pages 1–17, 2023.
13. Ming Tang and Vincent WS Wong. Deep reinforcement learning for task offloading in mobile
edge computing systems. IEEE Transactions on Mobile Computing, 21(6):1985–1997, 2020.
14. Xinchen Lyu, Hui Tian, Cigdem Sengul, and Ping Zhang. Multiuser joint task offloading
and resource optimization in proximate clouds. IEEE Transactions on Vehicular Technology,
66(4):3435–3447, 2016.
15. Ying Ju, Yuchao Chen, Zhiwei Cao, Lei Liu, Qingqi Pei, Ming Xiao, Kaoru Ota, Mianxiong
Dong, and Victor CM Leung. Joint secure offloading and resource allocation for vehicular edge
computing network: A multi-agent deep reinforcement learning approach. IEEE Transactions
on Intelligent Transportation Systems, 2023.
16. Xiaobo Zhou, Shuxin Ge, Pengbo Liu, and Tie Qiu. Dag-based dependent tasks offloading in
MEC-enabled IoT with soft cooperation. IEEE Transactions on Mobile Computing, 2023.
17. Zhaolong Ning, Peiran Dong, Miaowen Wen, Xiaojie Wang, Lei Guo, Ricky Y. K. Kwok, and
H. Vincent Poor. 5G-enabled UAV-to-community offloading: Joint trajectory design and task
scheduling. IEEE Journal on Selected Areas in Communications, 39(11):3306–3320, 2021.
18. Yu Liu, Yong Li, Yong Niu, and Depeng Jin. Joint optimization of path planning and resource
allocation in mobile edge computing. IEEE Transactions on Mobile Computing, 19(9):2129–
2144, 2020.
19. Bo Yang, Xuelin Cao, Joshua Bassey, Xiangfang Li, and Lijun Qian. Computation offloading
in multi-access edge computing: A multi-task learning approach. IEEE Transactions on Mobile
Computing, 20(9):2745–2762, 2021.
20. Matthew Fahrbach, Zhiyi Huang, Runzhou Tao, and Morteza Zadimoghaddam. Edge-weighted
online bipartite matching. In 2020 IEEE 61st Annual Symposium on Foundations of Computer
Science (FOCS), pages 412–423, 2020.
21. Zhiyi Huang, Zhihao Gavin Tang, Xiaowei Wu, and Yuhao Zhang. Fully online matching II:
Beating ranking and water-filling. In 2020 IEEE 61st Annual Symposium on Foundations of
Computer Science (FOCS), pages 1380–1391, 2020.
22. Yunan Gu, Walid Saad, Mehdi Bennis, Merouane Debbah, and Zhu Han. Matching theory for
future wireless networks: fundamentals and applications. IEEE Communications Magazine,
53(5):52–59, 2015.
References 77

23. Aranyak Mehta and Debmalya Panigrahi. Online matching with stochastic rewards. In 2012
IEEE 53rd Annual Symposium on Foundations of Computer Science, pages 728–737. IEEE,
2012.
24. Hyame Assem Alameddine, Sanaa Sharafeddine, Samir Sebbah, Sara Ayoubi, and Chadi Assi.
Dynamic task offloading and scheduling for low-latency IoT services in multi-access edge
computing. IEEE Journal on Selected Areas in Communications, 37(3):668–682, 2019.
25. Sarhad Arisdakessian, Omar Abdel Wahab, Azzam Mourad, Hadi Otrok, and Nadjia Kara.
FoGMatch: an intelligent multi-criteria IoT -Fog scheduling approach using game theory.
IEEE/ACM Transactions on Networking, 28(4):1779–1789, 2020.
26. Lichao Yang, Heli Zhang, Xi Li, Hong Ji, and Victor CM Leung. A distributed computation
offloading strategy in small-cell networks integrated with mobile edge computing. IEEE/ACM
Transactions on Networking, 26(6):2762–2773, 2018.
27. Sladana Jošilo and György Dán. Decentralized algorithm for randomized task allocation in fog
computing systems. IEEE/ACM Transactions on Networking, 27(1):85–97, 2019.
28. Xiong Wang, Jiancheng Ye, and John C.S. Lui. Decentralized task offloading in edge
computing: A multi-user multi-armed bandit approach. In IEEE INFOCOM 2022—IEEE
Conference on Computer Communications, pages 1199–1208, 2022.
29. Liping Qian, Yuan Wu, Fuli Jiang, Ningning Yu, Weidang Lu, and Bin Lin. NOMA assisted
multi-task multi-access mobile edge computing via deep reinforcement learning for industrial
internet of things. IEEE Transactions on Industrial Informatics, 17(8):5688–5698, 2021.
30. A. Mehta, A. Saberi, U. Vazirani, and V. Vazirani. AdWords and generalized on-line matching.
In 46th Annual IEEE Symposium on Foundations of Computer Science (FOCS’05), pages 264–
273, 2005.
31. Sameer Singh Chauhan and R. C. Joshi. A weighted mean time min-min max-min selective
scheduling strategy for independent tasks on grid. In 2010 IEEE 2nd International Advance
Computing Conference (IACC), pages 4–9, 2010.
32. Ismael Salih Aref, Juliet Kadum, and Amaal Kadum. Optimization of max-min and min-
min task scheduling algorithms using G.A in cloud computing. In 2022 5th International
Conference on Engineering Technology and its Applications (IICETA), pages 238–242, 2022.
33. Slad̄ana Jošilo and György Dán. Wireless and computing resource allocation for selfish
computation offloading in edge computing. In IEEE INFOCOM 2019-IEEE Conference on
Computer Communications, pages 2467–2475. IEEE, 2019.
34. Colin Funai, Cristiano Tapparello, and Wendi Heinzelman. Computational offloading for
energy constrained devices in multi-hop cooperative networks. IEEE Transactions on Mobile
Computing, 19(1):60–73, 2019.
35. Etienne Le Sueur and Gernot Heiser. Dynamic voltage and frequency scaling: The laws of
diminishing returns. In Proceedings of the 2010 international conference on Power aware
computing and systems, pages 1–8, 2010.
36. Greg Semeraro, Grigorios Magklis, Rajeev Balasubramonian, David H Albonesi, Sandhya
Dwarkadas, and Michael L Scott. Energy-efficient processor design using multiple clock
domains with dynamic voltage and frequency scaling. In Proceedings Eighth International
Symposium on High Performance Computer Architecture, pages 29–40. IEEE, 2002.
37. Zhaolong Ning, Peiran Dong, Xiangjie Kong, and Feng Xia. A cooperative partial computation
offloading scheme for mobile edge computing enabled internet of things. IEEE Internet of
Things Journal, 6(3):4804–4814, 2018.
38. Xiongwei Wu, Xiuhua Li, Jun Li, P.C. Ching, C.M. Leung, Victor, and Vincent Poor, H.
Caching transient content for IoT sensing: Multi-agent soft actor-critic. IEEE Transactions
on Communications, 69(9):5886–5901, 2021.
39. Quan Yuan, Jinglin Li, Haibo Zhou, Tao Lin, Guiyang Luo, and Xuemin Shen. A joint
service migration and mobility optimization approach for vehicular edge computing. IEEE
Transactions on Vehicular Technology, 69(8):9041–9052, 2020.
40. Haixia Peng and Xuemin Shen. Multi-agent reinforcement learning based resource manage-
ment in MEC- and UAV-assisted vehicular networks. IEEE Journal on Selected Areas in
Communications, 39(1):131–141, 2021.
78 3 Computation Offloading in Industrial Edge Computing

41. Pengbo Liu, Shuxin Ge, Xiaobo Zhou, Chaokun Zhang, and Keqiu Li. Soft actor-critic-
based DAG tasks offloading in multi-access edge computing with inter-user cooperation. In
Algorithms and Architectures for Parallel Processing—21st International Conference, ICA3PP
2021, Virtual Event, December, 2021, Proceedings, Part III, volume 13157, pages 313–327,
2021.
Chapter 4
Data Caching in Industrial Edge
Computing

Edge caching is a prominent research area and practical field, especially benefiting
from the emergence of data mining for IIoT operation control. Typically, before
processing an offloaded task, it is necessary to access relevant sensing data from
servers caching the needed information. This chapter initially introduces data
caching optimization in industrial edge computing systems, which is crucial when
applications rely on inferences from sensing data over specific historical periods.
Given the diverse sources and heterogeneous nature of data collected from numerous
sensors, this chapter then presents two caching solutions tailored for two common
types of data: latency-sensitive data and video streaming data.

4.1 Introduction

Caching the related databases and AI applications within the storage capacities of
edges is a promising way to enable the decision-making of devices in industrial edge
computing systems [1, 2]. The caching data should be determined by the service
requirement of data that directs the activity of devices, such as a map to control the
moving of robots. The cached data can greatly reduce the latency of end devices and
alleviate traffic loads in backhaul links [3].
In fact, the QoE improvement caused by data caching significantly relies on
the prediction accuracy of devices’ requirements, which is reflected by the data
popularity.1 Therefore, current research works pay attention to devoting to obtaining
accurate data popularity for improving Cache Hit Rate (CHR). ML techniques,
including DL, DRL, transfer learning [4], and so on, are widely adopted to
intelligently cache the data by predicting the underlying data popularity within
given historical requests. Meanwhile, some studies reveal that data diversity, when

1 Here, the data popularity indicates the level of the frequency the data is requested.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 79
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_4
80 4 Data Caching in Industrial Edge Computing

leveraged through cooperation among edges, can help reduce service redundancy
and enhance the CHR [5].
Specifically, in industrial edge computing systems, when receiving a request
from an end user, the edge checks whether the required data has been cached. The
request will be responded to instantly on the edge which caches the required data
and enough computation resources. Otherwise, this edge will forward this service
request to its neighboring edge that satisfies the above conditions, the so-called
edge cooperation [6–8]. In extreme circumstances, i.e., no edge can response to
this request, it will be forwarded to the centralized cloud with high latency.
In industrial settings, the freshness of data, or its age of information, significantly
impacts decision-making in production processes. Outdated data delay the decision-
making and adversely affect production. Additionally, with the increasing adoption
of AR and VR technologies in industrial production, the edge often needs to cache
large volumes of videos. Therefore, designing effective video caching strategies
that minimize video streaming latency while maintaining high video quality is
a substantial challenge. Balancing these requirements is crucial for the smooth
operation and efficiency of industrial processes that rely on real-time data and high-
quality video content.
This chapter introduces caching methods in two different scenarios. Section 4.2
pays attention to caching the freshness-aware data, e.g., the High-Definition (HD)
map, incorporating download latency and freshness. We leverage a distributed Mul-
tiarmed Bandit Algorithm (MAMAB) in the decision-making process. Simulation
results demonstrated that our algorithm outperforms other existing algorithms [9].
Section 4.3 focuses on caching QoE-aware data, e.g., video, which further leads
to a video’s bitrate selection problem. We formulate this problem as a multi-
agent cooperative MDP and solve it by Field-of-View (FoV)-aware multi-agent soft
actor–critic (FA-MASAC). The extensive simulation results show the superiority of
FA-MASAC in terms of average QoE and so on [10].

4.2 Freshness-Aware Caching with Distributed MAMAB

4.2.1 Statement of Problem

In smart factories, autonomous vehicles are extensively utilized across production


lines, assembly lines, finished goods warehouses, e-commerce warehouses, and
logistics systems. HP maps, which provide precise semantic information of road
networks, are a typical and frequently requested type of data by these autonomous
vehicles. They play a crucial role in accurate path planning and autonomous driving
decision-making.
HD maps are characterized by two critical features: large data volume and the
need for timeliness, as the maps change dynamically. For example, the amount of
Google HD map data is approximately 1 GB per mile [11]. These characteristics
4.2 Freshness-Aware Caching with Distributed MAMAB 81

make it impractical to cache all HD maps on autonomous vehicles in advance.


Additionally, as the number of autonomous vehicles increases, substantial data
requests exert significant pressure on the backhaul link, potentially leading to
undesired latency. This presents a challenge in ensuring that these vehicles can
access the most current map data efficiently and promptly, which is essential for
safe and accurate navigation in smart factory environments.
Therefore, to address this problem, HD maps are suggested to be pre-cached
on the edge, such as the RSU, according to a particular distribution, and thus
maximize the cache hit rate and reduce download latency [12, 13]. Unfortunately,
the limited storage resources prevent the RSU from caching all the contents. In
this case, it requires taking full use of computation and storage resources in the
neighbor environment, even the few resources in the devices, i.e., caching HD
maps in RSUs and devices cooperatively. This approach enables devices to retrieve
HD maps through Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I)
communication and thus reduces the download latency and releases the traffic
pressure on the core network [14].
The authors in [15] made request and cache decisions through a matching method
toward minimum download latency. Liu et al. [16] formed a platoon for devices that
request the same HD maps. Also, Wu et al. [17] clustered the devices that allow V2V
or V2I communication, and the devices in the same cluster cooperatively cache the
contents.
However, the time-varying HD map caused by some events, such as the accelera-
tion of vehicles and new construction regions, leads to significant interference with
real-time driving decision-making. Moreover, the freshness of HD maps greatly
influences decision-making as the dynamic HD maps cached in devices lead to
freshness loss compared with caching in RSUs and cloud. This aspect has been
overlooked by previous methods, potentially compromising driving safety.

4.2.2 HD Map Caching Model

Our goal is to make a trade-off between download latency and data freshness, i.e.,
minimizing the total cost.

4.2.2.1 Overview

As shown in Fig. 4.1, in the Internet of Vehicles (IoV) scenario, there are usually
multiple RSUs and devices distributed in the geographical map and one remote
cloud that stores the entire HD map. We assume the geographical map is composed
of M blocks, each of which has an RSU and its individual HD map. Thus, we use
.M = {1, · · · , M} to denote the sets of RSUs whose maximum storage is S. The

transmission power and channel gain when communicating to m are denoted by .Pm
and .gm , respectively.
82 4 Data Caching in Industrial Edge Computing

Fig. 4.1 Vehicle network model

HD map comprises the basic and advanced layers, containing static and dynamic
information for different driving control functions [18]. For example, the basic layer
guides the coarse-level planning of paths, while the fine-grained paths are planned
with the support of the advanced layer. Based on this, as shown in Table 4.1, we
divide the HD map into four sub-maps [19], i.e., basic layer with static information
.fbs and dynamic information .fbd and advanced layer with static information .fas
{ }M
and dynamic information .fad . Let .F = fm,bs , fm,bd fm,as , fm,ad m=1 denote the
file set of the sub-map f . We assume that the data size of each sub-map is equal to
.sf .

Let .V = {1, · · · , V } denote the sets of devices. In a finite time horizon .T


containing T time slots, for RSU m, its covered vehicles in time slot t are denoted by
t
.Vm . The vehicle can only communicate with the corresponding RSU of its located

block and its neighboring vehicles to obtain the HD map. Its transmission power
channel gains are .Pv and .gv , respectively. Note that the vehicle v also caches sub-
maps locally in time slot t, denoted by .Utv .
In time slot t, RSUs adaptively cache the sub-maps from the entire HD map in
the cloud. There are three ways to obtain the required sub-maps for vehicles, i.e.,
request to the cloud via V2I, RSU via V2I, and neighboring vehicles via the V2V.
To prevent interference, we assign distinct communication spectrums for V2V and
V2I. We use .Itv = {1, · · · , K} to represent the candidate vehicles, which is able
4.2 Freshness-Aware Caching with Distributed MAMAB 83

Table 4.1 HD map elements


Layer Data style Data content Update frequency
Basic layer Landmark (static ) Road facilities, trees, etc. Months
Traffic (dynamic) Congestion, temporary, etc. Secs/mins
Advanced layer Road segment (static ) Road and lane details Days/months
Real-time environment Speed, position, and direction Secs
(dynamic) of pedestrians and devices

to communicate to the vehicle v in time slot t. The operator can make the caching
decision for RSUs and vehicles per time slot:
• Caching decision for RSUs .β t ∈ {0, 1}M×F : .βm,f t = 1 indicates that sub-map
has been cached in RSU m, while .βm,f = 0 otherwise.
t

• Caching decision for vehicles .α t ∈ {0, 1}V ×F : .αv,f


t = 1 indicates sub-map has
been cached in vehicle v, while .αv,f = 0 otherwise.
t

4.2.2.2 Vehicle Request Model

During driving, each vehicle should make driving control according to the planning
path. Path planning is a coarse-level control requiring the basic layers .fbs and .fbd
for blocks among all possible paths, denoted by .Ftv,l . Driving is a fine-grained
control made based on both basic and advanced layers to deal with the dynamic
environment and find the safe and efficient target blocks .Ftv,h .
Thus, we can use .Ntv = Ftv,l +Ftv,h . To represent the required sub-maps of vehicle
v in time slot t, which helps to find the sub-maps, .Qtv should be cached, i.e.,

Qtv = Ntv − Utv .


. (4.1)

To estimate the freshness of the set of sub-maps, we introduce the concept of


Age of Information (AoI) [20]. The AoI of a sub-map is calculated by the number
of times (frequency) the slot that the sub-map f has not been updated, denoted
t . By setting a suitable threshold .τ for .a t , we can remove and update the
by .av,f v,f
certain sub-map in the local caching once its AoI exceeds the threshold, and thus
ensure the freshness of the sub-maps. Therefore, .Utv is updated as
⋃ ⎛ ⎞
Utv = Ut−1
. v Qt−1
v − εv
t
− Nt−1
v − Nt
v , (4.2)

( )
where .εvt and . Nt−1
v − Ntv are the set of stale sub-maps and needless sub-maps of
vehicle v in time slot t.
84 4 Data Caching in Industrial Edge Computing

4.2.2.3 Specific Cost

The specific cost function on account of driving safety is defined as a weighted sum
of download latency and freshness loss. First, we use .Xt ∈ {0, 1}V ×(M+V +1)×F to
t
denote where vehicles request the sub-maps. More specifically, .xv,k,f = 1 indicates
the request is sent to a vehicle; .xv,m,f = 1 indicates the request is sent to either the
t

cloud server or RSU m.


Then, to obtain the download latency, we assume the V2I bandwidth of each RSU
is evenly distributed to its connected vehicles (the allocated wireless bandwidth of
the vehicle is denoted by .Bm,v ). Based on this, we calculate the download latency
for caching sub-map f from RSU m to vehicle v as
sf
t
dv,m,f
. = ⎛ ⎞. (4.3)
Pm ·gm
Bm,v log2 1 + N

Once the sub-map f is cached from vehicle k with a bandwidth .Bv via V2V link,
the download latency is
sf
t
dv,k,f
. = ⎛ ⎞. (4.4)
Pv ·gv
Bv log2 1 + N

Similarly, for the caching from the cloud, the download latency can be calculated
by
sf
.
t
dv,0,f = ⎛ ⎞. (4.5)
P0 ·g0
B0 log2 1 + N

Here, .B0 , .P0 , and .g0 are the corresponding bandwidth, transmission power, and
channel gain, respectively.
Next, we define the freshness loss to indicate whether or not to update the sub-
map f on vehicle v in time slot t, which is
⎧ t
⎨ ak,f
, if k ∈ Itv and ak,f
t ≤ τ,
.
t
lv,k,f = 10τ (4.6)
⎩0, otherwise.

t
Specifically, we set .lv,0,f t
, lv,m,f to 0 to capture the constant update of cached sub-
maps in the cloud server and RSUs. By combining the download latency with the
t
freshness loss, the cost .Cv,i,f for the request decision of a certain vehicle, i.e.,

t
Cv,i,f
. = ω · dv,i,f
t
+ (1 − ω) · lv,i,f
t
, i ∈ {0} ∪ M ∪ V. (4.7)
4.2 Freshness-Aware Caching with Distributed MAMAB 85

Note that we use a weighting factor .ω ∈ [0, 1] to make a trade-off between latency
and freshness.
Finally, the cost of vehicle v in block m caches sub-map f in time slot t is
⎾ ⎛ ⎞ ⏋ ⎲
.
D t
v,m,f =x t
v,m,f β t
m,f C t
v,m,f + 1 − β t
m,f C t
v,0,f + t
xv,k,f t
αk,f t
Cv,k,f .
k∈Itv
(4.8)

Based on Eq. (4.8), we formulate the caching problem, which makes caching
decision .β t and vehicle request decision to minimize the total cost, as follows:


M ⎲ ⎲
P1 : min
.
t
Dv,m,f (4.9)
x t ,β t
m=1 v∈Vtm f ∈Qtv

t
s.t. xv,k,f
. ≤ αk,f
t
,. (4.9a)

t
xv,k,f + xv,m,f
t
= 1, . (4.9b)
k∈Itv

t
βm,f sf ≤ S, . (4.9c)
f ∈F
t
βm,f ∈ {0, 1} , xv,k,f
t t
, xv,m,f ∈ {0, 1} . (4.9d)

Constraint (4.9a) specifies that the vehicle must request a sub-map from the
vehicle that has cached it. Constraint (4.9b) stipulates that vehicle v is limited to
acquiring only one instance of sub-map f . Constraint (4.9c) ensures the storage
requirement of cached maps can be burdened by RSU’s capacity. Directly solving
P1 is unfeasible under the vast solution space, especially with the intricate interplay
between caching and request decisions.

4.2.3 Distributed Caching and Requesting Algorithm

Therefore, in the subsequent discussion, we make the caching and request decisions
by solving two decomposed subproblems subsequently, i.e., vehicle request and
caching placement problem. Initially, we randomly cache the contents based on .β t
to find a potential optimal request decision .xot . This decision is further used as the
basis of the caching placement problem, which is solved by a distributed MAMAB
algorithm.
86 4 Data Caching in Industrial Edge Computing

4.2.3.1 Freshness-Aware Request

Since the RSU is desired to make caching and request decisions in each time slot,
i.e., .β t = 1, we transform Eq. (4.8) into a novel cost
' ⎲
t
Dv,m,f =xv,m,f
t t
Cv,m,f + t
xv,k,f t
αk,f t
Cv,k,f .
. (4.10)
k∈Itv

Therefore, .P1 can be transformed into


M ⎲ ⎲
'
P2 : min
.
t
Dv,m,f (4.11)
xt
m=1 v∈Vtm f ∈Qtv

s.t. (4.9a), (4.9b),


t
xv,k,f t
, xv,m,f ∈ {0, 1} .

P2 is a typical Integer Linear Programming (ILP) problem, which has .(k + 1)


possible solutions to cache a sub-map.
⎛ It results
⎞ in a huge solution space, whose
worst-case time complexity is .O (k + 1)|Qv | when it is solved by classical B&B
t

algorithm.
To deal with the high complexity, we proposed a freshness-aware request method
based on matching. First, vehicle v covered by RSU m is indexed by its caching
t
cost .Cv,m,f for receiving sub-map f from the RSU m and that from vehicle k
t
once .αk,f = 1. Subsequently, for vehicle v, we derive the cost difference .ΔLtk =
t
Cv,m,f − Cv,k,f
t , k ∈ Ivt :
• If .ΔLtk > 0, vehicle k is regarded as a candidate selection. All the candidate
selections are sorted based on this value in descending order to form a candidate
set .ψvt .
• Otherwise, the vehicle v will cache sub-map from RSU.
We iteratively find the maximum cost discrepancy in .ψvt and decide whether or
not cache sub-maps by receiving contents from other vehicles via V2V links. Here,
the convergent of the iteration is determined by the average cost.

4.2.3.2 MAMAB-Based Caching

For sub-map f , a high number of vehicles in block m caching it via V2I indicate a
large reduction space for the cost of caching it at RSU m. Thus, the corresponding
reward for a caching decision can be defined as
⎲ ( )
t
rm,f
. = I f, Gtv,m . (4.12)
v∈Vtm
4.2 Freshness-Aware Caching with Distributed MAMAB 87

.Gtv,m is the sub-map that vehicle v requires to request from RSU m in time slot t.
.I(f, Gv,m ) = 1 if .f ∈ Gv,m , and .I(f, Gv,m ) = 0 otherwise. Hence, we transform
t t t

.P2 into an MAMAB problem, i.e.,


T ⎲
M ⎲
P3 : max
.
t
βm,f t
rm,f (4.13)
βt
t=1 m=1 f ∈F

s.t. (4.9c),
t
βm,f ∈ {0, 1} .

The increasing number of RSUs also greatly expands the action space, leading to
additional latency by the traditional centralized approach. Furthermore, to handle
the additional communication overhead, we implement a distributed MAMAB
method, where each RSU and sub-map is regarded as agent and arm, respectively.
Meanwhile, we employ the Upper Confidence Bound (UCB) to maintain a balance
between exploration and exploitation.
Initially, RSU m randomly caches the sub-maps and updates the caching
t
frequency .Jm,f t
of sub-map f to calculate average reward .R̄m,f that is
⎾ ( )
|
| 3 log γf 2 t
t
R̂m,f
.
t−1
= R̄m,f +⏌ t−1
, (4.14)
2Jm,f

where .γf2 is the maximum reward of RSU m caching sub-map f . Since the
t , the increasing
exploration count of the sub-map f is positive correlated with .Jm,f
/ ( )
3 log γf 2 t
of term . leads to a high probability of selecting sub-map f . During the
2J m,f t−1
exploration process of sub-map f , the value of .Jm,f t becomes larger, shifting the
t
focus toward .R̄m,f , which signifies the exploitation phase of sub-map f .

Each RSU m optimizes its caching strategy by .max Ff =1 βm,f t t
R̂m,f under
the storage limitation. Then, the average reward is updated based on the caching
t−1 t−1
R̄m,f Jm,f +rm,f
t
t−1
t
decisions. If .βm,f = 1, .R̄m,f
t = t−1
t
and .Jm,f = Jm,f + 1. Otherwise,
Jm,f +1
t−1 t−1
t
R̄m,f
. = R̄m,f t
and .Jm,f = Jm,f .

4.2.4 Performance Evaluation

We assess performance through Simulation of Urban Mobility (SUMO) software.


Our evaluation is conducted on a .1600 × 1600 .m2 grid road network featuring .K =
40 RSUs and 160 sub-maps. Each sub-map occupies a size of .sf = 100 Mbits,
while the cache capacity of each RSU is .S = 2 Gbits [18].
88 4 Data Caching in Industrial Edge Computing

The bandwidths allocated for vehicles and RSUs are configured at 50 and
150 MHz, correspondingly. A vehicle’s communication range spans 100 m. Addi-
tionally, transmission powers for vehicles and RSUs are set to 300 mW and 2 W,
respectively. Gaussian channel noise and channel gain parameters are established at
.10
−6 mW and .5 dB, respectively [21]. The simulation duration spans .T = 20000

slots, with a threshold .τ defined at 5 time slots. We set .ω to 0.5. The subsequent
benchmarks are employed for comparative analysis:
• Caching without V2V [17]: RSUs independently cache sub-maps to optimize
the average caching reward without leveraging V2V collaboration. The dis-
tributed MAMAB method is employed for caching decisions.
• Latency-Aware Caching [15]: Vehicle request decisions prioritize minimizing
download latency, disregarding the freshness of dynamic sub-maps. Both V2I
and V2V communications are taken into account.
• Location-Aware Caching: RSUs cache sub-maps based on proximity, starting
from nearby blocks and progressing outward until their cache capacity is reached.
Figures 4.2, 4.3, and 4.4 depict the total cost, average freshness loss, and average
download latency across all methods, where the cache size of each RSU varies from
1 to 5 Gbits. Caching without V2V exhibits the lowest average freshness loss, albeit

Fig. 4.2 The total cost with


different RSU cache sizes

Fig. 4.3 The total average


freshness loss with different
RSU cache sizes
4.2 Freshness-Aware Caching with Distributed MAMAB 89

Fig. 4.4 The total average


download latency with
different RSU cache sizes

Fig. 4.5 The total cost with


different RSU bandwidths

with the highest download latency, as it relies solely on RSUs or the cloud platform
for sub-map caching. Conversely, latency-aware caching yields the highest average
freshness loss but boasts the lowest download latency, prioritizing this metric
in vehicle request decisions. Location-aware caching, leveraging the MAMAB
algorithm, achieves an average freshness loss comparable to caching without
V2V, showcasing the efficacy of the MAMAB algorithm. However, its download
latency ranks second-highest due to suboptimal caching decisions. The proposed
freshness-aware caching method strikes a balance between average freshness loss
and download latency, resulting in the lowest total cost. Additionally, an observed
trend indicates a decrease in total cost across all methods with an increase in RSU
cache size, attributed to the increased cacheable sub-maps from RSUs.
Figures 4.5, 4.6, and 4.7 further illustrate the performance of the four methods, in
terms of cost, freshness loss, and download latency under varying RSU bandwidths.
As previously discussed, the proposed method achieves the lowest total cost.
Meanwhile, with the bandwidth of RSU increasing, the download latency for
obtaining sub-maps decreases, as well as the total cost.
The simulation outcomes unequivocally demonstrate the superiority of FA-
MASAC over other caching strategies. However, in practical scenarios, it is
imperative to recognize that not all vehicles may be inclined to share their
90 4 Data Caching in Industrial Edge Computing

Fig. 4.6 The total average


freshness loss with different
RSU cache sizes

Fig. 4.7 The total average


download latency with
different RSU cache sizes

cache resources voluntarily. Hence, implementing a suitable incentive mechanism


becomes paramount to incentivize vehicles for active participation in the HD map
caching and sharing process. This aspect remains an avenue for future research
endeavors.

4.3 Multicategory Video Caching

4.3.1 Statement of Problem

The emergence of vision techniques on smartphones and Head Mounted Devices


(HMDs), 360-degree video streaming, and VR has garnered considerable attention.
Goldman Sachs survey [22] pointed out that there will be approximately 100 million
users using VR and generate over $80 billion in 2025. Typically, a critical feature
of 360-degree video is its high storage requirement, which even reaches 6 times the
storage requirement of general video. Certainly, streaming the entire high-quality
360-degree video directly from a remote cloud can impose significant stress on the
4.3 Multicategory Video Caching 91

core network. For 360-degree video streaming, it is difficult to enhance the QoE for
users, with video quality and rebuffering being two pivotal metrics.
The previous studies focus on tile-based streaming and thus augment video qual-
ity and decrease rebuffering, ultimately enhancing QoE [23, 24]. Its fundamental
idea involves spatially dividing the video into multiple tiles, each of which is flexible
to choose a certain bitrate based on available bandwidth [25]. Furthermore, in 360-
degree video, the users often focus on a specific part of the video within their
FoV [26, 27]. Consequently, a series of efficient methods for FoV prediction springs
up, e.g., Linear Regression (LR) [28, 29], DL-based algorithms [25, 30], and so on.
These methods transmit off of tiles in the FoV with a high bitrate, while other tiles
outside the FoV are transmitted at a low bitrate or omitted entirely, contributing to
an improvement in video quality.
These methods rely on temporal correlation between frames, which is difficult
to apply to mobile VR scenarios as the freedom of VR diminishes the temporal
correlation. Numerous studies introduced saliency-driven approaches to enhance
QoE performance of 360-degree video. These solutions capitalize on the sub-
stantial correlation between historical view trajectory and pixel saliency within
the video [31, 32]. Moreover, certain works have delved into a more refined
assessment of the significance of FoV in bitrate selection, playing an essential role
in maintaining video quality with less bandwidth [33–35].
However, although the above tries to improve the overall performance with the
given resources, it cannot break the bottleneck as the resource is heavily limited,
where edge caching is regarded as a potential solution [36, 37]. Several studies make
caching and bitrate selection decisions to further improve QoE [38, 39]. Given a
constant bitrate selection, as shown in Fig. 4.8, tiles in FoV can be cached in advance
at an edge, allowing the user to access request content promptly, thus decreasing
latency. Otherwise, i.e., the bitrate selection changed adaptively according to the
bandwidth, tiles in the FoV are cached at the edge with a high-bitrate format. It is
evident that decisions regarding edge caching and bitrate selection directly affect
the final video quality and latency.
However, the limitation of existing QoE-driven strategies lies in the uniform
QoE function, while, in fact, the distinct application performs different preferences
in the factors for estimating the QoE. Research has demonstrated that different
video genres place different importance on aspects such as video quality and
rebuffering [40–42]. Taking the game as an example, a higher weight tends to be
assigned to rebuffer, as users prefer the smoothness of the gaming experience, while
for a landscape video, a higher weight for video quality is warranted, given the
preference for distortion-free landscape images.
Figure 4.9 illustrates that the QoE estimation from one perspective is inadequate
for accommodating the various requirements of different categories (e.g., game and
landscape). The three strategies are detailed below:
• The quality-first (QF) strategy aims to cache a few portions of tiles with high
quality at the edge. Note that this approach may fall into the local optimum for
users engaging in games.
92 4 Data Caching in Industrial Edge Computing

FoV
Cloud Server Low High
Tile-based 360-degree video
with multiple bitrates

Bitrate Selection

Cache

Tiles in FoV with


Edge Node User requests
relatively higher bitrate

Fig. 4.8 360-degree video streaming caching system

Videos Quality-first Rebuffer-first Optimal

Game
Tiles with high bitrate Tiles with low bitrate Tiles with adptive bitrate

Landscape
Edge caching QoE Edge caching QoE Edge caching QoE

Fig. 4.9 Different strategies for multicategory 360-degree video streaming

• The rebuffer-first (RF) strategy tends to store the tiles as much as possible with
low quality. It can offer a smoother experience for users engaging in games, while
may not satisfy the requirements of landscape videos.
• The optimal strategy involves electing different qualities for different categories
to satisfy the corresponding requirements. By doing so, there is considerable
potential for enhancing the average QoE across the board.
Addressing multicategory 360-degree video streaming involves partitioning
storage space for various video categories and applying a dedicated QoE-driven
strategy to each category using its specific QoE function. Indeed, rigidly dividing
cache space is not feasible for a random request state. Moreover, the decisions
regarding caching and bitrate selection suffer from a huge decision space as they
are greatly coupled, which poses a formidable challenge in maximizing the average
QoE for users.
4.3 Multicategory Video Caching 93

4.3.2 FoV-Based QoE of Users

We formulate the edge caching and bitrate selection for 360-degree video streaming
problem as a multicategory optimization problem in an industrial edge computing
system.
As shown in Fig. 4.10, the multicategory 360-degree video streaming system
is composed of a remote cloud, an edge server, and U users, denoted by .U =
{1, · · · , u, · · · , U }. Specifically, let C denote the caching capability of the edge
server. The link between the edge server and users is a wireless link, while that
between edge server and cloud is a high-capacity backhaul link.
The remote cloud caches all video categories, denoted by .O = {1, · · · , o, · · · , O}.
Each video category contains multiple videos, denoted by .Vo = {1, · · · , z, · · · , Z}.
The video duration is divide into multiple segments .T = {1, · · · , t, · · · , T },
each of which has a tile set .M = {1, · · · , m, · · · , M}. The tiles have .K ∈ K bitrates
for selection, whose caching capacity requirement is .ck . The transmission latency
between edge and users and that between edge and cloud are denoted by .d E and .d C ,
respectively.

Low High Low High


Cloud Server
Landscape Game
Muticategory tile-based
360-degree videos

Edge Node
Bitrate Selection of
multicategory 360-degree videos
Users for game (User1)
User1
Users for Landscape (User2)

User2 Cache

Edge Node Bitrate Selection

Cache of multicategory
360-degree videos

Fig. 4.10 System architecture


94 4 Data Caching in Industrial Edge Computing

We use a lightweight two-layer LSTM model to predict the FoV [24]. The output
of the LSTM model is the t-th segment of user u, i.e., .Iu,t = {Iu,t,1 , · · · , Iu,t,A },
where .a ∈ A = {1, · · · , A} is the index of tiles.
The operator decides whether the tile is cached on the edge and its corresponding
bitrate quality level. The bitrate selection decisions for category o are represented
by .x o = {x o,u |u ∈ Uo , o ∈ O}. From the perspective of users, we further denote it
by .x o,u = {xu,t,a |t ∈ T, a ∈ A}. Here, .xu,t,a ∈ K is an integer variable to denote
the bitrate selection for a-th tile of segment t’s FoV in the user u’s viewpoint.
Moreover, let .y o = {y o,u |u ∈ Uo , o ∈ O} denote edge caching decisions of
all the categories. Here .y o,u = {yu,t,a
x |t ∈ T, a ∈ A, x ∈ K} is a binary variable,
where .yu,t,a = 1 denotes a-th tile with x-th bitrate of segment t viewed by u that is
x
x
cached, .yu,t,a = 0, otherwise. Based on the above concepts, we decouple QoE from
the following aspects:
Average FoV Quality Given that the user mainly perceives content in the FoV,
which significantly impacts the QoE, utilizing FoV prediction, the average FoV
quality of segment t for user u is computed as follows:
∑A
a=1 q(xu,t,a )
1
QoEu,t
. = . (4.15)
A
Here, .q(·) maps the bitrate selection to video quality experienced by users, i.e.,

q(x) = Rx /RK ,
. (4.16)

where .RK is K-th bitrate [43].


Average FoV Temporal Variations The abrupt shifts in quality among consecu-
tive segments can result in a jarring experience for users. Consequently, it is crucial
to monitor the quality difference among segments, i.e.,
| |
| |
2
QoEu,t
. = |QoEu,t
1
− QoEu,t−1
1
|. (4.17)

Average FoV Spatial Variations Variations in quality between neighboring tiles


result in heterogeneity within a segment, influencing the user’s viewing experience.
Normally, these spatial variations can be estimated by

1 ⎲ ⎲ || |
A
|
3
QoEu,t
. = · |q(xu,t,a ) − q(xu,t,j )|, (4.18)
A
a=1 j ∈G(a)

where .G(a) includes the a’s neighbor tiles in FoV.


4.3 Multicategory Video Caching 95

Rebuffer To ensure a seamless playback experience, it is essential for the buffer


length to exceed the segment download latency. Consequently, the rebuffering
duration is
4
QoEu,t
. = max{Du,t − Bu,t , 0}, (4.19)

where .Du,t and .Bu,t are the download latency and buffer length of user u when
downloading segment t. Note that the tiles in non-FoV are cached with the lowest
'
bitrate. Let .Du,t = (M − A) · l1 · d E denote the transmission latency of tiles in
non-FoV, which is

A ⎾
⎲ ⏋ '
Du,t =
.
x
yu,t,a · lx · d E + (1 − yu,t,a
x
) · lx · d C + Du,t , (4.20)
a=1

where .lx is the size of tile with x-th bitrate. Monitoring the buffer length of a user is
essential for calculating rebuffer events, and thus we calculate the variation of buffer
length as follows:

Bu,t+1 = max{Bu,t − Du,t , 0} + Δt,


. (4.21)

where .Δt is the playback duration of segment t.


Our goal is to maximize the QoE, weighted by two nonnegative parameters .αo
and .βo , i.e.,
⎛ ⎞
QoEut = αo · QoEu,t
.
1
− QoEu,t
2
− QoEu,t
3
− βo · QoEu,t
4
. (4.22)

Based on Eq. (4.22), we make edge caching and bitrate selection decisions
according to the predicted FoV to maximize average QoE, which can be formulated
as


T ∑
O ∑
P1 : max
.
1
T
1
O
1
|Uo | QoEut , . (4.23)
x,y t=1 o=1 u∈Uo


K ∑
O
s.t. fo,k,t · ck ≤ C, ∀t ∈ T, . (4.24)
k=1 o=1

0 ≤ Bu,t ≤ Bumax , ∀u ∈ U, ∀t ∈ T, (4.25)


∑ ∑
where .fo,k,t = k
yu,t,a is the number of tiles with bitrate k of video category
u∈Uo a∈A
o cached by edge. Constraints (4.24) and (4.25) indicate that the storage requirement
of caching must be less than the edge server’s capacity, and the buffer length does
not surpass the maximum capacity .Bumax , respectively.
96 4 Data Caching in Industrial Edge Computing

4.3.3 Multi-agent Soft Actor–Critic Caching

It can be found that P1 performs typical Markov features as the video’s experience
is greatly correlated with the consecutive epochs. Hence, we transform P1 into a
multi-agent MDP, and each video category is regarded as an agent to deal with the
different QoE requirements among multiple categories. The agents make decisions
collaboratively to maximize long-term QoE.
State S In the given system, state s comprises local observations of all agents,
represented as .s = {s1 , · · · , so , · · · , sO }. It comprises the weight of the category
.Wo , request state .Ru,t and buffer length .Bu,t , and the user set of agent o after FoV-

aware operation .U¯o .

sot = (Wo , {Ru,t }u∈Ūo , {Bu,t }u∈Ūo , C), ∀o ∈ O,


. (4.26)

where .Ru,t is composed of video id, segment id, and the predicted viewing
probabilities of all tiles.
Action A Actions .a = {a1 , · · · , ao , · · · , aO } of agents comprise the bitrate
selection and edge caching decisions, i.e.,

aot = ({xu,t }u∈Ūo , {yu,t }u∈Ūo ), ∀o ∈ O,


. (4.27)

where .xu,t = {xu,t,1 , · · · , xu,t,A } represents bitrate selection of tiles in FoV of user
u. .yu,t = {yu,t,1 , · · · , yu,t,A } is the edge caching decision.
Reward Each agent is desired to optimize the average QoE within a given state .sot .
Thus, the overall reward decided by agent o can be calculated by

⎪ 1 ⎲⎛ ⎞

⎨ QoEut + QoEub , 0 < C ' ≤ C,
|Uo |
ave
QoEo,t
. = u∈Uo (4.28)



−5, C ' > C,

where

0, 0 ≤ But ≤ Bumax ,
b
.QoEu =
−2, otherwise,

is a penalty term when the used buffer length .C ' of u violates caching capacity
constraints. Note that when .C ' > C, the average QoE of agent o is set to .QoEo,t ave

= −5.
Nevertheless, there is a significant issue in estimating the state-transition prob-
ability function P1, especially before the action is taken. This is because the
equilibrium is further complicated by the dynamic state, as well as the great coupled
4.3 Multicategory Video Caching 97

Environment Actor Critic


Video categority
360-degree Video System Environment Policy Network Soft Value Network
Request

Buffer size
Actor ... Actor ... Actor
Cache size
... ...
Double Q-value Network

Batch Sampling
Critic ... Critic ... Critic
Replay Buffer
MASAC

Fig. 4.11 The structure of MASAC for multicategory video caching

relationships between agents. To tackle these issues, we propose an MASAC


approach to learn optimal policies while considering the interference among
categories, as illustrated in Fig. 4.11. Notably, MASAC leverages the similarity
of users’ FoV to limit the action space, thereby expediting training convergence.
The detailed information on the actor and critic modules of this algorithm is in
Sect. 3.3.3.

4.3.4 Performance Evaluation

We exhibit the overall performance of FA-MASAC in industrial edge computing


systems. We first detail the experimental datasets, experimental setup, as well as
baseline methods. Subsequently, the parameter settings utilized for training are
listed. Finally, we conduct a comprehensive analysis of the results.
There are two publicly available datasets that are used for our evaluation, which
is widely used in existing studies that focus on 360-degree video streaming:
• Dataset1 [44] The dataset contains the FoV trajectories of 48 users across
eight 360-degree videos. 30 users’ traces are used as the training set, while the
remaining data is used for testing [24]. We also extract 480 traces to construct the
360-degree video library. These traces are obtained by randomly assigning each
of the 480 traces an index corresponding to one of the available videos with the
same probability. According to the assigned video index, each trace uniformly
selects one of 18 available FoV traces.
• Dataset2 [45] comprises the FoV trajectories of 60 users across 28 360-degree
videos. 20 users’ traces are used for training, and the rest are earmarked for
evaluation.
We model a typical cellular network scenario, featuring a remote cloud, a single
edge, and a cohort of mobile users. In our simulations, we deploy .U = 30 users
randomly distributed within the edge’s coverage area. As per the configuration
in [38], the transmission latency for one Mbit from the edge to a user is set at
98 4 Data Caching in Industrial Edge Computing

Table 4.2 Weighting factors Video category .α .β


of QoE metrics
Category1 (C1) 30 30
Category2 (C2) 1 30
Category3 (C3) 30 1

.d E = 1/14 s/Mbit, while the corresponding time from the remote cloud to the user
is .d C = 1/2.9 s/Mbit. The edge is equipped with a cache capacity capable of storing
20% of the 360-degree videos from the generated video library. To streamline
analysis, we assume that all users commence video playback simultaneously,
with a uniform duration (e.g., 30 s) for video consumption. We categorize videos
into three distinct groups (e.g., Category1, Category2, and Category3) based on
the varied quality of experience (QoE) requirements for 360-degree videos. The
weight parameters are set differently to guide optimal edge caching and bitrate
selection decisions for service operators and users. For instance, C1 places equal
emphasis on both video quality and rebuffering, reflected in the weight parameters
.α1 :.β1 = 30 : 30. C2 prioritizes minimizing rebuffering, assigning a weight ratio of

.α2 :.β2 = 1 : 30. Conversely, C3 prioritizes video quality, featuring a larger weight

for video quality with .α3 :.β3 = 30 : 1. Table 4.2 delineates the QoE metric weights
based on video categories. Each 360-degree video is segmented into 30 segments,
each lasting 1 s. Within each segment, we divide it into .M = 24 tiles. Employing
FFMPEG[46] with H.264, each tile is encoded into .K = 4 different bitrate levels:
360p (2 Mbps), 720p (5 Mbps), 1080p (8 Mbps), and 2K (16 Mbps). Additionally,
we assume that each Field of View (FoV) consists of .A = 9 tiles.
The benchmarks used for comparing with FA-MASCA are as follows:
• RF [38] generates a virtual viewport according to the overlap of requests. It
makes the decision by a DQN model to maximize the video quality with limited
rebuffer (0.2 s).
• QF [28] constantly updates common FoV through previous users’ historical
trajectories. The common FoV guides the caching decision-making process
toward maximum video quality. Once receiving any request, it selects a high
or low bitrate of tiles to serve the user. It selects the same quality for tiles in FoV.
• Quality–Rebuffer–Balance (QRB) uses the SAC algorithm [47] to decide
which tile is cached in the edge server edge with what bitrate. The objective
of QRB is to maximize average QoE. With given FoV prediction, QRB treats
multiple video categories as Category 1, assigning equal weight to video quality
and rebuffer to inform the decisions. Note that the RF, QF, and QRB ignore the
distinct QoE requirements for different categories.
• Multi-video Category Based on A3C (MVC-A3C) utilizes three A3C net-
works [48, 49] for three video categories to make decisions that are the same as
QRB. In this method, the cache resources of the edge are evenly divided into three
parts. Each video category employs its respective A3C network to independently
make optimal bitrate selection and edge caching decisions based on its own QoE
function.
4.3 Multicategory Video Caching 99

We implement FA-MASAC using PyTorch [50] in Python. The FA-MASAC


architecture incorporates three fully connected layers, comprising the input layer,
one hidden layer, and the output layer. For each agent, the actor state vector size is
.(M + 5) × Ūo nodes, where .Ūo represents the number of users for agent o post the

FoV-aware operation. Furthermore, the hidden layer and the output layer consist of
.2 × A × Ūo nodes, aligning with the .2 × A × Ūo actions. Conversely, the critic state

is .(M + 2 × A + 5) × Ū , encapsulating environment states and actions of users from


all agents post the FoV-aware operation. We use the linear function for activation
and the Adam optimizer for training lasting 400 epochs with historical transition.
Regarding parameters, the learning rate is set at .α = 0.0003, with an exploration
noise of 0.1. The discount factor is configured to 0.99. Additionally, the experience
replay buffer capacity is .D = 500, and the mini-batch size is designated as .MB =
64.

4.3.4.1 QoE Performance

Figure 4.12 provides a comparative analysis of the normalized average QoE


across multiple categories with distinct QoE requirements. FA-MASAC achieves
superior performance over other baseline approaches, yielding a higher normalized
average QoE for all video categories. Notably, FA-MASAC exhibits substantial
enhancements in performance when compared to the best baseline methods for
specific video categories. Specifically, in contrast to QRB, RF, and QF, FA-
MASAC showcases performance improvements of 18.25%, 2.5%, and 15.1% for
C1, C2, and C3, respectively. This improvement stems from the strategic utilization
of cooperation and competition dynamics among video categories, enabling the
selection of optimal bitrate and cache decisions tailored to each video category,
consequently augmenting the average users’ QoE.
In the comparison among baseline methods, QRB stands out as superior in C1
due to its consideration of both video quality and rebuffer in optimizing perfor-

Fig. 4.12 Average QoE by different methods of Dataset1


100 4 Data Caching in Industrial Edge Computing

mance. In C2, QF that emphasizes video quality yields an unsatisfactory QoE, while
RF excels, delivering excellent QoE. This success is attributed to RF prioritizing
minimal rebuffer in its decision-making, allowing for the storage of more tiles with
relatively low-quality levels in the edge, meeting the QoE requirements of C2.
Conversely, prioritizing video quality, QF greatly benefits C3, while RF struggles
to deliver good QoE in this category. The MVC-A3C, which adopts an independent
strategy for each video category, performs reasonably well in each category but fails
to achieve optimal performance due to the lack of consideration for the interplay
between video categories. Additionally, the consistent performance comparison
results across various datasets, as depicted in Figs. 4.12 and 4.13, highlight the
robustness of FA-MASAC in real-world scenarios.
Figures 4.14 and 4.15 show video quality selected by all methods. As can be
seen from Fig. 4.14, FA-MASAC and MVC-A3C dynamically select appropriate
video quality for different video categories, successfully increasing the average

Fig. 4.13 Average QoE by different methods of Dataset2

Fig. 4.14 Average video quality


4.3 Multicategory Video Caching 101

Fig. 4.15 QoE metrics by different methods based on video category

QoE. These methods show better performance, i.e., offering higher video quality, in
C3, and vice versa in C2. Since MVC-A3C makes bitrate selection and edge caching
decisions independently for each video category, it lacks the adaptability to optimize
decisions based on dynamic user requests, failing to deliver a higher average QoE.
In contrast, other baseline methods consistently allocate the same video quality to all
three video categories, regardless of their dynamic QoE requirements. For instance,
for RF, its video quality for all video categories is low, which is assigned to high
video quality by QF. The mismatch between video quality and QoE requirements
is bound to adversely impact users’ average QoE. The results demonstrate that
adaptively assigning video quality to different video categories greatly improves
average QoE.
Figure 4.15 illustrates that FA-MASAC effectively reduces rebuffering for all
video categories by judiciously utilizing the limited cache resources of the edge
to store more tiles. It strategically allocates more cache resources to store high-
quality tiles, resulting in a lower rebuffer value for Category 3. QF with its strategy
of caching a small number of high-quality tiles at the edge leads to high rebuffer
values for all video categories. Notably, this method is particularly unfavorable to
Category 2, which prioritizes rebuffering. MVC-A3C makes decisions based on the
specific QoE function of each video category, yielding relatively low rebuffering
for Category 1, lower rebuffering for Category 2, and relatively high rebuffering for
Category 3. Conversely, RF and QRB provide relatively low rebuffering. However, a
distinct drawback is their inability to adaptively allocate differentiated video quality
for different video categories hampers the average QoE, especially for Category 3.
These results highlight that FA-MASAC not only collaboratively leverages the
edge’s cache resources but also employs different strategies to mitigate the mismatch
between a single unified QoE function and the diverse QoE requirements of various
video categories, thereby enhancing the average QoE for users.
Figure 4.16 depicts the average QoE of all five methods over time. FA-MASAC
stands out by achieving the highest average QoE as it considers the diverse QoE
102 4 Data Caching in Industrial Edge Computing

Fig. 4.16 Long-term average QoE

Fig. 4.17 Average QoE by different methods

requirements of multiple video categories to select appropriate bitrates and tiles


for caching. The MVC-A3C method makes independent decisions for each video
category, while QRB considers both video quality and rebuffer in its decision-
making process. These methods lack adaptability to optimize bitrate selection and
edge caching decisions dynamically in response to changing user requests, resulting
in an average QoE inferior to FA-MASAC. Prioritizing rebuffer, RF achieves a
lower average QoE compared to QRB. Moreover, the performance of QF is the
worst. This is because it selects a high-quality level only for FoV tiles, leading to a
high rebuffer for multicategory 360-degree videos. In summary, the results prove
the significance of considering the diverse QoE requirements of multiple video
categories for achieving superior performance.
The Cumulative Distribution Function (CDF) of normalized average QoE among
all methods is depicted in Fig. 4.17. FA-MASAC consistently outperforms others in
terms of normalized average QoE. The normalized average QoE of FA-MASAC is
4.3 Multicategory Video Caching 103

equal to 0.84, which performs 3.7% and 5% improvement than QRB and MVC-
A3C, respectively. Also, the normalized average QoE of RF is 0.63 across all
video categories. Since QF emphasizes video quality, it has the lowest performance,
i.e., 0.42, for multicategory 360-degree videos with diverse QoE requirements. FA-
MASAC exhibits a notably higher proportion of normalized average QoE exceeding
0.8, reaching as high as 0.68, while the baseline methods are distributed at 0.6,
0.5, 0.4, and 0.3, respectively. This observation reveals the limitations of methods
employing a single unified QoE function in delivering satisfactory average QoE
among categories.

4.3.4.2 Impact of the Proportion of Requests

We explore the impact of proportions of user requests for different video categories,
as shown in Figs. 4.18 and 4.19, where the proportion for C2 varies from 10% to
50%. Here, the remaining requests are equally divided into C1 and C3. The results
demonstrate that FA-MASAC always outperforms other baseline methods across
all datasets, showcasing its adaptability to different proportions of user requests.
This superiority is attributed to FA-MASAC’s utilization of distinct edge caching
and bitrate selection strategies for each video category, effectively mitigating the
mismatch between a single unified QoE function and the specific QoE requirements
of each video category. QRB consistently achieves a higher normalized average
QoE compared to other methods across different proportions of user requests.
Notably, the normalized average QoE of MVC-A3C exhibits an increasing trend
as the proportion of user requests for C2 rises, but it declines when this proportion
exceeds 30%. This behavior is a consequence of MVC-A3C dividing the edge’s
cache resource into three parts, allocating each to a specific video category. If the
actual user requests for these three video categories are unbalanced, the performance
of MVC-A3C deteriorates.

Fig. 4.18 Average QoE by different methods of Dataset1


104 4 Data Caching in Industrial Edge Computing

Fig. 4.19 Average QoE by different methods of Dataset2

With an increase in the proportion of user requests for Category 2, the normalized
average QoE of RF rises, while that of QF declines. This behavior arises because
both QF and RF solely consider video quality or rebuffer in their edge caching and
bitrate selection decisions, rendering them inflexible to changes in the proportion
of user requests for different video categories. Specifically, when the proportion of
user requests for Category 2 is below 30%, QF outperforms RF. This is because the
proportion of user requests for Category 2 is lower than that for Category 3, and QF
prioritizes video quality, favoring Category 3. Consequently, the average QoE of QF
surpasses that of RF. However, when the proportion of user requests for Category 2
exceeds 30%, RF may outperform QF.

4.3.4.3 Impact of Cache Size

Figure 4.20 shows the impact of cache size on average QoE, where the cache
capacity C varies within the range of 5% to 25%. FA-MASAC achieves a better
performance than other methods. This superiority arises from FA-MASAC’s ability
in adaptive bitrate selection for different video categories, ensuring a more rational
allocation of cache resources and avoiding wastage. Since FA-MASAC allows
tiles in FoV to select different qualities, it exhibits flexibility in selecting the
bitrate quality of cached tiles, thus improving the video quality and reducing
latency. Additionally, with the increasing cache capacity, more and more popular
video content can be stored at the edge, resulting in a gradual stabilization of the
performance of all methods, especially with large cache capacity (e.g., 25%).
The comprehensive simulation results consistently demonstrate the superiority of
FA-MASAC over other baseline methods in terms of average QoE. It is important
to note that this work does not consider the collaboration of multiple edge networks.
Future research could explore more complex and realistic scenarios to further
enhance the understanding of collaborative edge computing environments.
References 105

Fig. 4.20 Average QoE by different methods over different cache sizes

References

1. Konstantinos Poularakis, Jaime Llorca, Antonia Maria Tulino, Ian J. Taylor, and Leandros
Tassiulas. Joint service placement and request routing in multi-cell mobile edge computing
networks. In 2019 IEEE Conference on Computer Communications, INFOCOM 2019, Paris,
France, April 29–May 2, 2019, pages 10–18. IEEE, 2019.
2. Pawani Porambage, Jude Okwuibe, Madhusanka Liyanage, Mika Ylianttila, and Tarik Taleb.
Survey on multi-access edge computing for internet of things realization. IEEE Communica-
tions Surveys Tutorials, 20(4):2961–2991, 2018.
3. Prithwish Basu, Theodoros Salonidis, Brent Kraczek, Sayed M. Saghaian N. E., Ali Sydney,
Bongjun Ko, Tom La Porta, and Kevin S. Chan. Decentralized placement of data and analytics
in wireless networks for energy-efficient execution. In 39th IEEE Conference on Computer
Communications, INFOCOM 2020, Toronto, ON, Canada, July 6-9, 2020, pages 486–495.
IEEE, 2020.
4. B. N. Bharath, Kyatsandra G. Nagananda, and H. Vincent Poor. A learning-based approach to
caching in heterogeneous small cell networks. IEEE Trans. Commun., 64(4):1674–1686, 2016.
5. Yu-Jia Chen, Kai-Min Liao, Meng-Lin Ku, and Fung Po Tso. Mobility-aware probabilistic
caching in UAV-assisted wireless D2D networks. In 2019 IEEE Global Communications
Conference, GLOBECOM 2019, Waikoloa, HI, USA, December 9-13, 2019, pages 1–6. IEEE,
2019.
6. Yuris Mulya Saputra, Dinh Thai Hoang, Diep N. Nguyen, and Eryk Dutkiewicz. A novel
mobile edge network architecture with joint caching-delivering and horizontal cooperation.
IEEE Trans. Mob. Comput., 20(1):19–31, 2021.
7. Tuyen X. Tran and Dario Pompili. Adaptive bitrate video caching and processing in mobile-
edge computing networks. IEEE Trans. Mob. Comput., 18(9):1965–1978, 2019.
8. Yong Xiao and Marwan Krunz. QoE and Power Efficiency Tradeoff for Fog Computing
Networks with Fog Node Cooperation. In IEEE Conference on Computer Communications,
INFOCOM, Atlanta, GA, USA, May 1-4, pages 1–9, 2017.
9. Qixia Hao, Jiaxin Zeng, Xiaobo Zhou, and Tie Qiu. Freshness-aware high definition map
caching with distributed MAMAB in internet of vehicles. In Lei Wang, Michael Segal,
Jenhui Chen, and Tie Qiu, editors, Wireless Algorithms, Systems, and Applications—17th
International Conference, WASA 2022, Dalian, China, November 24-26, 2022, Proceedings,
Part III, volume 13473 of Lecture Notes in Computer Science, pages 273–284. Springer, 2022.
106 4 Data Caching in Industrial Edge Computing

10. Jiaxin Zeng, Xiaobo Zhou, and Keqiu Li. MADRL-based joint edge caching and bitrate
selection for multicategory 360-degree video streaming. IEEE Internet of Things Journal,
pages 1–1, 2023.
11. Jeffrey Minoru Adachi. Accuracy of global navigation satellite system based positioning using
high definition map based localization, 2021.
12. Zhou Su, Yilong Hui, Qichao Xu, Tingting Yang, Jianyi Liu, and Yunjian Jia. An edge caching
scheme to distribute content in vehicular networks. IEEE Trans. Veh. Technol., 67(6):5346–
5356, 2018.
13. Georgios S. Paschos, George Iosifidis, Meixia Tao, Don Towsley, and Giuseppe Caire. The
role of caching in future communication systems and networks. IEEE J. Sel. Areas Commun.,
36(6):1111–1125, 2018.
14. Lei Yang, Lingling Zhang, Zongjian He, Jiannong Cao, and Weigang Wu. Efficient hybrid
data dissemination for edge-assisted automated driving. IEEE Internet Things J., 7(1):148–
159, 2020.
15. Xiaoge Huang, Ke Xu, Qianbin Chen, and Jie Zhang. Delay-aware caching in internet-of-
vehicles networks. IEEE Internet Things J., 8(13):10911–10921, 2021.
16. Shiyu Tang, Ali Alnoman, Alagan Anpalagan, and Isaac Woungang. A user-centric cooperative
edge caching scheme for minimizing delay in 5g content delivery networks. Trans. Emerg.
Telecommun. Technol., 29(8), 2018.
17. Yunzhu Wu, Yan Shi, Zixuan Li, and Shanzhi Chen. A cluster-based data offloading strategy for
high definition map application. In 91st IEEE Vehicular Technology Conference, VTC Spring
2020, Antwerp, Belgium, May 25-28, 2020, pages 1–5, 2020.
18. Xianzhe Xu, Shuai Gao, and Meixia Tao. Distributed online caching for high-definition maps
in autonomous driving systems. IEEE Wirel. Commun. Lett., 10(7):1390–1394, 2021.
19. Rong Liu, Jinling Wang, and Bingqi Zhang. High definition map for automated driving:
Overview and analysis. Journal of Navigation, 73(2):1–18, 2019.
20. Sanjit Krishnan Kaul, Roy D. Yates, and Marco Gruteser. Real-time status: How often should
one update? In Albert G. Greenberg and Kazem Sohraby, editors, Proceedings of the IEEE
INFOCOM 2012, Orlando, FL, USA, March 25-30, 2012, pages 2731–2735, 2012.
21. Penglin Dai, Kaiwen Hu, Xiao Wu, Huanlai Xing, and Zhaofei Yu. Asynchronous deep
reinforcement learning for data-driven task offloading in MEC-empowered vehicular networks.
In 40th IEEE Conference on Computer Communications, INFOCOM 2021, Vancouver, BC,
Canada, May 10-13, 2021, pages 1–10, 2021.
22. H Hellini. The real deal with virtual and augmented reality. Available: http://www.
goldmansachs.com/our-thinking/pages/virtual-and-augmented-reality.html.
23. Yuanxing Zhang, Pengyu Zhao, Kaigui Bian, Yunxin Liu, Lingyang Song, and Xiaoming
Li. DRL360: 360-degree video streaming with deep reinforcement learning. In 2019 IEEE
Conference on Computer Communications, INFOCOM 2019, Paris, France, April 29–May 2,
2019, pages 1252–1260, 2019.
24. Xiaosong Gao, Jiaxin Zeng, Xiaobo Zhou, Tie Qiu, and Keqiu Li. Soft actor-critic algorithm for
360-degree video streaming with long-term viewport prediction. In International Conference
on Mobility, Sensing and Networking, 2021.
25. Xueshi Hou, Sujit Dey, Jianzhong Zhang, and Madhukar Budagavi. Predictive adaptive stream-
ing to enable mobile 360-degree and VR experiences. IEEE Transactions on Multimedia,
23:716–731, 2021.
26. Yuanxing Zhang, Yushuo Guan, Kaigui Bian, Yunxin Liu, Hu Tuo, Lingyang Song, and
Xiaoming Li. EPASS360: QoE-aware 360-degree video streaming over mobile devices. IEEE
Trans. Mob. Comput., 20(7):2338–2353, 2021.
27. Xuekai Wei, Mingliang Zhou, Sam Kwong, Hui Yuan, and Weijia Jia. A hybrid control scheme
for 360-degree dynamic adaptive video streaming over mobile devices. IEEE Transactions on
Mobile Computing, 2021.
References 107

28. Anahita Mahzari, Taghavi Nasrabadi, Afshin, Aliehsan Samiei, and Ravi Prakash. FoV-aware
edge caching for adaptive 360-degree video streaming. In Proceedings of the 26th ACM
international conference of Multimedia, 2018.
29. Liyang Sun, Fanyi Duanmu, Yong Liu, Yao Wang, Yinghuam Ye, Hang Shi, and David Dai. A
two-tier system for on-demand streaming of 360 degree video over dynamic networks. IEEE
Journal on Emerging and Selected Topics in Circuits and Systems, 9:43–57, 2019.
30. Zhiqian Jiang, Xu Zhang, Yiling Xu, Zhan Ma, Jun Sun, and Yunfei Zhang. Reinforcement
learning based rate adaptation for 360-degree video streaming. IEEE Transactions on Broad-
casting, 67(2):409–423, 2021.
31. Shibo Wang, Shusen Yang, Hailiang Li, Xiaodan Zhang, Chen Zhou, Chenren Xu, Feng Qian,
Nanbi Wang, and Zongben Xu. SalientVR: saliency-driven mobile 360-degree video streaming
with gaze information. In In Proceedings on Mobile Computing and Networking, 2022.
32. ShibBo Wang, Shusen Yang, Hairong Su, Cong Zhao, Chenren Xu, Feng Qian, Nanbin Wang,
and Zongben Xu. Robust saliency-driven quality adaptation for mobile 360-degree video
streaming. IEEE Transactions on Mobile Computing, 2023.
33. Ming Hu, Lifeng Wang, and Shi Jin. Two-tier 360-degree video delivery control in multiuser
immersive communications systems. IEEE Transactions on Vehicular Technology, 2022.
34. Xuekai Wei, Mingliang Zhou, and Weijia Jia. Towards low-latency and high-quality adaptive
360-degree streaming. IEEE Transactions on Industrial Informatics, 2022.
35. Xianda Chen, Tianaxiang Tan, and Guohong Cao. Macrotile: Toward QoE-aware and energy-
efficient 360-degree video streaming. IEEE Transactions on Mobile Computing, 2022.
36. Pantelis Maniotis, Eirina Bourtsoulatze, and Nikolaos Thomos. Tile-based joint caching
and delivery of 360◦ videos in heterogeneous networks. IEEE Transactions on Multimedia,
22(9):2382–2395, 2020.
37. Yanwei Liu, Jinxia Liu, Antonious Argyriou, Liming Wang, and Zhen Xu. Rendering-aware
VR video caching over multi-cell MEC networks. IEEE Transactions on Vehicular Technology,
70(3):2728–2742, 2021.
38. Pantelis Maniotis and Nikolaos Thomos. Viewport-aware deep reinforcement learning
approach for 360◦ video caching. IEEE Transactions on Multimedia, 2021.
39. Qi Cheng, Hangguan Shan, Weihua Zhuang, Lu Yu, Zhaoyang Zhang, and Q. S. Quek, Tony.
Design and analysis of MEC- and proactive caching-based 360 mobile VR video streaming.
IEEE Transactions on Multimedia, 2021.
40. Ivan Sliver, Mirko Suznjevic, and Skorin Kapov Lea. Game categorization for deriving
QoE-driven video encoding configuration strategies for cloud gaming. ACM Transactions on
Multimedia Computing, Communications, and Applications, 2017.
41. Chunyu Qiao, Jiliang Wang, and Yunhao Liu. Beyond QoE: Diversity adaption in video
streaming at the edge. In Proceedings of 39th International Conference on Distributed
Computing Systems, 2019.
42. Guanghui Zhang, Jie Zhang, Yan Liu, Haibo Hu, Jack Lee, and Vaneet Aggarwal. Adaptive
video streaming with automatic quality-of-experience optimization. IEEE Transactions on
Mobile Computing, 2022.
43. Xiaoqi Yin, Abhishek Jindal, Vyas Sekar, and Bruno Sinopoli. A control-theoretic approach
for dynamic adaptive video streaming over http. In ACM SIGCOMM, 2015.
44. Chenglei Wu, Zhihao Tan, Zhi Wang, and Shiqiang Yang. A dataset for exploring user
behaviors in VR spherical video streaming. In Proceedings of the 8th ACM on Multimedia
Systems Conference, 2017.
45. Taghavi Nasrabadi, Afshin, Aliehsan Samiei, C.Q. Farias, Mylene, and M. Carvalho, Marcelo.
A taxonomy and dataset for 360◦ videos. In In Proceedings of the 10th ACM on Multimedia
systems Conference, 2019.
46. FFmpeg. About FFmpeg. Available: https://ffmpeg.org. [Online].
47. Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off-policy
maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the
35th International Conference on Machine Learning, 2018.
108 4 Data Caching in Industrial Edge Computing

48. Nuowen Kan, Junni Zou, Chenglin Li, Wenrui Dai, and Hongkai Xiong. Rapt360: Reinforce-
ment learning-based rate adaptation for 360-degree video streaming with adaptive prediction
and tiling. IEEE Transactions on Circuits and Systems for Video Technology, 32(3):1607–1623,
2022.
49. Yongkai Huo and Hongye Kuang. Ts360: A two-stage deep reinforcement learning system for
360-degree video streaming. In In Proceedings on Multimedia and Expo (ICME), 2022.
50. Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan,
and et al. PyTorch: An imperative style, high-performance deep learning library. In Proceed-
ings of 33rd Conference on Neural Information Processing System, 2019.
Chapter 5
Service Migration in Industrial Edge
Computing

In industrial edge computing systems, users, such as automatic inspection robots,


constantly move in the factory, which means the controller should frequently
update the service caching strategy. To deal with this issue, this chapter attempts
to highlight the interference of users’ mobility for caching and further introduces
the concept of migration to fill this gap. We design an energy-efficient migration
method based on Lyapunov optimization to save energy consumption of the system
in the long term. Then, further taking the location privacy of the users into account,
a location privacy-aware migration method is designed to reduce the risk of the user
being attacked. This chapter rigorously demonstrates the effectiveness and security
of the methods in multiuser scenarios. Overall, the combination of caching and
migration so that they could apply to various complicated industrial scenarios is
suggested.

5.1 Introduction

The surge in latency-sensitive services, such as an AI control algorithm [1], is


propelled by the advancement of industrial edge computing systems. To match the
tight latency demands of various services, the heterogeneous dense cellular network
structure is employed to form the 5G networks, with a density reaching 40/50 BSs
per square kilometer [2]. Meanwhile, the advanced VM technologies support the BS
to cache the services to serve users. Normally, there is only a subset of services that
is deployed on the resource-limited edge server [3]. A potential solution involves
routing the requests to neighbor BSs or the remote cloud server once the requested
services are not deployed, albeit at the expense of increased response latency.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 109
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_5
110 5 Service Migration in Industrial Edge Computing

As mentioned before, in VM, each service preserves an instance for the serving
users, encompassing intermediate results or historical data. It enables all the users to
enjoy an extremely low latency [4–6]. In multiuser dense industrial edge computing
systems, the efficiency of caching may diminish under user mobility. As users
move among the coverage of BSs, they usually connect to the BS from which it
receives the highest signal strength indication, i.e., signal handoff. It results in extra
energy consumption as it should route the request among BSs, as well as an extra
communication latency, even reaching unacceptable levels. Despite the option to
update service caching policies frequently, resource limitations in core networks
introduce an extra latency and energy consumption for transmission between edge
server and cloud via backhaul links [7].
Earned by user mobility, the services should seamlessly follow users, which is
referred to as service migration, thereby minimizing request routing latency [8–10].
Also, this process consumes additional energy. Existing research pays attention to
designing single-user migration strategies, which independently selects the optimal
edge by predicted trajectory-based MDP of the user without taking other users’
actions into account [11].
However, in multiuser industrial edge computing systems, this approach presents
two primary challenges. Firstly, the presence of uncertain interference among users,
denoted as resource conflict, may result in migration failures. When multiple users
independently make service migration decisions, their service may be migrated
to one BS that cannot burden the storage requirement. Additionally, the shared
migration strategies for services among multiple users are inevitably ignored in the
existing single-user strategies, resulting in the shortage and misuse of resources [12].
Secondly, enabling migration effectiveness heavily depends on accurate trajectory
prediction. Existing strategies overlook how long users connect to their target BSs,
which can be efficiently reflected by the future trajectories. Moreover, the state space
extremely expands with the increasing number of users, which further complicates
precisely trajectory prediction for users [13].
This chapter mainly introduces the migration toward low energy consump-
tion and high privacy within the latency constraint. The Energy-efficient service
miGration for multiuser hterO-geneous dense cellular networks algorithm (EGO)
is introduced in Sect. 5.2. The objective is to achieve minimum overall energy
consumption within serve service latency requirements and limited resources.
Meanwhile, EGO also takes the interference among users into consideration to
improve the system performance [14]. Section 4.3 introduces a location privacy-
aware migration algorithm that protects against attacks. To estimate the risks of
location privacy leakage, we proposed a specific entropy-based location privacy
metric [15].
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 111

5.2 Energy-Efficient Migration Based on 3-Layer VM


Architecture

5.2.1 Statement of Problem

Energy consumption in multiuser industrial edge computing systems has gained


significant importance. The surveys from MTN and Huawei [16, 17] show that 5G
BS consumes two or more times higher energy consumption than that of a 4G BS.
Projections suggest that this consumption could account for 18% of operational
expenses in Europe and 32% in India [18]. Especially with a high density of
BSs, service migration may occur more frequently, resulting in substantial energy
consumption. Presently, there is a growing focus on energy consumption in both
Next-Generation Networks (NGNs), highlighting the necessity for designing an
energy-efficient migration strategy.

5.2.2 Energy-Efficient Service Migration Model Under 3-Layer


VM Architecture

Figure 5.1 shows a typical multiuser industrial edge computing system, including
several BSs and users distributed in the geographical map. The deployment of
BSs, each of which equips edge servers, performs a great overlap of their coverage

Fig. 5.1 Service migration illustration in multiuser industrial edge computing systems
112 5 Service Migration in Industrial Edge Computing

regions due to their high spatial density. It also means the mobile users are always
located in serving coverage of multiple BSs. Generally, the user receives different
signal strengths from different BSs, influenced by distance, bandwidth, and so on,
and connects the BSs with the best signal [19]. The BSs are interconnected via
wireless links, allowing users to access services across the network by request
routing. Note that the changing of the user’s connected BS does not mean the
migration will occur.
The services are hosted by VM in edge servers within limited resources, whose
migration process incurs more energy consumption than request routing with less
latency. The service can be abstracted to a 3-layer VM model [20] as follows:
• Base layer: An operating system that provides basic support and is always
employed in the BSs to build the VM.
• Application layer: The essential data to provide the service for users, which is
shared by multiple users.
• Instance layer: Individual state, e.g., historical pattern, privacy, etc.
When a user generates a service request, the corresponding BS, which possesses
both its corresponding application layer and the instance layer, must respond. Once
the target edge server already hosts the required application layer, it means that the
user is not required to consume time for transmitting the application data again.
In this case, only the instance layer with low data size should be migrated from
the previous serving node. Otherwise, both of them need to be migrated. Also, in
the special case, i.e., some stateless services, we set the data size of instance layer
to zero. The migration process is exemplified in a simplified scenario depicted in
Fig. 5.1.
The system involves three BSs and two users, named Emma and Steve. The
services deployed on .BS1 , .BS2 , and .BS3 are .{Steam}, .{Steam and F acebook},
and .{AR}, respectively. Initially, Emma is covered by .BS1 , and Steve is covered by
.BS2 . Emma generates requests for the Steam service, while Steve requests Steam

and F acebook. As they move, Emma’s connectivity switches to .BS2 , and Steve’s to
.BS3 . It can be seen that .BS2 has the service Steam, and thus only the instance layer

data is required to be migrated for Emma, which greatly reduces response latency.
Meanwhile, Steve continues to request Steam from .BS2 through request routing.
Alternatively, migrating application and instance layers of F acebook to .BS3 allows
Steve to be served locally.
In industrial edge computing systems, the following questions must be addressed
when minimizing the average energy consumption:
• Which service is migrated to which BS with user mobility, BS heterogeneity,
service latency deadline, and interference among users?
• Which layer, i.e., application and instance layer, should be migrated?
In multiuser industrial edge computing systems, there are N BSs and U users.
In time slot .t ∈ {0, 1, · · · , T }, each user requests a certain service m, .m ∈
{1, 2, · · · , M} based on its historical request statistics, where one time slot lasts
for .τ .
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 113

We use a U -by-N matrix .x ct to represent whether or not user u connects to BS n in


time slot t. Here, .xtc (u, n) = 1 indicates user u connects to BS n, while .xtc (u, n) = 0
does not. Note that a user can only connect to at most one BS, i.e.,


N
. xtc (u, n) = 1, ∀t, u. (5.1)
n=1

Generally, the controller determines BS n where the user u can request service m
in the next time slot .t +1, denoted by a U -by-N .x m,t that means that the service m is
provided by BS n to user u in time slot t. Also, let a U -by-N matrix .x 'm,t represent
the migration decision (i.e., target BS) in time slot t, which is
'
xm,t+1 (u, n) = xm,t
. (u, n). (5.2)

The physical meaning of .xm,t (u, n) = 1 is that the BS n has cached instance layer
of service m for the user in time slot t. Since the user requests service from one
certain server or does not request the service, the migration decision should satisfy


N
'
. xm,t (u, n) ≤ 1. (5.3)
n=1

Additionally, as the application layer is shared among users, let .P t =


{Pt (m, n)|m = 1, · · · , M; n = 1, · · · , N} denote the state of the application
in the edge server, i.e.,
⎧ ⎫

U
Pt (m, n) = min 1,
. xm,t (u, n) . (5.4)
u=1

More specifically, .Pt (m, n) = 1 denotes service m is cached in BS n in time slot t.


Similarly, .P 't is defined as the application state after migration based on
'
.xm,t (u, n), which is

⎧ ⎫

U
' '
.Pt (m, n) = min 1, xm,t (u, n) . (5.5)
u=1

Normally, the services that achieve different functions perform different require-
ments in CPU cycles, energy consumption, and so on. In this book, we abstract the
service m into a 8-tuple .< λm , γm , Dm , fm,u , θmA , θm,u
I , W , ω >, i.e.:
m m

• .λm : data size.


• .γm : computation density for the request, CPU Kcycles/bit.
• .Dm : maximum tolerant latency.

• .fm,u : CPU cycles requirement for user u.


114 5 Service Migration in Industrial Edge Computing

• .θmA /θm,u
I : the application/instance layer data size for user u.

• .Wm,u : channel bandwidth requirements.


• .ωm : the number of requests per time slot, as known as service frequency.
Let .Fn , .Sn , and .Wn denote the maximum CPU frequency, storage size, and
bandwidth of BS n. With the requirement of service m and capacity of BS n,
we can determine whether or not the migration decisions break the corresponding
computation, storage, and channel bandwidth constraints, i.e.,


M ⎲
U
'
. fm,u xm,t (u, n) ≤ Fn , ∀t, n, (5.6)
m=1 u=1

⎾U ⏋

M ⎲
'
.
I
θm,u xm,t (u, n) + θmA Pt' (m, n) ≤ Sn , ∀t, n, (5.7)
m=1 u=1


M ⎲
U
'
. Wm,u xm,t (u, n) ≤ Wn , ∀t, n, (5.8)
m=1 u=1

respectively.

5.2.2.1 Service Latency

To accurately estimate the transmission latency in the real world, we introduce a


widely used nonlinear function [21]. This function assumes the channel is Rayleigh
channel. It takes several effective factors, including the transmission power .pa , the
noise power .N0 , the channel bandwidth W , the complex fading vector h, and the
reference distance .d(a, b) between nodes a and b, into account. The transmission
rate .R(a, b) between nodes a and b is
⎛ ⎞
pa hd(a, b)−3
.R(a, b) = W log2 1 + . (5.9)
N0

The transmission latency for user u is composed of latency for sending requests
and routing requests from the user’s connected BS to its serving BS, which is

λm
tra
lm,t
. (u) = c + C, (5.10)
R(πt,u , πm,t,u )
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 115

where
c
πt,u
. = arg max xtc (u, n), (5.11)
n

. πm,t,u = arg max xm,t (u, n). (5.12)


n

Note that it is difficult to obtain an exact measure of transmission latency from user
to BS with users’ mobile nature, and thus we approach this latency by a constant
value C.
Also, we can calculate the computation latency as

λ m γm
com
lm,t
. (u) = . (5.13)
fm,u
'
We use .πm,t,u ' (u, n) to represent the migration target BS.
= arg max xm,t
n
Benefiting from 3-layer VM, the service’s application layer that is not in BS
'
.πm,t,u can be downloaded from the nearby BS, while the ongoing service can be

maintained, where the latency is


⎧ ⎫
app θmA |Pt' (m, n) − Pt (m, n)| '
.lm,t (n) = min |n = 1, 2, · · · , N . (5.14)
Pt' (m, n' )R(n, n' )

Actually, the service migration latency should only consider the instance layer
transmission latency as the application layer can be transmitted beforehand, i.e.,

I
θm,u
ins
lm,t
. (u) = ' . (5.15)
R(πm,t,u , πm,t,u )

'
Here, .πm,t,u = πm,t,u indicates that the service m is not migrated and its migration
latency is equal to zero. As each time slot may deal with multiple requests,
determined by the service frequency, the total service latency .Lm,t (u) can be
expressed as
⎾ ⏋
Lm,t (u) = ωm lm,t
.
tra
(u) + lm,t
com
(u) + lm,t
ins
(u). (5.16)

As the user must receive the results before the service deadline, the latency must be
less than the threshold .Dm , i.e.,

Lm,t (u) ≤ Dm , ∀u, m, t.


. (5.17)
116 5 Service Migration in Industrial Edge Computing

5.2.2.2 Energy Consumption

Next, we detail the calculation for the critical energy consumption, which includes
transmission energy


U
tra
Em,t
. = pπm,t,u ωm tra
lm (u), (5.18)
u=1

computing energy consumption,


U
.
com
Em,t = κωm 3 com
fm,u lm,t (u), (5.19)
u=1

and migration energy consumption,


⎾ ⏋
mig

U ⎲
N
app
.Em,t = pπm,t,u
'
ins
lm,t (u) + lm,t (n) . (5.20)
u=1 n=1

Here, .κ is the unit energy consumption for one CPU cycle processing in BS n.
The migration process typically involves transmitting a larger amount of data
compared to regular service requests, resulting in increased energy consumption,
as well as latency. However, after migration, the energy consumption for routing
requests, along with service latency, is significantly reduced in subsequent time
slots. Consequently, with optimal migration decisions, the cost associated with
migration is offset, leading to overall performance improvements. The total energy
consumption can be calculated by

mig
Em,t = Em,t
.
tra
+ Em,t
com
+ Em,t . (5.21)

5.2.2.3 Problem Formulation

The objective is to achieve minimum average energy consumption of migration


'
decisions .xm,t with the limited deadline requirement and resources, which is
formulated as the optimization problem below

M T −1
1 ⎲⎲
P1 :
. min
'
Em,t . (5.22)
x m,t T
m=1 t=0
app
s.t. lm,t (n) ≤ τ (5.23)
(5.1), (5.3), (5.6), (5.7), (5.8), (5.17).
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 117

Constraint (5.23) guarantees that the ongoing service will not be disrupted by
application layer migration. It is noteworthy that the absence of future information
poses a challenge in deriving the optimal solution. In other words, solving .P1
optimally demands comprehensive offline information, e.g., historical trajectories
and the service preference of the user for requesting, which is challenging to acquire.
Furthermore, even with known offline information, .P1 remains an NP-hard Mixed-
Integer Nonlinear Programming (MINP) problem.

5.2.3 Lyapunov Optimization

In order to solve .P1 with low complexity, we decouple it into M subproblems, each
of which makes migration decisions for different services with certain allocated
resources. Let .αm (t, n), .βm (t, n), and .δm (t, n) represent the percentage of compu-
tation, storage, and bandwidth resources allocated to service m, respectively. They
are calculated as follows:
2 |x '
ωm E{fm,u m,t−1 (u, n) = 1, ∀u}
αm (t, n) =
. , (5.24)

M
'
ω m' E{fm2 ' ,u |xm ' ,t−1 (u, n) = 1, ∀u}
m' =1
⎾ ⏋
I |x '
ωm θmA + E{θm,u m,t−1 (u, n) = 1, ∀u}
.βm (t, n) = (5.25)

M ⎾ ⏋,
A I '
ωm' θm' +E{θm' ,u |xm' ,t−1 (u, n) = 1, ∀u}
m' =1

2 |x '
ωm E{Wm,u m,t−1 (u, n) = 1, ∀u}
δm (t, n) =
. . (5.26)

M
'
ωm' E{Wm2 ' ,u |xm ' ,t−1 (u, n) = 1, ∀u}
m' =1

Based on this, the problem is decoupled into P2, i.e.,

T −1
1 ⎲
. P2 : min
'
Em,t . (5.27)
x m,t T
t=0

s.t. (5.1),(5.3),(5.17),(5.23) ∀m

U
' f
fm,u xm,t (u, n) = yn,t (m) + αm (t, n)Fn , . (5.28)
u=1
118 5 Service Migration in Industrial Edge Computing


U
'
I
θm,u xm,t (u, n) + θmA Pt' (m, n) = yn,t
s
(m) + βm (t, n)Sn , . (5.29)
u=1


U
'
Wm,u xm,t (u, n) = yn,t
w
(m) + δm (t, n)Wn , (5.30)
u=1

∗ (m), ∗ ∈ f, s, w represents fine-grained tuning factors, where suitably


where .yn,t
setting them helps precisely define the resource occupied by service m. The right
terms of Eqs. (5.28), (5.29), and (5.30) limit the maximum available computation,
storage, and bandwidth resources in an elastic manner. In this case, the optimal
solution of .P2 will approach that of .P1.
∗ (m), ∗ ∈ f, s, w
Nevertheless, it is difficult to find the determined values for .yn,t
previously. Therefore, the Lyapunov optimization technique is utilized to build a
virtual resource queue for estimating whether or not the values are suitable, i.e.,
with less deviation from the expected resource utilization.

{ ∗ ∗
}
qn,t
. (m+1) = max qn,t (m) + yn,t (m), 0 , ∗ ∈ F = {f, s, w}. (5.31)

The objective is to harmonize the resource utilization per service with the collective
utilization across all services. It minimizes energy consumption by optimizing
resource allocation in a way that controls the resources allocated to any single
service. A metric that is widely used to estimate the congestion level for resource
allocation is quadratic Lyapunov, i.e., .L(x) ≜ 12 x 2 .

T −1 N
1 ⎲⎲⎲⎾ ∗ ⏋2
L(Q(m)) ≜
. qn,t (m) . (5.32)
2T ∗∈F
t=0 n=1

Equation (5.32) builds a virtual resource queue for services, where small backlog
L(Q(m)) reflects a plenty of resources, as well as high stability. The corresponding
.

one-slot conditional drift that shifts the quadratic Lyapunov function to low
congestion is

ΔR
.1 (Q(m)) ≜ L(Q(m + 1)) − L(Q(m)) (5.33)
T −1 N T −1 N
1 ⎲⎲⎲⎾ ∗ ⏋2 1 ⎲⎲⎲ ∗ ∗
⩽ yn,t (m) + qn,t (m)yn,t (m)
2T ∗∈F
T ∗∈F
t=0 n=1 t=0 n=1
−1 ⎲
T⎲ N ⎲
1 ∗ ∗
≤B+ qn,t (m)yn,t (m),
T
t=0 n=1 ∗∈F
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 119


where .B = B∗ ,
∗∈F

⎧ T −1 N

1 ⎲⎲ ∗
.B∗ = sup [yn,t (m)]2 . (5.34)
2T
t=0 n=1

Minimizing Eq. (5.32) ensures the total resource utilization does not exceed the
resource bound, i.e., m minimizing the drift-plus-penalty per time slot function

T −1 T −1 T −1 N
V ⎲ V ⎲ 1 ⎲⎲⎲ ∗ ∗
Δ R
. 1 (Q(m))+ Em,t ≤ Em,t +B + qn,t (m)yn,t (m).
T T T ∗∈F
t=0 t=0 t=0 n=1
(5.35)

Here, .V > 0 serves as a parameter for balancing energy consumption and resource
utilization. With the above queues, we can decompose .P2 into multiple service-
orient subproblems .P3, where the elastic resource allocation is taken into account.
The goal is to find a state of resource allocation that mitigates collisions between
energy consumption and resource utilization. Hence, we obtain the migration
decisions for each service by solving .P3, i.e.,

T −1
⎾ ⏋
1 ⎲ ⎲
N
.P3 : min V Em,t + Qn,t (m)
x 'm,t T
t=0 n=1

s.t. (5.1),(5.3),(5.17),(5.23) ∀m, (5.36)

where


U
f '
Qn,t (m) = qn,t (m)
. fm,u xm,t (u, n)
u=1
⎾U ⏋

'
+ qn,t
s
(m) I
θm,u xm,t (u, n) + θmA Pt' (m, n)
u=1


U
'
+ qn,t
w
(m) Wm,u xm,t (u, n). (5.37)
u=1

It can be seen that .P3 is an offline problem that inherently considers long-term
energy consumption and cannot be solved in an online manner, while, in practice,
the decision regarding service migration must be made instantly without access to
future information in an online manner. To tackle this challenge, we further utilize
Lyapunov optimization to transform .P3 into T subproblems.
120 5 Service Migration in Industrial Edge Computing

mig
Let .Em,t and .êm (t) represent the actual and excepted migration energy consump-
tion, where

1 ⎲ mig
t−1
êm (t) =
. Em,t ' . (5.38)
t '
t =0

The difference between them in time slot t is


mig
e
ym
. (t) = Em,t − êm (t). (5.39)

Based on Eq. (5.39), we initialize an empty energy queue

em (t + 1) = max{em (t) + ym
.
e
(t), 0}, (5.40)

more specifically, .em (0) = 0. When .em (t) ⪢ 0, less energy is desired to be
consumed for migration in the future. In this case, the migration probability is
mig
restricted, i.e., reducing the value of .Em,t → 0, to compensate for energy deficiency
and thus stabilize the energy queue and vice versa. The Lyapunov function of
migration energy consumption is expressed as

1
. L(em (t)) = [em (t)]2 . (5.41)
2

Also, we use .Be to denote the upper bound of . 21 ym


e (t)2 . Similar to Eq. (5.33), the

one-slot conditional Lyapunov drift is

1 e 2
ΔE
.1 (em (t)) ≜ L(em (t + 1)) − L(em (t)) ⩽ y (t)
2 m
+ em (t)ym
e
(t) ⩽ Be + em (t)ym
e
(t).

To ensure the stability of the energy queue and the optimality of the solution of
P3, we execute minimizing a supremum bound of Eq. (5.42).
.

⎾ ⎲
N ⏋
'
ΔE
.1 (em (t))+V V Em,t + Q n,t (m) ⩽ Be +em (t)ym
e
(t)
n=1

⎾ ⎲
N ⏋
+ V ' V Em,t + Qn,t (m)
n=1

⎾ ⎲
N ⏋
mig '
= Be + em (t)Em,t − em (t)êm (t) + V V Em,t + Qn,t (m) , (5.42)
n=1

where .Be and .em (t)êm (t) are two constants.


5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 121

Therefore, we obtain the final migration decisions in an online manner by solving


P4, i.e.,
.

⎾ ⎲
N ⏋
' mig
P4 : min
.
'
V V Em,t + Q n,t (m) + em (t)Em,t
x m,t
n=1

s.t. (5.1),(5.3),(5.17),(5.23) ∀m, t. (5.43)

.V ' is a positive weighting parameter, which can make a trade-off between the
objective value and energy queue stability influenced by migration. Hence, the
energy queue enables the BSs to save total energy with unknown future information
and thus approximate the optimal decision-making. Note that, to simplify the
expression in the following section, let .z̄m,t (x 'm,t ) denote the objective function in
.P4.

EGO involves several steps: In times slot t, the controller updates the service
placement .P t . The resource queues of services are also updated based on Eq. (5.31).
After this step, .P4 is solved using a modified Particle Swarm Optimization (PSO)
algorithm, as detailed in Sect. 5.2.4. Subsequently, the energy queue is updated,
and finally, we use the migration decisions from .P4 to update the next system
environment.

5.2.4 Probabilistic Particle Swarm Optimization Algorithm

Taking the NP-hard feature of .P4 into account, we propose a modified PSO method.
The traditional PSO algorithm operates by constructing a population, known as a
swarm, consisting of candidate solutions referred to as particles. These particles,
denoted as .x k for .k = 1, · · · , K, navigate within the search space following the
formulas below. The velocity .vk of particle k is influenced by the best particle
position .pk and the best group position g.
⎛ ⎞ ⎛ ⎞
vki+1 = ωvki + c1 r1 pki − x ik + c2 r2 g i − x ik ,
. (5.44)

where i is the index of iteration, .ω is the inertia weight, .c1 and .c2 are the
learning factors, and .r1 , r2 ∈ [0, 1] are random numbers. In Eq. (5.44), .v i+1
k
comprises three distinct components. Firstly, the inertia component reflects the
particle k’s inclination to preserve its current velocity. Secondly, the cognition
component indicates that particle k leans toward its local optimum. Lastly, the
society component suggests that particle k gravitates toward global optimum. The
movement of particle .x ik is then determined by these components.

x i+1
.
k = x ik + vki . (5.45)
122 5 Service Migration in Industrial Edge Computing

The above steps are repeated to find a solution with the best fitness indicator, i.e.,
z̄m,t (x ik ). Here, the value of .z̄m,t (x ik ) is negative correlated with the fitness. Based
.

on this, the particle position and group position are found


⎛ '⎞
pki = arg min
.
'
z̄ m,t pki , i ' = 1, · · · , i − 1, (5.46)
pki

and

g i−1 , z̄m,t (p̂ki ) > z̄m,t (g i−1 ),
g =
.
i
(5.47)
p̂ki , otherwise,

where
⎛ ⎞
p̂ki = arg min z̄m,t pki , k = 1, · · · , K.
. (5.48)
pki

Generally, the particles constantly move around the space according to their best
individual positions and collected population position .g i . This process continues
until the convergence, specifically when .max z̄m, t(pki ) − z̄m,t (g i ) < ε, or when a
predefined number of iterations are reached.
The particle .x ik,m,t can be regarded as a potential optimal solution .x ik,m,t ∈
{0, 1}U ×N . It prevents the update for .x ik,m,t through the traditional PSO algorithms’
update rule, i.e., Eqs. (5.44) and (5.45). To deal with this issue, two modifications of
PSO are made:
(1) A novel updating rule is designed for particles to improve the exploration
capacity. The velocity of .x ik,m,t is

ωv ik,m,t + c1 r1 p ik,m,t + c2 r2 g im,t


v i+1
.
k,m,t = , (5.49)
ω + c1 r1 + c2 r2

where .ω + c1 r1 + c2 r2 is a normalized term to ensure


N
.
i
vk,m,t (u, n) = 1, ∀u. (5.50)
n=1

The particles in traditional PSO always select BS

n∗,i = argn max vk,m,t


.
i
(u, n), (5.51)

as the target BS. The velocity .v i+1


k,m,t is updated according to Eq. (5.49). At the
beginning of i-th iteration, the best particle position .pki (u, n) is equal to .vk,m,t
i (u, n),
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 123

in which case .n∗,i+1 may always be equal to .n∗,i , leading solution to fall into local
optimum.
Hence, we define .Fk,m,t
i (u), a discrete probability distribution, to explore more
potential optimal solutions
⎛ ⎞
1 2 ··· N
.Fk,m,t (u) ∼ i
i
,
i
vk,m,t (u, 1) vk,m,t (u, 2) · · · vk,m,t
i (u, N)

i
where .vk,m,t (u, n) is the probability of user u migrating the service to BS n. We
randomly sample .r3 according to .Fk,m,t
i (u) as the target BS. Thus, particle .x i+1
k,m,t
can be updated as follows:

1, if n = r3 ,
. x i+1
k,m,t (u, n) = (5.52)
0, otherwise.

Equation (5.52) expands the exploration space to prevent convergence to local


optima. Subsequently, we adjust the fitness for particles, and the individual and
global optimal positions through comparing with the previous fitness (Eqs. (5.46)
and (5.47)).
(2) We introduce a correction scheme to accelerate converging by aligning
the migration decision between two sequential time slots. Here, we define
.D(k, m, t) as

D(k, m, t) =‖ x ik,m,t − x 'm,t−1 ‖,


. (5.53)

i.e., Euclidean distance between the position of particle k in the current time
slot and that in the previous time slot, denoted as .x 'm,t−1 .
To rapidly converge, we narrow the search space by ensuring the particle satisfies
.D(k, m, t) ≤ T0 , where .T0 is an empirical threshold. When a particle satisfies
.D(k, m, t) > T0 , let


1, n = n(u),
x i+1
.
k,m,t (u, n) = (5.54)
0, otherwise,

where
'
n(u) = arg max x i+1
. k,m,t (u, n) + x m,t−1 (u, n) + g m,t (u, n).
i
n

Nevertheless, the modified PSO might still converge to local optimal by consis-
tently selecting superior decisions. To circumvent this potential issue, we extend it to
multiple (J ) population versions. Increasing the number of populations J enhances
124 5 Service Migration in Industrial Edge Computing

the likelihood of converging to the globally optimal solution with a greater number
of iterations, as outlined in Theorem 5.1.
Theorem 5.1 With the increasing of .J > 0, the probability of the modified PSO
algorithm reaching the global optimum of .P4 increases correspondingly. Especially
when .J → ∞, the probability approaches to 1.
Proof Let .φ ik = (x i+1 i i i
k,m,t , v k , p k , g ) represent the particle k’s state in i-th iteration.
.{φ , i ≥ 1} is a Markov chain, whose transition probability is [22]
i
k
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
. Pr φ lk |φ ik = Pr x lk |x ik · Pr v lk |v ik Pr plk |pik Pr g l |g i , (5.55)

where
⎛ ⎞ ⎧ 1, z̄ (pi+1 ) ≤ z̄ (pi )),
i+1 i m,t m,t
k |p k = 0, otherwise,
. Pr p k k (5.56)

⎛ ⎞ ⎧ 1, z̄ (g i+1 ) ≤ z̄ (g i )),
m,t k m,t k
. Pr g i+1 |g i = (5.57)
0, otherwise.

Similarly, let .ξk i = (φ1 i , φ i2 , · · · , φ iK ) denote the state of population, and


.{ξ k , i ≥ 1} is a typical Markov chain, whose transition probability is
i

⎛ ⎞ ∏K ⎛ ⎞
. Pr ξ lk |ξ ik = P φ lk |φ ik . (5.58)
k=1

Let .p∗k and .g ∗ denote the best particle and group positions, and the optimal
particle state can be expressed as
{ }
K = φ ∗k = (x ik , v ik , p ∗k , g ∗ ), i ≥ 1 .
. (5.59)

Also, the optimal population state set is


⎧ ⎫
.G = ξ i,∗ , i ≥ 1|ξ i,∗ = (φ i1 , φ i2 , · · · , φ iK ), ∃φ ik ∈ K . (5.60)

Note that both the set .K and set .G are closed sets. ⋃
We further build two closed sets .B and .H, where .H = G B. The probability of

i+1 /∈ H is

. Pr(ξ i+1 /∈ H) = Pr(ξ i+1 /∈ H|ξ i /∈ H) Pr(ξ i /∈ H)


+ Pr(ξ i+1 /∈ H|ξ i ∈ H) Pr(ξ i ∈ H). (5.61)
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 125

As .H is a closed set, .Pr(ξ i+1 /∈ H|ξ i ∈ H) = 0, and we have

. Pr(ξ i+1 /∈ H) = Pr(ξ i+1 /∈ H|ξ i /∈ H) · Pr(ξ i /∈ H|ξ i−1 /∈ H) · · · Pr(ξ l /∈ H)



K
= Pr(ξ l /∈ H) Pr(ξ l+1 /∈ H|ξ l /∈ H). (5.62)
k=1

When .ξ i = (φ i1 , φ i2 , · · · , φ iK ) /∈ H, .∀k ∈ [1, K], .φ ik /∈ B and .φ ik /∈ G, i.e.,


K
. Pr(ξ l+1 /∈ H|ξ l /∈ H) = Pr(φ l+1
k /∈ H|φ k /∈ H)
l
(5.63)
k=1
K ⎾
∏ ⏋
= 1− Pr(x ik |x i−1
k ) Pr(v i i−1
|v
k k ) · Pr(p i
|p
k k
i−1
) Pr(g |g
i i−1
) .
k=1

By summing .Pr(ξ i /∈ H) over i, we have


⎲ ∞ ∏
⎲ K {
i−1 ∏
. Pr(ξ ∈
i
/ H) = Pr(ξ /∈ H) l
1−Pr(g i |g i−1 )·
i=1 i=1 l=1 k=1
}
Pr(x ik |x i−1 i i−1 i−1
k ) Pr(v k |v k ) Pr(p k |p k )
i
. (5.64)


∞ ∑

As . Pr(ξ i ) < ∞, . Pr(ξ i /∈ H) < ∞, we obtain
i=1 i=1
{ }
. lim 1 − Pr(x ik |x i−1 i i−1
k ) Pr(v k |v k ) · Pr(pik |pi−1
k ) Pr(g |g
i i−1
) = 0. (5.65)
i→∞

Therefore, we have

. lim Pr(x ik |x i−1 i i−1


k ) Pr(v k |v k ) · Pr(pik |pi−1
k ) Pr(g |g
i i−1
) = 1. (5.66)
i→∞

This is equivalent to

. lim Pr(x ik |x i−1 i i−1 i i−1


k ) = lim Pr(v k |v k ) = lim Pr(p k |p k ) = lim Pr(g |g
i i−1
) = 1.
i→∞ i→∞ i→∞ i→∞
(5.67)

By substituting Eqs. (5.67) into Eqs. (5.56) and (5.57), when .i → ∞, we have
p ik = pi+1
.
k and .g i = g i+1 . Thus, we obtain .Pr( lim ξ i ∈ H) = 1. This proves that
i→∞
{ξ i , i ≥ 1} converges to .H. .g i = g i+1 = g ∗ , .{ξ i , i ≥ 1} converges to .G, which
.

indicates the modified PSO reaches the global optimum within a single population,
126 5 Service Migration in Industrial Edge Computing

and .g i = g i+1 /= g ∗ , .{ξ i , i ≥ 1} otherwise. We denote the no-convergence


probability by .Pr(g i /= g ∗ ). Thus, the convergence probability with J populations
is
⎾ ⏋J
. lim Pr(g ∗ ∈ G) = 1 − lim Pr(g i /= g ∗ ) = 1. (5.68)
J →∞ J →∞

With the increasing of J , . lim Pr(g ∗ ∈ G) = 1, i.e., the probability reaches


J →∞
100% for global optimization.

5.2.5 Performance Evaluation

We utilize ArcGIS to handle real-world trajectories collected from Bologna


dataset [23]. The trajectories are sampled at a frequency of 1 Hz. The geographical
road map is accessed in OpenStreetMap, which is mapped to a 1500 .×1060 m area.
77 BSs based on the actual communication tower locations are deployed in this
area as depicted in Fig. 5.2. On this scaled-down map, the velocity averages around
1.4 m/s, closely resembling the typical walking speed of user.
We conduct simulations using Matlab. Users move according to their trajectories
and request service from a certain BS. The services that are deployed in BS are
randomly selected from the service in Table 5.1. The transmission power p and

Fig. 5.2 The distribution of BSs


5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 127

Table 5.1 Service parameters


Type .Dm (s) .λm(bits) .γm(cycles/bit) .ωm
A
.θm(MB) I
.θm (MB)
Emergency stop 0.1 3200 36,000 10 47 [1,3]
Collision risk 0.1 4800 40,000 10 70 [1,3]
Accident report 0.5 4800 28,000 2 97 [2,3]
Parking 0.1 1200 80,000 1 85 [3,5]
Platoon 0.5 4800 88,000 2 95 [4,6]
Face detection 0.5 3200 50,000 1 363 [5,10]
Video stream 0.1 1200 10,000 10 500 0

noise power .N0 are set to [0.1,0.4] W and .2 × 10−10 W, respectively. Moreover, h
is modeled as symmetric complex Gaussian variables [24].
The computation resource and bandwidth, i.e., .fm,u and .Wm,u , range from [500,
1000] cycles/bit and [1,10] MHz, respectively. For each BS, the maximum resources
.Fn , .Sn and .Wn are randomly sampled in [50,100] GHz, [1,5] GB, and [0.5,1] GHz.

Also, C and .κ are set to .0.02 s and .10−26 , respectively.


There are three benchmarks that are used for showing the performance of EGO
algorithm in terms of energy consumption, latency, and so on, which are listed as
follows:
• Static optimization placement (SOP): This strategy employs a static service
placement approach, where services are deployed onto edge servers and remain
in place without migration throughout the simulation period.
• Partial dynamic optimization algorithm (PDOA) [25]: PDOA aims to mini-
mize the average service latency from the viewpoint of an individual user, where
interference is not taken into account.
• Dynamic Markov decision process (DMDP) [11]: DMDP is employed as a
single-user service migration algorithm in our simulations. This approach utilizes
an MDP model to predict the trajectory, enabling optimal service migration
decisions aimed at minimizing energy consumption. Like PDOA, DMDP is
applied independently to each user.

5.2.5.1 Impact of User Number

Figure 5.3 illustrates the average energy consumption for the four methods, with
varying user numbers from 100 to 3000. Notably, SOP exhibits the highest average
energy consumption among the methods. This is attributed to the absence of service
migration as users move, resulting in increased transmission energy consumption.
Additionally, SOP’s average energy consistently rises with an increasing number
of users, reflecting higher computing energy consumption to meet the latency
requirements of all users.
128 5 Service Migration in Industrial Edge Computing

3
EGO

Average Energy Consumption (J)


SOP
2.5 DMDP
PDOA

1.5

0.5
500 1000 1500 2000 2500 3000
User Number

Fig. 5.3 Average energy consumption, where the user numbers vary from 100 to 3000

On the contrary, PDOA, DMDP, and EGO initially experience a decline in


average energy consumption as the user number rises. This is due to the shared
migration energy consumption of the application layer among multiple users.
However, beyond a certain user threshold, the average energy increases, driven by
additional migration energy consumption caused by user interference.
Remarkably, when the user number is below 1000, DMDP achieves the lowest
average energy consumption because its objective is centered on minimizing
average energy consumption. However, as the user number exceeds 1000, EGO
emerges as the method with the lowest average energy consumption. EGO’s
adaptive adjustment of elastic resource bounds for each service proves effective
in managing interference among users. In comparison to DMDP, EGO achieves a
40.7% energy saving when there are 3000 users. This highlights EGO’s suitability
for multiuser industrial edge computing systems.
Figure 5.4 illustrates the average service latency for the four methods, with
the user numbers ranging from 100 to 3000. As the user number increases, the
average service latency also rises across all four methods. SOP, which does not
involve service migration and entails constant user movement, exhibits the highest
service latency due to additional latency caused by request routing. DMDP follows
as the second-highest in average service latency, prioritizing the minimization
of average energy consumption over latency concerns. PDOA initially achieves
the lowest service latency, especially when the user number is below 1700, as
services consistently migrate to follow the user. However, beyond 1200 users,
PDOA experiences an exponential growth in average service latency, indicating a
dominance of interference among users. EGO’s performance closely aligns with
PDOA when the user number is below 1800. However, when the user number
surpasses 1800, EGO attains the lowest average service latency. This is attributed to
EGO’s utilization of elastic resource bounds to mitigate interference among users.
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 129

4.4

4.2

Average Service Latency (s)


4

3.8 EGO
SOP
3.6
DMDP
3.4 PDOA

3.2

2.8

2.6
500 1000 1500 2000 2500 3000
User Number

Fig. 5.4 Average service latency, where the user numbers vary from 100 to 3000

100
Average Deadline Guarantee Rate (%)

95

EGO
SOP
90
DMDP
PDOA

85

80
500 1000 1500 2000 2500 3000
User Number

Fig. 5.5 Average deadline guarantee rate, where the user numbers vary from 100 to 3000

In comparison to PDOA, EGO achieves a notable 12.6% reduction in service latency


with 3000 users.
Figure 5.5 presents the average deadline guarantee rate, representing the percent-
age of requests receiving a response from the edge server before their deadlines,
for the four methods. SOP, with its extended transmission path, exhibits the lowest
average deadline guarantee rate. DMDP follows as the second-lowest, and its rate
steadily decreases with an increasing user number. PDOA, prioritizing average
latency minimization, achieves the highest deadline guarantee rate when the user
number is below 1600. Beyond this threshold, the rate declines sharply due to
intensified interference among users. On the contrary, EGO consistently maintains
130 5 Service Migration in Industrial Edge Computing

Fig. 5.6 Average energy 7


consumption with 3000 users, EGO
where the BS number varies SOP

Average Energy Consumption (J)


6 DMDP
from 35 to 77
PDOA
5

1
35 42 49 56 63 70 77
BS Number

the highest deadline guarantee rate, especially with a large user number. With 3000
users, EGO achieves a rate of .93.2%, surpassing the rates of the other three methods,
which remain below .87%.

5.2.5.2 Impact of BS Number

In Fig. 5.6, the average energy consumption is presented with 3000 users, varying
the number of BSs from 35 to 77. Notably, the average energy consumption
decreases for SOP, PDOA, DMDP, and EGO as the number of BSs increases. This
reduction is attributed to the increased resources and diminished interference among
users facilitated by a higher number of BSs. Comparatively, PDOA and DMDP
exhibit steeper declines in their average energy consumption curves in response
to the growing number of BSs, emphasizing their heightened sensitivity to user
interference.

5.2.5.3 Impact of Duration τ

In Fig. 5.7, the average energy consumption is illustrated for varying time slot
duration (.τ ) ranging from 1 to 32. Notably, the value of .τ exerts no influence on SOP,
as it remains inactive in each time slot. For PDOA, DMDP, and EGO, an increase
in .τ results in less frequent service migration, thereby reducing migration energy
consumption and, consequently, the average energy consumption. However, once .τ
surpasses a specific threshold, transmission energy consumption in request routing
becomes dominant. Consequently, the average energy consumption of PDOA begins
to rise.
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 131

Fig. 5.7 Average energy 3.5


consumption with 3000 users,
where .τ varies from 1 to 32

Average Energy Consumption (J)


3

2.5

2
EGO
SOP
DMDP
1.5 PDOA

1
1 2 4 8 16 32
(s)

Fig. 5.8 Average service 4.4


latency with 3000 users,
where .τ varies from 1 to 32
4.2
Average Service Latency (s)

3.8

3.6

EGO
3.4 SOP
DMDP
PDOA
3.2
1 2 4 8 16 32
(s)

In Fig. 5.8, the average service latency is presented with varying time slot
durations (.τ ) ranging from 1 to 32. Once again, it is observed that SOP remains
unaffected by the value of .τ due to its inactivity in each time slot. For PDOA,
DMDP, and EGO, an increase in .τ initially leads to a decline in average service
latency, attributed to the decline in migration latency. However, after .τ exceeds a
specific threshold, the average service latency begins to increase.

5.2.5.4 Impact of Mobility

Figure 5.9 illustrates the average service latency under different mobility with 3000
users. It is apparent that the velocity greatly influences the performance of SOP
132 5 Service Migration in Industrial Edge Computing

4.4

Average Service Latency (s)


EGO
4.2
SOP
DMDP
4 PDOA

3.8

3.6

3.4

3.2
0.7 1.05 1.4 1.75 2.1
Average Velocity of Users (m/s)

Fig. 5.9 Average service latency with different user mobility

95
Average Deadline Guaratee Rate (%)

EGO
SOP
DMDP
90 PDOA

85

80
0.7 1.05 1.4 1.75 2.1
Average Velocity of Users (m/s)

Fig. 5.10 Average deadline guarantee rate with different user mobility

as it does not have any tools to deal with mobility. However, with the increasing
of average velocity, the average service latency of all the other methods also
inevitably grows. This is attributed to the heightened interference among users
caused by frequent service migration. Figure 5.10 further demonstrates that the
deadline guarantee rate of EGO decreases as the average velocity of users increases.
5.2 Energy-Efficient Migration Based on 3-Layer VM Architecture 133

5.2.5.5 Impact of V and V '

Figure 5.11 illustrates the average energy consumption and average service latency
of the EGO algorithm with varying values of V , ranging from .10−1 to .103 . As
V increases from .10−1 to .103 , EGO places more emphasis on average energy
consumption than on resource utilization. The results indicate that as average energy
consumption decreases, there is a corresponding increase in average service latency.
Setting a suitable value of V , EGO ensures the balance between average energy
consumption and average service latency.
Figure 5.12 illustrates the impact of .V ' in energy consumption and service
latency. With the growth of .V ' , the migration energy consumption will be taken into
account preferentially, leading to frequent migration as well as high service latency.
It means that the EDO should find a suitable .V ' as an empirical value (.V ' = 20) to
make a trade-off between energy consumption and service latency.

5.2.5.6 Average Service Latency of Services

Additionally, Fig. 5.13 provides a breakdown of the average service latency for each
of the seven services with 3000 users. The red lines represent the deadlines for
each service. SOP exhibits the poorest performance, as the average service latency
exceeds the response deadline for all services due to its lack of service migration.
DMDP and PDOA experience average service latency beyond the response deadline
for five services, highlighting the impact of interference among users. In contrast,
EGO successfully meets the deadline requirements for all services, even for low-
priority and data-intensive services like “platoon.”

3.5 1.4
Average Service Latency
Average Energy Consumption
Average Energy Consumption (J)
Average Service Latency (s)

3.4 1.3

3.3 1.2
10-1 100 101 102 103

Fig. 5.11 Average energy consumption of the EGO algorithm, where V varies from .10−1 to .103
134 5 Service Migration in Industrial Edge Computing

3.65 1.55
Average Service Latency
Average Energy Consumption
3.6 1.5

Average Energy Consumption (J)


Average Service Latency (s)

3.55 1.45

3.5 1.4

3.45 1.3

3.4 1.25

3.35 1.2
10-1 100 101 102 103

Fig. 5.12 Average energy consumption of the EGO algorithm, where .V ' varies from .10−1 to .103

600
EGO
SOP
500 DMDP
PDOA
Average Service Latency (ms)

400

300

200

100

0
Emergency Collision Accident Parking Platoon Face Video

Fig. 5.13 Average service latency of each service with 3000 users

The simulation, conducted with real-world vehicle trajectory data in Bologna,


demonstrates that the proposed EGO solution outperforms state-of-the-art solutions
by 40.7%, 12.6%, and 7.2% in terms of average energy consumption, average
service latency, and service deadline guarantee rate, particularly in multiuser
industrial edge computing system. Moreover, this chapter assumes a mobile user
connects to a single base station. Future research may explore more advanced
service migration algorithms for MEC scenarios where users can connect to multiple
base stations simultaneously.
5.3 Location Privacy-Aware Service Migration 135

5.3 Location Privacy-Aware Service Migration

5.3.1 Statement of Problem

While service migration works well in guaranteeing service continuity, it may also
incur severe location privacy issues. The correlation between service migration
trajectories and user trajectories introduces potential privacy concerns. As illustrated
in Fig.5.14, the service migration trajectory closely follows the user’s movement
from .u(t) to .u(t + 3). This correlation raises the risk of privacy breaches, where
malicious entities such as untrusted or compromised service providers could exploit
service migration records to stealthily infer user locations. The implications of such
privacy violations include stalking, blackmail, and even kidnapping, as highlighted
in previous studies [26].
To safeguard user location privacy in the context of service migration, various
LPPMs have been explored. Common approaches include cloaking-based algo-
rithms [27], dummy-based algorithms [28], and differential privacy (DP)-based
algorithms [29], which have primarily been developed for Location-Based Service
(LBS). Cloaking-based algorithms and differential privacy-based algorithms focus
on introducing ambiguity in users’ locations, either by creating cloaking areas or by
adding location noise, to conceal precise location details.
In practice, the continuous nature of service migration poses challenges, as adver-
saries can leverage historical migration trajectories to infer users’ true locations,

Fig. 5.14 Migration


trajectory vs. user trajectory

User Trajectory

s
s

s
s

Service Migration Trajectory


136 5 Service Migration in Industrial Edge Computing

mitigating the impact of noise and cloaking areas. Dummy-based algorithms attempt
to conceal real migration trajectories [26, 30]. Meanwhile, maintaining these decoy
migration services entails additional computation and storage resources, where the
effective location privacy-aware service migration method is desired.
There are two key challenges that should be overcome. Besides the interfer-
ence among users, accurate measurement of location privacy leakage risk under
adversaries’ location inference attacks is difficult. Traditional metrics, such as the
communication distance between the user and the edge where the requested service
is deployed [31, 32], rely on the assumption that longer distances correspond to
lower location privacy leakage risk and vice versa.
However, this distance-based metric falls short of accurate privacy leakage risk
evaluation when suffering from adversary location inference attacks. Adversaries,
armed with users’ historical movement and service migration trajectories, leverage
Bayesian attacks [33] to infer potential user locations. In such scenarios, although
migrating service to a remote edge, the location privacy leakage risk still remains
high.

5.3.2 Adversary’s Location Inference Attack

In industrial edge computing systems, there are always distributed M users and
N BSs equipped with edge servers in the map. Let .N = {1, 2, · · · , N } and
.M = {1, 2, · · · , M} to denote the sets of BSs and users, respectively. For user

.m ∈ M , its mobility determines its trajectory across the coverage of different BSs,

i.e., its connected BS .cm t constantly changes, and thus we update the location .ut
m
in each time slot. Note that since the storage of BS limits its deployed services, the
connected BS of user m is not always the serving BS .sm t that provides the required

service for the user. Both the connected and serving BS of the user greatly influence
the migration decision made in time slot t, i.e., whether or not to migrate services
to which target BS .am t ∈N .

Here, Fig. 5.15 depicts the service migration process in MEC systems. In time
slot t, users .u1 and .u2 request services .s1 and .s2 from BS 1, while .u3 requests
service .s3 from BS 2. By time slot .t + 1, as users move, services .s1 , .s2 , and .s3
migrate to BS 2, BS 3, and BS 4, respectively, and thus reduce the service latency.
Next, in time slot .t +2, the connected BS of the three users is changed to BS 4, which
encourages .u1 migrate service .s1 from BS 2 to BS 4 due to the long communication
distance between .u1 and its serving BS 2. Simultaneously, .u2 continues to request
service .s2 from BS 3 instead of BS 4, strategically alleviating competition pressure
in resources at BS 4. In this scenario, an effective strategy should account for
user mobility and resource competition, minimizing communication latency and
mitigating interference for a seamless user experience.
5.3 Location Privacy-Aware Service Migration 137

BS 1 BS 2 BS 3 BS 4

Service Service Service Service


( ) ( 1) ( 1) ( 2)

Service Service Service Service


( ) ( ) ( 2) ( 1)

Migration Migration

( + 1)
( )
BS 2
( + 1)

( )
( + 1) ( + 2)
BS 4
BS 1 ( + 2)
( )
BS 3

Fig. 5.15 Illustration of service migration

Most of the existing malicious adversaries attempt to infer user’s location based
on its observed historical migration trajectories, which raises two urgent problems
to be solved:
• How to accurately estimate the risk of location privacy leakage when migration
occurs?
• How to select the BS to migrate which service under unknown user mobility,
uncertain resource competition and considering user mobility, resource competi-
tion, and location privacy leakage risk?
We assume that the role of the adversary is a service provider, who is honest and
curious. It tends to infer a user’s location based on its current migration trajectories
obtained by monitoring in confidence. Normally, only the migration trajectories
are collected by the adversary, in which case the adversary treats the location of
the user as its nearby BS which has deployed the corresponding service [26], i.e.,
.um (t) = sm (t). The location inference in this manner without any extra background

knowledge of the user is known as the knowledge-free attack (KFA).


On the contrary, the radical adversary further steals the historical movement
pattern, named knowledge-based attack (KA), via public data mining, data brokers,
etc. It greatly helps the adversary to build the bridge between migration and
location by a Bayesian probability [34]. Thus, with given background knowledge,
the adversary explores the probability distribution of the user’s mobility .Pmu and
constructs a mapping relationship between the user location and its service location
138 5 Service Migration in Industrial Edge Computing

.Pmu2s . We represent the probabilities that user m moves from .utm to .ut+1
m and that BS
.sm can provide the service to user in location .um , by .Pm (um |um ) and .Pm (sm |um ),
t t u t+1 t u2s t t

respectively. According to Bayesian theory, the posterior probability of user location


conditioned on the observed service location .Pms2u is

t |ut )P u (ut |·)


Pmu2s (sm m m m
Pms2u (utm |sm
.
t
)= ∑ . (5.69)
ut+1
m ∈N
Pmu2s (sm |ut+1 u t+1
m )Pm (um |·)

Equation (5.69) shows that the user’s locations can be tracked based on the
collected user’s service location. In contrast to KFAs, KAs can eliminate locations
that do not align with the background knowledge, thereby enhancing the accuracy
of inferring the user’s true location.
Next, we introduce the proposed entropy-based location privacy metric. The user,
who suffers from the KA, has a high probability of leaking privacy although it
migrates the service to a remote BS.
To strike a balance between protecting location privacy and improving the
experience of users, inspired by information theory, we propose the concept of
privacy entropy to enhance the efficiency of risk estimation, i.e.,
⎲ ( ) ⎛ ⎞
Hm (t) = −
. Pms2u utm |sm
t
log Pms2u (utm |sm
t
) . (5.70)
utm ∈N

Equation (5.70) means that the location entropy value is negatively correlated
with the inference accuracy of the adversary, resulting in relatively low location
privacy leakage risk. The user’s location privacy leakage risk .Rm (t) can be estimated
by
( t)
.Rm (t) = −Hm sm . (5.71)

Generally, the service latency is the interval from the time of request initiation
to that of receiving the response, comprising the communication, computation,
req
and migration latency. Let a 4-tuple .< pm , λm , λserm , δm > represent the service
req
information requested by mobile user m, where .pm , .λm , λser m , and .δm are the
transmission power, the data size of request and service, and the computation
intensity (i.e., CPU Cycles/bit), respectively. It can be calculated according to the
equations in Sect. 5.2.2.1.

5.3.3 Location Privacy-Aware Multiuser Service Migration


Algorithm

P1 exhibits a memoryless sequential decision-making structure following the


Markov property, supporting to use MDP approach. However, MDP necessitates
5.3 Location Privacy-Aware Service Migration 139

access to all users’ information (such as location and requested services) and all
BSs’ states to determine the optimal migration decisions, which is impractical
in practice. Additionally, owing to interference, migration decisions among users
mutually influence each other, leading to great coupling between service latency
and location privacy leakage risks. Consequently, we convert P1 into a partially
observable Markov decision process (POMDP) problem and address it using the
MADRL algorithm.

5.3.3.1 Problem Transformation

In the POMDP framework, the system is defined by a 7-tuple .< E , A , P , O, U, r,


γ >. .E represents state set of the entire environment; .A is the action set;
.P (e
t+1 |et , a t ) is the state-transition probability from state .et to .et+1 with given

action .a t . .O is the observation set of the local environment state from the perspective
of a single user. The observation distribution is denoted by U , where .U (ot |a t−1 , et )
is the probability of a user observing state .ot given the action .a t−1 and the
environment state .et . .r(ot , a t ) and .γ signify the corresponding instantaneous reward
and the long-term discounted factor.
Normally, it is difficult to accurately predict the transition P and observation
U without known users’ movements which perform significant uncertainty. To
capture this uncertainty, we introduce the DRL technique that uses DNN to learn
corresponding probability distributions [35]. More specifically, we propose an
MASAC algorithm based on the SAC and POMDP to find the optimal migration
decisions:
1. Environment State: It includes the users’ information (e.g., user location, ser-
vice location, and requested service) and BSs’ configurations (e.g., computation
capacity) in time slot t, which is

.et = {ut1 , ut2 , · · · , utM , s1t , s2t , · · · , sM


t
, J1 , J2 , · · · , JM ,
f1 , f2 , · · · , fN }. (5.72)

2. Observation: The users only observe the partial environment state without infor-
mation exchange among multiple users. We use .om t to represent the observation

of user m in time slot t, i.e.,


{ }
t
om
. = utm , sm
t
, Jm , f1 , f2 , · · · , fN . (5.73)

The environment state .et = {o1t , o2t , · · · , oM


t } is the composition of all users’

observations.
3. Migration Action: The candidate action set for users is denoted by .At , including
any BS nearby the users. The action of user m in time slot t, .am t ∈ At , indicates

the target BS where the service is migrated to.


140 5 Service Migration in Industrial Edge Computing

4. Reward: Let .rm t denote instantaneous reward under action .a t and observation
m
.om . The total cost in time slot t represents the instantaneous reward, that is, .rm =
t t

−Cm (t).

5.3.3.2 MASAC Service Migration Algorithm

The MASAC algorithm’s architecture, as illustrated in Fig. 5.16, treats each user
as an SAC agent responsible for independently determining service migration
decisions. Further details on this algorithm are provided in Sect. 3.3.3.
As mentioned earlier, we adhere to the centralized training and decentralized exe-
cution paradigm, where other agents’ observation states and actions are observable
during training but unobservable during execution.
During the training stage, all agents gather the historical environment state
from the experience replay buffer, including observations, actions, as well as
rewards. Samples from this buffer are then used to centrally train the actor–critic
models for each agent. The agent takes interference from other agents’ actions into
consideration to make migration decisions that maximize rewards for all agents.
We first initialize each agent’s policy .μm , soft state-value function .ϱm , soft Q-
value function .θm , and the memory of the experience replay buffer .D. In time slot
t, the agents determine the migration action .am t under observation .ot based on the
m
μ
policy .πm to transition into a new state .e ' t+1 t of
. After that, the immediate reward .rm
t t t ' t
each agent is obtained. Then, we record a 4-tuple .(e , a , rm , e ) into experience
replay buffer .Dm. Finally, using mini-batch training, we update the actor–critic
network by learning the soft state value .V ϱm (et ) and soft Q-value .Qθm (et , a t ).
With the benefits of centralized training, agents can collaborate without direct
information exchange. During execution, the trained policy network guides the
agents to make migration decisions independently toward low service latency and
location privacy leakage risk. It can efficiently reduce the interference among agents.

Fig. 5.16 Multi-agent soft actor–critic framework for location privacy-aware service migration
algorithm
5.3 Location Privacy-Aware Service Migration 141

5.3.4 Performance Evaluation

In this section, we conduct some experiments to verify the efficiency of our proposed
method. We set 13 distributed BSs in a .1000 × 1000 m area, each of whose
communication radius and computation capacity are set to .200 m and [5, 20] GHz,
respectively. For the users in this area, we simulate their trajectories according to
real-world user movement in GeoLife DataSet [36]. The requested service of users is
regarded as a latency-sensitive service, whose parameters are listed below: The sizes
of request data .λm and image data .Λm , and computation intensity .δm are uniformly
selected from [1, 5] MB, [10, 50] MB, and uniform distribution [100, 500] CPU
cycles/bit, respectively.
Each user sends request to its connected BS via a wireless channel with a
transmission power .pm ∈ [0.5, 1] W. The wireless bandwidth W and noise power
.N0 are set to [5, 25] MHz and .10
−8 W. Meanwhile, we use a circular symmetric

complex Gaussian random variable [37] to simulate the fading vector h. The
connection among BSs is achieved by wired communication with a transmission
rate .r b , where the interruption latency .ξ is 0.05 s/hop. The detail of the variables
can also be referred to as Table 5.2.
There are five methods that are used as the benchmark of the proposed methods
from the perspective of service latency and location privacy leakage risk:
• DMDP [38]: The detail of DMDP has been introduced in Sect. 5.2.5.
• MASAC: MASAC optimizes the service latency while easing the resources
competition among users without taking the location privacy leakage risk into
account.
• DMDP with distance-based location privacy (DMDP-distance) [32]: This
algorithm is a variation of DMDP, which further introduces the location privacy
to minimize both service latency and location privacy leakage risk. It uses a
distance-based location privacy metric to estimate the location privacy leakage
risk.
• MASAC with distance-based location privacy (MASAC-distance): The loca-
tion privacy leakage risk estimation in MASAC-distance is the same as that

Table 5.2 Experiment Parameter Value


parameters req
.λm [1, 5] MB
.λm
ser [10, 50] MB
.δm [100, 500] CPU cycles/bit
.fn [5, 20] GHz
.pm [0.5, 1] W
W [5-25] MHz
.N0 .10
−8 W

.rb 100 Mbps


.ξ 0.05 s/hop
142 5 Service Migration in Industrial Edge Computing

in DMDP-distance. It makes service migration decisions based on MASAC


algorithm.
• MASAC Service Migration with Differential Privacy (MASAC-dp): It [39]
serves as an effective method for reducing location privacy leakage risk. It
achieves this by introducing random noise to user locations. Specifically, we
adopt the DP algorithm for it and employ MASAC algorithm to make service
migration decisions.

5.3.4.1 Impact of Wireless Bandwidth

Figure 5.17 illustrates the privacy entropy with different network bandwidths.
Figures 5.18 and 5.19 show the location accuracy under knowledge-free and
under KAs, respectively. Regarding KFAs, the DMDP-distance, MASAC-distance,
MASAC-dp, and our proposed algorithms achieve approximately 14% to 27%
decline. The performance of the proposed method is significantly superior to DMDP
and MASAC algorithms. For confronted with KAs, location accuracy with DMDP-
distance and MASAC-distance algorithms perform around 52% to 65% increasing.
This is attributed to adversaries gathering auxiliary knowledge to enhance the
accuracy of user location inference. In comparison, MASAC-dp and our proposed
algorithms can restrict location accuracy to below 30%. This is achieved by reducing
the correlation mentioned above through increased migration decision randomness.
Figure 5.20 illustrates the impact of wireless bandwidth on service latency.
The results show a decrease in service latency with all six algorithms as wireless
bandwidth increases. The MASAC algorithm stands out, achieving the lowest
service latency due to its focus on optimizing service latency for multiple users.

Fig. 5.17 Privacy entropy with different network bandwidths


5.3 Location Privacy-Aware Service Migration 143

Fig. 5.18 Location accuracy under adversary’s KFA with different network bandwidths

Fig. 5.19 Location accuracy under adversary’s KA with different network bandwidths

Fig. 5.20 Service latency with different network bandwidths


144 5 Service Migration in Industrial Edge Computing

Fig. 5.21 Communication


latency with different
network bandwidths

Fig. 5.22 Migration latency


with different network
bandwidths

Our proposed algorithm achieved the second-lowest service latency. In contrast,


the DMDP, DMDP-distance, MASAC-distance, and MASAC-dp algorithms exhibit
higher service response latency. Examining Fig. 5.23, it is evident that DMDP algo-
rithms, such as DMDP and DMDP-distance, have significantly higher computation
latency than other algorithms. This discrepancy arises as the resource competition
among users is not taken into account.
Figure 5.21 reveals that DMDP-distance and MASAC-distance algorithms
exhibit high communication latency. This is attributed to the long communication
distance between the user and its serving BS to protect privacy. As depicted in
Figs. 5.21 and 5.22, since the location noise interferes with service migration
decisions, MASAC-dp suffers from large communication and migration latency.
With varying wireless bandwidths, our proposed algorithm demonstrates near-
optimal service latency performance with a low risk of leakage location privacy
(Fig. 5.23).

5.3.4.2 Impact of Request Size

Figure 5.24 exhibits the location privacy protection ability of the six algorithms,
req
where request size .λm varies from 1 to 5 MB. Here, the service data size .λser
m is
req
ranged in [10, 50] MB. Even with dramatic varying of .λm , the proposed algorithm
5.3 Location Privacy-Aware Service Migration 145

Fig. 5.23 Computation


latency with different
network bandwidths

Fig. 5.24 Performance with different request sizes: (a) the privacy entropy, (b) the location
accuracy under adversary’s KFA, and (c) the location accuracy under adversary’s KA

Fig. 5.25 Service latency


with different request sizes

demonstrates effective protection of user location privacy while maintaining low


location accuracy.
req
Figure 5.25 illustrates the impact of .λm on service latency. With the requested
data increasing, the service latency of MASAC and our proposed algorithms grows
req
more slowly compared with others. Specifically, as seen in Fig. 5.26, when .λm > 2,
the communication latency of DMDP-distance and MASAC-distance algorithms
experiences rapid growth. This is because they reduce the frequency of migration
146 5 Service Migration in Industrial Edge Computing

Fig. 5.26 Communication


latency with different request
sizes

Fig. 5.27 Migration latency


with different request sizes

Fig. 5.28 Computation


latency with different request
sizes

with .λser
m increasing, leading to an increase in communication distance with user
movement.
Figure 5.27 presents the migration latency results. The migration latency of
MASAC-dp and our proposed algorithms is high due to the frequent migration
for enhancing location privacy protection. On the contrary, MASAC-distance
algorithm has the lowest migration latency. Figure 5.28 displays the computation
latency results, where the computation latencies of DMDP and DMDP-distance
significantly exceed that of MASAC algorithms.
5.3 Location Privacy-Aware Service Migration 147

5.3.4.3 Impact of User Number

Figure 5.29 illustrates the location privacy protection capabilities of the six algo-
rithms with varying user numbers, ranging in [16, 80]. With the number of
users increasing, the privacy entropy of the DMDP algorithms remains relatively
stable, while the privacy entropy of MASAC-based algorithms increases. It is
attributed to MASAC algorithm migrating services to different BSs to alleviate
resource competition, thereby enhancing the migration randomness. Consequently,
even with different location inference attacks, the location accuracy of MASAC-
based algorithms gradually decreases when the number of users increases. Our
proposed algorithm consistently demonstrates effective location privacy protection
with varying user numbers.
Figures 5.30, 5.31, 5.32, and 5.33 display the variations in latency performance,
including response latency, communication latency, migration latency, and com-
putation latency, with different numbers of users (ranging from 16 to 80). As
the number of users increases, service latencies for MASAC, MASAC-distance,
MASAC-dp, and our proposed algorithms experience smooth growth, while DMDP

Fig. 5.29 Performance with a different number of users (a), the privacy entropy, (b) the location
accuracy under adversary’s KFA, and (c) the location accuracy under adversary’s KA

Fig. 5.30 Service latency


with different numbers of
users
148 5 Service Migration in Industrial Edge Computing

Fig. 5.31 Communication


latency with different
numbers of users

Fig. 5.32 Migration latency


with different numbers of
users

Fig. 5.33 Computation


latency with different
numbers of users

and DMDP-distance algorithms do otherwise. It is caused by high computation


latency under violent resource competition among users.
Figure 5.33 exhibits that the computation latency of DMDP exceeds that of other
algorithms as the number of users increases. Also, it demonstrates that MASAC
algorithms can mitigate computation latency growth by controlling the migration
frequency. When the number of users is small, the algorithms achieve a relatively
low migration latency without resource competition among users. Otherwise,
MASAC-based algorithms tend to migrate services to other edges without resource
competition, leading to an extra migration latency.
References 149

With varying network bandwidths, service request data, as well as the numbers
of users, extensive simulations show the superior performance of the proposed
algorithm. It ensures the services achieve seamless migration with low latency and
high QoS. In the future, efforts will be directed toward enhancing the scalability of
the proposed method to suit flexible industrial edge computing systems.

References

1. Yuyi Mao, Changsheng You, Jun Zhang, Kaibin Huang, and Khaled Ben Letaief. A survey on
mobile edge computing: The communication perspective. IEEE Communications Surveys and
Tutorials, 19(4):2322–2358, Dec. 2017.
2. X. Ge, S. Tu, G. Mao, C. Wang, and T. Han. 5G ultra-dense cellular networks. IEEE Wireless
Communications, 23(1):72–79, Feb. 2016.
3. Adyson Magalhães Maia, Yacine Ghamri-Doudane, Dario Vieira, and Miguel Franklin
de Castro. Optimized placement of scalable IoT services in edge computing. In IFIP/IEEE
International Symposium on Integrated Network Management, IM, pages 189–197, Washing-
ton DC USA, Apr. 2019.
4. Jie Xu, Lixing Chen, and Pan Zhou. Joint service caching and task offloading for mobile edge
computing in dense networks. In IEEE Conference on Computer Communications, INFOCOM,
pages 207–215, Honolulu, HI, USA, Apr. 2018.
5. T. Ouyang, Z. Zhou, and X. Chen. Follow me at the edge: Mobility-aware dynamic service
placement for mobile edge computing. IEEE Journal on Selected Areas in Communications,
36(10):2333–2345, Oct. 2018.
6. Tie Qiu, Aoyang Zhao, Feng Xia, Weisheng Si, and Dapeng Oliver Wu. ROSE: robustness
strategy for scale-free wireless sensor networks. IEEE/ACM Transactions on Networking,
25(5):2944–2959, Sep. 2017.
7. X. Zhang and Q. Zhu. Hierarchical caching for statistical QoS guaranteed multimedia trans-
missions over 5G edge computing mobile wireless networks. IEEE Wireless Communications,
25(3):12–20, Jun. 2018.
8. Wahida Nasrin and Jiang Xie. SharedMEC: Sharing clouds to support user mobility in mobile
edge computing. In IEEE International Conference on Communications, ICC, pages 1–6,
Kansas City, MO, USA, May 2018.
9. Y. Sun, S. Zhou, and J. Xu. EMM: Energy-aware mobility management for mobile edge
computing in ultra dense networks. IEEE Journal on Selected Areas in Communications,
35(11):2637–2646, Nov. 2017.
10. T. Taleb, A. Ksentini, and P. A. Frangoudis. Follow-me cloud: When cloud services follow
mobile users. IEEE Transactions on Cloud Computing, 7(2):369–382, Apr. 2019.
11. S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung. Dynamic service migration
in mobile edge computing based on Markov decision process. IEEE/ACM Transactions on
Networking, 27(3):1272–1288, Jun. 2019.
12. Andrew Machen, Shiqiang Wang, Kin K. Leung, Bongjun Ko, and Theodoros Salonidis.
Migrating running applications across mobile edge clouds: poster. In International Conference
on Mobile Computing and Networking, MobiCom, pages 435–436, New York City, NY, USA,
Oct. 2016.
13. Adam Sadilek and John Krumm. Far out: Predicting long-term human mobility. In Interna-
tional Conference on Artificial Intelligence, AAAI, pages 814–820, Toronto, Ontario, Canada,
Jul. 2012.
14. Xiaobo Zhou, Shuxin Ge, Tie Qiu, Keqiu Li, and Mohammed Atiquzzaman. Energy-efficient
service migration for multi-user heterogeneous dense cellular networks. IEEE Transactions on
Mobile Computing, 22(2):890–905, 2023.
150 5 Service Migration in Industrial Edge Computing

15. Weixu Wang, Xiaobo Zhou, Tie Qiu, Xin He, and Shuxin Ge. Location privacy-aware service
migration against inference attacks in multi-user MEC systems. IEEE Internet of Things
Journal, pages 1–1, 2023.
16. Matt Walker. Operators facing power cost crunch. https://www.mtnconsulting.biz/product.
Accessed Nov 7, 2020.
17. D. Chen and W. Ye. 5G power: Creating a green grid that slashes costs, emissions & energy
use. https://www.huawei.com/en/publications/communicate/89/5g-power-green-grid-slashes-
costs-emissions-energy-use. Accessed Nov 7, 2020.
18. Valentin Poirot, Mårten Ericson, Mats Nordberg, and Karl Andersson. Energy efficient multi-
connectivity algorithms for ultra-dense 5G networks. IEEE Wireless Networks, 26(3):2207–
2222, Jun. 2020.
19. Li Ping Qian, Yuan Wu, Bo Ji, Liang Huang, and Danny H. K. Tsang. HybridIoT: Integration
of hierarchical multiple access and computation offloading for IoT-based smart cities. IEEE
Network, 33(2):6–13, 2019.
20. Andrew Machen, Shiqiang Wang, Kin K. Leung, Bongjun Ko, and Theodoros Salonidis. Live
service migration in mobile edge clouds. IEEE Wireless Communication, 25(1):140–147, Mar.
2018.
21. Qi Zhang, Lin Gui, Fen Hou, Jiacheng Chen, Shichao Zhu, and Feng Tian. Dynamic task
offloading and resource allocation for mobile-edge computing in dense cloud RAN. IEEE
Internet Things Journal, 7(4):3282–3299, Jun. 2020.
22. Ning Lai and Fei Han. A hybrid particle swarm optimization algorithm based on migration
mechanism. In Intelligence Science and Big Data Engineering—7th International Conference,
IScIDE, pages 88–100, Dalian, China, Sep. 201.
23. Esri. ArcGIS. https://developers.arcgis.com/.
24. Y. Wang, M. Sheng, X. Wang, L. Wang, and J. Li. Mobile-edge computing: Partial com-
putation offloading using dynamic voltage scaling. IEEE Transactions on Communications,
64(10):4268–4282, 2016.
25. X. Yu, M. Guan, M. Liao, and X. Fan. Pre-migration of vehicle to network services based on
priority in mobile edge computing. IEEE Access, 7:3722–3730, Jan. 2019.
26. Ting He, Ertugrul Necdet Ciftcioglu, Shiqiang Wang, and Kevin S. Chan. Location privacy in
mobile edge clouds: A chaff-based approach. IEEE Journal on Selected Areas in Communica-
tions, 35(11):2625–2636, 2017.
27. F. Fei, S. Li, H. Dai, C. Hu, W. Dou, and Q. Ni. A k-anonymity based schema for location
privacy preservation. IEEE Transactions on Sustainable Computing, 4(2):156–167, April 2019.
28. Pasika Ranaweera, Anca Delia Jurcut, and Madhusanka Liyanage. Survey on multi-access edge
computing security and privacy. IEEE Communications Surveys Tutorials, 23(2):1078–1124,
2021.
29. Weiqi Zhang, Guisheng Yin, Yuhai Sha, and Jishen Yang. Protecting the moving user’s
locations by combining differential privacy and k-anonymity under temporal correlations in
wireless networks. Wirel. Commun. Mob. Comput., 2021:6691975:1–6691975:12, 2021.
30. Jian Kang, Doug Steiert, Dan Lin, and Yanjie Fu. MoveWithMe: Location privacy preservation
for smartphone users. IEEE Transactions on Information Forensics and Security, 15:711–724,
2020.
31. Xiaofan He, Juan Liu, Richeng Jin, and Huaiyu Dai. Privacy-aware offloading in mobile-edge
computing. In GLOBECOM 2017—2017 IEEE Global Communications Conference, pages 1–
6, 2017.
32. Weixu Wang, Shuxin Ge, and Xiaobo Zhou. Location-privacy-aware service migration in
mobile edge computing. In 2020 IEEE Wireless Communications and Networking Conference
(WCNC), pages 1–6, 2020.
33. Rinku Dewri. Local differential perturbations: Location privacy under approximate knowledge
attackers. IEEE Transactions on Mobile Computing, 12(12):2360–2372, 2013.
34. Reza Shokri, George Theodorakopoulos, Jean-Yves Le Boudec, and Jean-Pierre Hubaux.
Quantifying location privacy. In 2011 IEEE symposium on security and privacy, pages 247–
262. IEEE, 2011.
References 151

35. Fang Fu, Yunpeng Kang, Zhicai Zhang, F. Richard Yu, and Tuan Wu. Soft actor–critic DRL
for live transcoding and streaming in vehicular fog-computing-enabled IoV. IEEE Internet of
Things Journal, 8(3):1308–1321, 2021.
36. Yu Zheng, Hao Fu, Xing Xie, Wei-Ying Ma, and Quannan Li. GeoLife GPS trajectory
dataset—User Guide, geolife gps trajectories 1.1 edition, July 2011. Geolife GPS trajectories
1.1.
37. Yanting Wang, Min Sheng, Xijun Wang, Liang Wang, and Jiandong Li. Mobile-edge com-
puting: Partial computation offloading using dynamic voltage scaling. IEEE Transactions on
Communications, 64(10):4268–4282, 2016.
38. Shiqiang Wang, Rahul Urgaonkar, Murtaza Zafer, Ting He, Kevin Chan, and Kin K. Leung.
Dynamic service migration in mobile edge computing based on Markov decision process.
IEEE/ACM Transactions on Networking, 27(3):1272–1288, 2019.
39. Xuewen Dong, Tao Zhang, Di Lu, Guangxia Li, Yulong Shen, and Jianfeng Ma. Preserving
geo-indistinguishability of the primary user in dynamic spectrum sharing. IEEE Transactions
on Vehicular Technology, 68(9):8881–8892, 2019.
Chapter 6
Application-Oriented Industrial Edge
Computing

Industrial edge computing applications cover nearly every possible scenario in our
daily life. In the current era of AI, the most promising scenario is in the area of
edge-assisted model inference, whose typical application is object detection. Object
detection is the basis of making any other control decisions and also plays an
irreplaceable role in preventive maintenance and quality control. Therefore, this
chapter will introduce edge-assisted object detection for two typical data, i.e., image
and point cloud. Meanwhile, fluctuations in wireless bandwidth may incur long
communication latency for both edge-assisted methods, and thus we propose a
teacher–student learning framework to further accelerate the inference.

6.1 Image-Oriented Object Detection

6.1.1 Statement of Problem

Real-time object detection with high accuracy greatly supports the development
of mobile vision applications in industrial edge computing systems, such as
autonomous driving [1]. Generally, there is a mismatching between limited
resources in mobile end devices, i.e., robots, and significant resource requirement
of DNN-based computation-intensive object detection. To deal with the problem,
mobile vision applications should find a way to reduce the resource requirement in
devices while maintaining accuracy performance.
There are various studies that have been devoted to breaking the resource
bottleneck, which can be classified into two categories. (i) One strategy involves
executing object detection tasks directly on mobile devices and employing model
compression techniques such as weight sharing [2] and knowledge distillation [3] to
transform computation-sensitive CNN models into more lightweight versions [3].
However, these lightweight models suffer from significant degradation in detection

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 153
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_6
154 6 Application-Oriented Industrial Edge Computing

accuracy. (ii) Alternatively, a widely adopted approach is to offload object detection


tasks to a remote cloud platform for execution, while the prolonged transmission
latency between devices and the cloud may violate real-time requirements [4].
With MEC, there is a growing interest in leveraging edge servers for offloading
object detection tasks [5, 6]. Studies such as [7–10] have demonstrated the feasibility
of meeting accuracy and real-time requirements by offloading either the entire or a
portion of the CNN model to powerful edge servers. Differently, mobile devices
are better served by multiple edge servers, even which have less power, rather than
relying on a powerful edge server [11].
It motivates the decomposition of the object detection task, i.e., transforming the
task into multiple sub-tasks through the segmentation of the CNN model. Thus, the
multiple edge servers can execute the sub-tasks in parallel to reduce latency [12–
16]. However, considering the varied computation workload, and input/output data
volume of each layer within the CNNs, it requires a sophisticated strategy for
model splitting and sub-task generation. This strategy aims to minimize additional
communication overhead among edge servers during parallel execution of object
detection tasks. Additionally, an adaptive sub-task offloading strategy is crucial to
accommodate the resource diversity among edge servers and cope with changing
channel conditions over time, ensuring the minimization of detection latency.

6.1.2 Entry Point Selection

The devices connect to the edge server via wireless links, while the edge servers
communicate with each other via wired links. Each mobile device is equipped with
a high-resolution camera, which is used for image collection. The collected data
will be sent to the edge server to execute the object detection algorithm. Figure
6.1 shows the architecture of MASS to achieve rapid detection with only a few
accuracy decline in industrial edge computing systems [17], which is composed of
the following four modules (depicted in Fig. 6.1):
• Entry Point Selection Module: Upon receiving a frame of an image, the
operator partitions the CNN into two parts based on a selected parallel entry
point. The sub-tasks that should be processed locally are referred to as the head
part. The rest of the sub-tasks, named tail part, will be sent to multiple edge
servers and be executed in parallel.
• Estimation Module: It is used to evaluate the computation cost of sub-tasks.
• Adaptive Sub-task Generation and Offloading Module: This is the core
module in MASS, which acts on the tail part to solve how to adaptively divide
these sub-tasks, where the resources in edge servers are taken into account.
Additionally, we design a uniformly sampled zero-padding scheme to minimize
communication costs among edge servers while preserving detection accuracy.
• Result Merging Module It gathers and merges the outcomes of each sub-task,
and the ultimate results are sent to the mobile device.
6.1 Image-Oriented Object Detection 155

Fig. 6.1 The architecture of MASS

We use .Hjin , .Wjin , .Cjin to represent the height, width, and channels of the input
of CNN layer j . Meanwhile, for the output of CNN layer j , the channels are
represented by .Cjout . Here, each layer of the CNN model [8, 18] can be a parallel
entry point. Nevertheless, different entry points result in distinct gains and costs
associated with parallel offloading, ultimately impacting detection latency. Here, .Kj
represents the kernel size of CNN layer .j, j ∈ 1, 2, · · · , M, where M is the number
of layers in CNN model. Note that MASS can be applied to most CNN-based object
detection models, allowing for the substitution of the CNN models [19–21].
The operator can select a certain layer .i, i ∈ {1, 2, · · · , M} as the parallel entry
point. In this case, the revenue and cost of parallel offloading can be calculated by

.RP Oi = Amticom + Amtimem (6.1)

and

CP Oi = Amtitrans1 + Amtitrans2 .
. (6.2)

Here, .Amticom , Amtimem , Amtitrans1 are the computation cost, memory cost, and
communication cost when offloading sub-tasks [22, 23], which are


M ⎛ ⎞
Amticom =
. Hjin × Wjin × Cjin × Kj2 + 1 × Cjout , (6.3)
j =i


M ⎛ ⎞
Amtimem =
. Hjin × Wjin × Cjin + Cjout + Cjin × Cjout , (6.4)
j =i
156 6 Application-Oriented Industrial Edge Computing

Fig. 6.2 The data halo Data Halo


appeared when splitting the
CNN layer into .2 × 2 slices

A B

C D

Fig. 6.3 The four slices


exchange data for accurate A B
detection, such that the data 0 1 0 0
halo in slice A is filled with
the data from the other three 1 0 1 1
slices
0 1 0

0 1 0
C D

and

Amtitrans1 = Hiin × Wiin × Ciin .


. (6.5)

It is important to recognize the presence of a data halo when dividing the CNN
layer into slices, as illustrated in Figs. 6.2 and 6.3. To uphold detection accuracy, the
sub-tasks need to exchange information within the data halo [24]. Consequently, the
communication cost between two edge servers is

M ⎛
⎲ ⎞
Amtitrans2 =
. Hjin + Wjin × Cjin . (6.6)
j =i

Next, we use parallel efficiency to measure the influence of selecting layer i, i.e.,

RP Oi
P Ei =
. . (6.7)
CP Oi

The P E values for various parallel entry points across different CNN models are
depicted in Fig. 6.4. Essentially, a higher .P Ei value signifies greater offloading
revenue and lower offloading cost. Consequently, we choose the optimal entry point
as .i = arg max P Ei . During this process, the optimal entry point strikes a balance
between revenue and cost caused by offloading.
6.1 Image-Oriented Object Detection 157

Fig. 6.4 The P E value versus different entry points using Faster R-CNN, SSD, and YOLO,
respectively

E
Zero Padding
F

Fig. 6.5 An example of zero-padding, where the data halo is filled with zeros

6.1.3 Computation Cost Estimation

Normally, it is necessary to exchange data for sub-tasks in the tail part within
the data halo of each layer, which also produces an extra communication latency,
denoted by .Amtitrans2 . A possible solution is to eliminate this data exchange by
employing zero-padding along the edges of CNN slices, which is depicted in
Fig. 6.5. However, it results in the accumulation of errors in the data halo over layers,
leading to a significant degradation in detection accuracy.
To fill this gap, we further introduce the uniform sample to it. Initially, we
uniformly sample some layers from the tail part. The sampled layers use zero-
padding to avoid the data exchange, while the rest of the layers are exchanged.
Also, we permit data exchange at layers .i, i + l, i + 2l, · · · to periodically correct
the accumulated error in the data halo. The value of l is a nonnegative empirical
integer, which greatly reduces communication costs among edge servers without
158 6 Application-Oriented Industrial Edge Computing

accuracy degradation. The utilization of the uniform sample helps strike a balance
in the communication cost among edge servers.

⎲ l⏌ ⎛
⎿(M−i)/ ⎞
.Amtitrans3 = in
Hi+j l + Wi+j l × Ci+j l .
in in
(6.8)
j =0

Before offloading, it is essential to estimate the computation cost of these sub-


tasks on the different hardware platforms. Following the approach outlined in [23],
we employ theoretical Floating-point Operations Per Second (FLOPS) as a metric
to gauge the computation complexity of CNNs. The theoretical FLOPs are defined
as


M ⎛ ⎞
theory_com
Amt1
. = Hjin × Wjin × Cjin × Kj2 + 1 × Cjout . (6.9)
j =1

theory_com
However, theoretical FLOPs .Amt1 are unable to measure the com-
pletion latency of the sub-task on a specific hardware platform. The completion
latency of a sub-task is intricately tied to the hardware platform’s configuration,
encompassing factors such as CPU frequency, memory size, GPU pipeline, and
cache [25]. Therefore, mapping the theoretical FLOPs to experimental FLOPs can
be abstracted to a regression problem. We build a quadratic polynomial for the
regression, i.e.,
⎛ ⎞ ⎛ ⎞
theory_com 3 theory_com 2
Amt1real_com = β1 × Amt1 + β2 × Amt1
. (6.10)
theory_com
+ β3 × Amt1 + β4 .

Here, .β1 , .β2 , .β3 , and .β4 are the parameters.


The sub-tasks are generated based on partitioning ratios, i.e., 1/32, 1/16, 1/8, 1/4,
1/2, and 1. Based on this, we calculate the theoretical FLOPs by Eq. (6.9) and record
the experimental FLOPs obtained on the hardware platform, i.e., Nvidia Jetson TX2.
Based on them, we can parameterize the four coefficients through Least Squares
Method. The regression accuracy is evaluated, achieving an error less than .7%. The
final result for the parameter is listed in Table 6.1 for Faster R-CNN, SSD, and
YOLO.

Table 6.1 Parameters of experimental FLOPs


Parameter
Model .β1 .β2 .β3 .β4 Error
Faster R-CNN 1.07 −0.347 0.368 −0.085 2.764%
SSD −2.138 5.946 −3.576 0.732 6.416%
YOLO 1.369 −0.68 0.246 0.059 2.232%
6.1 Image-Oriented Object Detection 159

6.1.4 Adaptive Offloading

The traditional methods split CNNs into multiple slices with the same size, and each
slice is executed by an edge server, as shown in Fig. 6.6. However, for heterogeneous
edges, we must adaptively offload according to the capacity differences among edge
servers. This adjustment takes into account factors such as computation resources,
memory resources, and communication resources, aiming to minimize detection
latency, as depicted in Fig. 6.7.
The set of sub-tasks in the tail part is denoted by .S = {Sp |p ∈ {1, 2, · · · , P },
where P is also the number of available edges, i.e., .Sp is offloaded to edge server
.Ep . The transmission latency of this offloading is

( )
αp × Amtitrans1 + Amtitrans3
trans
.Tp = , (6.11)
bp

where .αp and .bp are the task partitioning ratio and allocated network bandwidth for
transmission, respectively.
Based on Eq. (6.10), the computation cost of the tail part is
( )3 ( )2
Amtireal_com = β1 × Amticom + β2 × Amticom
. (6.12)
+ β3 × Amticom + β4 .

Thus, we calculate the execution latency of sub-task .Sp on edge server .Ep as
follows:

αp × Amtireal_com
Tpexec =
. , (6.13)
xp

Fig. 6.6 The existing model


partitioning strategy [13, 15]
splits the CNN layer into Subtask 1 Subtask 2
slices with equal size

Subtask 3 Subtask 4

Fig. 6.7 Our proposed Subtask 1


α1
adaptive sub-task generation
α2 Subtask 2
strategy
α3 Subtask 3

α4
Subtask 4
160 6 Application-Oriented Industrial Edge Computing

where .xp is the allocated computation resource of .Ep . Similarly, the corresponding
memory consumption is

. yp = αp × Amtimem . (6.14)

Overall, the completion latency of sub-task .Sp and average completion latency are

Tp = Tptrans + Tpexec
. (6.15)

and


P
Tp
p=1
T =
. . (6.16)
P
In industrial edge computing systems, we should find the partition ratio .αp , p ∈
{1, 2, · · · , P }, network bandwidth .bp , computation resource .xp , and memory
resource .yp for each sub-task, to minimize the completion latency. We formulate
the adaptive sub-task generation and offloading problem as follows:


P ∑
P
Tp + (Tp − T )2
p=1 p=1
. min . (6.17a)
α,x,y,b P

P
s.t. αp = 1, p ∈ {1, 2, · · · , P }. (6.17b)
p=1

0 < bp ≤ BEp , p ∈ {1, 2, · · · , P }. (6.17c)


0 < xp ≤ XEp , p ∈ {1, 2, · · · , P }. (6.17d)
0 < yp ≤ YEp , p ∈ {1, 2, · · · , P }. (6.17e)
αp ∈ [0, 1], p ∈ {1, 2, · · · , P }. (6.17f)

Here, .XEp , .YEp , and .BEp as the available computation, memory, and communi-
cation resources of .Ep , respectively. Constraint (6.17b) ensures the partitioning
and offloading of the tail part of the CNN model. Constraints (6.17c), (6.17d),
and (6.17e) ensure that the resources used by sub-tasks do not exceed the capacity
of edges. It can be effectively solved using Sequential Least Squares Programming
(SLSQP).
6.1 Image-Oriented Object Detection 161

6.1.5 Performance Evaluation

Figure 6.8 exhibits the testbed used in this chapter. We employ a mobile phone as
the mobile device, two NVIDIA Jetson AGX Xavier, and two NVIDIA Jetson TX2
development boards as edge servers to capture the heterogeneity. The edge server
communicates with other edges by 1 Gbps Ethernet cables, and with the mobile
device by 5 GHz WiFi.
We implement the mobile side functions on the mobile phone. It constantly
captures the video frame and subsequently offloads them to the edge server for
processing. We use three popular object detection models: Faster R-CNN, SSD,
and YOLO. To ensure repeatability and consistency, we utilize the COCO 2017
dataset [26] for validation.

6.1.5.1 Object Detection Accuracy

The object detection accuracy of MASS, implemented with Faster R-CNN, SSD,
and YOLO, is examined, where the numbers of edge servers vary from 1 to 4. The
parameter l is set to 50, 35, and 53 for Faster R-CNN, SSD, and YOLO, respectively.
MASS achieves nearly identical detection accuracy compared to the original models
with one edge server. Figure 6.9 exhibits the result for the same frame with the
original Faster R-CNN model and MASS based on Faster R-CNN, respectively, each
of which can detect all vehicles. Note that, in Fig. 6.10, there are multiple bounding
boxes for one object achieved by MASS. This is caused by the feature pyramid
network (FPN) in CNN, which is executed among multiple servers. Consequently,

Fig. 6.8 Testbed with four edges


162 6 Application-Oriented Industrial Edge Computing

Fig. 6.9 Object detection results of Original Faster R-CNN model

Fig. 6.10 Object detection results of MASS based on Faster R-CNN

Fig. 6.11 The object detection accuracy of MASS based on Faster R-CNN, SSD, and YOLO

the result merging module is designed for fusing these boxes by the non-maximum
suppression (NMS) algorithm.
Figure 6.11 depicts the average object detection accuracy of MASS, where
the numbers of edge servers vary from 1 to 4. When comparing MASS with the
original object detection models, a minimal degradation in object detection accuracy
is observed for MASS based on Faster R-CNN and YOLO, staying below .1.4%
with two edge servers. Notably, MASS based on SSD experiences an accuracy
degradation of less than .0.1%. The detection accuracy of MASS based on Faster
R-CNN, SSD, and YOLO decreases by .2.9%, .0.2%, and .1.9%, respectively, with
6.1 Image-Oriented Object Detection 163

four edge servers. Also, in this case, the accuracy of MASS under SSD exceeds that
under YOLO. This highlights the varying sensitivity in partitioning for different
object detection models. The setting of l has an impact on detection accuracy, but
the accuracy degradation can be ignored when l is lower than 36.

6.1.5.2 Object Detection Latency

Figure 6.12 shows the object detection latency of MASS of the above three CNN
models, where the numbers of edge servers vary from 1 to 4. Compared to the
original model, MASS based on Faster R-CNN experiences reductions in object
detection latency by .40.98%, .56.54%, and .64.83% when the number of edge servers
is .2, 3, 4, respectively. Furthermore, when utilizing four edge servers, MASS based
on SSD and YOLO perform reductions in detection latency by .60.97% and .46.4%.
This demonstrates that MASS effectively reduces the detection latency for both
two-stage (e.g., Faster R-CNN) and one-stage models (e.g., SSD and YOLO).
Additionally, the acceleration ratio for MASS based on Faster R-CNN and SSD
remains consistent. When the number of edge servers is limited, MASS based
on SSD outperforms that based on Faster R-CNN. Nevertheless, MASS based
on SSD will gradually exceed that based on SSD and YOLO with an increasing
number of edge servers. This is because of the heavier computation requirements
of the two-stage Faster R-CNN model, allowing for more significant gains with
additional edge servers. Conversely, one-stage object detection models generally
have shorter completion times than two-stage models; for instance, Faster R-CNN
requires .1.429 s to detect an image, whereas SSD requires .0.238 s. Consequently,
more edge servers can greatly reduce the execution latency, which also results in a
more substantial transmission latency.

Fig. 6.12 The object detection latency of MASS based on Faster R-CNN, SSD, and YOLO, where
the numbers of edge servers vary from 1 to 4 and the l is set to 50, 35, and 53
164 6 Application-Oriented Industrial Edge Computing

6.1.5.3 The Performance of Adaptive Sub-task Generation and Offloading

To verify the effectiveness of the proposed adaptive sub-task generation and


offloading strategy, we conducted a comparison with random offloading and greedy
offloading strategies, and their performances are depicted in Figs. 6.13 and 6.14 for
a scenario with four edge servers. In the random offloading strategy, the detection
task is randomly divided into P parts, which are then offloaded to P edge servers.
The greedy offloading strategy, on the other hand, divides and offloads the sub-tasks
according to the resources of edge servers.
Figure 6.13 reveals that the proposed algorithm achieves a lower completion
latency of all sub-tasks than that achieved by random and greedy algorithms. It
produces up to .46.86% completion latency reduction. Furthermore, as illustrated in
Fig. 6.14, the standard deviation with the proposed algorithm is the lowest, measur-
ing less than .0.1, when compared to random offloading and greedy offloading.

Fig. 6.13 The completion latency of the sub-tasks on four edge servers

Fig. 6.14 The STD of sub-tasks completion latency of four edge servers
6.2 Point Cloud Oriented Object Detection 165

Fig. 6.15 The object detection accuracy based on Faster R-CNN with different l, entry points i on
four edge servers

6.1.5.4 The Impact of Uniformly Sampled Zero-Padding Scheme

Figure 6.15 displays the object detection accuracy of MASS based on Faster R-CNN
with different values of l and entry points. For a given entry point i, increasing
l leads to a decline in detection accuracy due to the inevitable cumulative error
in the uniform sample for sub-tasks execution. The l is negatively correlated with
the accumulated errors, causing a greater degradation in accuracy. Additionally, the
detection accuracy is also negatively correlated with the position of entry point i.
This is because the closer to the last layer in CNN, the more global knowledge is
required to extract comprehensive object features.
In general, when .l < 36, MASS has a relatively high detection accuracy, proving
the effectiveness of the uniformly sampled zero-padding. It is noteworthy that, for
cases where the entry point .i ≥ 24, Fig. 6.15 does not display detection accuracy for
certain l values. This omission is due to the total CNN layers in the Faster R-CNN
model being less than .i + l in these instances.
Experimental results indicate that MASS achieves up to .64.83% decline in
detection latency with a low (i.e., around 3%) accuracy decline. In future studies, we
tend to leverage machine learning techniques to further enhance sub-task offloading
decisions.

6.2 Point Cloud Oriented Object Detection

6.2.1 Statement of Problem

In industrial edge computing systems, the perception ability of a single device (e.g.,
vehicle) is greatly restrained by the sensors’ capacity, such as its coverage. Hence,
166 6 Application-Oriented Industrial Edge Computing

cooperative perception [27] is regarded as an efficient way to complete the lack of


data caused by capacity limitation by cooperating with other devices. This is also
a benefit of V2V communication, supporting the data exchange among neighbor
devices, as depicted in Fig. 6.16.
In many perception systems, the core functionality revolves around 3D object
detection based on data with the format of point clouds. This process involves
estimating 3D bounding boxes that provide information about the size, 3D pose,
and class of objects. Its primary goal is to precisely discover the target objects in a
certain frame of point clouds, making it a central challenge in autonomous driving
applications. DNNs are widely used in 3D object detection. There have been many
voxel-based models like SECOND, Pointpillars, and so on, which garner significant
attention due to their reputed performance. As illustrated in Fig. 6.17, the detection
procedure is:
1. The voxelization module takes the point clouds as input and partitions their 3D
space into uniform-spaced voxels. Hence, the point clouds can be grouped by
their located voxels.
2. The voxel feature encoding module is used to embed the individual point clouds
in each voxel into a point-wise feature space. The embedded features will then
aggregate with the local features.
3. The sparse convolution layer module can transform the voxel feature into a high-
dimension representation via 2D/3D sparse convolution.

Fig. 6.16 An illustration of cooperative perception helps autonomous vehicles extend sensing
range and improve detection precision

Fig. 6.17 A typical procedure of voxel-based 3D object detection models


6.2 Point Cloud Oriented Object Detection 167

4. The Region Proposal Networks (RPNs) construct a high-resolution feature map


based on the volumetric representations. It is finally used to learn the box
regression of the object and its corresponding classification confidence.
Normally, cooperative perception is classified into three categories according to
which type of data, i.e., raw, feature, and object, is shared among vehicles.
Raw-level cooperation indicates the exchange of raw point cloud data between
vehicles [28]. Each vehicle aligns and fuses the received raw data with its own,
creating a new point cloud frame. After that, 3D object detection is applied to
the new point cloud to generate results. It can achieve high detection precision by
preserving the most relevant information, which also leads to the substantial data
size involved. For instance, a commercial 64-beam LiDAR collects point clouds
(approximately 2 MB) at 5–20 Hz [29], imposing significant pressure on V2V
communication bandwidth. The transmission of excessive data beyond the V2V
communication bandwidth, known as bandwidth saturation [30], leads to latency
or information loss, ultimately impacting detection precision.
Object-level collaboration is executed after the detection results based on the
vehicle’s local point clouds are obtained [31]. Initially, each vehicle processes point
clouds using its 3D object detection models. Subsequently, the vehicle exchanges
its object data. Compared to raw data, object data is significantly smaller, typically
100x or even 1000x [32]. However, object-level cooperative perception sacrifices
detection precision due to the decline in the substantial amount of valuable infor-
mation. Despite consuming only a fraction of the available V2V communication
bandwidth in most cases, this approach is subject to bandwidth underutilization.
To address the above drawbacks, recent proposals, such as feature-level coop-
erative perception (e.g., F-Cooper [33] and FS-COD [34]), aim to strike a balance
between accuracy and bandwidth resources. It shares features among vehicles for
aligning and fusing based on their own feature data. The processed data is then used
for classification and regression to detect objects. Feature-level cooperative percep-
tion demonstrates improvements in both detection precision and data transmission,
positioning it between raw- and object-level cooperative perceptions. However,
despite these advancements, the data transmitted via V2V link in dynamic net-
working conditions still leads to bandwidth saturation or underutilization, impacting
detection precision and, subsequently, driving safety.
Based on this, we conduct an analysis of the detection precision and bandwidth
requirements associated with raw-, feature-, and object-level cooperative percep-
tions. For this analysis, we randomly select 1004 frames of consecutive point clouds
in a single-vehicle perspective from the KITTI dataset, obtained from a Velodyne
64-beam LiDAR.
To simulate the data in a two-vehicle perspective [28], we utilize two point cloud
frames from distinct time segments.
We evaluate the average precision of raw-, feature-, and object-level cooperative
perceptions with sufficient V2V bandwidth, as illustrated in Fig. 6.18a. It indicates
that exchanging data between vehicles contributes to enhancing vehicles’ perception
ability, regardless of the 3D object detection model employed. For instance, with
168 6 Application-Oriented Industrial Edge Computing

Fig. 6.18 Performance of Pointpillars in raw-level, feature-level, and object-level cooperative


perceptions

the Pointpillars model, the raw-, feature-, and object-level detection achieve 60.9%,
55.2%, and 52.9% precision, respectively—outperforming single-vehicle detection
(46.4%). The average precision of raw-level perception is the highest, while that of
object-level perception is the lowest.
Here, we give a detailed analysis of the bandwidth requirements of the above
strategies. We use

te2e = ttransmit + tprocess ,


. (6.18)
6.2 Point Cloud Oriented Object Detection 169

to denote the end-to-end latency with cooperation, where .ttransmit and .tprocess are
the transmission latency via V2V and the processing latency of object detection
model. Taking the urgent real-time requirement into account, .te2e is set to 100 ms,
i.e., the LiDARs scan with 10 fps. .tprocess in SECOND, Pointpillars, PartA2-Net,
and PV-RCNN models are 50.61 ms, 16.46 ms, 80.11 ms, and 79.66 ms, which is
tested on a desktop (Intel i7 CPU with NVIDIA 1080 Ti GPU). The bandwidth
requirements of raw-, feature-, and object-level cooperative perceptions are depicted
in Fig. 6.18b, considering a Wi-Fi 2.4G V2V link. The observation is that the
bandwidth requirements remain stable, resulting in frequent occurrences of either
bandwidth saturation or underutilization under dynamic Channel State Information
(CSI).
In Fig. 6.18a, it is evident that different levels lead to distinct average precision.
Generally, the more data exchanged among vehicles, the higher the average
precision is achieved. However, these existing cooperative schemes deviate from the
optimal as they suffer from long transmission latency and the lack of information
caused by bandwidth saturation or underutilization.
Hence, we introduce a novel cooperative perception scheme, named ML-Cooper,
toward more flexible and adaptive perception within limited bandwidth. Specifically,
ML-Cooper bridges the transmitted data with the dynamic CSI of the V2V link.

6.2.2 Point Cloud Partition

Figure 6.19 [35] exhibits the architecture of ML-Cooper. Two vehicles, distin-
guished as sender and receiver, collect point clouds from their own LiDAR sensors
and connect to each other via a V2V link. The point clouds frame is processed to
be feature and object data through 3D object detection model. ML-Cooper allows
vehicles for hybrid data sharing, i.e., the data in each level can be sent to other
vehicles. For the sender, its processed point cloud data is divided into three parts,
each of which contains partial raw, feature, and object data. After this data is sent to

α,β,Υ

Fig. 6.19 ML-Cooper architecture


170 6 Application-Oriented Industrial Edge Computing

Fig. 6.20 Point clouds partition methods

receiver, the receiver executes an alignment based on the positions and angles and
fuses them with local data. Finally, the fuse data is fed to the 3D detection model.
Additionally, ML-Cooper can be applied to several SOTA 3D object detection
models, e.g., SECOND, Pointpillars, and so on.1 Taking the different influences in
the precision of data levels, we tend to balance the ratio of the three parts with
limited bandwidth to improve the average precision.
The most important module of ML-Cooper is point cloud partitioning. Different
from the 2D image, the point cloud frame is sparse, irregular, orderless, and
continuous. Thus, as depicted in Fig. 6.20, we design two specific point cloud
partitioning methods, i.e., angle-based and density-based partitioning:
• Angle-based partitioning pays attention to the straightforward view of the
vehicle, and thus the raw data here is transmitted at the raw-level to avoid the
lack of information.
• Density-based partitioning focuses on the point clouds far away from the vehi-
cle where the density of points is ultra-low. Similar to angle-based partitioning,
the point clouds in the far range are sent to another vehicle at the raw-level.
Both the angle-based and density-based partition methods reach a high detection
precision with low complexity, which is detailed in Sect. 6.2.6. Indeed, more sophis-
ticated partition methods may enhance perception performance. However, such
enhancements come at the cost of increased complexity in cooperative perception
systems, a topic that warrants further exploration in future studies. It is essential
to recognize that once the boundaries of three parts are established, extracting
corresponding data becomes straightforward as both types of data are point-wise.
However, object data, being nonpoint-wise, introduces a challenge, particularly
when a certain object straddles the borderline between two parts. In such cases,
it is regarded as the object data of the third part.

1 Note that ML-Cooper can also be extended to multiple vehicle scenarios.


6.2 Point Cloud Oriented Object Detection 171

6.2.3 Data Alignment

Normally, each vehicle owns various views of the environment, relying on many
factors, e.g., its location. The received data of a receiver should be aligned with
the sender’s view, which greatly influences the fusing efficiency. It means that, in
practice, extra information of the sender should also be transmitted, e.g., LiDAR
configuration, GPS/IMU, and so on. The IMU helpes the receiver to obtain a
transformation matrix as follows:

.R = Rz (θyaw )Ry (θpitch )Rx (θroll ), (6.19)

where .Rz (θyaw ), .Ry (θpitch ), .Rx (θroll ) are three basic .3 × 3 rotation matrices. .θyaw ,
θpitch , and .θroll represent the differences in yaw, pitch, and roll angles, respectively,
.

as shown in Fig. 6.21.


' ' '
We use .(Xs , Ys , Zs ) and .(Xs , Ys , Zs ) to represent the coordinates of one point in
the local views of sender and receiver, respectively. In this case, the aligned data is
obtained by
⎡ '⎤ ⎡ ⎤ ⎡ ⎤
Xs Xs Δdx
'
. ⎣ Y ⎦ = R × ⎣ Ys ⎦ + ⎣Δdy ⎦ , (6.20)
s
'
Zs Zs Δdz

where .(Δdx , Δdy , Δdz ) is the GPS gap between sender and receiver. We regard
the GPS data to be accurate as the advanced localization technologies have reached
centimeter-level accuracy [36, 37]. Once the LiDAR sensor is properly calibrated,
the feature and object data can be aligned in the same manner. This is because the
features retain the location information of the objects.

Fig. 6.21 The coordinate systems of vehicles


172 6 Application-Oriented Industrial Edge Computing

6.2.4 Multilevel Data Fusion

After transformation, we fuse the three levels of data as follows:


• Raw Data Fusion: Generally, the set of LiDAR point clouds is composed of 3D
coordinates and other attributes, e.g., color or reflection. The set of fused point
clouds .Pf is obtained by fusing the set of point clouds of the original .Pr and
'
aligned raw data .Ps , i.e.,
'
Pf = Pr ∪ Ps .
. (6.21)

It means that after alignment, the points are added to the raw data point set of the
receiver.
• Feature Data Fusion: We use the voxel feature fusion method for encoded
feature map fusion [33]. The non-empty voxels of the sender and receiver are
transformed into 2 128-dimension vectors .Vr = {Vri |i = 1, 2, · · · , 128} and
.Vs = {Vs |i = 1, 2, · · · , 128} for fusion with an element-wise maxout. It can
i

efficiently estimate the importance of features for cooperation, and thus the fused
features .Vf can be obtained by
⎛ ⎞
.Vfi = max Vri , Vsi , i = 1, . . . , 128. (6.22)

• Object Data Fusion: The receiver will add the newly detected objects to the
results, while the repeated objects take the maximum confidence score value from
the local detection and sender’s detection.

6.2.5 K-Soft Actor–Critic Algorithm

Normally, it is difficult to accurately estimate the impacts of different levels of data


in final detection precision. Here, we detail the ML-Cooper that determines the value
of .α, .β, and .γ in a highly dynamic vehicular network, which is solved by a modified
SAC algorithm (the detail can be referred to in Sect. 4.3.3).
The modified SAC algorithm, named K-SAC, is an effective way to solve the
exploration–exploitation dilemma in finding optimal .α, .β, and .γ . In K-SAC, each
of the k frames is distinguished to enhance the exploitation rather than exploration.
More specifically, these frames tend to use the best-known partition strategy. In
K-SAC, the middle point clouds frame in the consecutive sequence with a certain
length is regarded as the key frame [38].
6.2 Point Cloud Oriented Object Detection 173

At the beginning of time slot t,2 the sender and receiver obtain its point clouds
frame, which is regarded as an agent engaging with observed state .st within discrete
decision time slots. Then it takes an action .at based on its policy .πθ based on neural
networks parameterized by .θ , while an immediate reward .rt and next state .st+1 are
returned according to the dynamic network conditions. The objective is to find an
optimal∑ policy .πθ∗ in each time slot with maximum discounted cumulative reward

.R0 = t=0 δ rt . Here, .δ ∈ [0, 1) is the long-term discounting factor. Next, we
t

detail the state space, action space, and reward below:


State Space The state space comprises the information that is observed from the
system by the agent. We use .st to represent this state, including the CSI of the
V2V link. Here, the instantaneous CSI of the V2V link is simulated via channel
estimation [39].
Action Space The action .at required to make is the partition proportion of raw,
feature, and object data, denoted by .α, .β, and .γ , respectively.
Reward The reward is defined as


M−1
rt =
. (φn+1 − φn )pinterp (φn+1 ), (6.23)
n=0

where

. ~),
pinterp (φ) = max p(φ (6.24)
~≥φ
φ

where M is the number of estimated bounding boxes, .p(φ ~) is the measured precision
~, and .pinterp (φ) is a smoothed version of the precision curve .p(φ) [40].
at recall .φ
The recall value .φi ∈ {φ1 , . . . , φM } is obtained by the confidence threshold equal
to the confidence score of the i-th bounding box within the estimated bounding box
set when sorted by the confidence score in descending order.
K-SAC agent aims to find the policy .π(a|s) that also maximizes an entropy term
.− log π(at |st ), which can encourage exploration, of the policy, i.e.,

⎧ ∞


L(π ) = E
. δ [rt − λ(1 − Kt ) log π(at |st )]|π .
t
(6.25)
t=0

Here, .Kt indicates whether or not the frame is a key frame to distinguish the key and
non-key frames. .λ is a temperature parameter used for balancing the importance of
the entropy against the system reward. Since the key frames always receive a larger
weight, the impact of the entropy term is smoothed.

2 Here, each time slot lasts for .Δt = 100 ms.


174 6 Application-Oriented Industrial Edge Computing

6.2.6 Performance Evaluation

In this section, the KITTI dataset [41] and the dataset collected from two real
vehicles are used for the evaluation of ML-Cooper. Also, we compare ML-Cooper
with the following benchmarks:
• Cooper [28] is a raw-level cooperative perception method, which shares and
fuses the raw point clouds collected from different vehicles.
• F-Cooper [33] takes the processed feature maps for fusion, which greatly
reduces the bandwidth requirement.
• L3 [42] is a typical object-level cooperative perception method. The resource-
limited vehicle broadcasts its local sensing results to other vehicles.
• AFS-COD [43] is a feature-level cooperative perception method. It adaptively
transmits and aggregates feature maps with different sizes, where the dynamic
bandwidth is taken into account.
We introduce an adaptive bandwidth mechanism to Cooper and F-Cooper,
enabling them to selectively share only a portion of the raw or feature data. The
performance of these modified schemes, denoted as Cooper-BA and F-Cooper-BA,
respectively, is then compared with that of ML-Cooper. To ensure the fairness of the
comparison, the 3D object detection models are consistently set to be the same.

6.2.6.1 Evaluations on KITTI Dataset

Experimental Setting We select 1004 frames of consecutive point clouds from the
KITTI dataset, captured by a vehicle equipped with a Velodyne 64-beam LiDAR
sensor. The 3D object detection models are executed on a desktop system featuring
an Intel i7-8700 CPU, 48 GB memory, 240 GB+1 TB hard disk, NVIDIA 1080
Ti GPU, running Ubuntu 18.04 with a Linux 5.4.0 kernel. Since the KITTI data
originates from a single vehicle, we utilize two point cloud frames from different
time segments to simulate data generated from two vehicles [28, 33]. The feasibility
of this approach is demonstrated in Figs. 6.22 and 6.23.
With the moving of the vehicle, two subsequent point cloud frames are collected
in time slots .t1 (Fig. 6.22a) and .t2 (Fig. 6.22b), respectively. This vehicle uses
SECOND to detect objects, where the ground truth and detected result are bounded
by green and red boxes. Figure 6.22a exhibits three objects, whose point clouds
are in the far range and thus sparse, are not detected. In our experiment, these two
frames are regarded as two independent point clouds frames that are collected by
two different vehicles, i.e., sender and receiver. It can be seen from Fig. 6.23 that
the detection accuracy after sharing raw, feature, and object data greatly improves
compared with the detection results from the perspective of a single vehicle.
Experimental Results The average precision results for Cooper, F-Cooper, L3,
Cooper-BA, F-Cooper-BA, and ML-Cooper are depicted in Fig. 6.24 using four
different 3D detection models and considering three distinct V2V link scenarios:
6.2 Point Cloud Oriented Object Detection 175

Fig. 6.22 Two frames of point clouds from KITTI dataset, and their detection results

Fig. 6.23 Cooperative detection by sharing different levels of data

Cellular 4G, WiFi 2.4G, and WiFi 5G, all under the angle-based point clouds
partition scheme. Notably, L3 consistently achieves the same performance across
the three V2V link channels, indicating that the small size of object data ensures
successful transmission from the sender to the receiver vehicle. Conversely, as
bandwidth increases, both Cooper and F-Cooper show improved average preci-
sion. For instance, Cooper with Pointpillars exhibits average precision values of
53.3%, 54.6%, and 60.8% in the Cellular 4G, WiFi 2.4G, and WiFi 5G channels,
respectively. Similarly, F-Cooper with SECOND achieves average precision values
of 57.1%, 57.4%, and 58.5% in the three channels, respectively. This improvement
is attributed to the higher bandwidth facilitating more data transmission, thereby
enhancing perception performance. However, Cooper’s average precision with
PartA2-Net and PV-RCNN remains the same across the channels due to these mod-
els requiring more processing time, leaving limited time for data transmission and
leading to consistent bandwidth saturation. Interestingly, in some cases, Cooper’s
176 6 Application-Oriented Industrial Edge Computing

Fig. 6.24 The average precision of Cooper, F-Cooper, L3, AFS-COD, Cooper-BA, F-Cooper-BA,
and ML-Cooper with four different 3D detection models
6.2 Point Cloud Oriented Object Detection 177

Fig. 6.25 Cooperative detection results, where the bandwidth is 150 Mbps. The green and red
boxes represent the ground truth and detected cars, respectively. The yellow and red shadows
indicate the raw point clouds data and feature data received from the sender, respectively

average precision is even lower than that of F-Cooper and L3 due to bandwidth
saturation.
In Fig. 6.24, although Cooper-BA and F-Cooper-BA show slight improvements
in average precision compared to Cooper and F-Cooper, respectively, they still fall
short of ML-Cooper’s performance. The incremental improvement is attributed to
the fact that a certain portion of information is missing in Cooper-BA and F-Cooper-
BA, resulting in completely undetected objects in the scene. Figure 6.25 provides
visualizations of cooperative perception results for Cooper-BA, F-Cooper-BA, and
ML-Cooper, where the bandwidth is set to 150 Mbps, and the angle-based partition
scheme is applied. It is evident that 5 and 4 objects are not detected in Fig. 6.25a
and b, respectively, because the sensing data from the edge is not shared by the
sender. Consequently, the receiver must rely solely on its own sensing data for
object detection. In contrast, Fig. 6.25c illustrates that ML-Cooper can detect more
objects by supplementing the missing raw data with feature data or object data.
This approach enables the sender to provide maximum assistance to the receiver,
resulting in superior perception performance.
For ML-Cooper and AFS-COD, a feature-level cooperative perception method,
it is evident in Fig. 6.25c that AFS-COD outperforms F-Cooper by reducing
data discarding. However, despite AFS-COD’s ability to adjust the size of the
transmitted feature, it still cannot surpass the extreme detection precision achieved
by ML-Cooper. Moreover, since the data size per channel remains essentially fixed,
AFS-COD struggles to accurately adapt to continuous bandwidth variations.
The results in Fig. 6.24 further demonstrate that ML-Cooper consistently
achieves the highest average precision across all cases. As mentioned earlier,
ML-Cooper optimally utilizes the available bandwidth of the V2V link in each time
slot by dynamically adjusting the values of α, β, and γ . This approach effectively
eliminates the impact of bandwidth saturation and underutilization, leading to
superior performance.
Figure 6.26 depicts the average precision of angle-based and density-based ML-
Cooper, where the bandwidths vary from 10 to 1000 Mbps. It is observed that the
performances of the two methods are similar to each other.
178 6 Application-Oriented Industrial Edge Computing

Fig. 6.26 The performance comparison of angle-based and density-based point clouds partition
methods

Fig. 6.27 The two autonomous vehicles used in the experiments

6.2.6.2 Evaluation on Dataset Collected from Two Real Vehicles

Evaluating ML-Cooper on a dataset collected from two real vehicles adds realism
to the assessment. This approach considers factors such as sensor measurements at
different timestamps and the obstruction of the view of the vehicle behind by the
one in front. It addresses some of the limitations associated with simulating vehicle-
to-vehicle cooperation using frames from the KITTI dataset.
Experimental Setting We use two Great Wall WEY VV7 vehicles (Fig. 6.27) to
collect images and point clouds used for this experiment. The configurations of the
6.2 Point Cloud Oriented Object Detection 179

Table 6.2 Configurations of the Autonomous Vehicle


Item Specification Quantity
.360
◦ LiDAR RoboSense-16 1
Front-view camera Sekonix SF332X-10X 3
Surround-view camera Sekonix SF332X-10X 4
Radar Delphi ESR 1
Inertial and GPS sensor CGI-610 1
Computing device NVIDIA Jetson AGX Xavier 2
DSRC transceiver WB-L20B 1

vehicle are referred to as Table 6.2. The common scenarios where we collect the
data are as below:
• Multilane roads. This urban scene is quite common, characterized by numerous
dynamic vehicles driving at high speeds, with car following being a frequent
occurrence. Such complex traffic scenarios are ideal for testing the performance
of our system.
• Road intersections. Another typical scenario is a busy road intersection, where
vehicles congregate in large numbers and congestion easily occurs. Due to the
diverse behaviors of traffic participants and the complexity of traffic conditions
at intersections, real-time cooperative perception plays a crucial role in ensuring
driving safety. Therefore, we include this scenario as one of our test cases.
• Parking lots. This is a crowded environment with numerous obstacles, i.e., busy
aboveground parking lots as one of our test scenarios. As crowded parking lots
are representative of congested areas, we included this scenario as one of our test
cases.

Experimental Results The average precision of Cooper, F-Cooper, L3, Cooper-


BA, F-Cooper-BA, AFS-COD, and ML-Cooper using four different 3D detection
models, with the V2V link being DSRC, is illustrated in Fig. 6.28. In this scenario,
the angle-based point cloud partition method is adopted for ML-Cooper. Notably,
Cooper and Cooper-BA exhibit the poorest performance among the schemes.
This can be attributed to frequent “bandwidth saturation” due to the low average
bandwidth of DSRC, resulting in the unsuccessful transmission of raw data to
enhance perception ability with the Cooper scheme. However, the performance
of Cooper-BA improves due to reduced information loss. Similarly, F-Cooper-BA
outperforms F-Cooper. It is noteworthy that the average precision of F-Cooper and
F-Cooper-BA with Pointpillars is the same, i.e., both are 42.7%.
On the contrary, L3 fails to achieve the best performance due to bandwidth
underutilization, as evidenced by its lower average precision compared to F-Cooper.
For instance, with SECOND and Pointpillars, the average precision of L3 are 43.8%
and 40.3%, respectively, both of which are lower than that of F-Cooper. However,
with PartA2-Net and PV-RCNN, the average precision of L3 surpasses that of F-
Cooper but remains lower than that of F-Cooper-BA. This improvement is attributed
180 6 Application-Oriented Industrial Edge Computing

Fig. 6.28 The average precision of Cooper, F-Cooper, L3, ASF-COD, Cooper-BA, F-Cooper-BA,
and ML-Cooper with four different 3D object detection models and DSRC V2V link

to the bandwidth adaptation mechanism, which alleviates the bandwidth saturation


problem encountered by F-Cooper in these cases.
Furthermore, AFS-COD’s performance is generally inferior to that of F-Cooper-
BA. For instance, with Pointpillars, PartA2-Net, and PV-RCNN, the average
precision of AFS-COD are 42.3%, 51.5%, and 54.7%, respectively, while the
average precision of F-Cooper-BA with the same models are 42.7%, 53.2%, and
55.6%. This discrepancy arises because the 16-beam point clouds are already quite
sparse, and the feature map is even sparser. Especially the feature map with a small
size after multilayer convolution contains too little information to be of significant
assistance.
Indeed, the superior performance of ML-Cooper across various scenarios under-
scores its adaptability and efficiency in handling bandwidth variations. Its ability to
dynamically adjust data transmission, ensure data integrality, and preserve essential
information positions ML-Cooper as a robust solution for cooperative perception
in V2V communication. The flexibility of ML-Cooper to optimize the trade-off
between data richness and bandwidth constraints makes it a promising approach
for enhancing perception accuracy and overall safety in vehicular networks.
The performance of angle-based and density-based point clouds partitioning
methods with the V2V link bandwidth varying from 2 Mbps to 50 Mbps is depicted
in Fig. 6.29. It is observed that the average precision of ML-Cooper with the two
partition methods is quite close to each other.
The experimental results demonstrating ML-Cooper’s superior performance on
both synthetic and real-world datasets highlight its effectiveness in diverse V2V
channel conditions. The potential for further improvement through advanced point
cloud partitioning methods opens avenues for future research, emphasizing the
continual evolution and refinement of cooperative perception schemes for enhanced
autonomous vehicle capabilities.
6.3 Video Inference with Knowledge Distillation 181

Fig. 6.29 The performance comparison of angle-based and density-based point clouds partition
methods

6.3 Video Inference with Knowledge Distillation

6.3.1 Statement of Problem

The emergence of intelligent applications, such as autonomous driving and aug-


mented reality, gives rise to a huge demand for real-time video inference on mobile
devices [44]. Video inference utilizes state-of-the-art (SOTA) DNN models, which
are growing larger and larger, to maintain a high accuracy performance in tasks like
object detection, pose estimation, and semantic segmentation [45]. However, the
computation cost of heavyweight SOTA models is usually unaffordable for mobile
devices, leading to an unacceptable execution latency.
MEC is able to reduce the execution latency by enabling devices to offload
their tasks to the nearby BS equipped with an edge server [46]. With MEC, video
inference with heavyweight models can be executed on the edge server instead
of transmitting the video frames to the remote cloud server over the congested
backhaul network. In MEC networks, to further reduce the inference latency, the
resources of multiple edge servers are aggregated and utilized to run heavyweight
models with data parallel or model parallel approaches. Nonetheless, fluctuations
in wireless bandwidth may incur long communication latency for both approaches,
especially with a substantial volume of raw video data or intermediate features to be
transmitted.
Recently, teacher–student learning has emerged as a promising framework
for real-time video inference on resource-constrained mobile devices in MEC
networks [47]. With teacher–student learning, heavyweight teacher models are
deployed on edge servers, and lightweight student models distilled from teacher
models are deployed on mobile devices.
182 6 Application-Oriented Industrial Edge Computing

It is shown in [48] that a student model can achieve a frame rate of 30 FPS
on a Samsung Galaxy S10+ smartphone for semantic segmentation. Unfortunately,
student models suffer from a decline in accuracy as finite parameters cannot
maintain the similar accuracy of the teacher model across different visual scenes
in video streams [49], which is caused by data drift. To deal with data drift and
eliminate accuracy decrease, the student model has to be updated periodically with
the help of the teacher model through a training process on the edge server. Note
that different training configurations, such as training epochs and frozen layers,
lead to different accuracy improvements with different resource requirements (i.e.,
training cost). It is pointed out in [49] that training costs vary by up to 200-fold
depending on different training configurations, and higher resource usage does not
always translate into higher accuracy.
In multi-device heterogeneous MEC networks, the network operator has to make
optimal updating decisions for each device to achieve high inference accuracy. The
updating decision includes the offloading decision, i.e., which edge to offload, and
the configuration selection decision, i.e., which training configuration to select.
However, it is quite challenging due to resource heterogeneity and limited com-
puting resources of edge servers. First, updating the student model with expensive
training configuration under frequent update requests from all devices poses a great
challenge to the limited resources of edge servers. Moreover, resource heterogeneity
further complicates the problem. Second, the offloading decisions and configuration
selection decisions are strongly coupled with each other, resulting in an extremely
huge solution space, which makes it difficult to find the optimal decision.
To solve these problems, we propose an adaptive teacher–student framework for
real-time video inference in industrial edge computing systems in the following part.

6.3.2 Inference Accuracy Estimation

The system model considered in this chapter is shown in Fig. 6.30, where there are
multiple devices and multiple BSs randomly distributed in the MEC networks. Each
BS is equipped with an edge server, and thus devices can offload the updating tasks
to its connected BS through wireless channels.
Let .M = {1, 2, · · · , M} denote the set of BSs, indexed by m. The GPU
processing capabilities of BSs for executing a task are represented by an M-
dimension vector .W, indexed by .ωm . When the BS has multiple tasks to execute, the
“First Come, First Served” rule should be abided by, i.e., the task received earlier
will be executed sooner, where up to one task can be processed.
Let .N = {1, 2, · · · , N} denote the set of devices, indexed by n. Each device has
a continuous video stream to be inferred, which requires the device to maintain
a lightweight student model for video inference. According to a given training
configuration, the student model is updated with the help of the teacher model by
sending part of the sampled video frames to BS. After the training process at the
BS, the new model is sent back to the devices.
6.3 Video Inference with Knowledge Distillation 183

× ×

Fig. 6.30 The system model of video inference with teacher–student learning in multi-device
heterogeneous MEC networks

In time slot .t ∈ {1, 2, · · · , T }, device u can offload the training task to edge
server m to update its model, or continue to use the old model with an accuracy .δt− .
Here, each time slot lasts .τ seconds. We use two M-dimension vectors .Bun,t and .Bdn,t
to denote the upload and download transmission rates between device n and BSs,
where .Bn,tu (m) and .B d (m) are the upload and download transmission rates between
n,t
device n and BS m, respectively. To maximize the average inference accuracy, the
system should make updating decisions for each device, including the configuration
selection and offloading decisions. The configuration selection decision determines
the accuracy improvement and resource usage. The offloading decision indicates
which BS to offload, which is greatly influenced by the available resources of the
edge servers and network conditions. The definitions are shown in Table 4.2.
Let a 3-tuple .α n,t =< cu , cd , ce > denote the configuration selection decision
for device n in time slot t, i.e., the hyperparameters of training configuration, where
.α n,t (x), x ∈ {1, 2, 3} indicates the three elements in the tuple in sequence. The

total candidate configuration set is indicated by .C. .cu is the ratio of the sampled
video frames sent to the edge server. .cd is the ratio of unfrozen layers in the
DNN model. The parameters of these unfrozen layers will be updated during the
training process. .ce denotes the number of training epochs. Intuitively, expensive
training configuration with large .cu , .cd , and .ce values results in higher accuracy
improvement of the student model. Here we establish the relationship between the
inference accuracy and the training configuration.
184 6 Application-Oriented Industrial Edge Computing

Fig. 6.31 An example of


obtaining the approximate
accuracy improvement ratio
with deeplabv3+ model on an
NVIDIA RTX 2080Ti GPU

Let .cg (αn,t ) denote the GPU seconds3 required by configuration .αn,t ∈ C.
∗ denote the maximum inference accuracy that device n can reach in time
Let .δn,t
slot t. The accuracy improvement ratio .ηn,t is a function of .cg (αn,t ), i.e., .ηn,t =
g(cg (αn,t )). Then the estimated inference accuracy of device n in time slot t can be
obtained as

δn,t = δn,t
. · ηn,t . (6.26)

However, it is quite difficult to derive the closed form expression of .g(·). Therefore,
we use a measurement-based method to approximate .g(·).
We take deeplabv3+ model [50] on cityscapes [51] and A2D2 [52] dataset as
an example. The training process was taken on an NVIDIA RTX 2080Ti GPU.
First, we randomly select several training configurations from .C and conduct the
training process. Then, we measure the accuracy improvement ratios with these
configurations and plot the results in Fig. 6.31. Finally, curve fitting is used to obtain
an approximation of .g(·), which is
⎾ ⏋
g(cg (αn,t )) = 0.1946 ∗ ln cg (α n,t ) + 0.3615.
. (6.27)

Note that this approach is general and can be applied to other models and GPU
hardware.
Let a M-dimension binary vector .β n,t denote the offloading decision of device n
in time slot t, where .βn,t (m) = 1 indicate device n offload the task to BS m in time
slot t. Note that devices can offload the task to at most one BS, so we have


M
. βn,t (m) ≤ 1. (6.28)
m=1

3 GPU seconds refer to the time taken for training with 100% GPU processing capabilities.
6.3 Video Inference with Knowledge Distillation 185

The total latency of a model update process of device n in time slot t consists
of four parts: the training data upload latency, the computation latency, the queuing
latency at the edge server, and the model download latency.
We assume the frame rate of each video f and the resolution of the image
sampled s is fixed, and thus the total amount of the training data contains every
frame in the previous time slot t, which is .Sn,t = f · s · τ . Hence, the training data
upload latency for device u in time slot t is


M u
αn,t (1)Sn,t
u
.ln,t = βn,t (m) u . (6.29)
Bn,t (m)
m=1

Similarly, the model download latency for receiving the updated model for device n
in time slot t is

M
αn,t (2)S d
.
d
ln,t = βn,t (m) d (m)
, (6.30)
m=1
Bn,t

where .S d is the size of the parameters of the student model. Then, the computation
latency of device u in time slot t is
g wm
.
c
ln,t = cn,t , (6.31)

q
where the .ŵ is the 100% GPU processing capabilities of edge server. Let .ln,t denote
the queuing time of device n in time slot t, and then the total update latency is
q
.Ln,t = ln,t
u
+ ln,t
c
+ ln,t + ln,t
d
. (6.32)

Note that the updating process should be finished within each time slot, so we have

Ln,t ≤ τ.
. (6.33)

With the predicted inference accuracy and model update latency, we obtain the
estimated average inference accuracy of all the devices in time slot t as

1 ⎲{ −
N
}
.R(t) = δt Ln,t + δn,t (τ − Ln,t ) . (6.34)
τN
n=1

The objective is to maximize the average inference accuracy by making optimal


updating decisions, which can be formulated as

1 ⎲
T
. max R(t) (6.35)
α n,t ,β n,t T
t=1

s.t. (6.28), (6.33).


186 6 Application-Oriented Industrial Edge Computing

The formulated problem is an integer nonlinear optimization problem, which is NP-


hard and cannot be solved directly. Meanwhile, heuristic algorithms are prone to fall
into local optima, while conventional DRL algorithms like actor–critic are sensitive
to hyperparameters and thus difficult to deal with high-dimensional problems.

6.3.3 Cross Entropy Method (CEM)

In this section, we present the CEM-MASAC algorithm for solving the optimization
problem, of which the architecture is shown in Fig. 6.32. In CEM-MASAC, users,
i.e., agents, take their own actions with a soft value function to interact with the
environment based on SAC introduced in Sect. 3.3.3. CEM-MASAC also leverages
a cross entropy method to further explore the optimal action with population evo-
lution, which aims at avoiding falling into local optima and improving exploration
efficiency.
We formulate the problem as a POMDP problem .< E, S, A, R >:
• Environment .E: In this chapter, let .et ∈ E denote the environment, including
the network conditions, computation resources of all edge servers, and maximum
accuracy improvement of all devices.
⎛ ⎞
et =
. Bu1,t , · · · , BuN,t , Bd1,t , · · · , BdN,t , W, δ ∗t . (6.36)

, 2

1 2

1 2

Fig. 6.32 The architecture of CEM-MASAC algorithm


6.3 Video Inference with Knowledge Distillation 187

• State .S: We assume the absence of information exchange among multiple


devices. In this case, device n only observes a partial state .sn,t ∈ S in time slot t,
including maximum accuracy improvement, network condition, and computation
resources of edge servers, i.e.,
⎛ ⎞

.sn,t = Bun,t , Bdn,t , W, δn,t . (6.37)

• Action .A: The action .an,t ∈ A of device n in time slot t contains the configuration
selection decision and offloading decision, which is

an,t = (α n,t , β n,t ).


. (6.38)

• Reward .R: We define the reward as the average inference of the devices
according to Eq. (6.34). After all the agents take actions in each time slot, the
environment returns an immediate reward

rn,t = r(sn,t , an,t ) = R(α n,t , β n,t ).


. (6.39)

The CEM is an estimation of distribution algorithm that combines evolutionary


with DRL to avoid falling into local optima, enabling more efficient action space
exploration.
First, in each training iteration, CEM generates a population containing K
individuals, each of which indicates the network parameter of all the actors (denoted
by .φk ), according to a Gaussian distribution .N(μ, σ 2 ), i.e., .φk ∈ N(μ, σ 2 ). .μ
and .σ 2 are the mean value and the standard deviation of the normal distribution,
respectively. Second, for the entire population, each individual interacts with the
environment to obtain a reward. Then, the individuals are ranked based on their
rewards and sent to the replay buffer. Finally, top-performing individuals are used
2 ), where
for updating the normal distribution .N(μnew , σnew


K/2
μnew =
. λ i φk , (6.40)
i=1

and


K/2
2
.σnew = λi (φk − μ)2 + ϵ. (6.41)
i=1

Here .λi and .ϵ are the weights given to the individuals and the noise added to the
usual covariance update to prevent premature convergence.
188 6 Application-Oriented Industrial Edge Computing

6.3.4 Performance Evaluation

Datasets We used two datasets for the evaluation of our method: cityscapes [51]
(driving in Frankfurt, 46 mins long), A2D2 [52] (2 videos, 25 mins in total), which
covered various ranges of scenes with fixed cameras and moving cameras at walking
and driving. We split each video into several 10-second segments. The upload and
download transmission rates were set based on two sets of 1200 traces from real-
world communication traces FCC [53], which range from 1 Mbps to 10 Mbps and
from 1 Mbps to 20 Mbps, respectively.
Inference Models We considered the semantic segmentation tasks in our system.
We used deeplabv3+ [50] with Xception65 and mobilenetv2 as the backbone to
simulate the teacher model and student model, respectively. We used deeplabv3+
with Xception65 to label the video frames as the ground truth, which were then
used to supervise the student model training.
Evolutionary Deep Reinforcement Learning Hyperparameters We set the
learning rate of critic part as 10−3 . We set the future reward discount γ as 0.99.
The population size was set to 10, and each population selected the top-half fittest
individuals as the elite individuals.
Retraining Configurations We used set C u = {0.01, 0.02, 0.05, 0.1, 0.2} to
simulate the ratio of sampled frames sent to the edge server. We used set C d =
{0.05, 0.1, 0.25, 1} to simulate the ratio of unfrozen layers of the student model. We
used set C d = {10, 20, 30, 40, 50} to simulate the number of retraining epochs. The
experiment setups are shown in Table 6.3.
Baseline We compared the performance of our method with the following meth-
ods:
• No Update: Each mobile device performs the inference task without offloading
the model updating task to the edge server.
• R-F(α1 ): Each mobile device offloads the model updating task to a random
edge server with a fixed retraining configuration, i.e., 2% training frames, 25%
unfrozen layers, and 20 epochs, to simulate a low-cost retraining configuration.

Table 6.3 Experiment setups


Parameter Value
The number of time slots (T ) 50
The duration of each time slot (τ ) 10 s
The ratio of data sent to the edge (cu ) {0.01, 0.02, 0.05, 0.1, 0.2}
The ratio of unfrozen layers (cd ) {0.05, 0.1, 0.25, 1}
The number of training epochs (cq ) {10, 20, 30, 40}
The future reward discount (γ ) 0.99
The learning rate of critic networks 10−3
6.3 Video Inference with Knowledge Distillation 189

Note that AMS considers a single BS MEC system and executes the model
updating tasks in a polling manner. Since our environment is a heterogeneous
MEC network, we add random offloading to AMS for fairness.
• R-F(α2 ): Each mobile device offloads the model updating task to a random
edge server with a fixed retraining configuration, i.e., 5% training frames, 100%
unfrozen layers, and 40 epochs, to simulate a high-cost retraining configuration.
• S-O: Each mobile device makes the optimal decision independently without
considering the resource contention by traversing the potential solution space.

6.3.4.1 Overall Accuracy Improvement

We evaluated our method and baseline methods on two dimensions: average


accuracy and rewards. With the above two datasets, the average inference accuracy
of these five methods is depicted in Figs. 6.33 and 6.34, where the time varies from
0 to 50 seconds. Note that there is a significant drop in accuracy caused by the
influence of data drift. With fixed configurations .α1 and .α2, the average inference
accuracy of R-F improves up to 1.4% and 3.6%, respectively, when compared to

Fig. 6.33 The average inference accuracy of 12 devices with 3 BSs on cityscapes dataset

Fig. 6.34 The average inference accuracy of 12 devices with 3 BSs on A2D2 dataset
190 6 Application-Oriented Industrial Edge Computing

No Update. With R-F, devices randomly offload the model updating tasks to edge
servers and get random average inference accuracy improvement. It is also found
that the average inference accuracy of S-O improves up to 4.42% compared to No
Update, which is similar to R-F. This is because the devices suffer from resource
contention, and some devices cannot update their models in time. Our method
outperforms all other baseline methods and shows an accuracy improvement of up
to 9.24% compared to R-F. Since it jointly considers the available resources and
network conditions to adaptively make updating decisions, the other methods do
not take into account both.

6.3.4.2 Impact of BS Number

Since the edge servers are heterogeneous in our scenario, we study three com-
binations of edge servers with a fixed number of devices, i.e., 12 devices. The
combinations are shown in Table 6.4. Different combinations present different
numbers and different resources of edge servers. As shown in Figs. 6.35 and 6.36, it
can be found that with more powerful edge servers, the average inference accuracy
will be higher in our method. Since more edge servers mean more resources for
each device in a time slot, leading to more accuracy improvements. It can also
be found that S-O does not yield as many accuracy improvements as our method,
because most devices offload to the same edge server without considering resource
contention and cannot be finished in time. Our method makes the optimal decision
with the edge resources changing and gets the best accuracy improvement compared
to other methods.

Table 6.4 Edge server Name Value


combinations
Combination1 2080Ti, V100
Combination2 2080Ti, V100, A100
Combination3 2080ti, 2080ti, V100, A100

Fig. 6.35 The rewards with different combinations of edge servers on cityscapes dataset
References 191

Fig. 6.36 The rewards with


different combinations of
edge servers on A2D2 dataset

References

1. Zhengxia Zou, Zhenwei Shi, Yuhong Guo, and Jieping Ye. Object detection in 20 years: A
survey. CoRR, abs/1905.05055, 2019.
2. Tejalal Choudhary, Vipul Mishra, Anurag Goswami, and Jagannathan Sarangapani. A compre-
hensive survey on model compression and acceleration. Artif. Intell. Rev., 53(7):5113–5155,
2020.
3. Lei Deng, Guoqi Li, Song Han, Luping Shi, and Yuan Xie. Model compression and hardware
acceleration for neural networks: A comprehensive survey. Proc. IEEE, 108(4):485–532, 2020.
4. Jangwon Lee, Jingya Wang, David J. Crandall, Selma Sabanovic, and Geoffrey C. Fox. Real-
time, cloud-based object detection for unmanned aerial vehicles. In First IEEE International
Conference on Robotic Computing, IRC 2017, Taichung, Taiwan, April 10–12, 2017, pages
36–43. IEEE Computer Society, 2017.
5. Yiwen Han, Xiaofei Wang, Victor C. M. Leung, Dusit Niyato, Xueqiang Yan, and Xu Chen.
Convergence of edge computing and deep learning: A comprehensive survey. CoRR,
abs/1907.08349, 2019.
6. Zhi Zhou, Xu Chen, En Li, Liekang Zeng, Ke Luo, and Junshan Zhang. Edge intelligence:
Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE,
107(8):1738–1762, 2019.
7. Surat Teerapittayanon, Bradley McDanel, and H. T. Kung. Distributed deep neural networks
over the cloud, the edge and end devices. In 37th IEEE International Conference on Distributed
Computing Systems, ICDCS 2017, Atlanta, GA, USA, June 5–8, 2017, pages 328–339. IEEE
Computer Society, 2017.
8. Chuang Hu, Wei Bao, Dan Wang, and Fengming Liu. Dynamic adaptive DNN surgery for
inference acceleration on the edge. In 2019 IEEE Conference on Computer Communications,
INFOCOM 2019, Paris, France, April 29–May 2, 2019, pages 1423–1431. IEEE, 2019.
9. Shigeng Zhang, Yinggang Li, Xuan Liu, Song Guo, Weiping Wang, Jianxin Wang, Bo Ding,
and Di Wu. Towards real-time cooperative deep inference over the cloud and edge end devices.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 4(2):69:1–69:24, 2020.
10. Mikolaj Jankowski, Deniz Gündüz, and Krystian Mikolajczyk. Joint device-edge inference
over wireless links with pruning. In 21st IEEE International Workshop on Signal Processing
Advances in Wireless Communications, SPAWC 2020, Atlanta, GA, USA, May 26–29, 2020,
pages 1–5. IEEE, 2020.
11. Wuyang Zhang, Zhezhi He, Luyang Liu, Zhenhua Jia, Yunxin Liu, Marco Gruteser, Dipankar
Raychaudhuri, and Yanyong Zhang. Elf: accelerate high-resolution mobile deep vision with
content-aware parallel offloading. In Proceedings of the 27th Annual International Conference
on Mobile Computing and Networking, pages 201–214, 2021.
192 6 Application-Oriented Industrial Edge Computing

12. Rafael Stahl, Zhuoran Zhao, Daniel Mueller-Gritschneder, Andreas Gerstlauer, and Ulf
Schlichtmann. Fully distributed deep learning inference on resource-constrained edge devices.
In Embedded Computer Systems: Architectures, Modeling, and Simulation—19th International
Conference, SAMOS 2019, Samos, Greece, July 7–11, 2019, Proceedings, volume 11733 of
Lecture Notes in Computer Science, pages 77–90. Springer, 2019.
13. Li Zhou, Mohammad Hossein Samavatian, Anys Bacha, Saikat Majumdar, and Radu Teodor-
escu. Adaptive parallel execution of deep neural networks on heterogeneous edge devices.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, SEC 2019, Arlington,
Virginia, USA, November 7–9, 2019, pages 195–208. ACM, 2019.
14. Thaha Mohammed, Carlee Joe-Wong, Rohit Babbar, and Mario Di Francesco. Distributed
inference acceleration with adaptive DNN partitioning and offloading. In 39th IEEE Confer-
ence on Computer Communications, INFOCOM 2020, Toronto, ON, Canada, July 6–9, 2020,
pages 854–863. IEEE, 2020.
15. Zhuoran Zhao, Kamyar Mirzazad Barijough, and Andreas Gerstlauer. DeepThings: Distributed
adaptive deep learning inference on resource-constrained IoT edge clusters. IEEE Trans.
Comput. Aided Des. Integr. Circuits Syst., 37(11):2348–2359, 2018.
16. Sai Qian Zhang, Jieyu Lin, and Qi Zhang. Adaptive distributed convolutional neural network
inference at the network edge with ADCNN. In ICPP 2020: 49th International Conference on
Parallel Processing, Edmonton, AB, Canada, August 17–20, 2020, pages 10:1–10:11. ACM,
2020.
17. Duanyang Li, Zhihui Ke, and Xiaobo Zhou. MASS: multi-edge assisted fast object detection
for autonomous mobile vision in heterogeneous edge networks. In Periklis Chatzimisios,
Rodolfo W. L. Coutinho, and Mirela Notare, editors, Q2SWinet 2021: Proceedings of the 17th
ACM Symposium on QoS and Security for Wireless and Mobile Networks, Alicante, Spain,
November 22–26, 2021, pages 61–68. ACM, 2021.
18. En Li, Liekang Zeng, Zhi Zhou, and Xu Chen. Edge AI: on-demand accelerating deep neural
network inference via edge computing. IEEE Trans. Wireless Communications, 19(1):447–457,
2020.
19. Shaoqing Ren, Kaiming He, Ross B. Girshick, and Jian Sun. Faster R-CNN: towards real-
time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell.,
39(6):1137–1149, 2017.
20. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott E. Reed, Cheng-Yang
Fu, and Alexander C. Berg. SSD: single shot multibox detector. In Computer Vision—
ECCV 2016—14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016,
Proceedings, Part I, volume 9905 of Lecture Notes in Computer Science, pages 21–37.
Springer, 2016.
21. Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, and Ali Farhadi. You only look
once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and
Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pages 779–788.
IEEE Computer Society, 2016.
22. Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. Shufflenet V2: practical
guidelines for efficient CNN architecture design. In Computer Vision—ECCV 2018—15th
European Conference, Munich, Germany, September 8–14, 2018, Proceedings, Part XIV,
volume 11218 of Lecture Notes in Computer Science, pages 122–138. Springer, 2018.
23. Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. Pruning convo-
lutional neural networks for resource efficient inference. In 5th International Conference on
Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track
Proceedings. OpenReview.net, 2017.
24. Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkate-
san, Brucek Khailany, Joel S. Emer, Stephen W. Keckler, and William J. Dally. SCNN: an
accelerator for compressed-sparse convolutional neural networks. In Proceedings of the 44th
Annual International Symposium on Computer Architecture, ISCA 2017, Toronto, ON, Canada,
June 24–28, 2017, pages 27–40. ACM, 2017.
References 193

25. Norman P. Jouppi, Cliff Young, and Nishant Patil et al. In-datacenter performance analysis
of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on
Computer Architecture, ISCA 2017, Toronto, ON, Canada, June 24–28, 2017, pages 1–12.
ACM, 2017.
26. Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan,
Piotr Dollár, and C. Lawrence Zitnick. Microsoft COCO: common objects in context. In
Computer Vision—ECCV 2014—13th European Conference, Zurich, Switzerland, September
6–12, 2014, Proceedings, Part V, volume 8693 of Lecture Notes in Computer Science, pages
740–755. Springer, 2014.
27. Florian A. Schiegg, Ignacio Llatser, Daniel Bischoff, and Georg Volk. Collective perception:
A safety perspective. Sensors, 21(1):159, 2021.
28. Qi Chen, Sihai Tang, Qing Yang, and Song Fu. Cooper: Cooperative perception for connected
autonomous vehicles based on 3d point clouds. In 39th IEEE International Conference on
Distributed Computing Systems, Dallas, TX, USA, pages 514–524, 2019.
29. Velodyne lidar hdl-64e. https://www.velodynelidar.com/hdl-64e.html.
30. Jingda Guo, Dominic Carrillo, Sihai Tang, Qi Chen, Qing Yang, Song Fu, Xi Wang, Nannan
Wang, and Paparao Palacharla. CoFF: cooperative spatial feature fusion for 3D object detection
on autonomous vehicles. IEEE Internet of Things Journal, 8(14):11078–11087, 2021.
31. Moreno Ambrosin, Ignacio J. Alvarez, Cornelius Bürkle, Lily L. Yang, Fabian Oboril,
Manoj R. Sastry, and Kathiravetpillai Sivanesan. Object-level perception sharing among
connected vehicles. In IEEE Intelligent Transportation Systems Conference, Auckland, New
Zealand, pages 1566–1573, 2019.
32. Zijian Zhang, Shuai Wang, Yuncong Hong, Liangkai Zhou, and Qi Hao. Distributed dynamic
map fusion via federated learning for intelligent networked vehicles. In IEEE International
Conference on Robotics and Automation, Xi’an, China, pages 953–959, 2021.
33. Qi Chen, Xu Ma, Sihai Tang, Jingda Guo, Qing Yang, and Song Fu. F-cooper: feature based
cooperative perception for autonomous vehicle edge computing system using 3d point clouds.
In Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, Arlington, Virginia,
USA, pages 88–100, 2019.
34. Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, Yaser P. Fallah, Rui Guo, and
Hongsheng Lu. Cooperative LIDAR object detection via feature sharing in deep networks. In
92nd IEEE Vehicular Technology Conference, Victoria, BC, Canada, pages 1–7, 2020.
35. Qi Xie, Xiaobo Zhou, Tie Qiu, Qingyu Zhang, and Wenyu Qu. Soft actor-critic-based
multilevel cooperative perception for connected autonomous vehicles. IEEE Internet of Things
Journal, 9(21):21370–21381, 2022.
36. High performance INS for ADAS and autonomous vehicle testing. https://www.oxts.com/
products/rt3000-v3/.
37. Verizon hyper precise location. https://thingspace.verizon.com/services/hyper-precise-
location/.
38. Yu Feng, Shaoshan Liu, and Yuhao Zhu. Real-time spatio-temporal LiDAR point cloud
compression. In IEEE/RSJ International Conference on Intelligent Robots and Systems, Las
Vegas, NV, USA, pages 10766–10773, 2020.
39. Hansong Wang, Xi Li, Hong Ji, and Heli Zhang. Federated offloading scheme to minimize
latency in MEC-enabled vehicular networks. In IEEE Globecom Workshops, Abu Dhabi,
United Arab Emirates, pages 1–6, 2018.
40. Mark Everingham, Luc Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew
Zisserman. The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis., 88(2):303–
338, 2010.
41. Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving?
the KITTI vision benchmark suite. In IEEE Conference on Computer Vision and Pattern
Recognition, Providence, RI, USA, pages 3354–3361, 2012.
194 6 Application-Oriented Industrial Edge Computing

42. Qi Chen, Sihai Tang, Jacob Hochstetler, Jingda Guo, Yuan Li, Jinbo Xiong, Qing Yang,
and Song Fu. Low-latency high-level data sharing for connected and autonomous vehicular
networks. In IEEE International Conference on Industrial Internet, Orlando, FL, USA, pages
287–296, 2019.
43. Ehsan Emad Marvasti, Arash Raftari, Amir Emad Marvasti, and Yaser P. Fallah. Bandwidth-
adaptive feature sharing for cooperative LIDAR object detection. In 3rd IEEE Connected and
Automated Vehicles Symposium, Victoria, BC, Canada, pages 1–7, 2020.
44. Bin Dai, Fanglin Xu, Yuanyuan Cao, and Yang Xu. Hybrid sensing data fusion of cooperative
perception for autonomous driving with augmented vehicular reality. IEEE Systems Journal,
15(1):1413–1422, 2021.
45. Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. Realtime multi-person 2d pose
estimation using part affinity fields. In IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2017, Honolulu, HI, USA, July 21–26, pages 1302–1310, 2017.
46. Weisong Shi, Jie Cao, Quan Zhang, Youhuizi Li, and Lanyu Xu. Edge computing: Vision and
challenges. IEEE Internet of Things Journal, 3(5):637–646, 2016.
47. Lin Wang and Kuk-Jin Yoon. Knowledge distillation and student-teacher learning for visual
intelligence: A review and new outlooks. IEEE Transactions on Pattern Analysis and Machine
Intelligence, 44(6):3048–3068, 2022.
48. Mehrdad Khani Shirkoohi, Pouya Hamadanian, Arash Nasr-Esfahany, and Mohammad
Alizadeh. Real-time video inference on edge devices via adaptive model streaming. In
IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada,
October 10–17, pages 4552–4562, 2021.
49. Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Yuanchao Shu,
Nikolaos Karianakis, Kevin Hsieh, Paramvir Bahl, and Ion Stoica. Ekya: Continuous learning
of video analytics models on edge compute servers. In 19th USENIX Symposium on Networked
Systems Design and Implementation, NSDI 2022, Renton, WA, USA, April 4–6, pages 119–135,
2022.
50. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam.
Encoder-decoder with atrous separable convolution for semantic image segmentation. In
European Conference on Computer Vision, ECCV 2018, Munich, Germany, September 8–14,
volume 11211, pages 833–851, 2018.
51. Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler,
Rodrigo Benenson, Uwe Franke, Stefan Roth, and Bernt Schiele. The cityscapes dataset for
semantic urban scene understanding. In IEEE Conference on Computer Vision and Pattern
Recognition, CVPR 2016, Las Vegas, NV, USA, June 27–30, pages 3213–3223, 2016.
52. Jakob Geyer, Yohannes Kassahun, Mentar Mahmudi, Xavier Ricou, Rupesh Durgesh,
Andrew S. Chung, Lorenz Hauswald, Viet Hoang Pham, Maximilian Mühlegg, Sebastian
Dorn, Tiffany Fernandez, Martin Jänicke, Sudesh Mirashi, Chiragkumar Savani, Martin Sturm,
Oleksandr Vorobiov, Martin Oelker, Sebastian Garreis, and Peter Schuberth. A2D2: Audi
Autonomous Driving Dataset. CoRR, abs/2004.06320, 2020.
53. Federal Communications Commission. 2016. Raw data Measuring Broadband America.
https://www.fcc.gov/reports/research/reports/measuring-broadband-america/raw-data-
measuring-broadband-america-20.
Chapter 7
Future Research Directions

As we delve into the future of industrial edge computing, the convergence of the-
oretical research and practical applications becomes a pivotal focus for innovation.
This field, at the forefront of technological progress, requires a seamless integration
of theoretical concepts and real-world implementations. Our exploration of future
research directions will highlight the incorporation of emerging technologies like
digital twin and collaborative cloud–edge data analysis, applied in industrial
contexts such as smart manufacturing and predictive maintenance. This approach
not only advances theoretical knowledge but also anchors these developments in
practical, industry-specific use cases. We aim to navigate the evolving realm of
industrial edge computing through this perspective, aiming to discover innovative
solutions and strategies that will define the future of industrial technology.

7.1 Theory Exploration for Future Directions

The exploration of theoretical frameworks for future directions in industrial edge


computing encompasses multifaceted domains, focusing specifically on the inte-
gration of digital twin technology, collaborative data analysis in the cloud–edge
paradigm, and the optimization of real-time communication with reduced data loads.

7.1.1 Digital Twin for Industrial Edge Computing System

The integration of digital twin technology into industrial edge computing systems
marks a significant shift in our approach [1]. Digital twins, as detailed virtual models
of physical entities, greatly enhance real-time monitoring and control capabilities
at the edge [2]. This section thoroughly examines the theoretical foundations and

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024 195
X. Zhou et al., Industrial Edge Computing,
https://doi.org/10.1007/978-981-97-4752-8_7
196 7 Future Research Directions

potential applications of digital twins, highlighting the complexities of incorporating


this concept into the industrial edge computing framework. The integration of
digital twins in industrial edge computing holds immense potential for real-time
monitoring, control, and system efficiency. Theoretical exploration focuses on the
interaction between physical and virtual domains, offering deep insights into real-
time performance, predictive maintenance, and optimization opportunities [3, 4].
Key aspects include:
• Real-Time Monitoring and Control: The digital twin enables real-time moni-
toring of physical assets and processes at the edge. By mirroring the state of the
physical system, it allows for instantaneous feedback and control adjustments,
thereby enhancing operational efficiency.
• Predictive Analytics: Through the analysis of real-time data generated by the
digital twin, predictive analytics can be employed to forecast potential issues
and prescribe proactive solutions. This capability is particularly valuable in
preventing unplanned downtime and optimizing resource utilization.
• Integration with Edge Devices: The digital twin concept aligns seamlessly with
edge computing architectures. Edge devices, strategically positioned closer to
the data source, can leverage digital twin data for localized decision-making,
reducing latency and ensuring rapid response times.
• Resource Optimization: By continuously analyzing data from the digital
twin, industrial edge computing systems can optimize resource allocation. This
includes energy usage, production scheduling, and the efficient utilization of
connected devices within the industrial ecosystem.
• Enhanced Security Measures: The integration of digital twins can contribute
to bolstering cybersecurity measures. By simulating potential cyber threats in
the virtual environment, security protocols can be fine-tuned, creating a more
resilient industrial edge computing system.
In summary, exploring digital twins in the context of industrial edge computing
highlights their potential to transform how industries leverage real-time data and
control systems. This theoretical groundwork paves the way for practical applica-
tions that can revolutionize industrial processes, enhancing agility, resilience, and
efficiency in a digitized era.

7.1.2 Cloud–Edge Collaborative Data Analysis

The synergistic relationship between cloud and edge computing creates a fertile
environment for collaborative data analysis, an area of great importance with sig-
nificant implications for system efficiency and responsiveness [5–8]. Investigating
the intersection of cloud–edge collaboration in data analysis is crucial. This section
aims to delve into the theoretical complexities, exploring how these technologies
can work together to optimize data processing, storage, and analytics:
7.1 Theory Exploration for Future Directions 197

• Enhanced Data Synchronization and Consistency Mechanisms: Future


research must focus on developing sophisticated mechanisms for data
synchronization between cloud and edge environments, ensuring data
consistency and integrity despite the inherent dynamism and network instability
typical of edge computing contexts. This involves crafting advanced protocols
that intelligently discern the priority and relevance of data, thereby managing
synchronization processes more effectively. Such advancements are critical in
maintaining accurate and reliable data flows, which are essential for decision-
making processes in industrial edge computing scenarios.
• Adaptive Data Processing and Storage Strategies: The exploration of dynamic
data processing and storage strategies, contingent upon the criticality and urgency
of data, is paramount. Investigative efforts should concentrate on algorithms that
determine the optimal location for data processing and storage, either at the edge
or in the cloud, thereby optimizing cost and latency. Such strategies would enable
prioritized, local processing of essential data at the edge, while deferring less
critical data to cloud-based processing, enhancing overall system efficiency and
responsiveness.
• Intelligent Data Analysis and Machine Learning Models: There is a pressing
need to develop intelligent data analysis models and lightweight machine learn-
ing algorithms that are efficiently executable within cloud–edge environments.
Research should focus on devising machine learning models that are sustainable
on edge devices with limited computational capacity, thereby reducing depen-
dence on cloud resources. These advancements will pave the way for more
autonomous, real-time decision-making at the edge, leveraging the power of AI
and machine learning in data-intensive industrial applications.
• Enhanced Security and Privacy Protections: Future research must address the
development of robust security protocols and privacy-preserving mechanisms,
especially for sensitive data traversing the cloud–edge collaborative framework.
Emphasis should be placed on exploring advanced encryption and anonymization
techniques, safeguarding data integrity and confidentiality during collaborative
processing and analysis. Such initiatives are crucial in mitigating the risks asso-
ciated with data breaches and privacy violations in increasingly interconnected
industrial edge computing environments.
• Resource Optimization and Load Balancing: Investigative efforts should be
channeled toward the dynamic allocation of computational and storage resources
between the cloud and edge, aiming to achieve optimal performance and cost
efficiency. The development of intelligent load balancing algorithms is essential
to evenly distribute workloads across various edge nodes, preventing resource
overload or underutilization. Such approaches will significantly contribute to
enhancing the operational efficiency and resilience of cloud–edge ecosystems.
• Seamless Integration of Edge and Cloud Computing: The creation of a unified
framework that facilitates seamless integration of edge and cloud computing is
an essential research trajectory. This includes exploring advanced networking
technologies, such as 5G, to bolster communication and collaboration efficiency
between the cloud and edge. Such integrative efforts are imperative for enabling
198 7 Future Research Directions

cross-platform data processing and analysis, thereby harnessing the full potential
of both cloud and edge computing paradigms in industrial applications.
• Energy Efficiency and Sustainability: Optimizing the energy efficiency of
cloud–edge collaborative data analysis, particularly on edge devices, is a critical
area for future research. This includes exploring the use of renewable energy
sources and energy-saving technologies to reduce the overall energy consump-
tion of cloud–edge systems. Pursuing sustainable data analysis practices is
paramount, not only to minimize environmental impact but also to ensure the
long-term viability and cost-effectiveness of industrial edge computing solutions.
Each of these research directions holds the potential to significantly advance the
field of cloud–edge collaborative data analysis, addressing current limitations and
unlocking new possibilities for industrial edge computing.

7.1.3 Real-Time Communication with Less Data

The third dimension of this exploration focuses on the crucial aspect of real-time
communication optimization in edge computing, with an emphasis on minimizing
data transmission. This involves theoretical considerations aimed at developing
advanced communication protocols and algorithms [9, 10]. The goal is to reduce
latency and bandwidth usage, ensuring quick and accurate information exchange
with minimal data payload. This section will not only delve into the theoretical
underpinnings but also suggest potential methodologies for achieving efficient real-
time communication in industrial edge computing environments.
In today’s world, where timely information is paramount, the need for real-
time communication, coupled with the necessity to minimize data transmission,
has become a key area of focus [11]. “Real-time Communication with Less
Data” represents a significant shift in approach, recognizing the importance of
instant information exchange while optimizing network resource utilization. The
theoretical exploration of this concept includes the following aspects:
• Low Latency Protocols: The theoretical underpinnings revolve around the
development of low latency communication protocols. These protocols aim
to minimize the time it takes for data to traverse the network, ensuring that
information reaches its destination with the utmost promptness.
• Bandwidth Optimization Strategies: The exploration includes strategies for
optimizing bandwidth usage. This involves the development of compression
algorithms, data aggregation techniques, and other methodologies to reduce the
amount of data transmitted without compromising the integrity and accuracy of
the information.
• Edge Computing for Localized Communication: Real-time communication
is enhanced by leveraging edge computing capabilities. Edge devices facilitate
localized communication, allowing critical information to be exchanged without
7.2 Application Scenarios 199

the need for extensive data transfer to centralized servers, thereby reducing
latency.
• Prioritized Data Transmission: The theoretical framework includes mecha-
nisms for prioritizing data transmission. By discerning between critical and
noncritical data, the communication system can ensure that essential information
is transmitted swiftly, while less time-sensitive data can follow, optimizing the
use of network resources.
• Dynamic QoS Adaptation: The exploration involves the theoretical develop-
ment of dynamic QoS adaptation mechanisms. These mechanisms enable the
communication system to adjust its parameters based on the current network
conditions, ensuring optimal performance in varying situations.
• Security Measures: The theoretical exploration also encompasses security
measures. Efficient real-time communication requires robust security protocols
to safeguard the transmitted data from unauthorized access and potential threats.
This involves encryption, authentication, and other security measures integrated
into the communication framework.
In conclusion, the pursuit of real-time communication with minimal data is a
crucial theoretical response to the growing need for instant information exchange
in today’s highly connected world. This exploration goes beyond addressing the
technical complexities of reducing latency and optimizing bandwidth. It sets the
groundwork for a communication framework that is not only adaptive and secure but
also precisely calibrated to meet the requirements of contemporary interconnected
systems. This endeavor is pivotal in ensuring that industrial edge computing systems
can operate efficiently, responsively, and securely in an environment where rapid
data exchange is increasingly vital.

7.2 Application Scenarios

In this section, the book explores practical implementations of industrial edge


computing through various application scenarios. These include prognostics and
health management in industrial environments, smart grid systems, manufacturing,
intelligent connected vehicles, and smart logistics [12, 13]. Prognostics and health
management in industrial settings leverage edge computing for real-time monitoring
and predictive maintenance, enhancing equipment efficiency and reducing down-
time. In smart grids, edge computing is crucial for managing energy distribution and
consumption, enabling quicker data processing for energy balance and integration
of renewable sources. The manufacturing sector benefits from edge computing
through improved production line efficiency and quality control, facilitated by
immediate data processing. In the realm of intelligent connected vehicles, edge
computing supports essential functionalities like autonomous driving by processing
data in real time, crucial for vehicle safety and operational efficiency. Lastly, smart
logistics utilize edge computing for real-time tracking and optimization of supply
200 7 Future Research Directions

ICV

Fig. 7.1 Application scenarios of industrial edge computing

chain management, ensuring efficient resource utilization and route planning. These
application scenarios are illustrated in Fig. 7.1, highlighting the wide-ranging and
transformative impact of edge computing across different industrial sectors.

7.2.1 Prognostics and Health Management

Prognostics and Health Management (PHM) is increasingly recognized globally


as a valuable solution for monitoring and managing the health of systems using
operational information collected by various sensors. This approach involves mon-
itoring and evaluating the overall health of the system, predicting potential failures,
and taking preemptive measures to avoid them [14, 15]. A PHM system typically
includes capabilities like performance detection, fault detection, fault isolation, fault
prediction, enhanced diagnosis, health management, and component life tracking. It
integrates autonomous support systems with joint distributed information systems
to predict failures in terms of time and location, thereby saving maintenance costs,
enhancing operational reliability, and enabling condition-based maintenance.
The main objective of PHM is to monitor wear and tear, aging, corrosion, and
failure of components during equipment operation, preventing unplanned downtime,
which poses a significant risk to life and property. With a multitude of sensors
to monitor components in real time, PHM is an important application scenario in
industrial edge computing [16]. Due to the enormous volume of sensor data and the
potential risks associated with delays in cloud processing, the integration of edge
computing into PHM systems is crucial. Implementing industrial edge computing
for initial data processing near the sensors significantly reduces the data volume
uploaded to the cloud and decreases the delay in emergency decision-making.
Presently, industrial edge computing for PHM is extensive, with railway track
safety detection being a notable case. Edge computing is utilized for real-time
7.2 Application Scenarios 201

feature extraction [17] and anomaly detection in railway tracks. This facilitates
monitoring the real-time performance of railways and trains, predicting potential
failures to prevent downtime and support optimization decisions. Additionally,
drones are being employed as a source of information for railway track detection,
enhancing the scope of edge computing in PHM [18].

7.2.2 Smart Grid

The smart grid stands as a prime example of industrial edge computing in action.
Its primary goal is to facilitate node monitoring and information exchange for the
transmission of electrical energy from power plants to end users [19, 20]. The
smart grid offers significant advancements over traditional power grids through
the integration of various aspects of power production, transmission, distribution,
and security protection using advanced information technology. This integration
allows both power grid companies and users to access real-time information about
the status of the grid and electricity consumption, thereby enhancing the overall
efficiency of the power system.
In modern smart grids, a vast array of smart meters and various types of sensing
devices are deployed. This results in a complex overall structure with heterogeneous
data types and substantial instantaneous data volumes. To address these challenges,
edge servers can be deployed near smart meters and sensing devices. These servers
perform in data analysis and make partial decisions, facilitating regional equipment
management and energy efficiency optimization. This approach not only improves
management efficiency but also meets real-time operational requirements. The edge
servers collect data essential for equipment maintenance and structural optimization
and then upload it to the cloud center for centralized processing, analysis, and
training [21].
A smart grid system based on industrial edge computing can intelligently detect
grid structures [22], distribute computing, storage, and control services to the edge
network, and effectively allocate intelligent resources of the entire power system
closer to end users [23]. Such a system can support high-demand functions like
intelligent low-voltage area management, user power management, and monitoring
of external force damage risks [24], showcasing the transformative potential of edge
computing in enhancing smart grid capabilities.

7.2.3 Manufacturing

In the context of industrial manufacturing, the challenges are multifaceted due to


the diversity of industrial devices, the sheer volume of real-time data, complex com-
munication network topologies and protocols, and the need for high-performance,
accurate, and real-time information transmission [25]. Coordinating production
202 7 Future Research Directions

equipment and software management systems at industrial sites is a significant


hurdle. Industrial edge computing, particularly when integrated with Network
Function Virtualization (NFV) technology and real-time network transmission
technology, offers a solution. It enables high-quality network connections from
cloud platforms to edge computing platforms at industrial sites, ensuring flexible,
isolated application deployment, and providing intelligent, real-time, secure, and
quality-assured industrial network and edge computing services.
A notable application in this domain is real-time image or video processing
for purposes like product defect detection and classification [26], worker motion
correction, or checking for assembly errors in equipment components [27]. This
process typically involves image acquisition, preprocessing, segmentation, feature
extraction, and matching recognition [28], often combined with artificial intelli-
gence algorithms. The image detection model, pre-trained using industrial images
or video datasets, may utilize incremental learning algorithms to continuously refine
the model for improved recognition accuracy [29]. By incorporating industrial edge
computing, the training process remains on the cloud platform, while the recognition
tasks are handled at the edge, ensuring both accuracy and reduced latency.
Manufacturing sites often employ various network access methods, like indus-
trial Ethernet or Fieldbus, each with multiple protocols, complicating intercon-
nectivity. Industrial edge computing platforms can convert different protocols
into a common one, addressing the connectivity issue across diverse industrial
networks. Additionally, these platforms offer management and data interfaces and
use lightweight network and application virtualization management for remote
management, upgrading, and maintenance of a vast array of devices and appli-
cations. They allow for remote configuration and monitoring, data cleaning and
desensitization to ensure data usability without leaking sensitive information, and
integrate chip-level secure booting and secure key authentication to provide a secure
environment for industrial networks.

7.2.4 Intelligent Connected Vehicles

The emergence of the 5G era heralds significant advancements in the field of Intel-
ligent Connected Vehicle (ICV), poised to become a crucial scenario in industrial
edge computing [30]. A core solution for ICV in this new era is the collaboration
between edge and cloud computing. Cloud computing acts as a super brain for
vehicles, handling complex processes such as area-wide traffic forecasting [31, 32],
while edge computing functions like the vehicles’ nerve endings, performing more
immediate and “subconscious” reactions such as gathering driving information
about nearby cars or initiating automatic emergency braking.
A key focus within industrial edge computing applied to ICV is autonomous
driving. Regional autonomous driving, which is relatively straightforward, involves
pre-planning the path and speed of a vehicle based on the environmental information
of the entire running area. This enables automatic vehicle operation within a desig-
7.2 Application Scenarios 203

nated area, such as a small amusement park. In situations like the sudden entry of
pedestrians or other vehicles, edge computing swiftly processes image information
from cameras against onboard road data to facilitate immediate responses.
However, adaptive autonomous driving in varying environments is considerably
more complex, encompassing diverse scenarios such as cruising, lane-changing
assistance, navigating intersections, automatic parking, speed control, and path
planning. In these situations, the detection of surrounding vehicles, identification
of traffic signals, and response to emergency obstacles require prompt processing,
which cannot afford the delays of cloud data uploading. Thus, industrial edge com-
puting becomes the processing hub in these scenarios, determining the prioritization
of events for processing either on the edge server or the onboard system.
Beyond driving functionalities, onboard entertainment and services are integral
to the ICV experience. As vehicles travel at high speeds, roadside fixed edge servers
support real-time communication with the vehicles, akin to mobile phone services
but with different movement ranges and speeds. Consequently, the application
scenario for ICV closely aligns with research in MEC [33]. The frequent interactions
among vehicles, edge servers, and the cloud platform necessitate efficient data
routing, caching, and offloading strategies to fulfill the needs of ICVs in the 5G
era.

7.2.5 Smart Logistic

The logistics industry is increasingly becoming a significant application scenario for


industrial edge computing [34, 35]. Traditional logistics, primarily based on RFID
technology, focuses on recording the storage and distribution information of goods
and performing basic management tasks. However, this approach often falls short in
achieving fully automated logistics operations without manual intervention. As the
commodity economy grows, the demand for complete automation in logistics and
storage, along with comprehensive data recording and management of the logistics
process, is becoming more pronounced. This includes tracking transportation routes,
vehicle statuses, driving behaviors, and the storage environments of goods.
In the warehousing and distribution stages of logistics, goods typically undergo
packaging, sorting, stacking, and loading, with RFID tags used to identify infor-
mation about the goods. The incorporation of robotic arms and automated systems
can significantly reduce manual intervention in traditional logistics processes like
packing, stacking, and loading. Additionally, the introduction of logistics robots
can optimize the classification and sorting of goods, paving the way for fully
automated warehousing and distribution. During the sorting process, logistics
robots, functioning as intelligent terminal devices, communicate with edge servers
to inform them about the types of goods identified via RFID tags. The edge server
then plans and communicates the optimal path for transporting the goods from
the shelf to the logistics vehicle back to the robot. The robot, following these
instructions, ensures the correct matching of goods and vehicles on site [36].
204 7 Future Research Directions

Logistics vehicles, which transport goods between warehouses, often travel on


roads with sparse edge servers or base stations. To effectively record and monitor
vehicle statuses, routes, and other pertinent information, and to respond promptly to
emergencies, onboard intelligent terminals based on edge computing are necessary.
These terminals continuously monitor vehicle status and provide real-time alerts
and actions for any detected abnormalities, storing and uploading data to cloud
service platforms for further analysis. Monitoring and alerting systems for driver
behavior are also essential to reduce accident risks. For goods requiring specific
environmental conditions during transport, real-time monitoring of the storage
environment is crucial to minimize transportation losses. Data exchange occurs
when vehicles pass by roadside base stations or edge servers, ensuring continuous
communication and data sharing.

References

1. Sawsan AbdulRahman, Safa Otoum, Ouns Bouachir, and Azzam Mourad. Management of
digital twin-driven IoT using federated learning. IEEE J. Sel. Areas Commun., 41(11):3636–
3649, 2023.
2. Xiangyi Chen, Guangjie Han, Yuanguo Bi, Zimeng Yuan, Mahesh K. Marina, Yufei Liu,
and Hai Zhao. Traffic prediction-assisted federated deep reinforcement learning for service
migration in digital twins-enabled MEC networks. IEEE J. Sel. Areas Commun., 41(10):3212–
3229, 2023.
3. Felipe Arraño-Vargas and Georgios Konstantinou. Modular design and real-time simulators
toward power system digital twins implementation. IEEE Trans. Ind. Informatics, 19(1):52–
61, 2023.
4. Sangeen Khan, Sehat Ullah, Habib Ullah Khan, and Inam Ur Rehman. Digital-twins-based
internet of robotic things for remote health monitoring of COVID-19 patients. IEEE Internet
Things J., 10(18):16087–16098, 2023.
5. Peiyin Xing, Yaowei Wang, Peixi Peng, Yonghong Tian, and Tiejun Huang. End-edge-cloud
collaborative system: A video big data processing and analysis architecture. In 3rd IEEE
Conference on Multimedia Information Processing and Retrieval, MIPR 2020, Shenzhen,
China, August 6–8, 2020, pages 233–236, 2020.
6. Zhichen Ni, Honglong Chen, Zhe Li, Xiaomeng Wang, Na Yan, Weifeng Liu, and Feng Xia.
MSCET: A multi-scenario offloading schedule for biomedical data processing and analysis
in cloud-edge-terminal collaborative vehicular networks. IEEE ACM Trans. Comput. Biol.
Bioinform., 20(4):2376–2386, 2023.
7. Qing Han, Xuebin Ren, Peng Zhao, Yimeng Wang, Luhui Wang, Cong Zhao, and Xinyu
Yang. Eccvideo: A scalable edge cloud collaborative video analysis system. IEEE Intell. Syst.,
38(1):34–44, 2023.
8. Xilai Liu, Zhihui Ke, Xiaobo Zhou, Tie Qiu, and Keqiu Li. QoE-oriented adaptive video
streaming with edge-client collaborative super-resolution. In IEEE Global Communications
Conference, GLOBECOM 2022, Rio de Janeiro, Brazil, December 4–8, 2022, pages 6158–
6163, 2022.
9. Baoquan Yu, Yueming Cai, Xianbang Diao, and Yong Chen. AoI minimization scheme
for short-packet communications in energy-constrained IIoT. IEEE Internet Things J.,
10(22):20188–20200, 2023.
References 205

10. Chenlu Zhuansun, Kedong Yan, Gongxuan Zhang, Chanying Huang, and Shan Xiao.
Hypergraph-based joint channel and power resource allocation for cross-cell M2M commu-
nication in IIoT. IEEE Internet Things J., 10(17):15350–15361, 2023.
11. Shaoling Hu and Wei Chen. Joint lossy compression and power allocation in low latency wire-
less communications for IIoT: A cross-layer approach. IEEE Trans. Commun., 69(8):5106–
5120, 2021.
12. Zakaria Abou El Houda, Bouziane Brik, Adlen Ksentini, Lyes Khoukhi, and Mohsen Guizani.
When federated learning meets game theory: A cooperative framework to secure IIoT
applications on edge computing. IEEE Trans. Ind. Informatics, 18(11):7988–7997, 2022.
13. Wenhao Fan, Shenmeng Li, Jie Liu, Yi Su, Fan Wu, and Yuanan Liu. Joint task offloading
and resource allocation for accuracy-aware machine-learning-based IIoT applications. IEEE
Internet Things J., 10(4):3305–3321, 2023.
14. X. Yi, Y. Chen, P. Hou, and Q. Wang. A survey on prognostic and health management for
special vehicles. In 2018 Prognostics and System Health Management Conference (PHM-
Chongqing), pages 201–208, 2018.
15. Carlos Pedroso, Yan Uehara de Moraes, Michele Nogueira, and Aldri Santos. Relational
consensus-based cooperative task allocation management for IIoT-health networks. In 17th
IFIP/IEEE International Symposium on Integrated Network Management, IM 2021, Bordeaux,
France, May 17–21, 2021, pages 579–585, 2021.
16. A. L. Ellefsen, V. Æsøy, S. Ushakov, and H. Zhang. A comprehensive survey of prognostics
and health management based on deep learning for autonomous ships. IEEE Transactions on
Reliability, 68(2):720–740, 2019.
17. Z. Liu and Others. Industrial AI enabled prognostics for high-speed railway systems. In 2018
IEEE International Conference on Prognostics and Health Management (ICPHM), pages 1–8,
2018.
18. J. Yang, X. Cheng, Y. Wu, Y. Qin, and L. Jia. Railway comprehensive monitoring and warning
system framework based on space-air-vehicle-ground integration network. In 2018 Prognostics
and System Health Management Conference (PHM-Chongqing), pages 1314–1319, 2018.
19. Lulu Wen, Kaile Zhou, Wei Feng, and Shanlin Yang. Demand side management in smart
grid: A dynamic-price-based demand response model. IEEE Trans. Engineering Management,
71:1439–1451, 2024.
20. H. Farhangi. The path of the smart grid. IEEE Power and Energy Magazine, 8(1):18–28, 2010.
21. James Cunningham, Alexander J. Aved, David Ferris, Philip Morrone, and Conrad S. Tucker.
A deep learning game theoretic model for defending against large scale smart grid attacks.
IEEE Trans. Smart Grid, 14(2):1188–1197, 2023.
22. G. Lin and Others. Community detection in power grids based on Louvain heuristic algorithm.
In 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), pages 1–4,
2017.
23. H. Wang, Q. Wang, Y. Li, G. Chen, and Y. Tang. Application of fog architecture based on
multi-agent mechanism in CPPS. In 2018 2nd IEEE Conference on Energy Internet and Energy
System Integration (EI2), pages 1–6, 2018.
24. C. Jinming, J. Wei, J. Hao, G. Yajuan, N. Guoji, and C. Wu. Application prospect of
edge computing in smart distribution. In 2018 China International Conference on Electricity
Distribution (CICED), pages 1370–1375, 2018.
25. F. Shrouf, J. Ordieres, and G. Miragliotta. Smart factories in industry 4.0: A review of the
concept and of energy management approached in production based on the internet of things
paradigm. In 2014 IEEE International Conference on Industrial Engineering and Engineering
Management, pages 697–701, 2014.
26. L. Li, K. Ota, and M. Dong. Deep learning for smart industry: Efficient manufacture inspection
system with fog computing. IEEE Transactions on Industrial Informatics, 14(10):4665–4673,
2018.
27. H. Kanzaki, K. Schubert, and N. Bambos. Video streaming schemes for industrial IoT. In 2017
26th International Conference on Computer Communication and Networks (ICCCN), pages
1–7, 2017.
206 7 Future Research Directions

28. A. Sabu and K. Sreekumar. Literature review of image features and classifiers used in leaf
based plant recognition through image analysis approach. In 2017 International Conference on
Inventive Communication and Computational Technologies (ICICCT), pages 145–149, 2017.
29. Y. Wang and M. Weyrich. An adaptive image processing system based on incremental learning
for industrial applications. In Proceedings of the 2014 IEEE Emerging Technology and Factory
Automation (ETFA), pages 1–4, 2014.
30. C. Chen, J. Hu, T. Qiu, M. Atiquzzaman, and Z. Ren. CVCG: Cooperative V2V-aided
transmission scheme based on coalitional game for popular content distribution in vehicular
ad-hoc networks. IEEE Transactions on Mobile Computing, pages 1–18, 2018.
31. A. Thakur and R. Malekian. Fog computing for detecting vehicular congestion, an internet
of vehicles based approach: A review. IEEE Intelligent Transportation Systems Magazine,
11(2):8–16, 2019.
32. S. Yang, Y. Su, Y. Chang, and H. Hung. Short-term traffic prediction for edge computing-
enhanced autonomous and connected cars. IEEE Transactions on Vehicular Technology,
68(4):3140–3153, 2019.
33. F. Giust, V. Sciancalepore, D. Sabella, M. C. Filippou, S. Mangiante, W. Featherstone, and
D. Munaretto. Multi-access edge computing: The driver behind the wheel of 5g-connected
cars. IEEE Communications Standards Magazine, 2(3):66–73, 2018.
34. Pirmin Fontaine, Stefan Minner, and Maximilian Schiffer. Smart and sustainable city logistics:
Design, consolidation, and regulation. Eur. J. Oper. Res., 307(3):1071–1084, 2023.
35. Hiren Dutta, Saurabh Nagesh, Jawahar Talluri, and Parama Bhaumik. A solution to blockchain
smart contract based parametric transport and logistics insurance. IEEE Transactions on
Services Computing, 16(5):3155–3167, 2023.
36. C. Lin and J. Yang. Cost-efficient deployment of fog computing systems at logistics centers in
industry 4.0. IEEE Transactions on Industrial Informatics, 14(10):4603–4611, 2018.

You might also like