100% found this document useful (1 vote)
237 views27 pages

5G Network Operations: Ai/Ml Based Recursive Autonomic Oss

Uploaded by

Rabatrock
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
237 views27 pages

5G Network Operations: Ai/Ml Based Recursive Autonomic Oss

Uploaded by

Rabatrock
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

5G NETWORK OPERATIONS

AI/ML BASED RECURSIVE


AUTONOMIC OSS
TABLE OF CONTENTS

03... Executive Summary

04... Introduction

06... 5G Network Architecture

07... Network Slicing

09... Key Challenges in Operating a 5G Network

11... Altran’s Proposed Solution – Reinforcement Learning


Based Recursive Autonomic OSS Systems

25... Conclusion

Altran
EXECUTIVE SUMMARY

5G is the 5th generation of mobile networks designed to meet


the large growth in data and connectivity of today’s highly mobile
and fully connected society. 5G will bring new unique network and
service capabilities. Firstly, it will ensure user experience continuity in
challenging situations such as high mobility (e.g., in trains), very dense
or sparsely populated areas, and journeys covered by heterogeneous
technologies. In addition, 5G will be a key enabler for IoT by providing
a platform to connect a massive number of sensors, rendering devices
and actuators with stringent energy and transmission constraints.
Furthermore, mission critical services requiring very high reliability,
global coverage and very low latency will become natively supported
by the 5G infrastructure.

5G networks use the concept of end-to-end network slicing, which


enables the concurrent deployment of multiple end-to-end logical,
self-contained and independent shared or partitioned networks on
a common infrastructure platform, to achieve the performance and
scalability requirements. Recursive network slicing, i.e., slices overlaid
on top of other network slices, is also supported.

To manage such highly scalable and recursively sliced 5G networks


and to maintain the customer QoE and SLAs in real time, the OSS
systems managing the operations must be autonomous and self-
driven. It should be AI/ML based and must support cognitive
algorithms for automation of network operations.

Altran proposes a Reinforcement Learning based Recursive


Autonomic OSS Solution for operating the 5G networks. The
proposed solution is analogous to the concept of autonomic control
systems and reinforcement learning present in the human body.

03 - Executive Summary Altran


INTRODUCTION

The proliferation of connected objects and devices in the 5G networks


will pave the way to a wide range of new services and associated
business models enabling automation in various industry sectors and
vertical markets (e.g., energy, e-health, smart city, connected cars,
industrial manufacturing, etc.). In addition to more pervasive human
centric applications, e.g., virtual and augmented reality augmentation,
4k video streaming, etc., 5G networks will support the communication
needs of machine-to-machine and machine-to-human type
applications. Autonomously communicating devices will create mobile
traffic with significantly different characteristics than today's human-
to-human traffic. The coexistence of human centric and machine type
applications will impose very diverse functional and KPI/performance
requirements that 5G networks will have to support.

There are three major categories of a use case for 5G:

• Massive Machine Type Communications (mMTC) – Enables


the machine-to-machine (M2M) and Internet of Things (IoT)
services that involve connecting billions of devices without human
intervention at a scale not seen before

• Ultra-Reliable Low Latency Communications (uRLLC) –


Mission critical including real time control of devices, industrial
robotics, vehicle to vehicle communications and safety systems,
autonomous driving and safer transport networks

Altran 04 - Introduction
• Enhanced Mobile Broadband (eMBB) – Provides significantly
faster data speeds and higher capacity to keep the world
connected. New applications will include fixed wireless internet
access for homes, outdoor broadcast applications without the
need for broadcast vans and greater connectivity for people on
the move

The management, orchestration and operation of the 5G networks


supporting the above category of services and associated
performance requirements require unprecedented efficiency and
automation levels. The OSS solutions for operating such networks
need to be Intelligent to self-heal and auto-revive in real-time to
address complex use cases and dynamic network behavior in real
time.

This paper starts with an overview of the 5G network architecture,


network slicing concept and key challenges in operating a 5G
network. It then describes Altran’s proposed reinforcement learning
based recursive autonomic OSS solution for operating the 5G
networks.

05 - Introduction Altran
5G NETWORK
ARCHITECTURE

5G Network architecture is highly service oriented in which services


are provided via a common framework to network functions that
are permitted to make use of these services. Modularity, reusability
and self-containment of network functions are additional design
considerations as per 3GPP specifications for a 5G network
architecture.

Using SDN and NFV technologies, radio access (RAN), transport and
core networks in 5G have been designed to be cloud oriented. Cloud
orientation allows for better support for diversified 5G services and
enables the key technologies of E2E network slicing.

Altran 06 - 5G Network Architecture


NETWORK SLICING

Network slicing is an end-to-end concept covering all 5G network


segments, including radio access, core, transport and edge networks.
In a 5G network, it is defined as “a composition of adequately
configured network functions, network applications, and the
underlying cloud infrastructure (physical, virtual or even emulated
resources, RAN resources, etc.), that are bundled together to meet
the requirements of a specific use case, e.g., bandwidth, latency,
processing, and resiliency, coupled with a business purpose.”

Network slicing allows multiple logical networks to simultaneously


run on top of a shared physical network infrastructure across multiple
domains and technologies to create tenant or service specific
networks. It aims for building dedicated logical networks that exhibit
functional architectures customized to the respective telco services,
e.g., eMBB, uRLLC, mMTC, etc. From a business point of view, a slice
includes a combination of all the relevant network resources, network
functions, service functions and enablers required to fulfill a specific
business case or service, including OSS and BSS.

07 - Network Slicing Altran


A pictorial representation of the Network Slice concept is given
below:

Figure 1: Network Slicing Concept

Altran 08 - Network Slicing


KEY CHALLENGES IN
OPERATING A 5G NETWORK

5G networks are expected to support use cases such as enhanced


mobile broadband (eMBB), ultra-reliable low latency communications
(uRLLC) and massive machine type communications (mMTC) based
on the characteristics such as ultra low latency, high data speeds
and the ability to connect many IoT devices cost effectively. While
the use cases and capabilities expected to be supported by 5G
are compelling, operating such networks poses many different
challenges. Some of these are described below:

• Management of the network slice instances recursively overlaid


on top of other network slice instances

• Ensuring uninterrupted availability, rapid scalability, resilience


and high levels of automation for the network slice resources
monitoring and provisioning

• Adhering to the highly demanding SLAs associated with real time


communication services

• Unified performance monitoring of a hybrid network composed


of 3G/4G/5G physical, traditional, NFV and cloud infrastructure

• Monitoring network slice instances to ensure that virtual


resources spun up for specific applications or customers will
meet the required service levels and committed QoE levels

• Implementation of the automated cross domain and inter/intra


slice correlation, root cause analysis and self-healing

• Accurate fault predictions and forecasting

09 - Key Challenges in Operating a 5G Network Altran


• Automated self-driving processes and workflows for
performance monitoring and fault resolution

• (Near) real time processing of massive amount of data generated


from billions of devices

• Implementation of Advanced data analytics to generate


intelligence that can infer, predict, forecast, fix and preempt
network faults.

Altran 10 - Key Challenges in Operating a 5G Network


ALTRAN’S PROPOSED
SOLUTION – REINFORCEMENT
LEARNING BASED RECURSIVE
AUTONOMIC OSS SYSTEMS

Altran proposes a reinforcement learning based recursive


autonomic OSS system that is analogous to reinforcement learning
and autonomic control in the human body. Below sections detail
the human body autonomic functions and the proposed solution
analogous to it.

Autonomic Control Systems & Reinforcement Learning in Human


Body

Reinforcement learning is an area of machine learning inspired


by behaviorist psychology. Humans use reflexes in some of their
behaviors. In other words, how these reflexes manage their resources
in adapting to unpredictable changes in the environment is defined
by reinforcement learning.

The figure below shows how the reflex actions and reinforcement
learning works in a human body. Reflex actions carried out by spinal
cord occur automatically, but at the same time, as the reflex occurs, a
signal is sent to the brain by a connecting interneuron (relay neuron)
for further interpretation and additional reaction. When this happens,
the stimulus is processed by the spinal cord as a reflex arc and a
response is sent back to the periphery, through the motor neuron.

11 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
Figure 2: Reflex Actions and Reinforcement Learning in Humans

There are two control centers that define reinforcement learning


– Central Autonomic Control System (CACS) and Local Autonomic
Control System (LACS). These systems are inherent in the human
body. The reflex arc is analogous to the LACS and the brain
is analogous to the CACS. The reflex arc processes any signal
that needs an immediate response. But at the same time, it also
propagates the reflex action to the brain as a signal. The brain learns
from this signal (reinforcement learning), processes it further based
on signals received from other sensors and receptors, and applies
this learning when the body goes near a hot surface next time.

Altran 12 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
Autonomic Control Systems & Reinforcement Learning in 5G
Network Operations

The concept of “autonomic control systems” and “reinforcement


learning” can be applied in network operations to achieve the
dynamism and agility required for managing the 5G networks.
The figure below shows a one to one mapping of each step taken by
the human body (when a hot surface is touched with a finger) with the
corresponding steps to be taken by the network when an event such
as a spike in the data traffic is captured.

Figure 3: Reinforcement Learning in Network Operations

13 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
The captured event are propagated to the LACS, which functions
as the reflex arc and takes immediate remedial actions. The event,
remedial action and its result is propagated to the CACS which stores
it in its knowledge plane.

Local Autonomic Control System (LACS)

LACS in network operations is implemented as a closed loop


automation system driven by pre defined rules, policies and the
knowledge cache. Offline and online mechanisms create the
knowledge cache.

In the offline method, the knowledge/learning data is pre populated


in the form of training data sets that can be used for making
decisions. In the online method, the knowledge/learning data is
pushed from the CACS on an ongoing basis.

Central Autonomic Control System (CACS)

CACS is implemented like the human brain, an intelligent system


that can identify the various scenarios and patterns in network traffic,
decide on the best possible action(s) to be taken, learn from the
outcomes of the actions and update its knowledge plane with the
learnings.

The CACS comprises of deep machine learning models based on


neural networks. These models are applied to identify traffic patterns
and reasoning to decide on the best possible action.

The CACS propagates all the learnings back to all the LACSs which
are connected to it. This ensures that the local knowledge bases of
all the LACSs are updated with this learning and they can apply this
knowledge next time such an event occurs.

Altran 14 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
The figure below shows the working of the CACS at a high level.

Figure 4: High Level Working of CACS

15 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
Autonomic Control Systems in the 5G OSS Architecture

In the OSS architecture for 5G, distributed OSS systems are


implemented as autonomic control systems, as shown in the below
figure.

Figure 5: Autonomic Control Systems in OSS Architecture for 5G

The CACS is implemented as the central autonomic OSS, responsible


for the overall management and orchestration of network slices and
services over the network operator’s 5G network.

The LACS is implemented as the local autonomic OSS, which acts


as the network slice manager and is responsible for the lifecycle
management and orchestration of network slices and services within
each slice.

Altran 16 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
The overall OSS architecture for the 5G network operations is
depicted below:

Figure 6: Overall OSS Architecture for 5G

The different layers in the architecture are detailed below:


• Network Layer: This layer hosts all the 5G network functions
(physical & virtual) of the 5G network operator

• Network Management & Control Layer: All the element


managers, VNF managers and network controllers are a part of
this layer

17 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
• Network Orchestration Layer: Network resource orchestrators
of different domains (access, core, transport, etc.) and multi
domain network orchestrators are a part of this layer. These are
responsible for interfacing with the element & VNF managers
and SDN controllers to allocate and activate the network
resources as required

• Network Slice & Network Service Orchestration Layer: This layer


comprises of the central autonomic OSS system. It is responsible
for the fulfillment and assurance of network slices and services
provided by the 5G operator

• Network Slice Management Layer: A network slice can be


overlaid on top of another network slice which can happen
recursively. Whenever a new network slice is created and
activated, a local autonomic OSS system is spawned by
the parent (central/local) autonomic OSS for the resource
management

This layer comprises multiple local autonomic OSS systems


responsible for the fulfillment and assurance of all recursively
overlaid parent and child network slices.

The concept of recursive spawning of local autonomic OSS systems is


detailed below.

Recursive OSS Architecture

The central autonomic OSS is the OSS system managing the


fulfillment and assurance of 5G operator’s network resources and the
services & slices utilizing those resources. The local autonomic OSS is
the OSS system managing the assurance of local network resources
and any network slices recursively created using those resources.

The local autonomic OSS typically only captures the data related
to the events and performance metrics of the network resources
comprising the network slice of the tenant and would not intrude
into the data related to the tenant’s customers and product & service
offerings.

Altran 18 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
A pictorial representation of the recursive OSS architecture is shown
below:

Figure 7: Recursive OSS Systems for 5G Network Operations

In the above figure, two network slices are created from the 5G
network for two different CSP customer tenants. The recursive
spawning of the local OSS systems for both the tenants in different
scenarios are described below:

Network Slice 1: A local OSS system is spawned by the central OSS


system when this slice is created. This local OSS system reports the
fault and performance data of this slice and all its child slices to the
central OSS system.

This CSP tenant further sells network slices to two of its customers –
an MVNO and an OTT player. These are named as Network Slice 1.1
and Network Slice 1.2. Two local OSS systems are spawned by the
local OSS system (of the CSP) when these slices are created. These

19 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
local OSS systems report the fault and performance data of these two
slices and all their child slices to their parent local OSS system.

Network Slice 2: A local OSS system is spawned by the central OSS


system when this slice is created. This local OSS system reports the
fault and performance data of this slice and all its child slices to the
central OSS system.

This CSP tenant further sells a network slice to its MVNO customer.
This is named as Network Slice 2.1. The MVNO further sells a network
slice to its OTT customer. This is named as Network Slice 2.1.1.

Two local OSS systems are spawned by the local OSS systems (of the
CSP and the MVNO) when these slices are created. These local OSS
systems report the fault and performance data of these two slices and
all their child slices to their parent local OSS systems.

The local OSS systems report the following data to their parent OSS
(central/local) system:

• Faults/events generated by network resources and their


corrective actions along with the corresponding results
• Performance metrics and KPIs
• Performance threshold breach faults/events and their corrective
actions along with the corresponding results

All the above data is stored in the data lake within the central
autonomic OSS system. The corrective actions are their results and
are used for the continuous reinforced learning and the learnings are
recursively propagated from the central OSS system to all the child
OSS systems.

Altran 20 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
Central Autonomic OSS

Functional architecture of the Central Autonomic OSS is depicted


below:

Figure 8: Functional Architecture – Central Autonomic OSS

The central autonomic OSS provides fulfillment and assurance


functions for the network slices, services & resources in the 5G
network. The different components are described below:

• Network Slice & Network Service Order Management: This


component includes catalogs for the network slice and network
service templates and provides the order orchestration and
policy & workflow management functions for the instantiation of
network slices and services

21 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
• Dynamic Inventory: This component stores the inventory of
the slice and service instances, and their mappings with the
network resources. It also includes a function for the real time
synchronization of inventory data with the network

• Fault Management: This component provides the fault


management functions such as event collection, filtering,
processing and correlation

• Performance Management: This component provides


performance management functions such as metrics collection,
aggregation, KPI calculation, thresholding and reporting

• Service Quality Management: This component provides


capabilities for service modeling, monitoring and service impact
analysis

• Closed Loop Automation: This component provides capabilities


for rules and policies defined to define the actions that can be
taken against different fault conditions for self-healing

The Central Autonomic OSS System also provides the following


functions:

• Rules based self-healing, optimization and configuration


• Reinforced continuous learnings based on corrective &
optimization actions and their outcomes
• Data lake storing the massive amount of data for network traffic
patterns, error scenarios and fault resolutions
• Deep learning models based on neural networks & cognitive
algorithms for data analytics, self-healing, forecasting, network
optimization and fault prediction and preemption

Altran 22 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
Network Slice Manager – Local Autonomic OSS

Functional architecture of the Local Autonomic OSS is depicted


below:

Figure 9: Functional Architecture – Local Autonomic OSS

The local autonomic OSS acts as the network slice manager and
provides fulfillment and assurance functions for the network slices
and resources. The different components are described below:

• Network Slice Management: This component includes a


catalog for the network slice templates and provide the order
orchestration and policy & workflow management functions for
the instantiation of network slices

• Dynamic Inventory: This component stores the inventory


of the network slice instances and their mappings with the
network resources. It also includes a function for the real time
synchronization of inventory data with the network

23 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems Altran
• Fault Management: This component provides the fault
management functions such as event collection, filtering,
processing and correlation

• Performance Management: This component provides


performance management functions such as metrics collection,
aggregation, KPI calculation, thresholding and reporting

• Closed Loop Automation: This component provides capabilities


for rules and policies defined to define the actions that can be
taken against different fault conditions for self-healing

Besides, it also provides local autonomic control functions for rules


based self-healing, optimization and configuration.

Altran 24 - Altran’s Proposed Solution – Reinforcement Learning Based Recursive Autonomic OSS Systems
CONCLUSION

5G Network Operations poses many different challenges such as


real-time monitoring and management of recursively overlaid network
slices, zero touch provisioning of network resources, adherence to
highly demanding SLAs and real time processing of massive amounts
of data generated from billions of devices.

A highly efficient, scalable and AI/ML enabled OSS system is required


for operating the 5G networks.

Altran’s proposed OSS solution uses the recursive model for


reinforcement learning to achieve the required scalability and
efficiency. The local OSS system manages all the individual network
slices while the central OSS system performs the data analytics and
machine learning work. Also, the proposed OSS solution includes a
data lake to store the massive amount of network data that can be used
for reinforcement learning and to train the deep machine learning
models for network optimization and fault predictions.

......

References
1. 5G-PPP-5G-Architecture-White-Paper_v3.0_PublicConsultation.pdf
2. https://www.viavisolutions.com/en-us/5g-architecture
3. Heavy Reading Whitepaper – Overcoming the 5G Challenges of Monitoring, Assurance
and Automation
4. https://5g-ppp.eu/wp-content/uploads/2015/02/5G-Vision-Exec-Summary-v1.pdf

25 - Conclusion Altran
Contact us
[email protected]

About the Author


Ashish Srivastava
Director – Technology & Senior OSS Architect
Product Support & Services

About Altran

Altran is the world leader in engineering and R&D services. Altran offers its clients a
unique value proposition to meet their transformation and innovation challenges. Altran
supports its clients, from concept through industrialization, to develop the products and
services of tomorrow and has been working for more than 35 years with major players
in many sectors: Automotive, Aeronautics, Space, Defense & Naval, Rail, Infrastructure
& Transport, Energy, Industrial & Consumer, Life Sciences, Communications,
Semiconductor & Electronics, Software & Internet, Finance & Public Sector. Altran has
more than 50,000 employees operating in over 30 countries.

Altran is an integral part of Capgemini, a global leader in consulting, digital


transformation, technology and engineering services. The Group is at the forefront of
innovation to address the entire breadth of clients’ opportunities in the evolving world of
cloud, digital and platforms. Building on its strong 50-year + heritage and deep industry-
specific expertise, Capgemini enables organizations to realize their business ambitions
through an array of services from strategy to operations. Capgemini is driven by the
conviction that the business value of technology comes from and through people. Today,
it is a multicultural company of 270,000 team members in almost 50 countries. With
Altran, the Group reported 2019 combined revenues of €17billion.

www.altran.com
...

Altran © 2020 Altran. All rights reserved.

You might also like