2504.11233v2
2504.11233v2
Copyright may be transferred without notice, after which this version may no longer be accessible. 1
Abstract—Modern cellular networks adopt a software-based a 5th generation (5G) cellular system counts tens of micro-
and disaggregated approach to support diverse requirements services for the core network, a distributed Radio Access
arXiv:2504.11233v2 [cs.NI] 25 Apr 2025
and mission-critical reliability needs. While softwarization intro- Network (RAN) with a disaggregated protocol stack capable
duces flexibility, it also increases the complexity of the network
architectures, which calls for robust automation frameworks of handling hundreds of users from a single base station, and
that can deliver efficient and fully-autonomous configuration, additional services for management and optimization of the
scalability, and multi-vendor integration. This paper presents network [2].
AutoRAN, an automated, intent-driven framework for zero-touch Even carrier networks, where operators carry the know-how
provisioning of open, programmable cellular networks. Lever- to manage and control such systems, incur in outages and
aging cloud-native principles, AutoRAN employs virtualization,
declarative infrastructure-as-code templates, and disaggregated anomalies due to the complex nature of cellular networks [3].
micro-services to abstract physical resources and protocol stacks. This major pain point is further exacerbated in private 5G
Its orchestration engine integrates Large Language Models networks, often operated by enterprise Information Technol-
(LLMs) to translate high-level intents into machine-readable ogy (IT) departments, which face challenges in reliably plan-
configurations, enabling closed-loop control via telemetry-driven ning, deploying, operating, and scaling private connectivity
observability. Implemented on a multi-architecture OpenShift
cluster with heterogeneous compute (x86/ARM CPUs, NVIDIA solutions. Compared to Wi-Fi, which is usually deployed as
GPUs) and multi-vendor Radio Access Network (RAN) hardware integrated solution with simple and inexpensive access points,
(Foxconn, NI), AutoRAN automates deployment of O-RAN- private cellular systems can provide the necessary performance
compliant stacks—including OpenAirInterface, NVIDIA ARC guarantees to enable mission-critical use cases, and deliver
RAN, Open5GS core, and O-RAN Software Community (OSC) the ultra-low latency and high throughput connectivity needed
RIC components—using CI/CD pipelines. Experimental results
demonstrate that AutoRAN is capable of deploying an end-to- for enabling Industry 4.0 and automation [4]. On the other
end Private 5G network in less than 60 seconds with 1.6 Gbps hand, cellular systems require expert knowledge and a more
throughput, validating its ability to streamline configuration, ac- involved management and monitoring effort. For this reason,
celerate testing, and reduce manual intervention with similar per- enterprise often resorts to system integrators, which streamline
formance than non cloud-based implementations. With its novel the deployment and management process but adds an inter-
LLM-assisted intent translation mechanism, and performance-
optimized automation workflow for multi-vendor environments, mediate layer between the network and the enterprise. This
AutoRAN has the potential of advancing the robustness of next- also increases integration costs [5] and reduces the ability to
generation cellular supply chains through reproducible, intent- customize the network.
based provisioning across public and private deployments. Part of this complexity also stems from the recent transi-
Index Terms—O-RAN, Open RAN, Automation, Testing, Zero- tion of cellular systems to software-based and disaggregated
touch, 5G, 6G architectures such as C-RAN, vRAN and Open RAN. Indeed,
this transition brings flexibility and programmability as well
as support for multi-vendor deployments. However, the de-
I. I NTRODUCTION coupling of RAN elements results in an increased number
of network components (e.g, Radio Unit (RU), Distributed
Today’s cellular networks serve a variety of customers and
Unit (DU) and Central Unit (CU) in a disaggregated Next
use cases with heterogeneous deployments and technologies
Generation Node Base (gNB)) that add complexity to the
that operate under diverse user requirements and channel
network and call for proper automation tools capable of
conditions [1]. This diversity of requirements and strategic
taming such complexity. Indeed, while Continuous Integration
support for society and economy make cellular networks
(CI)/Continuous Deployment (CD) and automation are widely
extremely complex systems. An end-to-end deployment of
used in cloud systems, existing techniques cannot be directly
S. Maxenti, R. Shirkhani, M. Elkael, L. Bonati, T. Melodia, and M. Polese applied to cellular systems due to the heterogeneous nature of
are with the Institute for the Wireless Internet of Things, Northeastern Univer- the problem that involves radio transmissions, spectrum alloca-
sity, Boston, MA, U.S.A. E-mail: {maxenti.s, shirkhani.r, m.elkael, l.bonati,
melodia, m.polese}@northeastern.edu. Salvatore D’Oro is with zTouch Net- tion policies, distributed deployments, core and RAN elements
works, Inc., Boston, MA, U.S.A. Email: [email protected]. that need to be orchestrated to guarantee low latency, high
This work was partially supported by the National Telecommunications throughput and support the real-time processing of wireless
and Information Administration (NTIA)’s Public Wireless Supply Chain
Innovation Fund (PWSCIF) under Award No. 25-60-IF054 and by the the signals.
U.S. National Science Foundation under grant CNS-2117814. In this paper, we address the above challenges and propose,
2
M
m
m ente Code services and APIs
icr
r a
oS
og um
Pr s t r
er
an optimized RAN based on OpenAirInterface (OAI) [6], [7],
vic
in
s e
Virtualization AutoRAN System Abstractions Orchestration
on NVIDIA ARC-OTA [8], [9], or on srsRAN [10], core
Compute, Network, network based on Open5GS, and RIC components from the
Acceleration
O-RAN Software Community (OSC). Once the system is
e
id
deployed, the automation framework orchestrates end-to-end
Gu
Observability Intent
NLP, LLMs performance and functional tests. We show how to transition
from generic bare metal deployments to integrated and auto-
Fig. 1: Key abstractions for the AutoRAN system.
mated deployments on clusterization platforms.
Overall, the contributions of the paper are as follows:
• Conceptualize, implement and evaluate AutoRAN, an end-
to-end automation solution for zero-touch deployment, con-
design, and develop AutoRAN, an automated and intelligent
figuration and testing of multi-vendor cellular networks;
framework for zero-touch configuration and provisioning of
• Automate multiple protocol stacks to achieve an end-to-end
open and programmable cellular networks. AutoRAN builds
deployment in less than 60 seconds and peak throughput of
on a set of cloud-native abstractions that we design to provide
1.6 Gbps on the cloud-based AutoRAN infrastructure;
support for RAN-specific workloads, with automation and
• Develop an LLM that converts high-level intents into verifi-
virtualization that extend from the core network to the cell
able and bespoke cluster configuration and RAN deployment
site.
policies;
The design of AutoRAN, which is shown in Figure 1, is • Extensively profile AutoRAN performance on different
based on a set of key abstractions and capabilities: (i) the tasks, including deploying and testing cellular networks.
physical infrastructure is abstracted through virtualization of
networking, hardware accelerators, and compute resources. The remainder of the paper is organized as follows. In
This is programmatically configured and operationalized using Section II, we review literature works related to ours. In
a (ii) declarative approach, where code and templates are used Section III, we discuss the foundational design principles
to describe the characteristics and configuration of the system, that behind AutoRAN, while in Section IV we discuss the
enabling versioning control, automated configuration roll-outs, implementation of such principles on the cluster infrastructure.
and end-to-end optimization of infrastructure parameters. The In Section V, we focus on the automation workflows for
applications, which in this case are RAN-related workloads, deployment and testing and describe the design and training
are deployed as (iii) disaggregated micro-services, coexisting procedures of our LLM. Finally, in Section VI, we provide
on the same infrastructure as traditional IT and cloud deploy- results and metrics, while in Section VII, we draw our con-
ments, while exposing Application Programming Interfaces clusions and discuss future works.
(APIs) for reconfiguration, interaction across services, and
telemetry. For example, a base station is split into multiple II. R ELATED W ORK
micro-services (i.e., DU and CU) connected through open AutoRAN proposes a solution to automatically configure,
interfaces. The service life-cycle is managed through (iv) end- deploy, and operate a multi-vendor Open RAN system. It
to-end orchestration to automatically converge to a valid set of also provides a convenient way to perform repeatable tests
services with the appropriate status and configuration across with different specifications, and deployment through a LLM
software applications and infrastructure. The orchestrator is interface. The work encompasses various aspects of config-
guided by (v) high-level intents representing the requirements uring infrastructure for automated deployment and testing
of the operator (e.g., coverage area, minimum Quality of of Open RAN while investigating Artificial Intelligence (AI)
Service (QoS) level) and translated into machine-readable con- applications in telecommunications.
figurations through Natural Language Processing (NLP) and The authors of [11] introduce an enterprise-scale Open
Large Language Models (LLMs). Finally, (vi) observability RAN testbed that enables realistic, high-fidelity research. By
makes it possible to track the end-to-end system status and automating Open RAN deployment and real-time telemetry
coordinate with the orchestrator to implement closed-loop collection, they aim at enabling testing and optimization
control. of network functions. NeutRAN [12] provides zero-touch
We designed and implemented a system that abides by multitenancy through RAN/spectrum sharing on OpenShift,
such principles on a fully-programmable OpenShift cluster dynamically allocating resources via optimization rApps for
with heterogeneous compute architectures, accelerators, and efficient utilization. Similarly, 5G-CT [13] automates end-to-
end-to-end components for the software stacks. Specifically, end 5G/O-RAN networks using OpenShift and GitOps work-
we design a RAN-Infrastructure-as-Code solution that, based flows, integrating capabilities for the continuous integration,
on an intent expressed by the user through an LLM prompt, deployment, and testing with OAI, commercial core, and
deploys and configures gNB on a cluster with 13 nodes with Software-defined Radios (SDRs). 5GShell [14] aims at reduc-
x86 and ARM Central Processing Units (CPUs), NVIDIA ing human-based network configurations by providing a plug-
L40, A100, and GH200 Graphics Processing Units (GPUs), and-play framework designed to automate the deployment of
and radios from various radio manufacturers (Foxconn and 5G cellular networks. It enables users to deploy different cores,
3
two different protocol stacks, and softwarized User Equipment III. AUTO RAN C LOUD -NATIVE D ESIGN
(UE). A demo of cloud-native 5G network automation using In this section, we discuss the design of AutoRAN, fo-
Kubernetes and OpenShift operators is presented in [15]. cusing on the foundational principles that we leveraged to
This solution automates deployment, configuration, service introduce zero-touch automation in end-to-end software-driven
upgrade, and switching between monolithic and disaggregated cellular networks (Figure 1). Specifically, we review how vir-
RAN architectures based on network traffic. Although these tualization, RAN-Infrastructure-as-Code, disaggregation and
works focus on automating cloud computing and virtualization micro-services, orchestration, intent-based configuration, and
for Open RAN, they do not include hardware accelerators such observability are leveraged in AutoRAN, whose high-level
as NVIDIA GPUs in the infrastructure. AutoRAN, instead, has architecture is shown in Figure 2.
a more diverse and heterogeneous infrastructure both for RUs The systems and services that are required for 5G networks
and gNB stacks, and is built with cloud-computing automation combine (i) elastic workloads, for core network, orchestration,
and scalability in mind. and management-related services; and (ii) the RAN, which has
RAN accelerators are instead considered in [8], [16]. stringent performance guarantees requirements to run complex
X5G [8] is an open, programmable, and multi-vendor pri- Digital Signal Processing (DSP). Due to their nature, elastic
vate 5G testbed that integrates NVIDIA GPUs to accelerate workloads can be placed directly on the infrastructure (e.g.,
the 5G physical layer interfacing with OAI for high-DU. at the edge, or in the cloud) with the caveat that connectivity
CloudRIC [16], is a virtualized O-RAN solution that reduces and latency requirements are satisfied (e.g., a core network
cost and improves energy efficiency by pooling hardware ac- may run in cloud data centers such as AWS or Azure as long
celerators such as NVIDIA GPUs and FPGAs across multiple as the latency to reach the cloud is not too high). As shown
DUs. Although these works showcase the advantages of using in Figure 2 (light blue), these include components for the
accelerators in improving the performance of RAN deploy- core network, RAN Intelligent Controllers (RICs), and various
ments, they do not focus on automating the configurations of non-latency-sensitive Machine Learning (ML) applications. In
the infrastructure, and are limited to simple and non-scalable contrast, RAN workloads (dark blue in Figure 2) need to be
deployments mostly based on bare-metal implementations. distributed to edge or cell site locations, close to the end
From the orchestration perspective, SoftRAN [17] intro- users, to satisfy the high data rates and predictable low latency
duces a software-defined model that replaces the traditional requirements needed by RAN functions [23]. For instance, the
distributed control plane with a logically centralized controller, deployment of a gNB—which comprises the disaggregated
which improves the global network optimization, management, elements CU, DU, and RU—requires that DU and RU are
scalability, and coordination. OrchestRAN [18] proposes op- in close proximity to minimize latency over the front-haul
timized allocations of resources to gNBs varying from edge interface carrying high-capacity traffic related to I/Q streams.
to cloud. ATHENA [19], instead, is a cloud-native, multi-x Additionally, DUs at cell sites might also need to host low-
network management and orchestration framework for the au- latency control and sensing applications such as dApps [24],
tomation of network workloads lifecycle. It has a declarative, [25], [26]. These might perform spectrum sensing, beam
intent-based, and multi-vendor-compatible software-defined management, anomaly detection or channel estimation, which
architecture that aims at improving the network management need direct access to real-time I/Q sample streams, which calls
flexibility. [20] proposes a cloud-based federation frame- for edge deployments rather than cloud ones due to the high-
work that automates testbed integration, resource sharing, capacity bandwidth requirement to transmit I/Q streams.
and remote experimentation on Amazon Web Services. It Figure 2 shows how AutoRAN leverages a fast and pro-
supports heterogeneous testbeds, allowing seamless access to grammable network to implement the fronthaul interface be-
distributed research infrastructure. Building on these efforts tween the DU and multiple options as radios, including com-
to automate and virtualize Open RAN infrastructure, recent mercial RUs, RU emulators, and software-defined radios. To
research has begun to explore the role of LLMs in further sim- holistically manage and optimize such a diverse architecture,
plifying and streamlining network configuration and manage- AutoRAN implements automated workflows for infrastructure
ment. Some works explore solutions to automatically produce configuration, end-to-end deployment, and testing, based on
network configurations using language models [21]. Others the following design principles.
focus on mobile radio network and 5G deployments, e.g., [22],
where the authors analyze various applications for domain A. Abstracting Compute, Networking, and Acceleration
knowledge, code and network configuration generation. As we discuss in Section IV, AutoRAN is built on top of
Compared to these works, which focus on features such as an infrastructure with a diverse set of compute, networking,
acceleration, orchestration, automation, or LLMs, AutoRAN and acceleration resources. To manage this complexity and
provides a holistic approach that covers all aspects of deploy- diversity, we design AutoRAN to abstract each individual
ment and testing of a cellular network. It is also the first solu- component of the infrastructure via virtualization, thus creating
tion to simplify cellular network operations and management a homogeneous software layer that hides the details of the
by enabling the intent-based deployment of network workloads underlying infrastructure to simplify deployment and control.
and tests on a heterogeneous infrastructure with accelerators, At the same time, we configure the infrastructure nodes and
different CPU architectures, and multi-vendor 5G software and software (e.g., through proper kernel profiles) so that the
radio devices. virtualization overhead does not compromise the performance
4
Select RAN
USRPs
RAN
RAN CU
Templates Edge CU
Edge CU
DU
Helm Charts Edge DU
• Service 1 DU
• Resource 1 AI/ML
•…
compared to bare metal, non-virtualized setups, as shown in post-deployment, to manage the complete life-cycle (day 2).
Section VI-D2. We leverage Podman as the base container Manually configuring the cluster infrastructure and software
technology for OpenShift, and rely on Single Root I/O Vir- is a task that is time consuming, prone to errors, and lacking
tualization (SR-IOV) to virtualize the Network Interface Card verifiability and accountability. Misconfigurations can cause
(NIC) and enable multiplexing of different traffic flows on outages and instability in the system, and a non-systematic
the same interface. GPUs are presented to the workloads approach to configuring AutoRAN may lead to delays in
through the NVIDIA Docker plugin for GPUs, which makes troubleshooting and detecting root causes for failures.
sure that containers can fully access their compute capabil- Therefore, we introduce a RAN-Infrastructure-as-Code
ities. Finally, to account for different CPU architectures and declarative approach for AutoRAN, as shown through the
guarantee optimized performance of CPU operations, we avoid template flows in Figure 2. We define the status and con-
translating instructions and develop a native, multi-architecture figuration of the system through templates and configuration
build pipeline that accounts for both ARM and x86. files, with key/value pairs describing how each hardware and
Further, the infrastructure is organized as a cluster, i.e., an software component of AutoRAN needs to be managed (e.g.,
abstraction where compute nodes work under a centralized specifying how many cores are reserved or isolated, GPU
control plane (e.g., an orchestrator as in Section III-D), rather configuration). Specific examples are provided in Section IV.
than as isolated servers. Combined with virtualization, this We implement a GitOps workflow to version, track, and deploy
approach simplifies deployment and management by repli- such configurations. A central git server hosts the template
cating configurations across nodes, supports heterogeneous files, which include machine configurations, Dockerfiles, Helm
computing, and enables on-demand reconfiguration, allowing charts, and other text-based templates. CI/CD pipelines auto-
applications to migrate between nodes for load balancing and matically synchronize the repository on the git server with
to satisfy latency-sensitive requirements. Moreover, it allows the hardware and services on the cluster. This guarantees that
for a logical separation of data and application logic, e.g., to updates are automatically synchronized and aligned across
provide resilient and dynamic RAN services with storage for production and the versioning server, and enforces a single
state and configurations that persists the life cycle of individual and specific workflow to propagate changes and updates.
services.
The CI/CD for the declarative approach is implemented
through two main components, ArgoCD and Tekton. ArgoCD
B. Declarative Approach for RAN-Infrastructure-as-Code is used to synchronize configurations between the git server
As discussed above, the AutoRAN cluster combines a (e.g., GitHub) and the AutoRAN cluster. This also guarantees
heterogeneous set of compute, acceleration, and networking a stateless cluster, which can be deployed (or re-deployed)
components, and additional abstractions on top of it. The in few simple steps, compared to a long set of tedious
system needs to go through multiple stages for the con- manual configurations. Tekton is an open-source framework
figuration: (i) at pre-deployment, for planning (i.e., day 0); for designing and running CI/CD. Tekton pipelines consist of a
(ii) at deployment, i.e., whenever new devices or services declarative set of tasks executed one after the other, configured
need to be added or for updates (i.e., day 1); and (iii) for from sets of parameters passed as input. The parameters can
5
be provided through default values as part of the pipeline an automated container build process targeting multiple archi-
definition, customized to ingest the output of other processes tectures, as well as continuous deployment solutions, all based
(as we discuss in Section V-A), or manually overridden. on Tekton pipelines.
Tekton is accessed through APIs, making it easier to deploy
applications on demand from within and outside the cluster. D. Orchestration
The micro-services lifecycle is managed through an orches-
C. Disaggregation, APIs, and Micro-Services trator, which takes care of handling the complexity associated
The software that constitutes AutoRAN components is to matching resources, micro-services, configurations, and
deployed as a set of micro-services which are atomic—yet cluster capabilities. The orchestrator takes care of micro-
connected—units deployed as pods on the cluster. A pod is services deployment, scaling, networking, and lifecycle man-
a set of one or more Docker containers that provide func- agement, based on declarative input provided through the
tionalities to fully express a micro-service. AutoRAN micro- CI/CD approach discussed in Section III-B. Once the desired
services include the actual applications for the cluster, i.e., state of the system is defined (e.g., the number of replicas
RAN, core network, RICs, and edge services, as shown at or specific configurations), the system continuously self-heals
the center of Figure 2, and the software that supports the to maintain that state. This is done automatically, solving
cluster and automation itself. In this sense, there exist a unified problems related to system complexity, ensuring high avail-
workflow and management procedures for both classes of ability through automatic failover and scaling, and optimizing
services, simplifying the overall design of the system. resource usage. The orchestrator controls operations over the
infrastructure cluster, as defined in Section III-A, matching
As we discuss in Section IV, a specific effort has been
micro-services to available resources. For example, it can
put into the design of the micro-services to (i) identify the
ensure that a DU requiring GPU acceleration is instantiated
degree of disaggregation that enables flexibility, automation,
on a compute node with an available GPU and NIC.
and scaling without compromising performance (e.g., whether
In our setup, and as described in Section IV, we use
different micro-services can be used for L1 and L2/L3 in a
Red Hat OpenShift, a commercial version of Kubernetes,
DU, or whether to split DU and CU in different pods, among
as the orchestrator engine for our micro-services, which we
others); and to (ii) separate state and logic as much as possible.
tune, instrument, and configure to support the variety of 5G
The latter allows the deployment of lightweight micro-services
workloads provided by AutoRAN.
that embed the application logic but not complex data struc-
The orchestrator also provides additional tools that can be
tures, which are instead on a permanent storage layer. This
used to manage the system. AutoRAN leverages (i) names-
provides redundancy through disk replication via Redundant
paces, i.e., logical partitions to organize micro-services across,
Array of Independent Disks (RAID) and exposes its resources
for example, application domains (e.g., pods for a core net-
to the micro-services, or pods, so that they do not have to
work, a RIC, the RAN) or tenants (e.g., different operators
replicate state whenever they are deployed or perform complex
sharing the same infrastructure); and (ii) advanced networking
operations on tear-down to store and manage the state.
capabilities, which enable automated service discovery and
The disaggregation is based on a functional split. For the
the establishment of complex network overlays across micro-
5G end-to-end application, it is aligned to 3rd Generation
services (e.g., dynamically establishing routes between core
Partnership Project (3GPP) and O-RAN specifications, which
network micro-services and new CUs instantiated through
include a Service Based Architecture (SBA) for the core
automation).
network and a gNB split into CU and DU (software-based) and
RU. Furthermore, the RICs are also deployed as set of micro-
services, which can be extended by onboarding custom logic E. High-Level Intents to Represent Network Status
(xApps, rApps). Interfaces for cellular network components While automation, CI/CD, orchestration, and declarative
are typically defined by standards or technical specifications approaches simplify the management of a 5G network from
from 3GPP, facilitating functionality over IP networks. In a system perspective, they still represent complex tools to
contrast, services such as automation and cluster functionali- use for a variety of end users that could be interested in
ties are functionally split without strict adherence to specific managing and operating such networks. Consider, for example,
standards. Their interfaces are implemented via application- private 5G deployments, where the enterprise IT team comes
level endpoints or APIs, often based on frameworks like with limited knowledge on radio systems: expressing a rich
Flask. Detailed API designs for various micro-services (e.g., configuration for the system components may be challenging,
automated testing or intent-based deployment) are discussed even if it is then automatically deployed and applied to the
in Section V. network. Similarly, teams with expertise on radio systems may
Similarly to the infrastructure, micro-services are also de- have more limited knowledge on configuring orchestrators and
fined through a declarative approach, with Helm charts defin- micro-services.
ing a set of unit elements to deploy (e.g., containers in a Therefore, we design AutoRAN with a mechanism to pro-
pod, the associated storage, networking capabilities, among vide high-level input for the system configuration, which is
others) and Dockerfiles specifying the features within a spe- then translated into actionable templates, pipeline triggers, and
cific container (e.g., what software is used to execute DU inputs that are automatically applied to the system. As shown
functionalities). As discussed in Section IV, we have designed in the top left part of Figure 2, users can express requests to
6
The most relevant operators used in AutoRAN are summarized 1 apiVersion: ptp.openshift.io/v1
2 kind: PtpConfig
in Table II and detailed in the remaining of this section. 3 metadata:
4 name: ptp-gh
TABLE II: List of OpenShift operators. 5 namespace: openshift-ptp
6 spec:
7 profile:
Name Function 8 - interface: enp1s0f0np0
Node Feature Discovery Exports hardware and software features of 9 name: ptp-gh
10 phc2sysOpts: ’-a -r -r -n 24’
nodes and label nodes
11 ptp4lConf: |
NVIDIA GPU Operator Installs NVIDIA GPU drivers on all nodes 12 [global]
and exposes various capabilities 13 dataset_comparison G.8275.x
NVIDIA Network Upgrades and configures NVIDIA NIC 14 G.8275.defaultDS.localPriority 128
Operator firmware 15 maxStepsRemoved 255
SR-IOV operator Splits the NIC into partitions and 16 logAnnounceInterval -3
instantiates networks 17 logSyncInterval -4
PTP operator Provides ns clock accuracy to the pods on 18 logMinDelayReqInterval -4
19 G.8275.portDS.localPriority 128
the node
20 network_transport L2
21 domainNumber 24
22 tx_timestamp_timeout 30
Node Feature Discovery. Since cloud-based deployments 23 slaveOnly 1
may run on very different hardware, the Node Feature Discov- 24
25 clock_servo pi
ery (NFD) automatically expose the features of the underlying 26 step_threshold 1.0
physical infrastructure (e.g., CPU model and architecture, 27 egressLatency 28
28 pi_proportional_const 4.65
number of cores, amount of memory, GPU availability and 29 pi_integral_const 0.1
type, among others) to applications and workload using labels. 30
31 [enp1s0f0np0]
These labels allow the cluster to perform targeted deployments, 32 announceReceiptTimeout 3
either on a particular set of nodes or on the specific node. 33 delay_mechanism E2E
34 network_transport L2
This ensures that components, operators, and deployments are 35 ptpClockThreshold:
instantiated on the nodes with the required resources. 36 holdOverTimeout: 5
37 maxOffsetThreshold: 50
PTP Operator. RAN systems require precise timing syn- 38 minOffsetThreshold: -50
chronization across their components. In general, clock syn- 39 ptpSchedulingPolicy: SCHED_FIFO
40 ptpSchedulingPriority: 65
chronization is achieved through the Network Time Protocol 41 recommend:
(NTP), which is able to provide ms accuracy over Local 42 - match:
43 - nodeLabel: node-role.kubernetes.io/worker-gh
Area Network (LAN) setups. However, the level of accuracy 44 priority: 4
delivered by NTP is not sufficient to support RAN applica- 45 profile: ptp-gh
tions, where the Open Fronthaul interface requires nanosecond Listing 1: Example of a PTP profile on a Grace Hopper node.
accuracy. We therefore use PTP, a protocol that shares a
synchronization signal over Ethernet, to sync all nodes and
radios to a common clock source—a Qulsar clock with a the proprietary drivers. We also provision GDRCopy [29] in
GPS input in our case. The Linux implementation of the the NVIDIA operator so that it is automatically deployed on
protocol uses two pieces of software, namely ptp4l and all nodes and can directly access the GPU memory. This is
phc2sys, which are installed via the OpenShift PTP operator. a low-latency GPU memory copy library based on NVIDIA
The operator also gives the possibility of increasing the priority GPUDirect RDMA technology, that creates the CPU mapping
of these processes, allowing one to use them on shared CPUs of GPU memory.
rather than on a dedicated CPU per process, saving one The NVIDIA Network Operator is used to enable intercom-
extra core for other workloads. Listing 1 shows the PTP munication between GPU and NIC. It is used to access the
configuration adapted for the Grace Hopper server. firmware of the NIC, for example, to enable higher timing
NVIDIA Operators. These include the NVIDIA GPU accuracy and specific QoS.
Operator and the NVIDIA Network Operator. The NVIDIA SR-IOV Operator. This operator is in charge of creat-
GPU Operator is in charge of installing and maintaining ing virtual slices—or virtual functions—of the physical NIC
NVIDIA GPU drivers. GPUs are used in AutoRAN for both interfaces and to expose them as objects inside OpenShift.
AI workloads (e.g., LLMs, training and testing of models) In OpenShift and Kubernetes deployments, these are created
and for the NVIDIA ARC deployment [8], which uses GPUs using configuration files called SR-IOV Networks Node
to accelerate the DU-low operations 5G gNBs. Specifically, Attachment. After splitting the interfaces, OpenShift im-
NVIDIA ARC requires the Open Kernel Modules instead of plements Network attachments to specify the IP ad-
8
dressing of the network and the VLAN tag. As it will be HugePages to directly allocate blocks of memory (1 GB for
discussed in Section IV-D, NVIDIA ARC deployments require x86 nodes, 512 MB for ARM deployments); and (iii) fine-
interfaces to be passed as untagged whereas OAI requires tune kernel parameters related to interrupts and energy, e.g.,
the tag to be set on the fronthaul port. This becomes key as disabling sleep states and offsetting the periodic clock ticks to
different deployments need different settings on the physical mitigate jitter. An example of a PerfomanceProfile for
port, and it enables parallelization of multiple workloads, and a Grace Hopper node is shown in Listing 3.
thus concurrent accelerated workloads on the same compute
nodes. Common NVIDIA ARC deployments are designed to 1 apiVersion: performance.openshift.io/v2
2 kind: PerformanceProfile
directly use physical network interfaces, whereas our SR-IOV 3 metadata:
approach, which enables running the L1 over a virtualized 4 name: gh-performanceprofile
5 annotations:
interface, makes it possible to parallelize multiple throughput- 6 performance.openshift.io/ignore-cgroups-version:
demanding applications. To the best of our knowledge, we are "true"
7 kubeletconfig.experimental: |
the first in following this approach with NVIDIA ARC. 8 ## various kubelet parameters for fine-tuned
An example of definition of networks over SR-IOV devices optimizations
9 spec:
is shown in Listing 2. A resourceName exposes the un- 10 additionalKernelArgs:
derlying sets of virtual functions (8 in this case) created from 11 - "tsc=reliable"
12 - "nohz_full=4-64"
the selected NIC. A network is then created on the virtualized 13 - "preempt=none"
NIC with parameters including VLAN tags, and IPs. 14 - "..."
15 cpu:
16 isolated: 4-64
1 apiVersion: sriovnetwork.openshift.io/v1 17 reserved: 0-3,65-71
2 kind: SriovNetworkNodePolicy 18 hugePages:
3 metadata: 19 defaultHugePagesSize: "512Mi"
4 name: gh-vf-sriov 20 pages:
5 namespace: openshift-sriov-network-operator 21 - size: "512Mi"
6 spec: 22 count: 48
7 resourceName: vfgh 23 nodeSelector:
8 nodeSelector: 24 node-role.kubernetes.io/worker-gh: ""
9 node-role.kubernetes.io/worker-gh: "" 25 machineConfigPoolSelector:
10 numVfs: 8 26 machineconfiguration.openshift.io/role: worker-gh
11 mtu: 9216 27 numa:
12 priority: 1 28 topologyPolicy: "none"
13 nicSelector: 29 workloadHints:
14 vendor: "15b3" 30 realTime: true
15 deviceID: "a2dc" 31 highPowerConsumption: true
16 rootDevices:
17 - "0000:01:00.0" Listing 3: Example of performance profile for a Grace Hopper machine.
18 isRdma: true
19 needVhostNet: true In addition, different PTP profiles are also applied to each
20 deviceType: netdevice
21 --- server, taking into account their hardware specifications and
22 apiVersion: sriovnetwork.openshift.io/v1 configuration parameters (for example, the name of the specific
23 kind: SriovNetwork
24 metadata: network interface that should receive the synchronization
25 name: gh-vnf-net signal).
26 namespace: openshift-sriov-network-operator
27 spec: Profiles linked to a label are immediately applied to each
28 ipam: | new labeled node. Nodes update in a rolling strategy and,
29 ## any IP address configurations, if needed
30 resourceName: vfgh at the end of the update, their configurations are aligned,
31 networkNamespace: aerial which eliminates the risk of stale drivers. This flexibility
32 # vlan: 2 # if a VLAN tag is needed
in node labeling and zero-touch configuration is even more
Listing 2: Example of a SR-IOV network policy and its network attachment. relevant for extendability and scalability of the framework.
Finally, it is worth mentioning that differently from bare-metal
or Kubernetes-based deployments, OpenShift only supports
C. Configuration Profiles
generic or real-time kernels (from no other sources than Red
Due to the stringent latency and processing requirements Hat itself), and does not support the low-latency ones generally
of RAN workloads, it is important to make sure that host recommended for gNB protocol stacks. Since at this time
machines use an appropriate kernel version and are configured the NVIDIA L1 deployment requires drivers that are not
to meet such requirements. This includes setting isolated available for Red Hat’s real-time kernel, we adopt the generic
CPUs, HugePages and disabling energy saving functionalities Linux kernel on all the cluster nodes. Experimental evidence
that might put the processor in an idle state and decrease in [13] also shows the instabilities of real-time kernels for
performance. As mentioned in Section IV-A, we use labels RAN deployments, which results in poorer performance when
to automatically apply configurations to nodes. Nodes with compared to the generic one.
the same physical hardware and configurations are grouped
together in an Machine Configuration Pool (MCP). Then, for
each MCP, we apply the same set of configurations. D. Integrating RAN Deployments
We use PerfomanceProfile objects to: (i) config- We validated a set of deployments that can be instantiated on
ure the number of reserved and isolated cores; (ii) enable the AutoRAN infrastructure at this time. These deployments,
9
shown in Table III, include deployments with both commercial We choose the latter deployment option (shown in Figure 4,
radios with support for the the O-RAN 7.2 split, SDR (e.g., where we also add the E2 termination described in [8]) to
Universal Software Radio Peripherals (USRPs)), as well as support the instantiation of multiple NVIDIA ARC containers
CPU- and GPU-accelerated DU solutions. We extend the (e.g., to deploy multiple gNBs) because both L1 and L2 are
TABLE III: Protocol stack and OTA RU pairs validated on AutoRAN.
tightly coupled, and requires shared memory and networking.
Termination
Container
E2 RAN
USRP-based deployment proposed in [13], [12]. Pods are
Foxconn RU
IPC
assigned one SR-IOV device, whose physical NIC is connected nFAPI
through a switch to the USRPs deployed in our laboratory
environment. This kind of deployment resembles the O-RAN cuBB (DU low)
8.1 split, since the entire logic of the disaggregated gNB is
focused on the CU/DU. We, then, proceed to include the
PTP Network GPU
O-RAN 7.2 split-based deployments, focusing in particular (ns accuracy) SR-IOV (1 VF) (NVIDIA A100)
on OAI and NVIDIA ARC. This split divides the func-
tionalities of a traditional Base Band Unit (BBU) into two NIC GPU
parts: high-PHY processing remains in the DU, while low-
PHY processing is moved to the RU. This enables effective
Fig. 4: ARC deployment in AutoRAN.
centralization of compute resources in the DU, while the RU
can be simplified and deployed closer to the antennas. It
also supports robust signal processing capabilities and reduces
latency by minimizing the fronthaul bandwidth requirements E. Container Image Building
between DU and RU.
Another important aspect for a virtualized deployment is
To support the 7.2 split, it is necessary to use high-
the building of container images. Unlike cloud computing use
performance and low-latency compute nodes that can sustain
cases, where images are generic, RAN deployments require
the high processing requirements of the Open Fronthaul and
instruction sets optimized for the specific CPU architecture and
the low-DU. OAI integrates an OSC library that provides
host family where the container will be instantiated. A possible
support for the 7.2 split, while ARC uses an NVIDIA library
approach is to build the image for a specific architecture
for x86 and ARM architectures.1 Moreover, both the Ope-
by specifying flags to the Dockerfiles (e.g., a cascadelake
nAirInterface and NVIDIA ARC platforms require Data Plane
architecture flag). This makes it possible to build images
Development Kit (DPDK), a set of libraries that accelerate the
on any node in the cluster. However, it requires different
processing of packets to support real-time operations. For the
Dockerfiles for different deployment. Another approach, which
7.2 OAI deployment, we deploy a single pod containing OAI
is the one we follow, consists in building the image on the
gNB, which requires two tagged SR-IOV devices, one for the
target deployment node (e.g., if we want to run OAI on a Grace
control plane and one for the user plane, each accelerated by
Hopper node, we build the image on a Grace Hopper directly).
DPDK.
In this way, we build images for each node without the need for
Differently, the NVIDIA ARC setup requires a more com-
different Dockerfiles. As we will show later in Section VI-E,
plex deployment. Specifically, NVIDIA ARC usually runs as
a combination of two containers, one for the L1 accelerator,
called cuBB, and the other one for the L2, based on OAI. The cuBB for NVIDIA L1 accelerator
communication between L1 and L2 happens via Inter-process (Mellanox+CUDA+7.2)
x86 and ARM
Docker images stored in private registry
thanks to the isolation between cores and processes, the build pipelines that process high-level, user-specified requirements
process does not affect the performance of the RAN. regarding the specific configuration to test (e.g. OAI with a
To maximize image reuse and reduce build times, images certain RU and MIMO configuration) and convert them into
are built in a chained manner, as shown in Figure 5. Specif- deployment and testing operations. These involve (i) instanti-
ically, different gNB images (e.g., OAI, srsRAN) and their ation of the 5G software containers as pods; (ii) initialization
subsequent versions (e.g., among different weekly tags in the and attachment of UEs; and (iii) data collection of relevant per-
case of OAI) might share common dependencies, e.g., in formance metrics to evaluate test success. Additionally, such
terms of NIC drivers, DPDK, etc. Instead of rebuilding all deployment and testing configurations can be automatically
the dependencies for each new image, we maintain dedicated generated through LLMs starting from a high-level intent,
images with the required dependencies installed but no gNB and then actuated by the above pipelines, as discussed in
software. We, then, use these images as a base to build the Section V-A. To streamline and automate testing procedures
gNB images, thus speeding up the build process at the cost and support remote execution of tests, our UEs are Sierra
of additional images (i.e., the dependencies images) stored on Wireless EM9191 5G modems connected to mini-PCs (e.g.,
the registry. Intel NUC, or Raspberry Pi). The UEs’s mini-PCs are used
as host machines that pilot the UE to perform attachment
F. Additional Software Components operations via AT commands embedded in the Qualcomm’s
chipset, generate traffic and collect data.
In addition to the RAN workloads, a fully operational Open AutoRAN Worflows. The workflows for the deployment
RAN requires several other elements. The core network is and testing of Open RAN components are shown in Figure 6.
in charge of operations such as user authentication and the The blue blocks in the figure concern the instantiation of
creation of Packet Data Unit (PDU) sessions. Among the workload, while the pink ones carry out the functionalities
various core networks available in the literature, we deployed related to testing. The deployment workflow involves steps
Open5GS. from the creation of generic deployment specifications, to
Another key element of an O-RAN deployment is the their specialization to the hardware infrastructure and available
RIC, a component that enables observability and control of resources. Similarly, the test workflow concerns the creation
the network [30]. Specifically, Near-real-time (RT) control is of generic test specifications, their specialization to the testing
achieved via applications called xApps deployed on the Near- infrastructure, the actual test execution, and data collection and
RT RIC, while Non-RT control via rApps deployed on the analysis. The steps for both deployment and testing workflows
Non-RT RIC We deploy an OSC Near-RT RIC (“E” release) will be detailed in the remaining Sections V-B and V-C
and connect it to OAI via the E2 agent provided in [8], respectively.
[31] and publicly available as part of the OpenRAN Gym
framework [32]. Finally, the cluster also hosts a non-RT RIC
to enable observability at larger time scales. A. Intent-based Instantiation and Testing
In the following paragraphs, we describe how we leverage
V. AUTO RAN E ND - TO -E ND W ORKFLOWS AutoRAN to simplify and streamline network instantiation
In this section, we focus on providing details on how to procedures and testing via intent-based deployment and test-
leverage such blocks to automate the deployment, management ing, leveraging natural language processing. The goal of this
and testing of a private 5G network Over-The-Air (OTA). feature is to eliminate complexity related to network config-
Specifically, we extend our previous work [13] to support a uration and allow even non-expert users to deploy and test
broader set of testing operations and technologies, deploy and a full-fledged network using only Natural Language (NL).
test automatically different combinations of protocol stacks Examples of possible intents include queries such as “deploy
and RUs (see Table III) in a repeatable manner, and analyze a 5G gNB with OAI and NVIDIA Aerial Research Cloud
their performance under different conditions (e.g. number of (ARC)” or “perform a 15 Mbps iPerf test with 3 UEs”.
users, target data rate, and arbitrary protocol configuration In order to process NL, we opt for using LLMs as they
files). As we describe later, this is done through a set of provide capabilities to parse text, understand context and need,
11
and convert them into structured and machine-understandable each possible parameter such as CU/DU/RU) with a value
directives. Given the large availability of pre-trained models (in Figure 7, this corresponds to the update of parameters);
with excellent performance in accomplishing a variety of tasks and (ii) verify that the set values are correct, i.e., that they
under different benchmarks, we opt for using a pre-trained correspond to valid parameters which are compatible with
LLM (Qwen 2.5) in an agentic setup, i.e., where the LLM one another (in the value validation step).
makes calls to a set of tools (defined thereafter) in iterative • Another issue is that while most LLMs can follow instruc-
loops. tions, the complexity of the network configuration task calls
Although appealing for their simplicity, LLMs are also well for multiple long instructions (this includes describing the
known for sometimes drifting away from initial instructions use-case, the infrastructure, the compatibility graph, etc.).
and producing hallucinations. In our case, this is particularly The result is that most models manage to follow some of
relevant, as any hallucination could result in incorrect network the instructions, but rarely all of them at the same time. For
configuration that might produce wrong/misleading test results this reason, instead of relying on a single prompting round,
and even generate outages. To prevent this unwanted behavior, as shown in Figure 7, we resort to a looping mechanism
we design a set of procedures (depicted in Figure 7) that in which the LLM is fed back with reports on misconfig-
constrain the output of the LLM, forcing it to output the correct urations and missing configuration fields. This enables the
network/test configurations. LLM to build the requested configuration iteratively, in a
Achieving this in practice is not trivial, and presents several two phase manner where it first loops until all parameters
challenges: have been populated with a correct value. Then, in the
• First, it should be noted that not all the RAN components second phase, the validity is verified by ensuring that the
available in AutoRAN are compatible with each other. For configuration matches the compatibility graph.
example, OAI supports the ARC L1, while srsRAN does Once we obtain the correct configuration (i.e., when there
not, meaning we need to prevent the LLM from trying is no more error message in Figure 7), we convert it into
to configure a srsRAN base station with ARC, which json format and send it to the correct pipeline (either the
would result in a non-working deployment. It is therefore testing pipeline or the deployment pipeline) for execution on
fundamental to ensure that the LLM only selects compat- the cluster. This process is essential as it ensures we map
ible components and configurations (e.g., OAI with ARC, intents to actionable network deployments, making sure that
plain OAI/srsRAN, as shown in Table III). For this reason, the selected software and hardware elements can interact with
we store a compatibility graph, indicating which network each other and can effectively deliver network services to
components (DU, CU, RU, Core Network) are compatible users.
with one another. Similarly, for each test type, a graph
of the list of compatible parameters is fed to the LLM B. Deployment Workflow
(for example, tests using different traffic generators, such The deployment workflow is in charge of automatically
as Multi-Generator (MGEN) can include parameters such instantiating the specified workloads (e.g., OAI gNB, core
as the distribution of traffic, while iPerf tests do not). In network, RIC) on the virtualized infrastructure. The main
the workflow of Figure 7, this graph is consumed when steps of this workflow, shown with blue boxes in Figure 6,
validating values. involve: (1) creating the Open RAN deployment file; and
• Second, the LLM might output some text that drifts slightly (2) specializing this to the specific hardware infrastructure
from the anticipated output, causing potential parsing issues abstracted by OpenShift (e.g., selecting specific CUs/DUs and
and undefined behaviors, such as outputting incorrect at- RUs). The deployment file is used to specify the workloads
tribute values. We solve this issue by using the tool-calling to instantiate on the physical infrastructure, e.g., an OAI
feature of recent LLMs. With this feature, we can feed the CU/DU with Foxconn RU (see Table III for the list of
LLM with a list of tools which it can explicitly call by possible combinations). An example of this file (and possible
outputting special tokens. In our case, those tools consist of output of the LLM) is shown in Listing 4, where the core
a set of pre-defined Python functions which (i) set the values network (marked as core_network in the listing) is set to
of the network parameters in a dictionary (e.g., associate Open5GS, the protocol stack to OAI for CU/DU-high (cu and
12
stack software affect the network performance, enabling users workload in approximately 40 minutes—without the same
to quickly spot degradations with respect to previous tests. level of effort of manually configuring a node. After their
initial provisioning, nodes can be repurposed for different roles
VI. E XPERIMENTAL E VALUATION (e.g., edge vs RAN node) by modifying their labels. After this
operation, nodes will reboot and auto-configure themselves as
The section discusses several results that profile the perfor-
required.
mance and effectiveness of the proposed AutoRAN approach.
Specifically, in Section VI-A we review the time needed to
onboard a new node in the cluster, comparing automation B. LLM-Based Deployment
and manual solutions. In Section VI-B we describe the LLM In this section, we evaluate our LLM-based deployment
deployment and report related metrics. In Section VI-C, we pipeline. To do so, we use Claude 3.7 Sonnet to generate
analyze the time needed to execute deployment and testing a series of 33 different deployment prompts. Each prompt
workflows using AutoRAN, showing how a network based provides some specific requirements in NL and is associated
on microservices can be instantiated in a matter of seconds. with the required element(s) in the deployed network. For ex-
In Section VI-D, we provide performance metrics obtained ample, if the query is “generate a 5G network based on GPU
from the deployed gNBs. Finally, in Section VI-E we analyze acceleration”, the associated requirement is that the DU-low
coexistence between RAN and other generic workloads. uses ARC. We evaluate the deployment pipeline by running
each prompt 10 times, for different sizes of LLMs. The LLM
A. Onboarding Nodes on the AutoRAN Cluster is deployed on one of the control-plane nodes, which are
AutoRAN is designed to ensure scalability. Even the exten- equipped with a NVIDIA L40S GPU with 40 GB of VRAM.
sion of the cluster follows the same principle. To showcase Figure 9a shows that the larger model is significantly better
the benefits of automated node creation and configuration, in at providing satisfactory configurations, with Qwen2.5:32B
this section we provide a comparison between adding a new reaching up to 80% correct responses, and with correctness
RAN worker node manually and using AutoRAN. Manually consistently decreasing as we decrease the size of the model.
deploying a NVIDIA ARC node requires performing several Furthermore, Figure 9b shows that this extra performance
error-prone steps that need to be repeated on each compute does not necessarily come at the cost of a larger runtime,
node of the cluster just to set up the basic infrastructure. with Qwen2.5:1.5B having the largest runtime despite its low
This usually requires additional effort to maintain the node success-rate. We observe (see Figure 9c) that this concurs with
with up-to-date drivers and software. Using the profiles and the high number of iterations in the initial phase (where we
verify if values are missing, see Figure 7): the model struggles
coreos-installer install … oc label $node $label
to find any valid configuration until it times out. On the other
hand, larger models have similar iteration counts due to their
Joining the cluster Adding label Node provisioned
ability to output a correct configuration (in the sense that we
Auto installation (~20 minutes) Auto-configurations (~20 minutes)
almost always obtain a deployable network, but that network
is not necessarily the one the user asked for, as per Figure 9c).
Fig. 8: Steps required to provision a new node.
configurations offered by AutoRAN, extending the cluster C. End-to-End Deployment and Testing Workflows
with an already configured profile requires minimal human In this section, we show experimental results to validate the
intervention, as shown in Figure 8 (i.e., it is only required to automated end-to-end test workflows described in Section V,
add the node to the cluster—using coreos-installer— where we use Sierra Wireless 5G modems as UEs.
and assign a label to it—using oc label), and guarantees Figure 10 shows the time required to execute the different
a completely configured node ready to accept any targeted tasks of the automated testing pipeline discussed in Section V.
100 Initial
Execution Time [s]
Iterations Number
10
Success Rate (%)
Configuration
80 40 Compatibility
60 Adjustment
40 20 5
20
0 0 0
0.5 1.5 3 7 14 32 0.5 1.5 3 7 14 32 0.5 1.5 3 7 14 32
Model Size (Billions of Parameters) Model Size (Billions of Parameters) Model Size (Billions of Parameters)
(a) Success rate of deployments through LLM for (b) Runtime of LLM for successful deployments (c) Number of iterations for successful deploy-
different sizes of Qwen2.5 LLM. with different sizes of Qwen2.5 LLM. ments with different sizes of Qwen2.5 LLM.
Fig. 9: Performance metrics for different sizes of Qwen2.5 LLM: (a) success rate, (b) runtime for successful deployments, and (c) number of iterations
required.
14
about 8 s. As the RUs is always in idle state (i.e., in the on state 300
but without actively transmitting or receiving) and initialized
outside of OpenShift with via M-plane-like functionalities 200
before the gNB pod deployment (see Section IV), we do
not include initialization (approximately 60 s for a Foxconn 100
RU) and reconfiguration times in the instantiation time of
CU/DU. This is different from the USRPs deployment, where 0
GH200 Gigabyte E251 Gigabyte E251 Microway Microway
initialization times are negligible. Indeed, the time taken to w/ ARC w/ ARC w/o ARC OAI w/ USRP srsRAN w/ USRP
start the gNB and connect it to the USRP for both srsRAN Fig. 12: Bar plot of achieved TCP downlink datarate of Sierra 5G modem
and OAI is around 10 s. Finally, the last two bars show the with five different configurations.
time taken for the UE (Sierra Wireless 5G modem in our case)
to connect to the gNB, and the time taken to collect the results We notice that in the downlink direction, the gNB deployed
and send them to the data collector pod respectively. It is with ARC on the GH200 server achieves an average through-
worth mentioning that we do not report the duration of the put of 275 Mbps, which surpasses the throughput of both the
data transmissions between gNB and UE, as it depends on the ARC gNB deployed on the Gigabyte server and that of the
value specified in the test specifications (e.g., run an iPerf test deployments without ARC. These experiments also show that
for 60 s, see Listing 5). OAI achieves slightly better performance than srsRAN when
15
80
1600 Mbps using all slots (1600) every second available over
100 MHz with a DDDDDDDSUU TDD pattern.
60
40 1
Docker Bare-metal OTA
20 0.8 AutoRAN OTA
0.6 Docker Bare-metal RuSIM
0 AutoRAN RuSIM
CDF
GH200 Gigabyte E251 Gigabyte E251 Microway Microway 0.4
w/ ARC w/ ARC w/o ARC OAI w/ USRP srsRAN w/ USRP
0.2
Fig. 13: Bar plot of achieved TCP uplink datarate of Sierra 5G modem with 0
five different configurations. 0 200 400 600 800 1,000 1,200 1,400 1,600
Downlink Throughput [Mbps]
OTIC [33]. RuSIM is a RU emulator for 7.2 split, mirroring 0.4 Shared Cores 0.2
all parameters that a real RU should expose. In addition to 0.2 0
this, it allows simulation of users and of wireless channel. It 1,562 1,572
0
is usually used jointly with other Keysight tools, including 1,500 1,520 1,540 1,560 1,580 1,600
Downlink Throughput [Mbps]
CoreSIM (to emulate core network and traffic generation) and
Air Mosaic for managing and orchestrating tests. AutoRAN Fig. 15: Performance comparison when applying load on either N isolated
provides a flexible testing platform and the integration of and shared cores using RuSIM at full capacity.
RuSIM only requires the addition of a second SR-IOV in-
terface to communicate with CoreSIM, since it is external This shows how AutoRAN does not introduce relevant
to the cluster. This shows the flexibility of the AutoRAN delays or performance degradation into the gNB stack while
virtualization capability, since the same physical interface is using an additional layer of virtualization and exposing a new
used twice, for fronthaul and backhaul, using two different set of features that streamlines and simplifies 5G lifecycle
virtual functions. We test performance of 15 simulated users, management.
16
VII. C ONCLUSIONS AND F UTURE W ORK [12] L. Bonati, M. Polese, S. D’Oro, S. Basagni, and T. Melodia, “NeutRAN:
An Open RAN Neutral Host Architecture for Zero-Touch RAN and
Starting from the Open RAN principles of disaggregation Spectrum Sharing,” IEEE Transactions on Mobile Computing, pp. 1–
and standardization, we proposed AutoRAN, an open automa- 13, August 2023.
[13] L. Bonati, M. Polese, S. D’Oro, P. Brach del Prever, and T. Melodia,
tion framework, which leverages cloud computing and virtual- “5G-CT: Automated Deployment and Over-the-Air Testing of End-to-
ization techniques to seamlessly deploy, reconfigure, and test End Open Radio Access Networks,” IEEE Communications Magazine,
heterogeneous private 5G RANs and related components. We pp. 1–7, April 2024.
[14] F. Mancini, L. Tamiano, and G. Bianchi, “5GShell: a plug-and-play
showcased the building blocks of AutoRAN, highlighting its framework for automating the deployment of 5G cellular networks,” in
advantages with respect to traditional deployment techniques, 2023 26th Conference on Innovation in Clouds, Internet and Networks
which are often monolithic and designed for highly expe- and Workshops (ICIN), 2023, pp. 39–41.
[15] O. Arouk and N. Nikaein, “5G Cloud-Native: Network Management &
rienced users. We experimentally evaluated the capabilities Automation,” in NOMS 2020 - 2020 IEEE/IFIP Network Operations
of AutoRAN, showing how the virtualization and automation and Management Symposium, 2020, pp. 1–2.
functionalities that it offers introduce enhanced flexibility and [16] L. L. Schiavo, G. Garcia-Aviles, A. Garcia-Saavedra, M. Gramaglia,
seamless deployment and testing capabilities that take in input M. Fiore, A. Banchs, and X. Costa-Perez, “CloudRIC: Open Radio
Access Network (O-RAN) Virtualization with Shared Heterogeneous
simple high-level intents expressed in natural language, instead Computing,” in Proceedings of the 30th Annual International Confer-
of complex and detailed configurations. In future works, we ence on Mobile Computing and Networking, ser. ACM MobiCom ’24.
will extend the same approach to manage additional NICs and Washington D.C., DC, USA: Association for Computing Machinery,
2024, p. 558–572.
hardware combinations (including RU configuration through [17] A. Gudipati, D. Perry, L. E. Li, and S. Katti, “SoftRAN: software defined
M-PLANE), as well as the automatic deployment of auxiliary radio access network,” in Proceedings of the Second ACM SIGCOMM
components to the RAN. Workshop on Hot Topics in Software Defined Networking, ser. HotSDN
’13. Hong Kong, China: Association for Computing Machinery, 2013,
p. 25–30.
[18] S. D’Oro, L. Bonati, M. Polese, and T. Melodia, “OrchestRAN: Network
ACKNOWLEDGMENTS Automation through Orchestrated Intelligence in the Open RAN,” in
IEEE Conference on Computer Communications (INFOCOM). IEEE
The authors would like to express their gratitude to Jing Xu, Press, 2022, p. 270–279.
Anupa Kelkar, and Chris Dick from NVIDIA Corporation for [19] A. Mohammadi and N. Nikaein, “Athena: An Intelligent Multi-x Cloud
their feedback on NVIDIA ARC. Native Network Operator,” IEEE Journal on Selected Areas in Commu-
nications, vol. 42, no. 2, pp. 460–472, 2024.
[20] M. McManus, T. Rinchen, A. Dey, S. Thota, Z. Zhang, J. Hu, X. Wang,
R EFERENCES M. Ji, N. Mastronarde, E. S. Bentley, M. Medley, and Z. Guan,
“Cloud-Based Federation Framework and Prototype for Open, Scalable,
[1] A. Narayanan, M. I. Rochman, A. Hassan, B. S. Firmansyah, V. Sathya, and Shared Access to NextG and IoT Testbeds,” 2024. [Online].
M. Ghosh, F. Qian, and Z.-L. Zhang, “A Comparative Measurement Available: https://arxiv.org/abs/2408.14460
Study of Commercial 5G mmWave Deployments,” in IEEE Conference [21] Y. Wei, X. Xie, Y. Zuo, T. Hu, X. Chen, K. Chi, and Y. Cui,
on Computer Communications, 2022, pp. 800–809. “Leveraging LLM Agents for Translating Network Configurations,”
[2] E. Dahlman, S. Parkvall, and J. Skold, 5G NR: The next generation 2025. [Online]. Available: https://arxiv.org/abs/2501.08760
wireless access technology. Academic Press, 2020. [22] H. Zhou, C. Hu, Y. Yuan, Y. Cui, Y. Jin, C. Chen, H. Wu, D. Yuan,
[3] Uptime Intelligence, “Annual outage analysis 2024,” Uptime In- L. Jiang, D. Wu, X. Liu, C. Zhang, X. Wang, and J. Liu, “Large
telligence, Tech. Rep., 2024, https://datacenter.uptimeinstitute.com/rs/ Language Model (LLM) for Telecommunications: A Comprehensive
711-RIA-145/images/2024.Resiliency.Survey.ExecSum.pdf. Survey on Principles, Key Techniques, and Opportunities,” IEEE Com-
[4] 3GPP, “5G for Industry 4.0,” https://www.3gpp.org/technologies/ munications Surveys & Tutorials, pp. 1–1, 2024.
tsn-v-lan, 2024, online; accessed 14 April 2025. [23] O-RAN Alliance, “O-RAN Control, User and Synchronization Plane
[5] NEC Corporation, “Annual outage analysis 2024,” NEC Corpora- Specification,” O-RAN Alliance, Tech. Rep. O-RAN.WG4.CUS.0-R004-
tion, Tech. Rep., 2021, https://www.nec.com/en/global/solutions/5g/ v16.01, March 2024, https://www.o-ran.org/specifications.
download/pdf/Moving to Open RAN.pdf. [24] S. D’Oro, M. Polese, L. Bonati, H. Cheng, and T. Melodia, “dApps:
[6] F. Kaltenberger, T. Melodia, I. Ghauri, M. Polese, R. Knopp, T. T. Distributed Applications for Real-Time Inference and Control in O-
Nguyen, S. Velumani, D. Villa, L. Bonati, R. Schmidt et al., “Driving RAN,” IEEE Communications Magazine, vol. 60, no. 11, pp. 52–58,
Innovation in 6G Wireless Technologies: The OpenAirInterface Ap- 2022.
proach,” arXiv preprint arXiv:2412.13295, 2024. [25] O-RAN next Generation Research Group (nGRG), “dApps for Real-
[7] N. Nikaein, M. K. Marina, S. Manickam, A. Dawson, R. Knopp, and Time RAN Control: Use Cases and Requirements (Report ID: RR-2024-
C. Bonnet, “OpenAirInterface: A flexible platform for 5G research,” 10),” https://tinyurl.com/5n82pwpx, October 2024.
ACM SIGCOMM Computer Communication Review, vol. 44, no. 5, pp. [26] A. Lacava, L. Bonati, N. Mohamadi, R. Gangula, F. Kaltenberger,
33–38, 2014. P. Johari, S. D’Oro, F. Cuomo, M. Polese, and T. Melodia, “dApps:
[8] D. Villa, I. Khan, F. Kaltenberger, N. Hedberg, R. S. da Silva, S. Max- Enabling Real-Time AI-Based Open RAN Control,” Computer Networks
enti, L. Bonati, A. Kelkar, C. Dick, E. Baena, J. M. Jornet, T. Melodia, (to appear), 2025. [Online]. Available: arxiv.org/abs/2501.16502
M. Polese, and D. Koutsonikolas, “X5G: An Open, Programmable, [27] “Observium.” [Online]. Available: https://observium.org
Multi-vendor, End-to-end, Private 5G O-RAN Testbed with NVIDIA [28] “TrueNAS,” https://www.truenas.com/, 2024, online; accessed 12
ARC and OpenAirInterface,” arXiv:2406.15935 [cs.NI], pp. 1–15, June September 2024.
2024. [29] NVIDIA, “GDRCopy,” https://github.com/NVIDIA/gdrcopy, 2024, on-
[9] A. Kelkar and C. Dick, “NVIDIA Aerial GPU Hosted AI-on-5G,” in line; accessed 12 September 2024.
2021 IEEE 4th 5G World Forum (5GWF), 2021, pp. 64–69. [30] M. Polese, L. Bonati, S. D’Oro, S. Basagni, and T. Melodia, “Under-
[10] SRS, “srsRAN,” https://www.srsran.com/5g, 2024, online; accessed 11 standing O-RAN: Architecture, Interfaces, Algorithms, Security, and
April 2025. Research Challenges,” IEEE Communications Surveys & Tutorials,
[11] P. Bahl, M. Balkwill, X. Foukas, A. Kalia, D. Kim, M. Kotaru, Z. Lai, vol. 25, no. 2, pp. 1376–1411, 2023.
S. Mehrotra, B. Radunovic, S. Saroiu, C. Settle, A. Verma, A. Wolman, [31] E. Moro, M. Polese, A. Capone, and T. Melodia, “An Open RAN
F. Y. Yan, and Y. Zhang, “Accelerating Open RAN Research Through Framework for the Dynamic Control of 5G Service Level Agreements,”
an Enterprise-scale 5G Testbed,” in Proceedings of the 29th Annual in IEEE Conference on Network Function Virtualization and Software
International Conference on Mobile Computing and Networking, ser. Defined Networks (NFV-SDN), Dresden, Germany, November 2023.
ACM MobiCom ’23. Madrid, Spain: Association for Computing [32] L. Bonati, M. Polese, S. D’Oro, S. Basagni, and T. Melodia, “OpenRAN
Machinery, 2023. Gym: AI/ML Development, Data Collection, and Testing for O-RAN
17
on PAWR Platforms,” Computer Networks, vol. 220, pp. 1–11, January Salvatore D’Oro is the CTO and co-founder of
2023. zTouch Networks, a company focused on the de-
[33] G. Gemmi, M. Polese, P. Johari, S. Maxenti, M. Seltser, and T. Melodia, velopment of zero-touch automation solutions for
“Open6G OTIC: A Blueprint for Programmable O-RAN and 3GPP O-RAN systems. He is also a Research Associate
Testing Infrastructure,” in IEEE 100th Vehicular Technology Conference, Professor at Northeastern University. He received his
2024, pp. 1–5. Ph.D. degree from the University of Catania and is
an area editor of Elsevier Computer Communica-
tions journal. He serves on the TPC of IEEE INFO-
COM, IEEE CCNC & ICC and IFIP Networking.
He is one of the contributors to OpenRAN Gym,
the first open-source research platform for AI/ML
applications in the Open RAN. His research interests include optimization,
Stefano Maxenti is a Ph.D. Candidate in Computer AI & network slicing for NextG Open RANs.
Engineering at the Institute for the Wireless Internet
of Things (WIoT) at Northeastern University, under
Prof. Tommaso Melodia. He received a B.Sc. in
Engineering of Computing Systems in 2020 and a
M.Sc. in Telecommunication Engineering in 2023
from Politecnico di Milano, Italy. His research is
linked with AI applications for wireless communica- Tommaso Melodia is the William Lincoln Smith
tions and orchestration, integration, and automation Chair Professor with the Department of Electrical
of O-RAN networks. and Computer Engineering at Northeastern Univer-
sity in Boston. He is also the Founding Director
of the Institute for the Wireless Internet of Things
and the Director of Research for the PAWR Project
Office. He received his Ph.D. in Electrical and
Computer Engineering from the Georgia Institute of
Technology in 2007. He is a recipient of the National
Ravis Shirkhani is a Ph.D. Candidate in Computer Science Foundation CAREER award. Prof. Melodia
Engineering at the Institute for the Wireless Internet has served as Associate Editor of IEEE Transactions
of Things (WIoT) at Northeastern University, under on Wireless Communications, IEEE Transactions on Mobile Computing,
Prof. Tommaso Melodia. She received a B.Sc. in Elsevier Computer Networks, among others. He has served as Technical
Electrical Engineering (Communication Systems and Program Committee Chair for IEEE INFOCOM 2018, General Chair for
Networks) in 2023 from Sharif University of Tech- IEEE SECON 2019, ACM Nanocom 2019, and ACM WUWnet 2014. Prof.
nology, Iran. Her research focuses on automation Melodia is the Director of Research for the Platforms for Advanced Wireless
of O-RAN networks, power consumption across O- Research (PAWR) Project Office, a $100M public-private partnership to
RAN components, and exploring optimization ap- establish four city-scale platforms for wireless research to advance the US
proaches for network energy efficiency. wireless ecosystem in years to come. Prof. Melodia’s research on modeling,
optimization, and experimental evaluation of Internet-of-Things and wireless
networked systems has been funded by the National Science Foundation, the
Air Force Research Laboratory the Office of Naval Research, DARPA, and
the Army Research Laboratory. Prof. Melodia is a Fellow of the IEEE and a
Distinguished Member of the ACM.