RAN automation: Software enablers for next-gen RAN

Software-enabler framework for AI-based RAN automation

Our research and supporting proof-of-concept systems have shown that a set of software enablers is required to best facilitate new AI approaches to sophisticated closed-loop automated network design. These enablers must be tightly integrated into a software framework that allows flexible and open information exchange. As shown on the left side of Figure 1, our software-enabler framework for AI-based RAN automation consists of nine key enablers:

A data layer (E1)
A message bus (E2)
Training and inference (E3)
Model life-cycle management (E4)
Open-source AI software (E5)
Integrated simulation (E6)
A decision-making agent layer (E7)
Rules and policies for intent-based management (E8)
A digital sibling for knowledge harvesting (E9).

Perhaps the most essential enabler of all is the production of industry-standard, machine-readable data (E1) at a high temporal rate by RAN network nodes (eNodeB or gNodeB). This data can be aggregated, stored, filtered and distributed by advanced real-time, hierarchical, message bus (E2) solutions. A distributed training and inference (E3) architecture can efficiently consume this data and use it to facilitate the training of ML models to create new AI algorithms and/or parameter selection for classic algorithm design [2].

There are many alternatives [3] to creating/training models for use with ML algorithms, with three main classifications:

Supervised learning, in which algorithms are trained using labeled examples
Unsupervised learning, using a cluster or grouping technique
Reinforcement learning (RL) [4], using a target reward strategy that allows guidance toward an optimal set of actions.

Depending on the problem targeted, the best ML model and supporting life cycle (E4) can vary considerably in complexity. For simpler problems, linear regression, smaller decision-tree and simple neural networks (NNs) with a few nodes and layers can be sufficient. For more complex problems, large decision trees or deep neural networks (DNNs) with many layers and nodes and several convolutional layers could be necessary to achieve the required accuracy.

RL approaches and supporting agents (E7) for controlling optimization goals can be especially effective at learning novel RAN management strategies. Training RL models depends on active exploration, through a software agent’s trial and error experiments, which will not always be possible or appropriate in a live RAN system. To help solve this problem, and to generate the required quantities of data to train models, we have included simulation (E6) in our set of software enablers.

Once trained, an ML model can be used in the inference phase (part of E3), where a selection of data is used as input into the model that will then produce a set of predictions, actions or rules, with the exact details depending on the ML algorithm type. In RAN, the hardware and software requirements on the training and inference phases can be vastly different. Training typically requires powerful central processing unit or specialized graphics processing unit (GPU) hardware with large memory and data storage.

AI software platforms, such as TensorFlow, Keras and PyTorch, together with other extensive, open source (E5), often Python-based, ML software ecosystems need to be integrated into the software engineering flow. During the inference phase, a trained model (or models) is made available to the RAN application(s) through model life-cycle management (E4). For latency-critical RAN edge applications, the inference needs to be efficiently realized, with low latency, power consumption and memory footprint, taking the characteristics of the target hardware and software architecture into account.

Our software enablers are fully compatible with intent-based management solutions [5]. When moving toward AI-based automation, operators will no longer spend time tweaking RAN performance by adjusting individual parameters exposed by the RAN system. Instead, well-defined high-level rules and policies (E8) for expressing overall operational intent are used. The AI-based automation machinery is then responsible for realizing these goals.

Long term, to allow the most sophisticated forms of automation, we see a need to build partial digital representations of deployed systems. Our unique digital sibling (E9) concept makes it possible to harvest ML models to build context-specific knowledge.