Artificial Intelligence Applications in Telecommunication Clouds
Artificial intelligence (AI) was established in the 1950s and experienced two prosperous periods in the 1960s and 1980s. In the 21st century, with the development of the Internet and cloud computing, the widespread adoption of information infrastructure increased computing hardware capabilities of devices, such as CPU, Graphics Processing Unit (GPU), and Field Programmable Gate Array (FPGA). This in turn, led to the development of Deep Learning (DL), which influenced the current season of prosperity for AI. Real world examples of this widespread adoption and ensuing development include the time when the human champion for the Go game was beaten by the AlphaGo computer program and the fact that facial recognition is widely used in ID authentication.
Because the telecommunication cloud provides the infrastructure of information communication, there are many potential opportunities to adopt AI technology, for example, in networking automation and networking optimization. However, there are also challenges and problems in AI adoption for the telecommunication cloud, including:
- How to monitor network elements and analyze the data with AI for judgment, prediction, and decision.
- How to design, construct, operate, run, maintain, and optimize telecommunication networks with AI.
- How to empower physical, virtualized or containerized network functions, and orchestrate and schedule resources with AI to the right clusters and nodes to offer better quality and better performance for networking services.
This article will describe various challenges and opportunities in different aspects of the telecommunication transformation with AI, including leveraging AI in telecommunication cloud applications, operations, and business expansions. We will also introduce a joint proof-of-concept developed by China Mobile*, Quanta*, and Intel to create a reference framework for network analytics end-to-end closed loop automation with AI, specifically in the network orchestration and operation layer. Together with low-level hardware technologies, the reference framework makes use of Distributed Analytics as a Service (DAaaS) to collect data, AI to analyze data and make determinations accordingly.
CHALLENGES AND PROBLEMS
With the development of telecommunication, the telecommunication network itself is constantly facing new challenges and new problems. This requires the network to carry out technological evolution and innovation, from IP networks to cloudification and software defined infrastructure, in order to achieve flexibility and efficiency when offering services, and to be transformed to the telecommunication cloud.
Technologies such as Internet of Things (IoT), virtualization, cloudification, software defined infrastructure, and 5G are important milestones in the development of current telecommunication clouds. The implementation of each technology brings about significant changes to the current network architecture, and it also brings huge challenges to the design, operation, and maintenance of the telecommunication cloud, including:
- Network traffic and amount of network equipment grows dramatically.
- Complexity of operation and maintenance increases significantly when the networks are software-defined and cloudified.
- 5G technology is increasingly diverse and flexible as it matures.
- IoT requires devices to be connected any time and anywhere.
- Edge computing brings strong requirements for high-performance and low-latency on the infrastructure architecture of the telecommunication clouds.
AI OPPORTUNITIES IN THE TELECOMMUNICATION CLOUD
The telecommunication cloud is the most important infrastructure for the information industry. Its large-scale, complex structure, and numerous network elements are the most direct participants and supporters of the information society.
While facing the above challenges, the rapid development of AI technology at this time has brought new opportunities to the telecommunications cloud. The ability to collect data and to mine information is an important capability for telecom carriers. AI uses powerful data analysis and information extraction capabilities to help operators convert data into useful information, and assist them with further actions. The industry hopes to help solve the problems related to efficiency and capability encountered in current telecommunication clouds by introducing AI technology. Also, the industry plans to provide flexible digital and information services to people all over the planet, so the communication network finally realizes network intelligence with a “smart brain”.
The following areas in the telecommunication cloud are most suited for AI adoption:
- Networking efficiency improvement and cost savings with network optimization and operation automation.
- Value mining and security protection in massive network data with powerful analytics based on big data.
- Networking resource management and layer decoupling based on open source infrastructure, and creation of unified open interfaces and standards for interoperability.
From the network level point of view, AI technology can be applied to the layer of network links and elements, the layer of network management and control, the layer of service orchestration and operation, and the upper-level business layer. At each layer, the AI technology can leverage its unique data regression, classification, inference, and optimization techniques to serve the entire stack.
From the process point of view, AI technology can be applied to different aspects of network planning, design, operation and maintenance, optimization and future service.
From the network scope point of view, AI technology can be applied to performance optimization of link communication, resource allocation and optimization for network elements, and management, control and coordination of the network and its subnets.
Figure 1. AI Applications and Architectural Layers
The above figure is similar to the NFV reference architecture: the layer of Network Links and Elements corresponds to the hardware layer, while Network Management and Control corresponds to the virtualized infrastructure layer. On top of that layer, telecom carriers are able to run Physical Network Functions (PNFs), Virtualized Network Functions (VNFs), or Containerized Network Functions (CNFs) by deploying, orchestrating and operating them in the Orchestration and Operation layer. Finally, different applications and upper-level services to users consume the services provided by those PNFs, VNFs and CNFs, in the Applications and Services layer.
In the higher layers of the figure, applications are able to leverage the AI technology for user scenarios such as smart monitoring, smart transportation, intelligent deployment and orchestration, and business prediction, etc. Higher layers, which are all about different user applications or use cases, are generally less sensitive to real-time requirements. They request more powerful computation, generate or collect more data, and need stronger analysis ability. For these reasons, applications on higher layers are suitable for centralized training and inference.
Lower layers, which are close to devices or edges, have stricter resource consumption constraints, and higher demands on real-time and low-latency. These are generally suitable for inference only or with lightweight training workloads. Therefore, in lower layers, the AI technology can be used for resource rescheduling, intelligent network slicing, fault positioning, and flow and congestion algorithm optimization.
AI technology helps automate and optimize the telecommunication cloud, and, in turn, the application of AI technology also benefits from the telecommunication cloud in these areas:
- Big Data: Various network elements, terminals and service systems in the telecommunication cloud generate a large amount of data at all times, such as network element status, link traffic, alarm events, signal quality, service logs, etc. This data contains a wealth of valuable structures and information that can be analyzed and extracted using AI-related algorithms to help the network optimize its operations.
- Powerful Computation: The AI algorithm represented by deep learning needs strong computing power support during the training process. The basic telecom carriers themselves have a large number of data center hardware facilities and cloud computing software infrastructures. Particularly in the current network evolution trend of “convergence of cloud and network”, both the centralized data centers and the edge data centers will have breadth and depth of computation. Further enhancements are conducive to the construction of large-scale AI facilities for computation and acceleration that support those AI algorithms.
- Abundant Scenarios: There are abundant AI scenarios in the telecommunication cloud, including internal applications and external applications. First of all, the telecommunication cloud itself is a complex information system with a large scale, a wide distribution, and a large amount of data - all of which is constantly changing and growing. From the research and development of the communication technology and the networking technology, to the planning and construction of the actual network, operation and maintenance, etc., there are a large number of potential scenarios that can be improved for performance or efficiency through the AI technology. Secondly, the telecommunication cloud serves all aspects of society, supporting a large number of vertical industry needs of informatization and intelligentization, supporting smart city, security, transportation, medical, education, finance, industry, agriculture and other needs of intelligentization. The AI technology can be developed and promoted as more applications and more services provided by the telecommunication cloud are developed.
AI APPLICATIONS IN NETWORK LINKS AND ELEMENTS
Machine learning algorithms in traditional computer science, such as linear models, decision trees, k-means clustering, etc., have matured and are in use. In recent years, deep learning methods, such as Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and enhanced learning, have also developed rapidly and have made major breakthroughs in areas such as cognitive technology. Typical application scenarios involving statistics, inference, fitting, optimization, and clustering of data can be easily found.
At present, AI algorithms can be applied to the underlying link communication technology, operation and control of the network elements, management and coordination of the network. But in the actual network, it is rare to see real applications of those algorithms. This is because there is a lack of systematic architectures and functions of AI applications in that domain, and also most of the algorithms are not completely validated, but still are theoretical outputs on paper.
However, the direction of network intelligence has gradually received recognition from the industry, by increasing its investment in research and development of standards and technologies to move forward. For instance, the 3rd Generation Partnership Project (3GPP) started to define intelligent related functions and interfaces in the 5G core network and access network. The Focus Group on Machine Learning for Future Networks including 5G (FG-ML5G) was established by ITU-T for research on intelligent network architecture and telecommunication basic datasets and formats.
Taking another example in that layer, a typical application of AI is an underlying algorithm optimization for network transmission, such as optimization for Carrier-Sense Multiple Access with Collision Detection (CSMA/CD) in the TCP/IP family, self adaption for Multiple-Input and Multiple-Output (MIMO) in wireless communication, etc.
AI APPLICATIONS IN MANAGEMENT, CONTROL, ORCHESTRATION AND OPERATION
Amongst all challenges in the telecommunication network transformation, operation challenges have become an important obstacle on the way to complete network cloudification, because both the size of the telecommunication network and its complexity is increasing. Those operation challenges include, but are not limited to, operation inefficiency, high network complexity, and difficulty of Service Level Agreement (SLA) maintenance.
The traditional mode where problems are handled manually is not enough to support the demands of next-generation network transformation, especially for the demands of effective cost reduction of operations, efficiency improvements of network operation and maintenance, accuracy improvements of resource scheduling, and others.
On the other hand, the mobile communication network has been developed to 5G. The key performance indicators, such as transmission rate, transmission latency and connection scale of the network, are constantly improving together with the fixed-line communication network, and the application scenarios are becoming more and more abundant on mobile devices and clients.
Significant and fundamental changes occur in performance and flexibility. At the same time, the complexity of the network is expanding significantly, and the flexibility requirements are also increasing, which brings unprecedented challenges to the operation and maintenance of 5G networks. The problem is prominent when handling situations that deal with the traditional operation and maintenance mode.
Looking back to the evolution of the telecommunication cloud, as more and more network functions run on software-defined infrastructure and servers for telecommunication network transformation, public cloud services offered by cloud providers provide powerful and simple platforms for running those network functions. Under most circumstances, management and operation are handled by the cloud providers. However, telecom carriers are keen on running virtual network functions on top of cloud native environments they have built, in order to control and manage the entire underlying infrastructure themselves.
The figure below shows the procedures most telecom carriers follow. They might choose open source solutions for their Virtualized Infrastructure Manager (VIM), for instance, OpenStack* for virtual machines or bare metal to run VNFs or PNFs, or Kubernetes* for containers to run CNFs.
Figure 2. Choices for Telecom Carriers
On top of the VIM layer, the telecom carriers might choose other open source solutions for a higher layer called NFV management and orchestration (MANO) software, such as Open Network Automation Platform (ONAP*) and Open Source MANO (OSM). This layer primarily handles the problem of deployment and orchestration of PNFs/VNFs/CNFs, and their life-cycle management on VIM, corresponding to the layer of Orchestration and Operation in Figure 1. In the MANO layer, “closed loop” is assumed, that is, the MANO system regulates the process of PNF/VNF/CNF operations without human interaction, for example, by using automatic onboarding.
Furthermore, regarding the challenges of network operations, telecom carriers expect to leverage AI for customer management, network optimization, network security and network automation. Additionally all automation by AI is expected to be “closed loop”.
Based on ONAP DAaaS, Intel, Quanta Computer, and China Mobile have been jointly developing a proof-of-concept of the reference framework for end-to-end closed loop automation for PNFs/VNFs/CNFs, by leveraging the hardware technologies. Starting from monitoring the infrastructure and the PNFs/VNFs/CNFs, the prototype collects metric data and trains the AI model, infers based on the metric data to predict the future workload, and takes appropriate actions based on that prediction, SLA-impact event prevention, and adaption of power for performance or energy saving, and so on. The following figure is one of the snapshots of the prototype, for power inference on vCMTS.
Figure 3. Prototype for Power Inference on vCMTS
AI APPLICATIONS IN BUSINESS
On the top level, AI can be used to empower services from telecom carriers, expand business capabilities, and improve service quality. Since the telecommunication network is cloudified and software-defined, as IoT, edge computing, and 5G technology evolves and matures, the business layer can adopt AI and use the infrastructure offered by the telecommunication cloud to come up with smart applications and services, such as smart city, smart home, smart community, smart hospital, intelligent transportation, intelligent finance service, and others. New market channels, new business, and new scenarios are opened up to make our life better on top of telecommunication clouds.
At present, machine learning, speech recognition, and natural language processing based on deep learning have already had practical and successful commercial cases, which can be quickly combined with business operation of the telecommunication cloud to form large-scale AI capabilities to provide people with services and experiences.
In telecommunications, with promotion and adoption of SDN/NFV architecture, with acceleration of network cloudification, and evolution of new systems and technologies, current telecommunication systems are encountering more and more challenges, and putting huge pressure on the telecom carriers. As the telecommunication network is transformed to the telecommunication cloud, more and more opportunities become available.
AI can benefit from resources brought by the telecommunication cloud and the big data generated by that cloud, meanwhile, AI can assist telecom carriers to efficiently manage and optimize network resources, automate orchestration and operation of network functions, and enrich different kinds of intelligent applications and services in the cloud. AI technology and the cloudified telecommunication infrastructure help and support each other, as well as develop and grow together. In the open source projects for telecommunication we have seen some initiatives leveraging the AI technology for VNF operation and edge computing as for now, for instance, Distributed Monitoring and Analytics (DMA), Distributed Analytics As ONAP4K8S use case, and so on.
We’re looking forward to more AI initiatives and more innovations that are contributed by the community during telecommunication transformation. We invite people to stop by our booth at KubeCon & CloudNativeCon North America 2019 and watch our demonstration for network analytics and closed loop automation with AI.
- ONAP.org website
- Distributed Analytics as a Service (Dublin Summary) - Edge Automation
- Distributed Monitoring and Analytics (DMA)
- Use case: Deploying distributed external functions (applications/NFs)
Shane Wang, Individual Director of OpenStack Foundation Board and Engineering Manager at Intel System Software Products