From CDN Edge to Fornax: Toward a Next-Gen Edge Cloud Platform
You may like watching YouTube videos anywhere, or swiping up for the next TikTok video from your mobile phone anytime. Whether the content is a simple static web page or a live video, some content delivery network (CDN) providers are playing important roles behind the scenes.
You may also have heard about edge computing or edge cloud, which has been attracting attention from not only the cloud industries, but also major CDN providers. Our open source Fornax under the Linux Foundation is one such edge cloud project.
While surveying the fields of edge computing, we find it interesting to understand CDNs’ existing development in this area as a reference, especially the overlapping features in their user scenarios. What are the differences between CDN edge and edge cloud? What can we learn from CDN edge to benefit the design of our Fornax edge computing platform? Let’s take a close look.
CDN Edge vs. Edge Cloud
CDN providers are among the pioneers who brought edge computing capability into production networks ahead of other industries, mainly to facilitate fast content delivery.
Edge cloud, on the other hand, aims to distribute cloud computing capabilities outside the cloud data centers to be near endusers or where the data is created. Similar to CDN providers, the goals include achieving better latency and/or less data transfer, such as for security reasons, by performing computation “on site.”
In addition, edge cloud shares some requirements with traditional cloud data centers, such as resource management and orchestration. The following chart illustrates the fields of commonality and difference between CDN edge and cloud edge.
The common area of limited computing and storage means only certain forms of computing and storage can be provided, such as serverless and key-value store (we’ll talk more about this later).
Besides the limited computing, CDN edge is not providing more computing power and flexibility since it largely serves content hosting and delivery. In contrast, the edge cloud has the typical components of a central cloud and therefore can serve wider-scoped applications.
While edge cloud and CDN edge differ from each other, they share some common interests toward business and interact with each other. For example, China Telecom presented its experience in designing and deploying CDN edge nodes based on KubeEdge at the Cloud Native Edge Computing Forum in November 2021. Similarly, the Fornax project from Futurewei was established with a vision of an edge cloud platform that features high flexible configurability, connectivity and fault tolerance.
Understanding existing CDN edge solutions is a necessary starting point for designing future edge cloud platforms.
Edge Is Not New to CDN
The concept of “edge” has existed for a long time in the CDN area. Typically, a major CDN provider manages and operates a large number of content servers geographically distributed across regions, countries or even continents. Besides the origin server, copies of the content are cached on different servers. Cached content offers clients a faster loading experience since their requests can be answered by geographically closer servers.
Such servers are called CDN edge servers relative to the origin servers. Not only is content retrieval latency reduced, but the network traffic is more distributed and balanced without generating traffic spikes and heavy workload on the origin server. Additionally, the origin server’s exposure to cyberattacks and other threats is alleviated.
Besides geographically content caching, the requirements of computing on the edge have extended further on the edge cloud computing platform side. The use cases include not only content retrievals, but also those that require computation and interactive communication. Let’s imagine you put on your virtual reality/augmented reality (VR/AR) headset and walk into a virtual office in the metaverse. You say “hello” to the quirky software developer, Mark (of course this is 3D virtual and live Mark), and he replies without even turning his head, busy typing something in the terminal. You sit down in front of your virtual minimalist desk and launch your email inbox, starting your day virtually.
This type of scenario may not happen to everyone today, except to some video game players, but its adoption by general users can be expected soon as metaverse’s key ecosystem technology is ramping up quickly. Apparently, caching contents geographically is not sufficient anymore. More and more computing and networking capabilities have to be brought to edge platforms; otherwise, you may see a blurred mosaic Mark type with glitches.
CDN providers have started building edge computing environments, providing compute resources at the network edge. Edge compute servers are envisioned to provide data processing for massive raw data and latency-sensitive applications, including Internet of Things, AR/VR, autonomous driving, 5G, etc.
How CDN Providers Design an Edge Platform
Let’s take a look at how CDN providers design and offer their edge platforms today. We chose Akamai, CloudFlare, Fastly, Amazon CloudFront and Verizon Edgecast for comparison. Among these providers, Akamai is a traditional and major CDN provider in the market. CloudFlare and Fastly are two rapidly rising players bringing a lot of innovation and competition recently. Amazon CloudFront and Verizon Edgecast are two providers touching this market from the cloud and telecommunication areas, respectively.
The edge use cases these CDN providers target are fairly similar, which mainly focus on advanced content caching, service localization and customization, and lightweight web services, while Verizon Edgecast focuses on latency-sensitive applications in 5G networks. Such similarity is straightforward to understand since it naturally comes from the low-latency advantage of edge computing, which can be leveraged by the current CDN services. So far, all the providers have yet to step out of their comfort zones considering some futuristic applications mentioned earlier.
For compute solutions, serverless or Function-as-a-Service (FaaS) is the trend. Originating from mainstream cloud development, this computing model can provide more cost-effective compute platforms with a shortened development cycle and eliminate the need for infrastructure configuration and management. The advantage of this computing model will be seen clearly when a large number of edge nodes come into play.
Key-value (KV) stores are the common storage solution the CDN providers are adopting. A key-value store or database is a type of nonrelational database that uses a simple key-value method to store data. Specifically, let’s dive deep into the KV store solutions. We only list Akamai, CloudFlare and Fastly for comparison since Amazon CloudFront and Verizon Edgecast do not provide sufficient information on their storage solutions.
Clearly, KV stores support applications with fast, frequent reads and infrequent writes. This feature still falls into the nature of content caching or delivery domains.
The eventual consistency model is commonly used across different providers. For edge computing applications, there usually is no strict requirement on fast propagation — it may take up to 60 seconds to propagate changes to all edge locations for CloudFlare Workers KV. However, it may suffer from data loss if some nodes with the latest changes fail.
Key and value size determine how much information you can place in a KV store. We can see CloudFlare provides the largest value size among all the providers. From an application perspective, 256 KB cannot support too many use cases except text-based content, while 25 MiB enables many more possibilities for other types of values, such as image or video clip content.
For the limit of read and write, Akamai, measuring per second, allows relatively more compared to the other providers. The write limit is much lower than the read limit for Akamai and CloudFlare. Although Fastly does not specify numbers for read/write, it limits the API calling rate, which is also not a large number.
Compliance is quite an interesting aspect for all the CDN edge platforms. Both Akamai and Fastly warn users that they do not support storing private or sensitive information. CloudFlare’s CEO Matthew Prince places compliance as the top requirement, while speed is the least important.
Challenges for Future Edge Cloud
While the CDN providers have already started deploying their edge solutions into production, they still face challenges in designing and architecting an edge cloud.
- It’s a “wild west” for resource management in an edge environment. Computing resources on the edge can vary greatly by their capacity in terms of computing power, storage networking capability and reliability. Few of these are of concern for larger-scale cloud data centers, including CDNs’ regional data centers where engineers maintain dedicated resources with quick and easy access.
- Complex application hierarchy. To enjoy the benefits of edge computing, such as low latency and high scalability, edge applications usually adopt a certain level of hierarchical architecture. This requires fundamental design changes for both application and edge frameworks.
- Scale beyond clusters. Edge computing also has the potential to handle application service and coordination at greater scale. Networking and storage would span beyond the boundary of clusters or data centers, and smart scheduling mechanisms would help achieve the balance of high resource utilization and performance over large logical and geological areas.
- Fast and frequent read/write are important for future use cases. Interactive communication in applications becomes increasingly important at the edge, even within the content domain. How to better serve both frequent reads and writes is a challenge for future edge designs.
- Where to store keys and values? In memory or on disks? We did not see any information regarding this question from the documentation. Perhaps it is proprietary information and not disclosed. The answer is important since it directly impacts the read/write performance and cost.
- The eventual consistency model cannot serve all use cases. Some future use cases, such as autonomous vehicles, may require a strong consistency model where all the information needs to be updated in a timely fashion.
- How to achieve global propagation? The first question is how to define “global.” The adopted mechanism needs to consider multicluster communication and synchronization, which is not a trivial problem.
- Regulatory and security compliance is a common concern. So far, none of the providers has the capability and confidence to handle private or sensitive data on the edge. This probably will become a big topic when private data on the move must be dealt with for some use cases, like the metaverse or autonomous vehicles.
Our goal of surveying the CDN edge computing offerings is to gather experience and build vision for the design of the next-generation edge computing platform such as the Fornax project. Whether it’s conferencing in metaverse, managing industrial automation, conducting a medical operation, oil drilling far in the ocean or monitoring a wind farm, edge computing poses the special challenges of a large-scale distributed system in volatile and sparse physical environments. CDN providers have introduced edge computing capabilities to production to handle content caching and relatively lightweight computation.
Looking ahead, next-generation, highly secure and flexible, quick-responding and scalable applications call for much more capable edge cloud platforms with features like flexible edge cluster hierarchy, efficient intercluster networking, storage and scheduling. Our Fornax project has these in the feature landscape while appreciating the trailblazing exploration by the CDN providers.