This article explores how Intel®'s Converged Edge Media Platform (Reference Architecture), combined with Varnish Enterprise and Kubernetes, leverages Intel® hardware optimizations to maximize efficiency and performance. Specifically, we demonstrate how using SR-IOV can achieve identical high-performance results with reduced CPU usage. Through performance testing, we compare baseline and optimized setups to highlight the tangible benefits of Intel® hardware and Kubernetes optimizations.
Intel®’s Converged Edge Media Platform Architecture is a media-focused implementation of Intel®’s Edge Platform. It provides container-based, cloud-native capabilities that enable Communication Service Providers (CoSPs) to quickly, efficiently, and cost-effectively deploy multiple media services. This architecture capitalizes on the fast-growing edge computing market by offering a platform tailored to media workloads. Built on Kubernetes (K8s) and leveraging Intel® hardware, it is optimized for the network edge and capable of hosting multiple services.
The Kubernetes cluster is enhanced with the following key features:
What is SR-IOV?
SR-IOV (Single Root I/O Virtualization) is a technology that enhances network performance in virtualized environments. It allows a single physical NIC to appear as multiple virtual NICs, giving virtual machines or containers direct access to NIC hardware. This reduces latency and increases throughput, making it ideal for high-performance workloads.
In a Kubernetes (K8s) cluster, master nodes and worker nodes serve distinct roles, each critical to the operation of the cluster.
Master Node: Responsible for managing the Kubernetes cluster and orchestrating all activities across the worker nodes.
Worker Node: Executes workloads (pods) assigned by the master node.
The testing process is divided into two phases:
Key elements shared across both phases:
Phase 1: Baseline
Phase 2: Optimized
For the tests, the prng VMOD is used to generate 1MB synthetic responses. Here is the VCL configuration:
Load generation is performed using WRK to simulate traffic and measure performance with and without SR-IOV. We limited the number of CPUs in the values.yaml
file. We wanted to test how low we could go in terms of CPU usage to saturate the NIC.
The values.yaml
file in a Helm chart is a YAML file that contains default configuration values for the chart. It allows you to customize the behavior and resources of a Kubernetes application when deploying it with Helm. These values can be overridden by providing your own values.yaml
file or by using --set
flags during the helm install
or helm upgrade
commands.
To learn more about the Varnish Helm chart, you can click here.
In the values.yaml
, we added:
Using 8 CPUs, the NIC throughput reached its maximum capacity (100Gbps).
To ensure a fair comparison between the two phases, we are now limiting the setup to 4 CPUs.
Threads |
Connections |
Req/sec |
Transfer/sec (GB/s) |
90% Latency (ms) |
99% Latency (ms) |
Hit Rate |
Errors (Timeout) |
4 |
20 |
5849.92 |
5.71 |
11.42 |
23.57 |
100% |
0 |
8 |
40 |
5393.02 |
5.27 |
40.76 |
53.63 |
100% |
0 |
16 |
60 |
4692.99 |
4.69 |
53.30 |
68.73 |
100% |
0 |
32 |
120 |
3416.72 |
3.34 |
84.26 |
107.04 |
100% |
0 |
64 |
240 |
1593.56 |
1.56 |
290.28 |
485.39 |
100% |
0 |
128 |
480 |
1716.83 |
1.68 |
514.99 |
929.11 |
100% |
0 |
We observe a decline in requests per second and a rise in latency as CPU congestion builds, leading to client performance degradation.
With SR-IOV enabled, the same NIC throughput of 10.94GB/s and request rate of 11195.59 req/sec are achieved, comparable to Phase 1 results (10.24GB/s throughput and 10485.96 req/sec) but with significantly fewer CPUs (5 CPUs versus 8 CPUs). This demonstrates that SR-IOV enables us to do the same with less CPU.
Limiting the setup to 4 CPUs:
Threads |
Connections |
Req/sec |
Transfer/sec (GB/s) |
90% Latency (ms) |
99% Latency (ms) |
Hit Rate |
Errors (Timeout) |
4 |
20 |
7566.83 |
7.39 |
3.99 |
8.31 |
100% |
0 |
8 |
40 |
8232.02 |
8.04 |
21.25 |
33.79 |
100% |
0 |
16 |
60 |
8386.16 |
8.19 |
22.39 |
36.26 |
100% |
0 |
32 |
120 |
7700.7 |
7.52 |
40.69 |
62.71 |
100% |
0 |
64 |
240 |
6393.45 |
6.25 |
73.59 |
127.94 |
100% |
0 |
128 |
480 |
5557.95 |
5.43 |
124.02 |
265.08 |
100% |
0 |
The results underscore the substantial performance gains achieved by integrating SR-IOV and Intel® hardware optimizations into the Kubernetes infrastructure. This approach not only boosts throughput but also significantly reduces CPU usage, making it an optimal solution for edge computing workloads. It is particularly suited for use cases that demand low latency or involve processing large datasets that are unsuitable for public cloud environments. When tailored for the network edge, these use cases are often video-related. With Varnish, the CPU savings can be repurposed to further enhance edge computing capabilities.
Gang Shen, Software Architect, Intel
Brandon Gavino, Product Solution Architect, Intel
Arindam Saha, Product Owner, Intel