eBPF: Put the Kubernetes Data Plane in the Kernel
eBPF could provide a “fundamentally better data plane” for cloud native operations, explained Daniel Borkmann, one of the core eBPF maintainers as well as an engineer at Linux networking company Isovalent, speaking at the virtual eBPF Summit last week.
The Extended Berkeley Packet Filter is a general-purpose execution engine with a small subset of C-oriented machine instructions that operate inside the Linux kernel. It has 114 instructions and 11 registers (2,000 instructions and 16 registers if compiled to x86) and is event-driven. It maps eBPF instructions to low-level Linux instructions with low overhead while checking that the code is safe to run inside the kernel, Borkmann said.
While originally groomed for telemetry, eBPF could provide a more efficient way to control load balancing, firewalling and other rules-based, event-driven actions from within the kernel itself, speeding operations while cutting down on CPU usage, according to the presentation.
In Kubernetes deployments, load balancers are often co-located on the node with the actual workload. Instead of running them from a subsystem, run them instead through eBPF, along with XDP (eXpress Data Path, a Linux hook to speed data packet processing). This approach can equal the performance of DPDK (Data Plane Development Kit, a set of libraries for speeding packet processing). Firewall policies for inbound packets can be processed using this speedier approach as well, thanks to no CPU polling, nor hopping between kernel mode and user mode. Facebook’s Katran, Cilium and Cloudflare’s Unimog are all using this approach.
An eBPF-dataplane, in conjunction with cgroups2, could enforce policies for traffic moving in and out of containers and pods. With the help of bpf_redirect_peer() and bpf_redirect_neigh(), a network card on the host can forward traffic to a pod without needing to run it up the host stack. Other examples he mentioned include TCP congestion control, custom TCP header creation. It can operate with other eBPF programs, such as one for tracing. eBPF simplifies the data plane when it is used in multiple subsystems, given that there are fewer dependencies and moving parts.
“The potential is so huge that it really feels like we are still at the beginning,” Borkmann said.
Google has been using eBPF to run a data plane for its Google Kubernetes Engine (GKE), according to a presentation on day two of the summit, by Google Software Engineer Zang Li. Network policy logging was the first feature Google created with this technology.
Kubernetes network policy defines what pods are allowed to communicate with one another. Although Kubernetes offers an API for setting network policy, it has no reference implementation. As a result, different vendors have implemented it in different ways, often extending it with proprietary features. Typically they rely on iptables, or Open vSwitch (OVM).
Google itself was dissatisfied with using iptables, given that they are clumsy, hard to extend, and not very scalable, Li explained. So the company implemented Cilium, an eBPF-based approach for policy logging. eBPF policy enforcers are installed on each node, which examines each packet against a set of rules that can be updated on the fly, or even dynamically generated. If the packet doesn’t match any of the rules, it is dropped. This setup can also push packet information to a user application for logging, Li explained. Information such as event type, source, identity, direction and packet headers can all be parsed and passed along to an application — at wire speed.
“We already see great potential. eBPF’s ability to augment network packets with custom metadata enables a long list of possible use cases. It can help us to enhance the observability, security and many other Kubernetes-aware package manipulations without sacrificing performance,” Li said.