What David Flanagan Learned Fixing Kubernetes Clusters
In one case, the submitter substituted a ‘c’ character with a unicode doppleganger — it looked identical to a c on the terminal output — thus causing an error that led to Flanagan doubting himself and his ability to fix clusters.
“I really hate that guy,” Flanagan confided at the Civo Navigate conference last week in Tampa. “That was a long episode, nearly two hours we spent trying to fix this. And what I love about that clip — because I promise you, I’m quite smart and I’m quite good with Kubernetes — but it had me doubting things that I know are not the fault. The fact that I thought a six digit number is going to cause any sort of overflow on a 64 bit system — of course not. But debugging is hard.”
After that show, Klustered adopted a policy of no Unicode breaks.
“You only learn when things go wrong,” Flanagan said. “This is why I really love doing Klustered. If you just have a cluster that just works, you’re never really going to learn how to operate that beyond a certain level of scale. And Klustered brings us a situation where we can have people bring their failures from their own companies, their own organizations, their own teams, and we replicate those issues on a live stream format, but it allows us to see how individuals debug it as well.”
Debugging is hard, he said, even when you have a team from Red Hat working to resolve the problem, as he learned during another episode featuring teams from Red Hat and Talos. In that situation, Red Hat had removed the executable bit from important binaries such as kubectl, kubeadm, and even Perl — which has the ability to execute most Sys calls on a machine; limiting the Talos ability to fix the fault.
“What we learned from this episode is you can actually execute the dynamic linker on Linux. So we have this ld-linux.so you can actually execute any binary on a machine, proxying it through that linker. So you can bin.chmod, like so, which is a really cool trick.”
/lib/ld-linux.so /bin.chmod +x /bin/chmod
People have also modified attributes on a Linux file system.
“Anyone know what attributes are in a Linux file system?” He asked. “No, of course not. Why should you?”
But these attributes allow you to get really low level and to the file system. He showed how they marked a file as immutable.
“So you can pack a file that you know, kubectl or Kubernetes has to write to and mark it as immutable, and you’ve immediately broken the system,” he said. “You’re not going to be able to detect that break by running your regular LS commands, you actually do need to do an lsattr on the file, and understand what these obscure references mean when you list them all. So, again, Klustered just gives us an environment where we get to extract all of this knowledge from people that have done stuff that we haven’t done before.”
On another episode, he had Kris Nóva, a kernel hacker who has worked in security and Kubernetes, along with Thomas Stromberg, a previous maintainer of minikube while Google, who has also worked in forensic analysis of intrusions. Stromberg had to fix the broken cluster by Nova, a security industry elite.
“Thomas came on and runs this FLS command,” he said. “It’s very old toolkit, written in the late 90s, called Sleuth Kit that does forensic analysis of Linux file systems.”
“By running this command, he got a time ordered change of every modification to the Linux file system. He had every answer to every question he wanted to answer for the last 48 hours…. So I love that we have these opportunities of complete serendipity to share knowledge with everyone,” he added.
Network Breaks Common
Networking breaks are often fairly common on that show. Kubernetes has core networking policies in place to keep them from happening…but still, it happens.
“However, we’re now seeing fragmentation as other CNI providers bring on their own adaptations to network policies,” Flanagan relayed. “It’s not enough to check for network policies or cluster network policies. …You need to know to successfully operate a Kubernetes cluster from a networking level [that] continues to evolve and get very cumbersome, scary, complicated, but also easier.”
Flanagan’s biggest frustration with Kubernetes is the default DNS policy.
“Who thinks the default DNS policy in Kubernetes is the default DNS policy? Now we have this DNS policy called default,” he said. “But it’s not the default. The default is cluster first, which means it’s going to try and resolve the DNS name within the cluster. And the default policy actually resolves to the default routing on the host.”
Flanagan said he’s been discussing with people like Tom Hockin and other commentators of Kubernetes how the community can remove some of the anomalies that are out there essentially tripping up people who just haven’t encountered these problems before.
Ebpf Changing the Landscape
eBPF is changing the landscape as well, he said. Rather than go into a Linux machine anymore, and run IP tables -l, which he noted has been ingrained into developer’s skulls for the past 20 years. Now developers are supposed to listen to to all the eBPF probes and traffic policies. And essentially, you need to have other eBPF tools that can understand the existing eBPF tools.
He recommended checking out Hubble for a visual representation of older network policies — Kubernetes and Cilium specifically, he added. Hubble also ships with a CLI.
“We have the tools to understand networking within our cluster. If you’re lucky enough to be using Cilium, if you’re using other CNI, you will have to find other tools, but they do exist as well,” he said.
He also recommended Cilium Editor.
“You can build a Kubernetes networking policy, or a Cilium network policy by dragging boxes, changing labels and changing port numbers,” Flanagan said. “So you don’t actually need to learn how to navigate these esoteric YAML files anymore.”
Ciluim Editor will allow you to use drag-and-drops to build out a Kubernetes networking policy, he said.
There are other ways to break Kubernetes clusters, of course. You can attack the container runtime, he noted. People have rolled back the kubectl binary as many as 25 versions; 25 versions is what it took to actually break backwards compatibility so that it can’t speak to the API server. Storage is another consideration with your own CSI providers, he added.
What he’d like to normalize is engineers admitting what they don’t know and sharing knowledge.
“The one rule I give people is please don’t sit there quietly, Googling off camera to get an answer and go, Oh, I know how to fix this,” he said. “I’d love to get senior engineers to set better norms for the newcomers in our industry and remove the hero culture we’ve established over the last 30 years.”
Civo paid for Loraine Lawson’s travel and accommodations to attend the conference.