Meshy and Happy with Kubernetes Ingress
Cloud computing depends on one critical thing: reliable networking. Reliable not only in connections available between point A and point B, but also secure, observable, and with a definable quality of service.
Networking is already a classically hard problem in computing. Did you hear the one about the network admin that brought down YouTube with one bad config? It happens! Security isn’t in a much better state, because the need for scale and connectivity opens us up to more vulnerability. Security researcher James Mickens sounded the alarm in his 2015 talk, which is still relevant today. More IPs and ports? More problems. Let’s face it, security is probably the last thing we think of when building applications. Increasing demand for more distributed applications means we could be heading over a networking cliff.
I spent several days trying out different options around this complex topic. Hopefully, this will help shortcut your learning into a real solution faster. In this article we are going to look at:
- What Kubernetes brings to network management
- Why Ingress is the tool to use in Kubernetes
- Where this is used to create a Service Mesh
- How to put it together with some examples around Apache Cassandra™
Kubernetes to the Rescue?
When you choose a deployment platform for your application, you expect the hard things to be managed for you — or at least made easier. Kubernetes is trying to deliver on the promise of building applications with scale, resilience and elasticity — which includes not only compute, but networking. The Kubernetes project has chosen to manage security in a way that is arguably the most sensible. Just say “no”. No outside access in and no inside access out. As you deploy pods in a cluster, Kubernetes uses a private network that is completely unroutable outside of the unique pod networks. This means that as a first time user, you find yourself wondering whether there is any way to access your newly created service, since you realize there are no external IPs or ports by default. Not surprisingly, this is how cloud providers have operated for years with explicit port access and VPCs (virtual private clouds).
There are four established ways (so far) to access your newly created service. Kubernetes supplies a limited number of native ways to route traffic to IPs and ports inside the control plane.
- Pod IP: This is similar to punching a hole in your firewall for a specific private IP and port. The configuration is not dynamic, however, and will not respond if node IPs change. Great for your laptop, terrible for production.
- Service NodePort: When creating a service, each node will create a random port number that will direct traffic to the backend service and each service will get its own port number. That means a lot of port numbers to remember if you deploy a lot of services. Destination pods running the service are chosen by the node, not the requestor, so services should be stateless or you might get surprised.
- Service Load Balancer: In your deployment YAML, you can configure a specific load balancer service. Typically, this would be a cloud provider load balancer, like ELB from AWS or NetworkLB from GCP. The outside IP is static and load balancing policy is controlled by the implementation. This is closer to the classic version of network load balancing like F5 appliances or HAProxy software.
- Ingress: Similar to Service Load Balancer but limited to HTTP traffic. Cloud providers have similar implementations, however, Ingress can be used locally with a few ingress controller projects. Since it is in the Kubernetes control plane, Ingress provides a lot more options for traffic control and observability.
Ingress to the Rescue!
Ingress is where the most interesting things are happening with Kubernetes networking. The addition of the Ingress API to Kubernetes has changed the way we manage networking — not only outside of clusters, but inside too. All of the pain points of networking at scale — like routing, automatic service attachment, security and failover — are managed with the least amount of expertise needed by the end-user of Kubernetes.
There are several controller implementations to choose from right now. Istio, Kong, Traefik, Skipper, NGinx… the list is really long, which should be some indication of not only the popularity, but the subtle differences between controller implementations. The use case is important when choosing an Ingress controller, so let’s look at an example of when you need a controller and how it is deployed.
In this example, we have a three-node Apache Cassandra cluster running inside Kubernetes. The application needing access to the data is outside of the cluster, so by the default rules of Kubernetes, sorry, no access is granted. To provide connectivity, we also included a Kong ingress controller, which is in our deployment YAML.
The result is a static IP address for outside services to use. As nodes come online, they are automatically added to the Ingress controller destination IP list. The best part of this deployment is that communication is secure by default with SNI (server name indication) and mTLS (mutual transport layer security). The latter is critical for the Ingress controller, to not only ensure the application requesting access has access permissions, but also that the application is talking to the cluster it wanted to communicate with securely.
Cool Firewall, but Where’s the Mesh…
The previous example was about providing outside access to inside resources, but isn’t a mesh a bunch of things communicating with each other? The other feature of Ingress is routing internal services to other backend services with the same level of control as we saw with routing external services.
This example pictured is similar to the one above, however, now the services are inside the same Kubernetes cluster. Since traffic is routable between pods, why would you add an Ingress controller like Kong? Just because we have our traffic inside the cluster, doesn’t mean we can lower our guard. The same amount of security with SNI and mTLS will be available by default. In addition, we can manage changes inside our cluster as services change IPs or become unavailable. All that pain we had to endure with network management before is now managed in one deployment.
If you take this example and extend it to all of your services running inside Kubernetes, you can see where the mesh name is now used properly. Microservices talking to each other are now given the same default level of security and network management. As services are added and removed, there’s no need for a network administrator to be involved in changing configurations. Developers can build with the confidence of knowing the microservices they deploy are communicating properly and with the best security.
Where Is This Going for Cassandra?
The two examples shown were about how to have applications communicate with a running Cassandra cluster, but what does this mean for Cassandra running in Kubernetes? The first and most pressing need for Cassandra is enabling multi-datacenter Cassandra communications. Because this is such a common use case for Cassandra deployments, using the Ingress controller to secure and route data between clusters will further simplify deployments.
The other area of interest for Cassandra users is in the observability aspects of Ingress. Controllers like Kong and Istio already provide detailed statistics on network traffic. There is a real interest in the Cassandra community to provide Layer 7 statistics on CQL traffic. Error rates and latencies on all CQL traffic to and from a Cassandra cluster would add a layer of observability not available now.
To Mesh or Not to Mesh
If you are just getting started with Kubernetes, you should have Ingress on your shortlist of things you need to learn. Kubernetes networking can be hard to understand, so just skip everything but Ingress and go straight into the future for the project. Service mesh gives you so many things that are simply included, it will be hard to deploy an application outside of Kubernetes again. For microservices, there is almost no other method that is as reliable. This is what we want in our cloud native applications: scale, resilience and elasticity. Building your application relying on service mesh technologies is the right tool for the right job.
If you are looking to learn more about how DataStax offers a lot of free learning for the topics you need to succeed on our Kubernetes Skill Page.