CI/CD / Cloud Native / Containers / Data / Kubernetes / Serverless

AWS Re:Invent and KubeCon: The Race to Invisible Infrastructure

13 Dec 2017 8:00am, by

Yaron Haviv
Yaron Haviv, the CTO and co-founder of iguazio, is a serial entrepreneur involved in big data, cloud, storage and networking. Prior to iguazio, Yaron was the Vice President of Datacenter Solutions at Mellanox, where he led tech innovation, software development and solution integrations. Yaron tweets as @yaronhaviv.

I’m finally back home after two weeks of full-on cloud infusion, starting at AWS re:Invent in Las Vegas and ending with KubeCon in Austin.

The common theme at KubeCon was about Kubernetes and containers becoming boring. This was similar to the one emphasized at re:Invent, which was about development automation and making infrastructure invisible. The conference started with the much anticipated Google developer advocate Kelsey Hightower presentation in which he demonstrated a typical development flow: every time the code was committed to GitHub, Google Cloud automatically built and ran a container (pod), followed by providing it with an HTTP end-point for testing.

The code appeared under a production HTTP URL when it was finally merged. Throughout the demo, Hightower didn’t need to use or think about the underlying Kubernetes infrastructure. To bring the point home, he installed Kubernetes by using an “Ok Google, create a Kubernetes cluster!” voice instruction.

Impressive as it was, I couldn’t help scratching my head. Hightower just showed us what I’d seen the week before at re:Invent, which is “serverless” working at Amazon…

At another session, Kubernetes co-creator Brendan Burns, who is now a Microsoft distinguished engineer working on Azure, presented his Metaparticle project. Positioned as the layer above Kubernetes, it features simplified distributed computing API semantics for workload load-balancing, distributed locks, master-election, sharding, etc.

AWS re:Invent was about consumption and while KubeCon was about our community of creators.

Metaparticle is still at its infancy but will allow us to build scalable applications without thinking about the underlying infrastructure by automatically building the required infrastructure from our code. I assume we will see more attempts at helping mainstream developers adopt cloud-native patterns, potentially integrating such patterns as an integral part of serverless platforms.

GitHub now uses Kubernetes in production and shared that it had tried to design its own solution a couple of years ago, which led to the conclusion that it was too great a challenge. The engineers basically waited with it until now, when the eco-system has a proper solution for managing containers. GitHub built an automated software delivery pipeline over Kubernetes in which every code update goes through several automated tests and integration steps. At the end, it is deployed in a small scale using Canary deployments before being used across the entire production site.

I couldn’t help thinking the difference between both shows was that of consumption vs. creation. AWS re:Invent was about “this is what we built and how you use it to run faster” (i.e. consumption) vs. the theme at KubeCon which was “this is how we are building it and here’s where you can download the sources” (as a community of creators). Judging by the number of people at both shows, it seems like advanced technology consumers outnumber us creators by a factor of 10:1.

The serverless trend didn’t skip KubeCon, with a slew of presentations and announcements that support the core message of making infrastructure invisible for developers. Open source platforms like nuclio, Kubeless and fission are natively integrated with Kubernetes, making functions yet another managed cluster resource.  Eliran Bivas, a big data senior architect from my company iguazio explained how to build an end to end real-time analytics pipeline over Kubernetes from scratch in his presentation, without Hadoop and YARN complexity.

A single Helm command was used to automatically deploy big data tools such as Spark and Zeppelin, along with nuclio serverless functions for real-time event processing, AI and dashboards. From this point data scientists can query data via interactive Zeppelin notebooks and developers can add or update code through nuclio’s interactive IDE or GitHub push. Infrastructure is invisible.

The Cloud Native Computing Foundation reached important milestones at KubeCon. In a face-to-face serverless working group discussion, of which I am a participant, we finalized the whitepaper and managed to converge various proposals for common event APIs to one. This is a first step towards making serverless functions portable across platforms to eliminate current lock-ins.

We made less progress in the storage working group, as we’re still debating what “cloud-native storage” actually is (read my views here). I guess that was the one place at the show where infrastructure was not invisible.

The Cloud Native Computing Foundation, Google and Microsoft are sponsors of The New Stack.


A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.