Kubeflow Co-Founder: Machine Learning Workflows on Kubernetes Can Be Simple
The KubeCon + CloudNativeCon Conference sponsored this podcast.
Machine learning (ML) should have a profound effect on many facets of our lives in the future, from self-driving cars and trucks to utility grid management. As artificial intelligence (AI) engineers and data scientists develop advanced systems based on neural networks and other technologies designed to teach machines to learn and act in ways similar to the human brain, workflows that integrate all facets of the underlying software development and deployments for ML applications will play an obvious and critical role.
To that end, David Aronchick, co-founder of Kubeflow described how Kubeflow can make setting up machine learning software production pipelines easier, during a podcast, Alex Williams, founder and editor-in-chief of The New Stack, recorded at KubeCon + CloudNativeCon 2018 in Shanghai.
The genesis of the Kubeflow project began last year. Originally created as a way to run Google Brain-created TensorFlow on Kubernetes, Kubeflow was designed to help simplify the deployment of open source systems for ML applications on Kubernetes platforms at scale.
“We looked at the landscape of machine learning and realized one of the most fundamental things people were having trouble doing was just the basics of setting up a reliable machine learning platforms and [production] pipelines,” Aronchick said. “We knew Kubernetes was going to be great at this. It had already transformed the industry when it came to so many types of applications.”
The idea has been to build an ML framework “around norms and packaging, deployment and all those kinds of things to make rolling out a machine learning pipelines portable and composable (you can pick and choose what you want) and distribute them to take advantage of the essence of Kubernetes,” Aronchick said.
One of the biggest stumbling blocks of ML software deployments has been “the essence of portability,” Aronchick said. “Data scientists today will often go and grab their laptop… and execute a bunch of commands to install a complex machine learning platform for them to experiment with a model,” Aronchick said. “That’s great, but the second they want to bring that model to production or bring it into the enterprise, things get challenging.”
An ML software developer’s laptop, for example, “probably does not look anything like their ultimate production system,” Aronchick said. An ML developer will thus have to “go in and rewrite the entire thing…to bring the same data science platform into the enterprise,” Aronchick said. “Worse than that, setting up these kinds of complicated distributed systems to do these training to do these things at scale in terabytes and Petabytes of data was really challenging.”
Instead, Kubeflow “describes what it is you are doing in ways that work the same on your laptop and in production-ready environments,” Aronchick said. “It allows you to make that move from your laptop to the ultimate enterprise environment that much easier,” Aronchick said.
In large-scale ML deployments on Kubernetes, typically on the cloud; the training process in machine learning, while obviously extremely challenging to develop, represents a small part of the entire process among the different layers of abstraction. “Kubeflow is not going to go out and re-design your storage system, your nodes or anything like that,” Aronchick said. “More often than not the most common machine learning platforms…were hand done by a bunch of really smart people in a single enterprise that couldn’t work literally in the next building over.”
Since its launch in December 2017, Kubeflow “has been on fire with “hundreds of committers all around the world” totaling 1,900 commits. “[Kubeflow’s reception] has been well beyond our expectations,” Aronchick said. “It has all come together to give machine learning engineers and data scientists very easy ways to [build] things out.”
In this Edition:
3:01: What is that was the impediment or the block that made Kubeflow so significant?
10:01: Exploring the Kubeflow 0.3 release and Kubeflow pipelines.
12:45: Clean layers of abstraction and how these impact the developer experience.
16:58: Addressing the issue of data portability and application framework portability.
21:17: What are the intersections with application architectures?
25:01: How do you think about the dueling forces of companies versus open source projects?