MapR’s Ted Dunning: The Intersection of Machine Learning and Containers
How would you orchestrate a containerized architecture whose purpose is not just to retain state, but actually construct it block-by-block, like an Egyptian pyramid or a presidential excuse?
In this latest edition of The New Stack Makers podcast, MapR’s chief application architect, Ted Dunning, presented an intriguing idea: a kind of “reverse load-balancing” architecture where stateless containers are given construction tasks that are applicable to machine learning, but a limited time in which to either complete their job or put their tools down and go home. He calls it rendezvous architecture.
If every application in the world could be easily containerized with an automated process, it might already have happened. We’ve talked before in The New Stack about the many approaches to maintaining state — which, in a less esoteric, more honest-sounding world, is the same thing as “having one’s own database.” For a stateless system, a stateful construct is a pretty neat trick. Or at least for now, it’s your choice of several pretty neat tricks, and maybe a few more not-so-neat ones.
But machine learning is a different order of beast. By implication, something you learn is something you retain. If you can accept the notion that a machine can “learn” anything, then the whole notion is pointless if the machine is also capable of wiping its own slate clean with each iteration. Not even a virtual thing can be real unless it retains the same virtues.
“It isn’t a universal architecture that can handle every machine learning deployment problem,” said Dunning, in an interview for The New Stack Makers podcast. “It handles a particular, widely-used class of decisioning systems, where you ask a question, give it some request, and you get a response back that’s finite size.”
The architecture tackles what Dunning warns are the “boring” aspects of machine learning — specifically, logistics. With a typical academic artificial intelligence system, a program is given a task to resolve, and it returns with its best effort at a solution within a specified period. Rendezvous would work in a similar way, except with a plurality of containers contributing to the response within that given time interval. What solutions the containers can provide are delivered to a kind of proxy, ready or not. That proxy filters the good outputs from the bad, and renders a result — along with, perhaps, a confidence index — to the function that called it.
In This Edition:
2:08: Dunning’s thoughts on how machine learning should integrate into containerized environments.
9:08: Moving the learning framework into containerization
14:32: Exploring the differences in skillset between data specialists, DataOps, and data scientists.
21:41: Discussing the education and training required to demonstrate what containers and the mathematics behind them are used for.
25:46: How education drives incentive to utilize what individuals are learning in an immediate and productive way.
33:57: How machine learning systems handle SLA-style contracts and the possibility of confidence indexes.
Feature image: An artist’s depiction of a single atom-thick sheet of synthetic graphene by James Hedberg, licensed under Creative Commons 3.0.