Here’s something that probably doesn’t cross most developers’ minds: In a distributed system whose components don’t share state with one another, how does one produce an application whose stated goal, if you will, is to create and maintain a state — specifically, something learned?
We already know that artificial intelligence routines such as decision trees don’t need to have state pre-assessed for them, to render results that seem rational enough. Chess move algorithms, for example, don’t have to retain a concept of the active chessboard in memory, to rate the quality of a potential move. They may not produce the best chess programs on the market, but they can evaluate moves, and they can beat amateurs.
Well, by definition, a machine learning application is expected to retain something — that’s what the entire “learning” part is supposed to mean.
“The thing that you really want to do from the get-go is, make a couple of decisions,” Bonsai co-founder and CEO Mark Hammond told us, in this latest edition of The New Stack Makers podcast. “The first is going to be, are you going to have a system where online learning matters — that is to say, where the system needs to learn concurrently, with users utilizing the system? Or can there be a lag in that? Is it possible for you instead to record the data that has come in, as users use the system, and then learn on those recordings in a more batch-style operation layer?”
It’s not an esoteric point. Any machine learning application has to be trained; for instance, if its job is to ascertain the normal behavior of a system, such as a telecommunications network, then it has to be shown data representing normalcy. Never mind why that data represents a normal state, the training algorithm needs to see it. Hammond’s point is, if that data can be accumulated over time, then certainly the recording mechanism could be envisioned as stateless microservices.
“The next thing is to understand whether or not maintaining state matters,” Hammond continued, “for what you are doing when you are serving requests. There’s a big difference between, for example, searching for reviews of a product and needing to get results back, where even in that case, you still want to have intelligence in the system.” It’s easy enough to use a natural language learning algorithm to refine the results of a query. But if your application pertains to a persistent physical object, such as a robotic manufacturing system, there needs to be an underlying layer that ensures that events in the microservices layer are routed to other services correctly. And that layer may not be microservices as we know them.
“As you build out these machine learning systems, it’s a whole lot more complicated than just, which machine learning algorithm should I use, and how should I structure my training data?” said Hammond.
After the podcast, be sure to download Bonsai’s white paper on the challenges of constructing real AI for the enterprise without getting lost in the metaphors, from bons.ai.
In This Edition:
2:55: The extension of artificial intelligence (AI) and machine learning into the current distributed systems and containerized environment.
13:06: What type of component can make a distributed environment possible for a machine learning scenario?
17:13: When a developer chooses Bonsai, how many decisions are taken off the table and automated for them?
27:36: How does adaptivity work with artificial intelligence?
33:03: The danger of accepting what a system tells you in an unsupervised machine learning environment.
38:36: Can machine learning platforms run on container orchestration platforms such as Kubernetes, or should it run on a different platform than what is available today?
Feature image: An Indian harvester ant nest by Rushil, licensed under Creative Commons 3.0.