While there are only a few companies using artificial intelligence in production, it’s certainly where the future lies. In this wide-ranging episode of The New Stack Makers podcast, we talk with Ted Dunning, chief application architect at MapR and author, along with Ellen Friedman, of the new O’Reilly book “AI and Analytics in Production.”
We all have an image in the back of our minds of computers taking over the world, but the truth for the short-term, said Dunning, is that some of the best value for artificial intelligence (AI) is going to be some of the most boring stuff. AI, at least in the beginning, will replace boring repetitive tasks and mine massive amounts of data in ways not previously imaginable.
As you accumulate data, said Dunning, it begins to have interesting synergistic effects, so that the raw value can go up in unexpected ways. In Calgary, a research effort has been collecting blood samples for 30 years. By mining the extraordinary wealth of information with these new AI techniques, the researchers have implemented processes to control of infections that can save 50,000 lives a year. In Alberta alone.
This doesn’t revolution medicine, said Dunning, but it has an enormous impact.
But in order to do take on these new jobs, he said, ”a significant shift in thinking is required for operations, analytics, and application development teams. That’s a lot, from a business perspective.”
What’s missing in a lot of the talk about AI, said Dunning, is that we do not actually know the solutions that the machine-learning system is going to find. “The fact that we do not know enough to be able to write exact specifications of what it should do, in all cases, means that the software development lifecycle is fundamentally changed.”
So, if you don’t know exactly what the model is supposed to do, you can’t write a unit test. For example, if you don’t know how AI is going to recognize fraud by sorting through terabytes of bank data, you can’t write a normal unit test.
While the business requirements are still the same, he said, you have to build a reproducible and reliable engineering process that delivers products on schedule and within a budget, but the specific methods used to do so are noticeably different.
Store It All
It used to be that storing data was expensive, but now companies can store massive amounts of data for a fraction of the cost compared to even a decade ago. While a lot of data storage pieces seem analogous to what was done before, i.e., data comes in, it’s collected in a rough and ugly form, the next step is radically different.
The next step, traditionally, is “Let’s model that data, and understand it completely, and discard anything we don’t understand.”
But now, the key is storing the data in as raw a form as possible, and we will model it incrementally. “We will model the parts that we understand today, and that we need today, but we will save the rest, as much as we can,” he said. This leaves the door open to the possibility that later we’ll understand some new thing, or we might automatically do some level of modeling on it.
That’s a huge shift, but the motions aren’t that different. We’re just delaying some of the work.
Listen in to find out why Dunning says, “Don’t pick a solution before you understand the problem,” and why he recommends experimenting in AI like you messed around with blocks as a kindergartner.
1:36: Why this book? Why now?
6:58: The paradigm of data storage and how this has changed over the years
10:33: Exploring the “Significant shift in thinking required for operations, analytics, and application development teams.”
15:58: The value for AI in doing ‘boring’ business-related tasks
19:33: Working with the CI/CD pipeline, database structures, and static data
24:21: Discussing the term ‘DataOps’ and what it means for developers
Raygun sponsored this podcast, which was produced independently by The New Stack.
Feature image via Pixabay.