AI Development Needs to Focus More on Data, Less on Models
To be successfully used by most businesses, artificial intelligence needs to be less focused on building models and more focused around data, said Andrew Ng in his talk at Insight Partners’ ScaleUp:AI conference held in New York earlier this month.
To date, AI development has been centered around building models, in which an algorithm is developed that will fit as closely as possible to the data. To test, researchers would download some dataset from the internet. This has led to a lot of widely used general purpose neural nets.
Ng’s data-centric approach differs from this approach. It aims to concentrate a lesser proportion of effort on code and more emphasis on trying to “systematically work on the data,” Ng said.
Ng thinks that, going forward, this approach will provide a “more efficient way” of building AI apps.
Ng has a long resume pioneering AI. Ng co-founded both the Coursera and DeepLearning.ai online learning centers. His most recent venture is Landing AI, which provides AI as a service.
AI for All
To date, much of the impact AI has had in the industry has benefitted large internet-scale companies for search, recommendations, user-identification and the like. AI has seen less success in the more general day-to-day businesses, such as health care. Even among large companies, only a small percentage benefit from AI.
The way that the massive internet companies develop and deploy AI systems isn’t practical for most traditional businesses, Ng argues.
For one, most businesses have massively smaller data sets to work with. A technology that was built around digesting 100 million or more images may not be very effective for the number of images coming in from some factory floor process, which could number only in the low dozens, for instance.
The good news is surprisingly solid AI systems can be built around a smaller data set, Ng pointed out. “It’s not about big data,” he said. Nor should the systems be gargantuan either. The internet-scaled companies are used to sinking millions of dollars into a monolithic project, but a company with a factory line, for instance, may need dozens of smaller-scale AI systems to inspect the product at each stage.
In other words, you could not build a monolithic platform to handle all the needs of a factory, or of a hospital.
Ng advises looking toward highly customized vertical platforms for each use case. This wouldn’t be up to the IT departments of each vertical but rather the subject matter experts. The trick would be to build the tools so they would allow the experts not to write code but to “engineer the data in a way that lets them express the domain knowledge,” he said.
What is involved in data engineering? For one, it involves cleaning up messy data. What one person tags as a “chip,” another person may tag as a “scratch.” One person may identify a single scratch where someone else may mark it down as a series of scratches.
Ng relayed a story about his work with Dupont, which was attempting to improve its computer vision system for inspecting for defects on sheets of steel. At the time, they were at 76% accuracy, but they wanted get that to 90%. Engineers took the traditional approach of refining the model to little improvement. Ng’s own team took the opposite approach, looking at the data quality inconsistencies and what was causing them. And this approach, in turn, led to a model with a much higher accuracy. Having the subject matter experts express clearly what consists of a “defect” improved the model.
The idea behind data-centric operations is “to build tools and show consistently high quality data from all stages of an AI project,” he said. “Because if you can show that for all of these stages, you have the right tooling to consistently high quality data that often solves a lot of practical problems that arise.”
When it comes to deriving value from AI, it turns out that “AI infrastructure is way more important than we would have thought as a source of differentiation,” noted Lonne Jaffe, a managing director at the investment firm Insight Partners (which owns The New Stack) during another talk at the conference.
The trick is to make ML very scalable, he said. This becomes doubly tricky at the edge, where ML systems may require both preprocessing and inference capabilities within a consumer device as well as coordination with a backend data center. Most database systems today are not built for these federated architectures.
One Insight portfolio company that makes hay from scalability is Run:AI, which offers what Jaffe likened to a “VMware for GPUs.” The idea is to not just make a service cheaper but to manage infrastructure in such a way to make it “qualitatively different and better.” Because of its more efficient infrastructure, Hour One, which just announced a round of funding this week, lets you pay only when you have finished iterating through to the final product.
The field is growing so large that it is bifurcating in various ways, he noted. AI-based analytics, for instance, “is turning out to be more different from the traditional data analytic stack than we originally thought,” he said, pointing to the AI experiment tracking and hyper-parameter optimization specialist Weights & Biases, which doesn’t have a perfect analog in traditional analytics.
Even the MLOps stack for structured data is diverging from that for unstructured data. The approach that Overjet takes, which offers image recognition for tasks such as dental examinations, is very different from that used by Zest AI, which uses more structured data to drive lending decisions.
Insight has been investing in software for about 20 years, and Jaffe has been a managing director at the company for about five years, building out a portfolio that includes AI infrastructure support companies as well as those that use AI to provide a specific service.
For companies providing AI software as their product, the talent needed is different from the usual software engineer in that they would like to keep current in the academic community in addition to enjoying the benefits of working for a commercial organization.
In addition to AI infrastructure and AI as a product, Jaffe sees a third nascent but growing category: Captive AI.
With Captive AI, the business can’t easily consume AI as a service from a vendor since the AI is core to the strategy. “Companies are learning systems, humans are learning systems. And so if the core learning system of an organization is now powered by AI, either existing incumbents will try to ‘do it themselves,’ or if they can’t hire the right talent or make the right investments, a class of new, disruptive ScaleUps might emerge, and we’re starting to see this more and more,” Jaffe said.
One example is Netflix, which has moved into original programming, shaped by the preferences of its customer base. Also, one of Insight’s own investment companies, Prose, formulates personalized shampoos to customers based on personalized AI analysis of myriad factors — hair type and texture, lifestyle habits, environmental exposures and even diet and stress levels. Such recommendations would be impossible to carry out, at least on a production basis, without AI.
With Captive AI, “the shape of the firm, or the industry itself can change,” he said.