“AI is revolutionizing how we process information,” said Al Brown, senior vice president of engineering at Veritone. Veritone is a 4-four-year-old company providing what I’m calling AI-as-a-Service (Although I admit AIaaS doesn’t have the same ring as SaaS). Brown called the company “a place to process content for AI.” This is part of the wave of second-generation AI companies in which the company provides the service of abstracting away the heavy AI lifting and lets their customers get the results.
Give me your data, says Veritone, and I’ll give you back just what you want to know. None of that pesky AI training for you! No need to understand the complexities of containers and engines and message busses and microservices for YOU! Give me vast quantities of data, structured, unstructured, whatever you got, we’ll take. And you get the answers to the questions you are burning to have answered.
There are three main categories of customers that have embraced AI and the capabilities it represents, said John Ward, Veritone’s vice president of marketing. The first is legal and compliance companies, where, for example, they are able to use AI to monitor audio phone calls to determine they are in compliance. There are about 30 billion audio compliance calls per year, he said, and about four billion of those are in the financial sector alone. So how can you tell if calls that banks are making to their customers are in compliance with FTC and other regulatory agencies without having a person listening to billions of person-hours? AI, of course.
The media and entertainment industry has also jumped on board. There are lots of uses, mostly in compliance and advertising, said Ward. News networks are using AI to measure advertising in two ways. First, to verify that the ads actually ran the contracted number of times, then to follow ads that are embedded into content where it’s harder to follow the subsequent clicks than in traditional advertising.
The government, including law enforcement agencies, has been quick to see the advantages of AI’s facial recognition capabilities. They can now match mug shots to security footage, crime scene photos and surveillance footage to search for matches. This works not only for criminals but for missing and exploited children.
How Do They Make It Work?
The Veritone stack ingests media, not just directly from its customers, but from a variety of public and private sources, and sends it through processing units they call engines. They excel at working with unstructured data which comes from audio and video files.
“A computer knows that it’s an mp4 file,” Ward said, “but it doesn’t know what’s inside it because it can’t see and it can’t hear. Structured data, coming from databases, is made up of 1s and 0s and its much clearer as to what it is.”
Structured data is much easier to work with, but it’s the unstructured data where the gold lies. Examples include video footage from security cameras and audio files of call center phone calls.
Veritone starts with the unstructured data, Brown explained. They consider themselves to be an operating system — “a place to process content through AI,” he said.
The starting point for their platform has different connection points. First are the “adaptors” that connect to a data source and emit data. The next extension point is “engines.”
Each engine wraps an AI service (or any service), listens for specific types of data, then process them. Then the engine produces more data, which in turn triggers other engines.
Most jobs will run the same data through several engines before turning over the result to their customer. For example, the first engine will transcribe the audio on a video file, then other engines take the now-structured data and process it, connecting it with data from Weather.com, point-of-sale systems or any one of another of their 190 engines (and counting).
Brown explained that there are over a dozen engines for facial recognition alone. One specializes in nighttime photos because nighttime lighting casts shadows that can be tricky to see through, he said. Another facial recognition engine focuses on profile views, and so on.
All of this is managed by an event bus, which manages all the rules that cause the engines to fire.
Below that is Conductor, which is the orchestration of AI to actually solve a problem, like providing better transcripts. Part of what they’re doing with Conductor is to develop a better, more accurate transcript that learns, said Brown. This experiment of applying ensemble learning through Conductor has increased transcription accuracy overall by 15 to 20 percent. Buoyed by this success, they are applying this concept to other engine types as well.
Next is the application tier, where they have an ecosystem of apps. They view all of the apps as partners, Brown explained, automating processes that humans had to do before. “It’s not really about the algorithm, it’s about how to use it and apply it to business.”
So What’s in Your Stack?
When they started, Chad Steelberg, Veritone’s founder and CEO, decided to process all content through their system as soon as it’s created. Every YouTube video and all the news stations from around the world are among the data processed on a daily basis. They’re currently tracking over 8,000 companies, said Brown, and this enormous task led to the decision to go serverless.
AI pushed them to a standards-based approach with Docker containers which scale easily depending on the load, said Brown. They chose Oracle’s Pipeline as their foundation because Oracle’s standard space approach allows for them the greatest flexibility. The Docker containers are orchestrated by Iron.io.
The apps tier uses React with Node.js. The API tier is currently switching from Rust to GraphQL. The rest of the stack is a mixture of code they’ve built using Golang or Node.js.
For the queue, they’re using a mix of Kafka and NSQ, using ElasticSearch for their search indices.
For the Developers
Also included is the Veritone Developer App, which allows developers to build their own engines. Engines are simply Docker containers or Docker images, Brown said. There’s a simple way to push the Engine into the Veritone system through the Developer App so the customer’s code is the Docker container but connected to the entire Veritone system.
“If you’re an AI developer, I want you to focus on the AI,” said Brown. “Not on the data, or the pipeline, or how to operationalize it, that’s what we take care of.”
Feature image via Pixabay.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Docker.