Secure file-sharing platform Box has launched a private beta an advanced image recognition and intelligent classification service built on Google’s Cloud Vision API, called Box Image Recognition.
Google Cloud Vision provided the most advanced in image recognition in no small part because Google Images is the largest collection of images on the planet. The company’s system has been learning from labels and photo descriptions supplied by the photographers, Wacker explained.
To offer an example, Wacker uploaded to Box Capture a photo of his sister-in-law’s wedding, which correctly identified it as a Pakistani wedding, not an Indian one, “which is incredibly specific,” he said. “It’s astonishing.”
Another advantage of the Cloud Vision API is its ability to capture Google Optical Character Recognition (OCR) metadata “in the wild,” he said. For example, you can take a picture of a streetcar, upload the image, and Cloud Vision can extract text from the image. “It’s super, super powerful.”
Box It Up
Enterprises need ways of organizing and understanding their growing unstructured content, especially with the surge in the use of images, video, and audio, Wacker said. Box Image Recognition can detect individual objects, text, surroundings and colors, then automatically add keyword labels. The Box app uses this data to build meta data onto images and text.
It’s not just image capture that’s important, or that you can extract text and assign labels. It’s not just that the service has the potential to greatly reduce manual data input. The real business value, argued Wacker, comes by using the image data to automate business processes, thereby speeding up workflows.
One startling time-saving use case is for retail catalogs. An employee could upload an image of a guy wearing blue jeans and Box Image Recognition identifies the photo and automatically creates keywords and metadata. Colors, type of clothing, fabric type, whether the model is male or female, all searchable, all automatically added. The image upload can trigger an API to add the photo to a catalog, eliminating time-consuming manual data entry and manipulation.
Another example is a user applying for a job or home loan, who could take a photo of their driver’s license through the Box Capture app. This app bypasses the phone’s camera storage and uploads the image directly into Box Image Recognition, which then strips out the data (picture, name, address, social security number, State ID number, etc.). Labels are added to make the data easily searchable and all of the data is loaded into Box Metadata so customers can then search, organize, etc. through APIs. But the exciting part, said Wacker, is that you can then use that upload to trigger a workflow, starting with an automatic background check.
For business, this means extraordinary time savings. One customer said each drivers license upload will save an hour. That’s a hell of an ROI.
It’s a Data Explosion
One thing is for sure: Adding image capture is going to increase the amount of data Box manages for its clients. It provides not only data capture but storage that falls under legal compliance for their enterprise customers, meaning that most of this new data must be stored a minimum of three years.
Since Box launched in 2011, it has amassed 30 billion files, with 10 billion of those coming in the last year alone. And once Box Image Capture gets out of beta, that number might continue to increase exponentially.
The company is preparing for a deluge, Box has been working on making its system flexible and scalable. It was an early adopter of the Kubernetes open source container orchestration engine and is currently partnering with Heptio to tap into the latest technology updates.
“We relish the explosion of data,” said Wacker.