Analysis / News / Op-Ed / Technology /

With Computer Vision, Amazon’s Got the Goods to Banish Long Lines from Whole Foods

20 Jun 2017 11:50am, by

Long check-out lines could vanish from Whole Foods stores once Amazon completes the $13.4 billion purchase of the food retailer.

That is thanks to computer vision, which Amazon plans to use as an element to make shopping a breeze. Amazon’s Go service, announced in December, uses computer vision, sensor input and artificial intelligence so users can shop and leave a store without checking out at a counter.

Amazon could ultimately implement Go technologies in its Whole Food stores. Once a buyer picks up a product, it is automatically added to a shopping list. Buyers are charged on their Amazon account after leaving the store, and don’t need to stand in the checkout line.

The first Amazon Go store opened recently in Seattle and is only for company employees, but more stores will pop up in the future. The Whole Food stores could be next on Amazon’s Go list, though it remains to be seen if it’ll be effective in improving the customer experience.

It’s not clear how Amazon Go uses computer vision, but cameras may be used to recognize humans and the products they pick up. In a simple sense, object recognition could be a long-term replacement for RFID tags, and that could have major implications on logistics, warehousing and transportation of products across a wide range of industries.

At the center of computer vision are cameras, which can now do more than just shoot great pictures. Like human eyes, cameras can socialize with images by recognizing objects, tracking depth and adding context to surroundings.

The Promise of Computer Vision

Computer vision can add a new layer of functionality and artificial intelligence to devices and automation. The retail industry is quickly adopting computer vision, which is also the core technology behind self-driving cars, which use 3-D cameras and sensors to recognize objects, signals and lanes.

Microsoft’s Kinect camera, which can track gestures and movement, was an early example of computer vision in play. But high-resolution security cameras today can identify individuals in large crowds, and face recognition is used for biometric authentication. Computer vision is also relevant in retail, Internet of Things, drone and virtual reality applications.

Amazon and other retailers are quickly adopting computer vision. With 3-D cameras, robots get computer vision to navigate, inspect product quality and automate tasks. One example is robots being used to examine the quality of fruits.

Lowe’s and Wayfair are using a mix of computer vision and augmented reality (AR) to simulate how furniture would look in a customer’s living room. Voice assistants like Samsung’s Bixby — which is on the Galaxy S8 smartphones — use image recognition to connect potential buyers to products. A user can scan a wine bottle and check pricing for the product on different retail websites.

Other industries are also seeing the value of computer vision. GE has identified opportunities for computer vision in the mining industry, where cameras could recognize items dropping off conveyor belts, which can then be reported to nearby systems. Via its HoloLens AR headset, Microsoft has shown computer vision being used for surgical training and other medical applications. AR headsets could also be used to troubleshoot equipment problems.

Research on computer vision has been going on for decades, but the technology is now gaining commercial relevance. The associated hardware, software and algorithms are maturing quickly, though there are still some challenges in identifying and implementing applications.

Taking pictures via 3-D cameras is just the start of building a computer vision system. The process involves studying and classifying pixels with a computer-vision algorithm and using machine learning for accurate analysis of images. Such systems can be deployed on or off premise.

Work Still to Be Done

Lambda Labs is one company that provides deep-learning hardware and software tools, while companies like Amazon, Google and Microsoft provide varying levels of machine learning and image recognition capabilities in the cloud.

Companies need to identify an application that would benefit from visual sensing. Computer vision systems can take time to mature, and as more images are fed into the system, the recognition and analysis becomes stronger.

Google and Facebook have sophisticated image-recognition systems in their mega data centers, but those took years to build. Machine-learning frameworks like TensorFlow, Caffe2 and Theano help build and improve learning models. Companies like Intel and Ambarella are building sophisticated depth-sensing cameras for computer vision.

Despite its promise, computer vision has a long way to go. False positives on image recognition are a big problem, and sophisticated computer vision algorithms are being developed by researchers and companies to mitigate that problem.

Beyond motion tracking, computer vision is mostly relegated to static images, which limits its scope. Google and other companies are chasing video recognition by analyzing a string of images bunched together in a sequence. The benefits of video recognition are enormous as computers will be able to identify activities and events.

The integration of computer vision with other AI technologies like natural language processing and voice recognition is also a work in progress. Implementing computer vision on a robot or drone can be challenging considering high-resolution graphics requires a lot of processing power and bandwidth, which can drain battery life.

Amazon’s also relying on computer vision for its Prime Air service, in which drones deliver products straight to your door. Drone delivery of Whole Foods products straight to your door may also become a reality if you don’t want to visit a store.


A digest of the week’s most important stories & analyses.

View / Add Comments