Analysis / Culture / Technology /

How Zuck Built His Jarvis AI Bot from APIs

23 Dec 2016 4:43am, by

You may have heard by now that Facebook co-founder Mark Zuckerberg’s personal project for 2016 was to build his own Artificial Intelligence (AI) bot, which he affectionately named Jarvis. Zuckerberg’s AI is far from Iron Man’s fully functional cognitive assistant, called Jarvis, or even Rosie, the beleaguered maid of “The Jetsons.”

Still, for 100 hours worth of work, it manages to accomplish a few basic tasks. Using a combination of Python, PHP and Objective C and overlays natural language processing, speech recognition, face recognition, and reinforcement learning APIs, allowing him to talk to Jarvis on his phone or computer and control connected appliances, allowing him to turn on and off lights and music, launch a gray t-shirt from his t-shirt cannon, and even have warm toast ready for him in the morning.

But just how does one build an AI? Iddo Gino, CEO of Rapid API, connected the dots. RapidAPI is the world’s largest API connector, enabling developers to gather and connect APIs in one platform. “We get a great view of the space- and especially the AI space where people consume many APIs,” Gino said in an email.

https://www.facebook.com/notes/mark-zuckerberg/building-jarvis/10154361492931634/

The Jarvis Schema from Zuckerberg’s blog post.

When you open the black box of artificial intelligence (AI) or machine learning (ML), it reveals clusters of APIs. He noticed that most of Jarvis’ core functionalities come from connecting APIs, not from coding the functionality from scratch. He laid out the structure for Jarvis, but really for all APIs, including Siri and Amazon’s Echo in a blog post.  Even IBM’s Watson is, at its base, a cluster of interconnected APIs.

Gino breaks down three clusters of APIs necessary to fuel Jarvis or any AI based on the actions taken/needed.

Tell Jarvis a Command

A wide variety of User Interfaces APIs allow Zuckerberg to connect with Jarvis, including Facebook Messenger, iOS voice commands and a door camera. First, Zuckerberg issues a command; the Messenger Bot API tells the Jarvis system to accomplish a task. (e.g., “shoot me a t-shirt” or “alert me when my next appointment arrives at the gate”).

Command Interpretation

AI system APIs help Jarvis make sense of the commands passed through the user interface APIs. When Zuckerberg or any cognitive assistant user issues a command, they do it using natural language and not computer-speak.

Next is a process familiar to all developers everywhere, breaking down the natural-language into its component parts to allow the computer to accomplish the task. For example, the simple task, “Shoot me a gray t-shirt” is broken down into component parts (load cannon with gray t-shirt, fire cannon) by APIs for speech recognition, and natural language processing, which extracts tasks and intents from the words.

Recognizing the person standing at his gate and granting or refusing entry is a more complex process, but the core process remains the same, recognize the voice (or text) command and break it into component parts that can be translated into tasks for the computer to complete.

These commands are consumed through APIs.

Taking Action

Home Systems/Data APIs are what actually get the job done. So far, the command has been recognized, then broken into component tasks. Now the Home Systems APIs allow developers to connect to Internet of Things devices, like light switches, thermostats, and door locks. This cluster also includes APIs that retrieve data from services outside the Jarvis system, for example, getting a song or playlist from Spotify or using the Internet Movie Database API to find out which actor voiced Jarvis in the Iron Man movies.

Zuckerberg is hampered in which appliances he can use, according to an article in Fast Company.

Although Jarvis is restricted to Zuckerberg’s private residence, his home network is under Facebook’s corporate infrastructure. All devices hooked into the Facebook infrastructure are required to have a Facebook security certificate that has very rigid standards. IoT refrigerators, for example, do not yet have Facebook security certificates. In addition, he found that he needed to make some hardware changes to his t-shirt cannon and dog food dispenser in order to get them to work with Jarvis. He also hacked a 1950s toaster to allow him to push the lever down with the bread inserted and the power off so that it could start toasting automatically at a future time. Safety regulations require modern toasters to dismantle this functionality.

Building Your Own

So Jarvis is mostly working — the demo for the Fast Company reporter often required repeating commands to Zuckerberg’s embarrassment.

Even though Zuckerberg said he spent about 100 hours on this project. He started with the Facebook Messenger Bot Framework (duh) and used the Buck build system, developed to build large projects quickly. Working with FaceBook open source platforms really saved him time. He cited Nuclide, FastText and various other projects from Facebook Research as shaving a lot of time off his project.

 He suggested researching the GitHub repo if you’re interested in spinning up your own AI. Of course, other companies are doing their own machine learning/cognitive assistant/AI platforms and you could hook your wagon to platforms that have spent thousands of developer hours in creating this functionality.

The rise of machine learning/cognitive assistants/AI is exploding, as are the stream of APIs coming onto the market daily. In his blog post, Gino wrote that  “AIs will only be as good as the APIs they use. The better the APIs and the more data and actions they let AI take – the better AIs will be.”

Feature Image: Pee-wee Herman catching toast from his Breakfast Machine, “Pee-wee’s Big Adventure.”


A digest of the week’s most important stories & analyses.

View / Add Comments