Interviews / Op-Ed / Technology /

Bonsai’s Mark Hammond on Artificial Intelligence: It’s About Teaching

3 Mar 2017 1:00am, by

In a way, the phrase “machine learning” is misleading to programmers, because, really, successful artificial intelligence is more about “teaching,” than “learning,” so asserts Mark Hammond, founder and CEO of Bonsai, a middleware platform for artificial intelligence (Mission statement: “AI for Everyone.”)

Last week in San Francisco, I sat down to chat with Hammond, who is a self-proclaimed cognitive entrepreneur, combining computer science, cognitive science and business smarts. He started programming in the first grade and landed a job with Microsoft while in high school that lasted through his degree in neural science from Caltech. From there he applied his understanding of how the brain works and technology in numerous startups and academic research, co-founding JobCompass in 2010, then founding Bonsai in 2014. His passion is turning his understanding of how the mind works into beneficially applied technology.

We talked about artificial intelligence (AI), machine learning, the singularity, and teaching baseball.

I saw a snarky tweet that asserted that most of what people call AI are just a ton of “if” statements. I know that is wrong, but find I can’t articulate the difference. Can you?

Fundamentally, machine learning from a programming standpoint isn’t that complicated, but it’s a very different way of thinking about a problem than most engineers are used to. So, this is where I’m going to get “mathy.”

If you’re a programmer, and you have some function “f” which is going to take some input “x” and give you “f of x.” Your entire training has been, how do I make a really good “f.” That’s what you are trained to do.

And machine learning flips that on its head. Look, I’m not going to give you “f, ” and I’m not going to ask you to make “f.” I’m going to give you “x, ” and I’m going to give you the outputs “f of x,” and I’m going to ask you to learn what “f” is. So it’s going backward. Don’t engineer “f” for me, derive “f” given the inputs and the outputs. That is machine learning in a nutshell.

And all the stuff you hear about — deep learning, Markov Chains, probabilistic generative models, we could go on and on — they all are just different techniques for learning what “f” is.

The problem that every one of them is trying to solve is exactly the same. It’s given “x” and “f of x,” can you derive what “f” is. That’s it.

So it’s definitely not a bunch of “if” statements.

It’s a collection of techniques that you can use, to given a set of inputs and the outputs they map to, figure out what the function that mapped from those inputs to those outputs was.

So that’s it at a high level. If you want me to elaborate me, let me know.

Go ahead, talk a little more about that. Please do. Our readers are people who are spending their lives programming “f.”

O.K., So there’re many ways you can think about how you derive “f.”

You can do simple linear regression. You could use the Bayes Theorem to learn what the probabilities are related to things; you can use genetic algorithms. There are a large wealth of algorithmic paths to take. They all have fundamentally different approaches of how they think about solving the problem.

If your audience wants to dive into detail, Pedro Dominguez has a really good overview book called “The Master Algorithm” which details the major classes of approaches and the school of thought for each of them.

If you want to dig a little deeper, then we think about neural networks because a lot of time when people talk about AI and machine learning right now, they’re specifically thinking deep learning or neural networks because this is all the rage. It’s the resurgence of everything old is new again. It’s the same neural networks as before with “deep” added to the front. And they’re interesting, as opposed to a probabilistic programming kind of model. Those are much more akin to what traditional programmers are used to thinking about in terms of programming: “I can break it up into these inference rules that I’m learning,” and so on, whereas neural networks are kind this mysterious thing.

And the reason that that’s different is it is fundamentally not a Von Neumann architecture. And people are not used to thinking about non-Von Neumann architectures because you get taught about a Von Neumann architecture in year one of computer science school, and then you never think of it again because everything is derived from that.

A Von Neumann architecture, succinctly, a processing unit, storage, and memory, and they work together. I’m being overly simplistic. In neural network computation and memory take place in the same place. They are not separate. Computation is memory and memory is computation. They are the same thing.

Mark Hammond. Photo by TC Currie.

And so it gets a little weird for thinking about how all of that works because it’s foreign. But on a technical level, that is why it’s not an “if” statement. It’s not an empirical, logical system.

There’s no processing that reads from some memory. The connections between any two computational elements IS the memory. So it’s just a fundamentally different computing substrate that most people are used to thinking about.

To be fair, if you ask a lot of people how it works, they will be able to tell you algorithmically how it works, but if you ask what does it DO? What are these neural units actually doing? It’s not well understood.

I started out doing Peoplesoft implementations for Fortune 50 companies and was accused at a party of putting people out of work by automating their jobs. The truth was more complex – departments were transformed, but jobs still needed to be done. Now I’m hearing the same thing about AI. There’s a study being released next week that says 80 percent of developers are worried that AI will increase unemployment. What would you say to those developers?

First of all, it’s not AI; it’s technology. Technology does this universally. We have the word “sabotage” because people threw wooden shoes into the machinery that was taking their jobs. Now to be fair, with AI, it seems to be happening faster than with other types of technology, so that I can understand it to be disconcerting. However, people’s imagination of what it can do are driven by hype more than reality. AI is much more going to augment what we do, instead of replacing what we do. Now, will there be people whose jobs will be automated? Yes. Jobs will be replaced, but it also opens up new opportunities, new jobs.

The kinds of technology that people are building on platforms like ours are more business decision support than it is automating away the analyst: Give the analyst give better information and help them do more, not get rid of the analyst. It’s augmented reality, not replaced reality.

Look at the airline industry. Autopilot has been around for a long time, but pilots are still necessary.

“My message to developers in particular: machine learning systems by design have to learn. Someone has to teach them. Your job is going to be to teach them.”

Autonomous vehicles are fun to pick on because they get so much hype. But look at trucking fleets and shipping. If you can enable the movement of goods from coast-to-coast on a truck and enable the driver of the truck to operate more hours in a day because highways are long and straight and boring. But at the same time, you can have the system help them and just like autopilot on the plane when the situation calls for it, get them to engage. That’s not replacing them, but enabling them to do more, providing a lot of value to the trucker, to the trucking company, to the economy as a whole, right?

So augmentation is something you’re going to see a lot more of before outright replacement. It’s not like we’re going to wake up tomorrow and the machines will have taken all our jobs. I understand where the fear is coming from, and it’s not really unfounded, but like all technology, expectations exceed reality, and there are leading indicators for what’s going to come. So I don’t worry about it too much.

My message to developers in particular: machine learning systems by design have to learn. Someone has to teach them. Your job is going to be to teach them.

The Bonsai platform.

In a previous interview with TNS, you said in machine learning; you define the concepts to be learned, the curricula for that learning, whether the learning is adequate or appropriate and how to best improve the learning. So you created your Inkling program to define the learning.

It defines the teaching. When you write Inkling code, at the end of the day you are writing what you are going to teach and how you are going to teach it. You are literally building pedagogy to feed to the engine. Then the engine uses the pedagogy to select the appropriate low-level mechanical learning models that are appropriate.

It’s best to think about [Inkling] as a database. When you write SQL code, you don’t worry about where the data will be stored or the data structure of the indices, those layers are abstracted away from you.

What you do worry about is the kind of business questions do you want to ask of your data. We [Bonsai] do the same thing. It’s just about intelligence instead of persistent data storage.

It’s about learning versus teaching. It’s kind of funny; teaching has become a total afterthought. Everyone thinks “I’ll just throw more data at the problem.” But when you start thinking of the problem in terms of teaching, it becomes more intuitive for most people because we’ve been around children. When you’re around children, you don’t know how they’re learning, you don’t know how the brain is doing what it’s doing, and yet we’re still able to teach them the things we want them to learn.

We have an entire science and art around how to teach things effectively. Because the expertise and intelligence we actually have institutionalized in our companies and in ourselves are the things, we are going to teach. It’s not “how do you back-propagation on a deep learning neural network.” That’s like me asking you how the transistor logic works on a modern central processing unit. Do you worry about that when you’re writing your program? No. The whole point of having that there is the compiler is going to deal with that for you.

Choosing to add consciousness to machines is a choice. And you don’t have to choose to do it. Ninety-nine percent of applications will never need to do that. So no, I’m not too worried about the Singularity.

From my perspective, AI is going to be the same way. We can make that a compile step, and have the developers focus on the teaching part and capture our subject matter expertise and domain knowledge that we want to teach.

And that’s a lot of work. A lot of a lot of work. We are not trying to make sentient machines. The goal is to have the tools be smarter to help you do whatever you want to do. You don’t want your fridge to dictate that you need to eat salad. That’s not the goal. The goal is just to have the tools be smarter and help you in trying to do what you want to do.

So building a system that is conscious or sentient such that it can run arbitrary things on its own — that’s not the goal.

So you’re not that spun up about the singularity?

No. I think that the beauty of the teaching–centric approach is you can teach an arbitrarily complex amount of information to the computer, that it can use to accomplish whatever objectives you have for it and it never need be imbued with any of that stuff. We can completely avoid that problem.

Choosing to add consciousness to machines is a choice.   And you don’t have to choose to do it. Ninety-nine percent of applications will never need to do that. So no, I’m not too worried about the Singularity.

Neo Technology founder Emil Eifram is a graph enthusiast, asserting the human brain is the most intricate graph there is. How has your neuroscience studies fed into your work at Bonsai?   What traits from the human brain are necessary for AI, and which traits are not necessary?

Interesting. There’s this idea that everything should be logically pure, mathematically correct and perfection rational, but that leads to questions like, “O.K. you have this teaching-centric approach. Isn’t the AI supposed to learn everything it needs to know from first principles on its own?”

That’s mathematically correct, but humanity learned calculus from first principles and took many, many, generations to do so, over thousands of years until Newton.

Do you want to pragmatic about solving real problems right now that enterprises have, or do you want to be focused on the foundational components of the science? That’s the difference in perspective.

So when I sat down to solve the problem of how to make AI broadly accessible to programmers, I came at it from the angle that the one example we have of intelligence is the brain. And I realized the brain doesn’t work from first principles. There’s no reason that is should be different for a machine learning AI than natural intelligence.

When I go to teach my son, he’s four, baseball; I do not throw fastballs at him. That’s intuitively stupid. He’s four, why would you do that to him? I put the Wiffle Ball on the T, when he gets good at that, I set the Wiffle Ball to pop up, when he gets good at that, I throw underhand at close range. It’s the simulation, not the real thing. It’s intentionally fake in a way that is designed to be instructive.

Simulations can do that. Simulations can be the edge cases you’re trying to teach about very easily. You don’t have to go out in your self-driving car to drive miles and miles to get samples of these weird edge conditions. You can simply simulate it. Simulations are going to be a big part of how these systems get taught. Procedural generation, apprenticeship data where it’s watching other people know how to do the thing already, these are all techniques that are not data-driven and are just as important, if not more, than just throwing lots and lots of data in the system.

Throwing fastball at my son to learn how to play baseball is the equivalent of me giving a machine a big-data data set and saying, “Hey, figure it out.”

I used to use the analogy of stock trading — people learning how to technical trading analysis. I do not come to teach you that and drop a stack of Wall Street Journals in front of you and circle points on charts and ask you to magically intuit what the heck we’re talking about.

And yet we have no qualms doing that to computers and wondering why we need ten million samples before it figures something out.

When you start to think about how to teach it, data, data and more data is not the right answer. You need the data at the end because that’s real life.

A digest of the week’s most important stories & analyses.

View / Add Comments