Google DeepMind Psychlab Assesses AI with Cognitive Psychology

Research into artificial intelligence in recent years has focused on developing AI that is generally intelligent, meaning that rather than having an AI that specializes in mastering one specific task — such as playing one kind of game — it can learn and acquire a variety of skills, much like a human would.
But even the simplest of tasks can involve a number of cognitive functions. So even as AI research begins to take more cues from how human brains work in order to develop new types of algorithms and architectures, it’s still not always totally clear which artificial cognitive skills are being used when an AI agent completes its task successfully.
To help better identify how these cognitive components come into play, Google’s AI research lab DeepMind recently released an open source toolkit that will allow developers to study AI behaviors in a controlled environment, similar to how cognitive psychologists might use tests devised for studying human behavioral processes such as attention, perception, memory, thinking, creativity and problem solving.
According to a post on their website, DeepMind’s Psychlab platform is built on DeepMind Lab, a customizable, first-person, simulated 3D environment for training and testing autonomous AI agents in various tasks.
Much like how human psychology experiments might be organized in a clinical setting, Psychlab establishes an equivalent framework to test AI agents alongside human subjects for their cognitive abilities, within the virtual environment found in DeepMind Lab.
“This usually consists of a participant sitting in front of a computer monitor using a mouse to respond to the onscreen task,” explained DeepMind researcher Joel Leibo. “Similarly, our environment allows a virtual subject to perform tasks on a virtual computer monitor, using the direction of its gaze to respond. This allows humans and artificial agents to both take the same tests, minimizing experimental differences. It also makes it easier to connect with the existing literature in cognitive psychology and draw insights from it.”
Some of “classic experimental tasks” that Psychlab offers include: visual search (which tests ability to search an array of items for a target); continuous recognition (tests memory for a growing list of items); arbitrary visuomotor mapping (tests recall of stimulus-response pairings); change detection (tests ability to detect changes in an array of objects reappearing after a delay); visual acuity and contrast sensitivity (which tests ability to identify small and low-contrast stimuli); glass pattern detection (tests global form perception); random dot motion discrimination (tests ability to perceive coherent motion) and multiple object tracking (tests ability to track moving objects over time).
There are some big advantages to transferring such metrics over. “Psychlab makes possible ways of analyzing experimental data that are common in psychology but relatively unknown in AI research,” wrote the paper’s authors. “For example, we describe methods for measuring psychometric functions, detection thresholds, and reaction times for artificial agents that can be directly compared to those of humans.”
In addition, human test subjects scored the same results in the virtual DeepMind Lab testing environment as in real-world trials. For instance, in the visual search test, which involves measures for selective attention by having the participant locate one particular object out of many different objects, the team found that humans were able to complete the task in relatively the same time in both real and virtual environments, if there was only one factor of difference — searching for the one differently colored bar or the one bar that’s oriented in a different way, compared to the rest. However, human reaction times increased slightly when there was more than one factor of difference — such as looking for one pink bar in a set that included bars of different colors and shapes. In comparison, AI agents in the experiment completed the various visual search tasks in the same amount of time, regardless of how many aspects these objects differed among each other.

Difference in reaction times between humans and artificial agents on the visual search task in Psychlab.
According to the team, such findings show that certain cognitive functions in AI agents work differently than in humans, and will no doubt help to inform how AI is engineered in the future. With AI research increasingly drawing inspiration from other disciplines such as neuroscience, developments such as this are a step toward creating AI that not only learns like a human, but potentially thinks and behaves like one too.
Others can build their own cognitive tasks for their artificial agents to perform. DeepMind’s open source, “flexible and easy-to-learn” API for Psychlab can be found on Github, in addition to their research paper here.
Images: DeepMind.