What happens when machines dream? It’s a question that has been explored by science fiction authors and artificial intelligence experts alike, and now, thanks to Google’s engineers, we now have insights into what a computer brain’s daydreams might actually look like, as seen in these trippy, psychedelic images that have recently been making the rounds on the Web. Google has since opened up DeepDream, their source code for spawning these psychedelic images, and people are running with it, turning mundane photos into hallucinatory works of art, or horror, depending on one’s perspective.
Deep Dreams of an Artificial Neural Network
Produced by Google’s artificial neural network (ANN) for image recognition, these wildly imaginative visuals are generated by a neural network that is actually a series of statistical learning models, powered by deceptively simple algorithms that are modelled after evolutionary processes. Researchers have “trained” these networks by feeding them millions of images and gradually adjusting the network’s parameters until it gives the desired classifications.
One can imagine it almost as a stacked sieve for information: these neural networks consist of 10 to 30 interconnected layers of artificial neurons, with some designated as “input,” “output” and intermediate “hidden” layers (here, “deep learning neural networks” refers to systems with five or more layers). The lower input layers interpret basic features, like edges or corners — analogous to an infant’s ability to perceive the fuzzy edges of a familiar face — while the intermediate layers take these basic interpretations and look for overall shapes. The output layers then assemble these into a final interpretation, an “answer” delivered by neurons that determine whether the image best depicts a house, an animal, or a fruit.
Due to the non-linearity of these frameworks, the process of how neural nets actually arrive at such dreamlike outputs is still a bit of a mystery to researchers, though there are now some tools available to help decipher this. According to the Google Research blog: “One of the challenges of neural networks is understanding what exactly goes on at each layer. We know that after training, each layer progressively extracts higher and higher-level features of the image, until the final layer essentially makes a decision on what the image shows.”
A Reverse Hallucination Technique
So the researchers decided to reverse the process, to “turn it upside down” in order to better visualize the network’s inner workings. By giving it free reign and asking it to interpret and “enhance an input image in such a way as to elicit a particular interpretation,” they were hoping to get more insight into what trained features the networks understood and what they did not.
What happened next was striking: the researchers found that not only could these neural networks discern between different images, they had plenty of information to generate images too, resulting in these surprising computational representations. For example, in response to the team’s queries for ordinary things like ants, bananas, starfish and so on, the network produced these rather unorthodox images.
Feeding images into lower layers generated interpretations with soft, curvilinear forms, as these lower neural layers are concerned with identifying edges and corners. The team explains: “Each layer of the network deals with features at a different level of abstraction, so the complexity of features we generate depends on which layer we choose to enhance.”
Probing further, when images are fed into the higher-level layers where more abstraction occurs, more detailed and unexpected results emerged, especially when the team asked the network: “Whatever you see there, I want more of it!”
This creates a feedback loop: if a cloud looks a little bit like a bird, the network will make it look more like a bird. This in turn will make the network recognize the bird even more strongly on the next pass and so forth, until a highly detailed bird appears, seemingly out of nowhere.
With this reverse hallucination technique that the team is dubbing “Inceptionism” — a film-inspired reference to the deep neural network’s efficient “architecture for computer vision” — the network created unanticipated results: trees becoming crystalline architectures, leaves translated into magical birds and insects. Essentially, these “over-interpretations” are an abstracted, fractalized fusion of previously learned features, produced by this feedback loop. Even more compelling were the incredibly rich landscapes that could be generated from an image initially filled only with random noise, by iteratively applying the algorithm again and again on successive versions of the original image.
It’s a kind of massive, data-driven pareidolia that companies like Google is uniquely positioned to lead, since big amounts of data are needed to train big neural nets, and if anyone has access to huge amounts of data, and access to unparalleled computational power, it would be Google. Though they look amazing, these evocative images do elicit more questions than answers. For one, it shows how deep neural networks can be easily fooled; but on the flip side, these complex images also demonstrate the unknowns in these emergent neural networks. More profoundly, they also point to how little we know about the cognitive complexities of vision, and about the human brain and the creative process itself.
The next question would be how to develop these deep neural networks with more unsupervised and automated approaches to processing raw data, building on a base stack of artificial cognitive abilities such as visual recognition and natural language processing. Beyond that, we step into the mind-blowing realm of quantum machine learning, where quantum neural networks are able to process one and zero states simultaneously, thus allowing them to ‘see’ the big picture.
Researchers postulate that dreams are a risk-free way of learning, an adaptive mechanism that helps drive hominid evolution to increasing levels of complexity. Could it be the same for machines? It’s difficult to say. But what is certain is that the reality presented by these images is at once exhilarating and troubling; the current fallibility of machine intelligence means that our increasing reliance on it will no doubt have unforeseen consequences, perhaps in wars where intelligent killer machines could run amok, as some experts are warning. At this moment though, these images are weirdly fascinating by themselves, and you can make your own by downloading the DeepDream code from Github, or uploading an image to Psychic VR Lab, or checking out the Twitter feed for #deepdream. Read more over at Google Research.
The New Stack is a wholly owned subsidiary of Insight Partners. TNS owner Insight Partners is an investor in the following companies: Bit.