The 26-year-old actor Kristen Stewart played Bella Swan, the protagonist in five “Twilight“ vampire films. Vanity Fair listed her as the top-earning actress of 2010. And now, she is a co-author of a scholarly paper about machine learning.
Stewart is listed as the paper’s co-author, along with Bhautik J. Joshi, a research engineer at Adobe, and David Shapiro, a producer at Starlight Studios, where Stewart made her directorial debut with a short subject called “Come Swim.” CNET described the film as “a trippy meditation that blends artistic vision with one of today’s most advanced computing techniques.”
And that’s where the machine learning came in…
Posted on Arxiv, “Bringing Impressionism to Life With Neural Style Transfer in Come Swim” details the thoughtful way that the filmmakers went about fine-tuning the results of an AI attempt at replicating a painting’s style in order to achieve a specific emotional effect.
“Neural Style Transfer” is the technical name for the process, and it’s a trick that’s already been performed by various neural networks, including the freely-available Caffe deep learning framework and Tensorflow library for machine learning. Machine-based evaluations of one style get mapped into specific stylistic effects for another image, and in this case, the painting they were trying to mimic was one of Stewart’s own.
The paper describes the resulting film as “a poetic, impressionistic portrait of a heartbroken man underwater,” adding that it also simulates that twilight state between waking and dreaming when you first wake up.”
Carrying out this idea, Stewart immediately faced a challenge: “The novelty of the technique gave a false sense of a high-quality result early on,” explained their paper. Since it’s a relatively new visual effect, “seeing images redrawn as paintings is compelling enough that nearly any result seems passable.” But the team kept carefully scrutinizing their results: to factor out novelty” and figure out whether it was really helping to move the film’s story along.
The papers summarizes what they learned along the way.
To get that special look, they first tried the evaluating the original image with the googLeNet and vgg19 neural networks, and their paper shares the specifics of their experience. They used vgg16 even though the “vgg19 execution times were too high,” because “googLeNet did not give the aesthetics we were seeking.” To narrow down the broad range of variables, they first established a “texture transfer” by closely cropping their original “style” image, so they could focus on fine-tuning other things — and also tried to pin down the appropriate colors early on.
“Whilst the technique appears to deliver impressionism on demand, steering the technique in practice is difficult,” their paper explained.
The paper describes “a meaningful set of shortcuts” that fine-tunes that mapping to “a reduced but meaningful set of creative controls,” and shares what the authors learned about the optimal path through the process — from quality of the original style image to the number of iterations, and even the intensity of the style transfer.
One big takeaway from the paper is that the ratio of the style transferred to that in the original had to differ by at least one additional power of ten before there was enough of a difference “for meaningful exploration.” It even shares an equation to express it — ratio = 10 ** u — where u becomes “a useful measure of unrealness… a rough way to map how impressionistic the style transferred image looked.”
It sounds like an arduous computational process. To get more powerful GPU hardware, the renderers switched to Amazon’s Elastic Cloud Compute (EC2), optimized the images, and “ended up with a compute time of around 40 minutes per frame per instance used.” And their paper shares their authoritative, based-on-experience conclusion — that “Far from being automatic, Neural Style Transfer requires many creative iterations when trying to work towards a specific look for a shot.”
It’ll be intriguing to see what they came up with in the end. In one video interview, actor Josh Kaye described the resulting movie as “a lot to do with heartbreak and the emotions that go along with that. Half of the movie is sort of this surreal internal torturous kind of things that’s going on, and then you realize that it’s just a manifestation of the emotions that somebody would feel — like, you know, we’ve all been there — when you just lose something or someone.”
But the film appears to be as much a technical achievement as it is an artistic one. The Verge collected some of its favorite reactions from Twitter — including a few from an associate professor at Georgia Tech’s school of interactive computing, who marveled in one tweet that the paper’s co-author was “once seduced by vampires.”
— Mark Riedl ? (@mark_riedl) January 19, 2017
The author of the Keras neural networks library even suggested that in the Hollywood of today, this was the new normal.
To be someone in Hollywood, you've got to put your ML papers on Arxiv and you better use TensorFlow… https://t.co/2Rcg1ccJ36
— François Chollet (@fchollet) January 19, 2017
Digital Trends seemed to agree, calling the research paper by a top Hollywood actress “a reminder of how important artificial intelligence techniques are becoming to the creative process.” And the Verge expanded on the point, suggesting that this is just one glimmer of what’s coming in our AI-assisted future.
“The fact is that these machine learning tools, once thought of as esoteric and specialized, have become increasingly mainstream,” wrote Verge writer James Vincent. He points out that even Facebook is now experimenting with AI-powered art filters for grafting the style of a famous painter onto your Facebook uploads, and argue that Stewart’s paper shows how an “AI revolution” is being powered by a community using openly accessible tools.
But maybe there’s also another message: that for AI to really work, you need domain experts, like that of an actress turned film director who understands the requirements of the digital film-making. It’s all the more impressive because Stewart’s background in technology seems limited. According to one interview, Stewart attended public school only through junior high before switching to “an independent studies correspondence thing.” The Telegraph points out the paper was published through the Cornell University library, “but not yet peer reviewed.”
But if a 26-year-old actor can wield the power of machine learning — don’t the results speak for themselves?Whether we recognize it or not, the world may already be changing. “Look, even if it’s not making sense, clever people on social media are certainly impressed,” wrote The Telegraph — citing the associate professor from the school of interactive computing.
And one of the commenters on the Verge just seemed delighted to see Stewart expanding her repertoire to a scientific research papers entitled “Bringing Impressionism to Life With Neural Style Transfer.”
“Sounds like a better love story than Twilight.”
Feature image from the Kristen Stewart site.