Could an AI Generate the First Line of a Novel?
Are we approaching a day where machines can out-perform us in “human” fields like the creative arts? If you gave the proper training set to a sophisticated neural network, would it eventually be able to compose the first line of a novel?
Not quite yet, as it turns out.
Janelle Shane has already started exploring other questions like this, by fooling around with a neural network. Shane’s methodology is simple. “All I have to do is give the neural network a long list of examples and it will try its best to teach itself to generate more like them,” she explains in one post. And behind all the tomfoolery there’s some high-powered technology, she says in an interview with The Next Web.
I’m using an open-source neural network framework called char-rnn, written by Andrej Karpathy in Torch. There are all sorts of fancy approaches people use to make neural networks better at understanding and generating sentences and paragraphs. I’m using none of that, just a machine learning algorithm that learns to spell words letter by letter with no sense of what they mean.
TheNextWeb concluded, “So perhaps it isn’t the singularity.”
But part of the fun is picking something that seems uniquely human. “Usually in my experiment, I give the neural network an unfair dataset — like paint colors,” Shane writes, “and it tries its best, but ends up with something unintentionally weird, like a brownish color called Stanky Bean, or a bright blue color called Dad.”
And of course, the other half of the fun is sharing the ridiculous results on her blog. For example, given a dataset of 37,265 fish names, the neural network tried its hardest to generate its own. And last week Shane trained it on the first lines from 224 Disney songs. But this month Shane turned her attention to one of the most human endeavors of all: writing a novel. “When you’re faced with a blank page, sometimes it’s just hard to get started,” Shane joked on her web page. “I wanted to see if I could train a computer program to help.”
“Turns out, she couldn’t,” quipped Geek.com.
For starters, it was hard to find enough data for training the neural network. After searching for first lines from novels, Shane only found a few hundred, where “ideally, I need thousands.” Otherwise, the neural network suggests novel first lines which are nearly identical to its training data, or clumsily pastes two together, with a few oddball tweaks. At one point it suggested a novel that started with Snoopy’s favorite opening clause but ended with the first line of James Joyce’s “Ulysses,” adding in a few random words for good measure.
"It was a dark and stormy night; the rain fell in torrents — except the station steps; plump Buck Mulligan came from the stairhead, bearing a bowl of people."
In “Ulyssess” Buck Mulligan bears “a bowl of lather on which a mirror and a razor lay crossed.” (Although admittedly it’s much more interesting if the bowl is filled with people). But Shane wasn’t satisfied with the novel openings generated by the neural network. “Most didn’t make much sense, and/or were obvious mishmashes of famous lines.”
Though a few of them were oddly poignant.
"It was a wrong number that struggled against the darkness."
"The sky above the present century had reached the snapping point."
"The snow is gone sometime, and you said, Why, and I said, To be with the darkness."
Weeks later Shane admitted that “It didn’t go so well.” She’d found one more site offering an additional 900 first sentences for novels: The Bulwer-Lytton Fiction Contest challenges competitors to compose the worst possible first line for a novel. (For example, “This is a tale of love, pain, loss, and redemption — and of a baboon, Amelia.”)
Shane’s verdict? “It didn’t help.” She ended up with novel openings like….
"It was a dark and stormy night and the secret being a silver-backed gorilla."
In desperation, she turned to the internet — or at least, to the community of her site’s readers and her followers on Twitter. Yes, she asked strangers on the internet to send in the first lines from their favorite novels — or even from amateur novels they’d written themselves. Could her neural network craft a convincing opening for a novel from a crowdsourced dataset contributed by human volunteers?
awesome, thanks so much! Up to 175 entries already.
— Janelle Shane (@JanelleCShane) November 2, 2017
Or would online humans simply overwhelm it with their anomalous affinity for science fiction?
too late. Already got 3 Neuromancers. I can filter out duplicates later though
— Janelle Shane (@JanelleCShane) November 2, 2017
Shane waited patiently until the end of the month, even spending the week before Thanksgiving on an entirely different experiment — training her AI network to generate names of pies, based on the names of 2,237 pie recipes. But soon it was back to AI-generated novels, or at least, collecting enough data to give her neural network a sufficient training set.
Fortunately, at this stage, she got a little help from her friends:
— Electric Literature (@ElectricLit) November 3, 2017
And soon, helpful submissions started pouring in….
Ha, someone did just submit a bunch of kid's books.
9661 entries now! pic.twitter.com/Gf1vqlDRf7
— Janelle Shane (@JanelleCShane) November 22, 2017
Meanwhile, another throng was gathering on a different part of the internet. On Thursday, at the stroke of midnight, 384,126 aspiring novel-writers marked the end of “National Novel Writing Month” — the 18th annual attempt to crank out a quickie novel in 30 days using nothing but their human wits. Shane chose that same momentous day to finally reveal the results of her own experiment. Her readers had come through with an additional 11,135 additional first lines from novels.
“folks, you have made me and the neural network so very happy.”
Combining the user-submitted first lines with the 900 that she already had, Shane created a robust training dataset with 10,096 first lines of novels. Sounding more upbeat, she wrote that “The first results showed, if not promise, then at least evidence of the high number of “My Little Pony” stories in the dataset.”
"...an of the the all stood ponyville at es that ev the."
But first results are always a little rough, and Shane reported that after training longer the neural network “soon made some improvement, and once in a while would produce a grammatically correct sentence as long as it was very short.”
"It was an hour of the night."
"I know they are from the mountain."
And it became eerily more human after a few more iterations through its training set. Or at least, as Shane writes, “It learned eventually how to begin a book by talking about the weather — although not always successfully.”
"The night is like a wounded carpets from the Crumzon."
"The sky was dead."
The training continued and continued, and eventually, Shane reports, many of the lines “almost made sense…especially the shortest ones.”
"I am not a king."
"The sky has gone."
"The sun was coming."
"The night was over."
By the end of the experiment, she’d concluded that some of the AI-generated first lines “were actually rather intriguing. I might read these books.”
"The silence was unlike a place."
“I am forced to write to my neighbors about the beast.”
"The sun was probably for his wife."
"I am a story that was not a truth."
But along the way, she’d made an interesting discovery: she wasn’t the only one who was trying to use technology to write fiction. For the last few years national novel-writing month has been accompanied by an overlapping event: “National Novel Generating Month.” A community of geeks has been sharing their computer-generated novels on a GitHub repository, and Shane couldn’t resist joining in the fun.
So, in the end, her post about computer-generated lines for novels ended with a plot twist of its own. “This project is my entry for National Novel Generating Month, which means that I generated 140,000 words’ worth of first lines…”
And surprisingly, some parts of it are actually readable.
On the end of April, that was a large little boy before the mast. The book was nothing but stars, and the earth was hard to breathe. The day was enveloped in the year 1874.
Stephen King clearly wrote this. And he was on potent psychedelics when he did so.
— Phil Atkin (@RamonesKaraoke) November 30, 2017
Shane (and her community of followers) could now finally witness the results of their unholy collaboration.
Oooh you must be getting these from the big dataset! I haven't looked through it yet. These are nice gems – share more if you find them!
— Janelle Shane (@JanelleCShane) November 30, 2017
So what happens when you combine machine learning technology, crowdsourcing, and a human sense of humor?
Can't argue with this:
The first day of the world was a mistake.
— Dr Headgear (@DrHeadgear) November 30, 2017
When challenged to write what it could never fully understand, the neural network still came back with some delightful surprises.
I am a decent man who had been stranded to be in the last seat.
— Dr Headgear (@DrHeadgear) November 30, 2017
Of course, there were a few glitches, but they were apparently caused by human pranksters.
So sand gets everywhere? Was it coarse and irritating too?
— Practical Tim (@PracticalPeng) November 30, 2017
The experiment appeared to be a success, even if it didn’t lead humanity to the perfect machine-generated novel.
Because at least it gave us poor little ponies, lost in the darkness, that most human gift of all.
- Amazon announces real-time language translation for text.
- How someone turned a Furby into an Amazon Echo.
- Dozens of drones are making deliveries in Switzerland.
- “FakeSpot” tries to stop cheating businesses online with an automated analysis of their Amazon reviews.
- Charting “the most dramatic rises and falls” in technology popularity (based on tag-frequency in Stack Overflow questions).
- Study finds it’s bad to use laptops in meetings.
- How many young Americans are leaving their desk jobs to farm.
Feature image via Pixabay.