Culture / Data

Games Data Play: The SQL Murder Mystery

5 Jan 2020 6:00am, by

Image from SQL Murder Mystery site

Last month found web surfers puzzling over an online game that lets them play detective using SQL queries. “There’s been a Murder in SQL City!” the page begins, describing itself as “both a self-directed lesson” and “a fun game.”

The SQL Murder Mystery site explains its simple premise: “The detective gave you the crime scene report, but you somehow lost it…” But at least you remember when and where the crime took place — on January 15, in SQL City — and every clue for the mystery is somewhere in the game’s virtual SQLite database.

The SQL Murder Mystery is a manifestation of the new kinds of culture emerging in our technology-enhanced world, one of those rare moments that combines how we work with how we play. But it also shows how projects in this world thrive by attracting new people, to play, to participate, and even to help move the project forward.

And behind it all is a serious purpose: to prepare the data workers of the future for the challenges ahead.

The Game of Data

The game helpfully gets you started by pointing out there’s an easy command to get a list of all the database’s tables.

SELECT name
FROM sqlite_master
where type = 'table'

Soon it becomes clear that there’s a table named “crime_scene_report” — and there’s lots of ways to proceed from there. The web page suggests that a SELECT sql statement could show the structure of those individual tables — although I had a lot more fun by just displaying the entire database:

SELECT *
FROM crime_scene_report

“The rest is up to you!” the page says encouragingly. The interviews with the suspects suggest someone with a special gold membership at the local gym — and fortunately, you’ve also got access to the gym’s check-in database. And the queries, of course, return lots of results, incentivizing detectives to learn how to perform more filtering on their results.

But just as interesting as the game is the story of its creation. Northwestern University’s School of Journalism wants to train “a new generation of multimedia journalists,” according to its website. The school’s Knight Lab is its collaboration with the college’s school of engineering to further “new media innovation” through “exploration and experimentation.” The SQL Murder Mystery was created by Joon Park and Cathy He, two fellows at the Knight Lab.

“We bring together students and educators from diverse areas of the university interested in how journalism and the broader media world are evolving in the midst of very rapid changes in how information is gathered, presented, and delivered,” explained Joe Germuska, who runs both the lab’s technology and its professional staff — as well as the student fellows. The school’s website gives his title as Chief Nerd.

“In classes and extracurricular activities, we try to introduce students to key issues and core technologies, to prepare them to go out and participate in making those changes and advancing the field,” he said.

The lab had a higher purpose for teaching people how to use relational databases. “First, they are still at the heart of many web-based applications, especially content-management systems,” says Germuska. “Secondly, they’re a powerful tool for managing data for analysis, such as for data journalism.

“It’s also the case that SQL is actually quite approachable for many people, if they have a motivation to get some exposure and some practice.”

And that’s where the SQL Murder Mystery comes in. As the lifestyle site Lifehacker describes it, “it works as a puzzle because you have a question compelling you to dive through the data.”

Like most projects, this one evolved gradually over time. Park and He finished the first version back in 2018, Germuska remembers, and it started out hosted on GitHub. “We tweeted a bit about it and got some positive feedback…”

A year later, Cory Doctorow was blogging enthusiastically about it at BoingBoing. “I love this kind of thing so much. Learning the abstruse syntaxes of power-users, network administrators and programmers gives users so much power over the computers they use,” he wrote. And his post prompted readers to share their own thoughts about SQL in the world today.

“I once encountered a stored procedure containing an IF statement that spanned two thousand lines,” remembered one of his commenters. “The mystery was how that didn’t result in murder…”

Another BoingBoing commenter was even more cynical. “As far as I can tell this whole thing is an exercise in using bulk data collection by private companies to unrigorously accuse someone of murder.”

Soon someone had even converted the game into R. But there was still room for a little improvement, Germuska tells us. “At the time, it required several steps to interact with the database, which wasn’t optimal for our beginner audience,” he said.

In October Simon Willison, the co-creator Django web framework, announced that he’d created a version using Datasette, a tool he’d created for data journalists and other data-sharers to help publish and explore their data.

This got a lot of people interested, Germuska remembers, and prompted him to visit the web site for Select Star SQL. It’s an interactive online book about SQL which actually offers fully interactive SQL in the browser. And more importantly, Germuska remembers, that site had shared its source code. “Over the course of a couple of days, I was able to adapt that code to our project… To be honest, most of that time was spent adapting and extending the “walkthrough” to be better suited to the interactive form that the Select Star code enabled.”

“It’s been really fun to see how excited people are about it,” says Germuska.

Soon the new site had racked up a whopping 823 upvotes on Hacker News — with people offering their own reactions and sharing some related thoughts. One posted a link for their own port of IBM’s SqlDetective (from informix to postgresql and mysql) And programmer William Edwards shared his own JavaScript-base SQL game.

“Writing an SQL engine is like writing a ray tracer or implementing a compression algorithm,” he joked in a later comment. “Every programmer should do it!”

Other users shared links to a JavaScript-themed game called “Untrusted,” and to another Bash-shell based game called OverTheWire: Bandit. And one comment remembered the venerable Command-line Murder Mystery, which it turns out was an inspiration for the SQL mystery site. The last line of the SQL Murer Mystery’s page on its GitHub repository says playfully that it was inspired “by a crime in the neighboring Terminal City.”

And then Germuska himself turned up in the Hacker News comments. “We love that people are enjoying this so much,” he wrote, “so we’re also grateful to Simon Willison for reviving attention on the original project, and Zi Chong Kao, whose SQL tutorial site showed us the possibility of mounting the whole thing in a browser page.”

It’s all a remarkable example of the different kinds of sharing that are powering the technological world today.


WebReduce

A newsletter digest of the week’s most important stories & analyses.