Dissecting the Stack for Place, Reddit’s Collaborative Pixel Art Project
In 2005, an English student facing a three-year business school course was worried that he wouldn’t be able to repay the massive loans he’d need to attend college. He devised a plan to sell pixels on the Internet for $1 each. Insane as it may sound, Alex Tew actually built out what he called the Million Dollar Homepage, where he sold pixels to the tune of just over a million dollars.
It’s one of those crazy Internet success stories that sounds more like an Onion article than an actual business. Tew’s plan was just supposed to get him through college, but the site remains online today as a testament to just how wild and seemingly dim an idea can take hold online.
What better bit of Internet history to build an April Fool’s joke upon? Reddit is renowned for its April Fool’s day pranks. It has included things like linking random users up in a one-on-one Omegle-style chat and dividing the Reddit userbase into two teams.
Thus, the idea they had this year was to create a bit of collaborative pixel art was born. The idea was to offer a user a big page on which they can only add a single pixel of color every five minutes or so. Sounds simple but in practice, it became a project that could only have succeeded on top of the heavyweight infrastructure of Reddit.
Place, as the pixel page was called, saw over a million unique users in 72 hours. Those individuals grouped up into communities and created territories on the map over which they fought and negotiated. The rainbow road, for example, attempted to negotiate its way across the screen as users reached out to the projects that were in its way. Elsewhere, the Prequel Memes folks took over a massive chunk to retell the tale of Darth Plagueis.
Daniel Ellis, senior software engineer for infrastructure at Reddit, said that the Place project was run primarily in Amazon Web Services, as is Reddit proper. The stack used, “RabbitMQ for connecting to a WebSockets cluster to broadcast messages in real-time to users. We had Fastly at the edge to cache the initial canvas loading. We used Python on the backend, with Redis and Cassandra,” said Ellis.
Redis was used to store the color and pixel information, while Cassandra held the information about each pixel’s author and timestamp, which could be seen by hovering the mouse over a specific pixel. This was an initial choice the team was faced with: which database should perform which task?
Much of the infrastructure behind Place was already in use by Reddit: RabbitMQ, for example, handles incoming upvotes and downvotes. The system, “Sent events off to an event collector pipeline with Kafka. We have a huge event pipeline in Kafka. It’s fairly low throughput, and it’s pretty stable,” said Ellis.
“One issue we had with our RabbitMQ instance,” said Ellis, was its interactions with their WebSockets cluster. Specifically, the RabbitMQ management plugin was trying to do bookkeeping on the exchanges, which was unneeded and slowed things down. When bookkeeping was turned off, things shaped up, said Ellis.
This problem surfaced in the field, while the application was actually being used during its 72-hour window of existence. In order to handle the problem, Ellis said the team turned down the cooldown timer between pixel placement per user. At first, users could add a pixel every 5 minutes, but when trouble surfaced in RabbitMQ, they turned it down to 10 minutes.
The move to 10 minutes prompted conspiracy theories from the users. At one point, Ellis turned it to 10:03, hoping people would notice and comment, but few did, he said, glumly.
In his work elsewhere at Reddit, Ellis said that the teams are trying to become more test oriented. He said the team uses Drone for automated testing, and that he has learned the value of canary deploys since joining Reddit from Sauce Labs two years ago.
“We have a custom auto-scaler looking at internal metrics. The only times we get really large spikes out of the ordinary is when people are DDoSing us,” said Ellis.
As for Place, he said that at the end of 72 hours of public use, the exhausted team at Reddit decided it was time to close it down. “I wanted it to last longer originally, but on the third day in, we said it was done. We shut it down at the exact moment it started,” for 72 hours exactly.