What happens when we apply cutting-edge modern technologies to the age-old game of chess? And what happens when a bored geek with a surplus of computing power conducts his own weird home experiments on the game?
This year a Pennsylvania software engineer who calls himself Tom Murphy VII decided to create some laughably bad chess-playing algorithms using everything from machine learning and neural networks to a gratuitous amount of CPU cycles — and then pitted the bad algorithms against each other to create a “tournament of fools.”
“It’s my idea of fun…” Murphy says in a humorous video describing the experiments.
Murphy has the brainpower to pull it off. He tells us that in 2007 he defended his computer science Ph.D. at Carnegie Mellon — which was the same year the students began holding the annual SIGBOVIK conference on April Fool’s Day. Sponsored by the “Association for Computational Heresy,” it was a kind of satirical special interest group devoted to a fictitious researcher named “Harry Quizmaster Bovik,” and including a call for goofy papers on topics like “artificial stupidity.”
For the last 11 years, Murphy’s been a senior staff software engineer at Google (in its Pittsburgh offices). But this year he decided to return again for the April Fool’s Day conference — and again, contributed some humorous research of his own.
Murphy brags proudly that some of his past presentations there involved ridiculous investigations that were “indistinguishable from ‘real’ research (for example The first level of Super Mario Bros. is easy with lexicographic orderings and time travel has some 20 citations in ‘real’ academic research).
But this was the year that Murphy turned his attention to chess.
Let the Games Begin
In Murphy’s initial game, human players are blindfolded, forcing them to remember where the pieces are. But what’s the equivalent for a computer? Telling it where pieces are located but not which pieces they are (or even what color)…
Of course, the computer wouldn’t be provided with the moves leading up to a position either. It wouldn’t even know whose turn it was. Yes, there’s a possibility that the computer’s king is in check — at which point pretty much every move is illegal except a move which leads your king to safety. But to get around that, Tom created a program that generates a list of possible moves, ranked in order of preference — from which the first legal move will be chosen.
“I like playing against it, because it’s not very good,” Murphy says in the video. “But the natural question is how not-very-good is it?” Testing it against chess-playing programs proves that yes, it loses with great regularity — as in, “every single time.”
Then he set out to build other bad chess-playing algorithms, so he could compare their relative performance…
One had a preference for placing its pieces on the white squares when it’s playing white, and on black squares when it’s playing black. (Its opponent? An algorithm which preferred placing its pieces on oppositely-colored squares.) In the end, they both played pretty badly. “They have a preference, but it doesn’t really have to do with winning.” In fact, they’re both just a little bit worse than the algorithm that chooses its moves at random.
If you find my intricate 42min video about 30 weird chess algorithms competing in a tournament of fools in order to assess my program that plays chess without knowing what pieces are on the board to be boring, then it is because you do not understand art:https://t.co/DkaEBGrwAf
— Tom 7 (@tom7) July 15, 2019
There was also two algorithms he named “Huddle” and “Swarm” — in which one automated player searches for moves keeping its pieces close to its own king, while the other searches for moves placing its pieces near its opponent’s king. This sometimes leads to Huddle’s king being forced to follow its own pawns across the board, wherein at least a few cases its pawn then accidentally promoted into more powerful pieces and inadvertently checkmated the opposing king.
But more often, it works the other way. “If you have a preference to attack the opponent, you’re going to accidentally checkmate it sometimes.” In the video, Murphy reveals that among the bad chess algorithms, this one is surprisingly not-as-bad. “‘Swarm’ is in fact much better than ‘Random Move.'”
And one more successful strategy is an algorithm that prioritizes four specific kinds of move (in this order): to checkmate, check, capture a piece, or push into its opponent’s territory.
But there are other terrible ideas — like an algorithm which seeks to mirror its opponent’s pieces, or to move all of its pieces to the other side of the board. One algorithm simply chooses whichever move comes first in alphabetical order.
And there’s another algorithm where each move is chosen from a list of possible moves — with the choices dictated arbitrarily by the digits of pi…
Survival in Chessland
But ultimately his most elaborate algorithm started with a question: which single chess piece is mostly likely to “survive” — to remain on the board through the end of the game, and emerging victorious with all of its fellow chess pieces? Dutifully investigating the answer, Murphy paid a visit to the free/libre chess site LiChess.org (which now sees more than a million games a day). It also offers games for downloading — so Murphy downloaded all 506,000,416 of them.
He summarized the results in a paper called “Survival in Chessland,” which explains his methodology. Murphy downloaded every complete game they had through November of 2018 — though another 200 million have apparently become available in the eight months since. Even his November haul represented a whopping 875GB of chess games, “so processing these took some care for efficiency and parallelism,” Murphy says in the video.
“Fortunately, I have a computer with just an obscene number of cores and truly excessive RAM, so you gotta use that for something.”
Er, how obscene? In an email, he describes his “gratuitous” home system, a 32-core AMD ThreadRipper 2990WX. Running the multi-threaded C++ programs for a few hours was enough to crunch through the entire dataset. “I think people underestimate what you can do efficiently with a single machine!”
“The work involved is pretty simple: I parse the PGN (which describes the game), then execute the moves in the game, keeping track of the fate of each piece, and then sum the counts for each fate…”
“I basically wrote all the code from scratch, because this is more fun than getting other people’s code to work. :)”
And then this led to more bad chess-playing algorithms. First, there’s the one which just moves pieces to those squares where they’re statistically more likely to survive. Or simply to the squares where they’re most likely to be at the end of a game. And other algorithms do the exact opposite — moving pieces to squares where they’re least likely to survive (or end up). Two algorithms calculate which square has the highest survival rate (versus capture rate) for a piece on each square, with one algorithm seeking the highest ratio favoring survival — and the “fatalist” algorithm seeking the lowest such ratio.
“Weirdly, it has the best winning chances of these strategies.”
And finally, he decided to compare ’em all to the Stockfish chess engine on several levels (including a special algorithm where it’s forced to play its least-promising move). “Looking at the results, unsurprisingly this is the overall worst strategy, and it manages to lose to almost everybody.”
But remember Tom’s original “blindfolded” algorithm, which didn’t know which piece occupied a square (or even its color)? He decided to augment it with a neural network, which performed machine learning — using the billions of positions available in the games from LiChess.org. It was now able to correctly predict the actual pieces at each position with 100% accuracy…about one-fifth of the time. And even when it’s wrong, it’s only incorrectly guessed an average of 3.22 pieces for each position.
And this becomes a jumping-off point for even more wild calculations. At some point, he deduces that all the games in their database eventually pass through 21,553,382,902 unique positions. With 204GB you could store them all — along with the next move — but there’s another interesting statistic. A whopping 76% of the positions occurred exactly once. “So these take up a lot of space, and they’re not very useful for playing, because I’m very unlikely to ever see them again.” Throwing away these one-off positions, every other possible game position can be stored with about 500MB of memory. This can be easily converted into an algorithm that plays the most popular move for any position it encounters (while swapping in a random move if it happens to find itself trapped in a unique position).
But then there’s one easy trick for beating it. “As soon as I make a little bit of a weird move, it just starts playing randomly. And then it’s very easy to beat.”
A final experiment involves various “dilutions” of the Stockfish engine — in which its preferred move is replaced by a randomly-chosen move x percent of the time.”
“We can actually assess how good a player is now by comparing it directly to a specific dilution of Stockfish…”
After all of Murphy’s elaborate chess-playing algorithms had been dutifully tested, he summarized the results in one massive table, and triumphantly concluded his video by announcing that he was finally ready to pursue other interests.
“Now it’s on to the next project, which is teaching this dog how to play chess.”