Try not to put off upgrading your vital infrastructure software for more than one release, and if you do make custom code changes to an open source project, prepare for the extra work integrating those patches again when the updated version of the software appears.
Those are but a few of the lessons you might glean from a Facebook blog item posted Thursday, documenting the company’s sometime-treacherous large-scale leapfrog migration to MySQL version 8.0, from an ancient 5.6 version it had been using of Oracle‘s open source MySQL database system.
Yes, despite the fact that Facebook had created Cassandra, a scalable NoSQL store now managed by Datastax, the social media giant is still using the veritable MySQL traditional relational database to store Petabytes of valuable user information, according to the post’s authors, Facebook Production Engineering Manager Pradeep Nayak and Facebook Software Engineer Herman Lee. It has thus far taken a company nearly a year to migrate to MySQL 8.0 from 5.6, which was first released in 2013, and support for which ends this year. Oracle rolled out MySQL 8.0 in 2018.
Hard Pass on 5.7
The company skipped entirely upgrading to MySQL 5.7 release, the major release between 5.6 and 8.0. At the time, Facebook was building its custom storage engine, called MyRocks, for MySQL and didn’t want to interrupt the implementation process, the engineers write. MyRocks is a MySQL adaptation for RocksDB, a storage engine optimized for fast write performance that Instagram built for to optimize Cassandra. Facebook itself was using MyRocks to power its “user database service tier,” but would require some features in MySQL 8.0 to fully support such optimizations.
Skipping over version 5.7, however, complicated the upgrade process. “Skipping a major version like 5.7 introduced problems, which our migration needed to solve,” the engineers admitted in the blog post. Servers could not simply be upgraded in place. They had to use logical dump to capture the data and rebuild the database servers from scratch — work that took several days in some instances. API changes from 5.6 to 8.0 also had to be rooted out, and supporting two major versions within a single replica set is just plain tricky.
At the beginning of the job (which is still underway), the company had 1,700 code patches originally written for v5.6 that needed to be ported to 8.0 in some form. Some APIs that were deprecated in 5.7 were taken out entirely by version 8.0, which meant that Facebook developers had to rewrite application code that relied on these old APIs.
The update team sorted the required patches into a number of different buckets, grouping multiple patches for the same feature together. Some of Facebook’s patches did not need to be carried forward at all, because the functionality they provided were subsequently covered by 8.0 itself. There were also non-server features that needed to be ported, such as those to support the build environment, that modified MySQL tools like mysqlbinlog, or that added functionality, like the async client API.
Some of the most complex features required significant changes to 8.0 to work properly. By the time when this blog post went live, the team had evaluated 2,300 patches, 1,500 of those making their way into 8.0.
Move It on Over
The actual migration effort involved bunching multiple mysqld instances (the actual MySQL core daemon) into larger replications sets, using specially-crafted software to automate the process to change over large groups of replication sets. Each replication set (which could span different data centers) had a primary instance and lots of secondary instances. The conversion strategy was to create and add then 8.0 secondaries via a logical copy using mysqldump, then take the following steps to do the conversion:
- Enable read traffic on the 8.0 secondaries.
- Allow the 8.0 instance to be promoted to primary.
- Disable the 5.6 instances for read traffic.
- Remove all the 5.6 instances.
To validate that the move would work, the team wrote integration tests for the automation software, which was run in a test environment that surfaced a number of small edge-case style bugs. For the MyRocks migration, the team built a MySQL shadow testing framework that captured production traffic and replayed them to test instances.
The migration work is continuing, but Facebook is already enjoying the benefits of MySQL 8.0, Nayak and Lee assert. For instance, some applications are using new features such as Document Store and improved DateTime support.
“Overall, the new version greatly expands on what we can do with MySQL at Facebook,” they write.
For all the gritty details, check out the blog post here.
Datastax and Oracle are sponsors of The New Stack.