MySQL 5.7 Gets Savvy with Cool Replication Hacks
When Airbnb’s MySQL databases grew too unwieldy to be handled effectively, especially over the busy summer months, company engineers decided to move some database tables off to their own individual servers.
Normally, moving a database table can be a treacherous process at best, especially for a 24/7 operation as busy as Airbnb’s. The Airbnb engineers came up with a neat hack, though. They used the MySQL replication feature to copy the database tables to their new primary locations in Amazon Web Services and incurred only minimum of downtime, requiring no migration code.
In the web-scale era, database replication is serving many more uses than the disaster recovery. So it is savvy on Oracle’s part to outfit MySQL version 5.7, out on October 27, to include a number of new options for enhancing replication, along with a number of other new features to keep the MySQL’s “M” designation in the LAMP stack relevant.
“Almost everyone uses replication, so we built a much more highly-performant replication system,” said Tomas Ulin, Oracle vice president of MySQL engineering.
Say what you will about Oracle’s handling of the many open source assets it acquired in its 2010 purchase of Sun Microsystems (think Java, OpenOffice), the company has kept MySQL competitive against a rising tide of popular NoSQL data stores.
This release continues the work. As Ulin noted, the most noteworthy embellishments are the major improvements in replication. The new group replication feature provides away to write data simultaneously across multiple database instances. This approach could pave the way for establishing high-availability operations, in which redundant identical databases are maintained so that if one goes down, another one can quickly step in to take its place.
This approach relieves one of the chief bottlenecks in using MySQL for high-volume work, said Peter Zaitsev, CEO of Percona Systems, which offers a version of MySQL, called Percona Server, tweaked for high performance work.
That the data can be instantly replicated relieves the chief database from servicing secondary duties, such as providing support for analysis, which could slow performance overall. Now a replicated copy of the database can assume that workload, and offer data that is just as fresh, Zaitsev said.
Users can set up sophisticated workflows that can utilize the replication capabilities, thanks to a new a piece of MySQL support middleware called the MySQL Router. MySQL Router can act as an intermediary between an application and a number of different databases, following a set of administrator rules concerning which database to direct traffic to at any given time.
MySQL now also supports multi-source replication, allowing a single database system to get inputs from multiple databases, thanks to an improvement of Global Transaction Identifiers (GTIDs) and multi-threaded processing.
With previous versions of MySQL, Zaitsev explained, throughput was limited due to the fact that all processing was handled by a single thread, running on a single core. Now MySQL can run across multiple cores in parallel.
Multiple source replication could be handy in a number of different scenarios, Ulin said. For instance, a national retail chain could set up different sales databases for each region, and then have each regional database, each with the same schema, copy its data to a central database, which would aggregate and analyze the data for a company-wide overview.
MySQL 5.7 is the first major update since February 2013, when version 5.6 was made available. With that earlier edition, Oracle chiefly focused on pleasing database administrators. This version adds more features for the developers, Zaitsev noted.
In prior editions, MySQL saved JSON data — which is basically plain-text data encoded in key-value pairs — as blobs in text fields. Querying JSON this way was slow, however.
Now MySQL can save JSON as a specific JSON data type, one formatted in binary for faster searching. All the keys are indexed so they can be searched much more quickly than they could through regular text searches, Ulin said.
Using MySQL instead of a new instance of MongoDB could cut administrator overhead, Ulin argued, if only because the organization doesn’t have to maintain two separate database systems, or train administrators in the new ways of a second database system.
MySQL 5.7 could also appeal to NoSQL users with improved online updating. Now users can alter tables without the need to take the entire database offline, an advantage heretofore ascribed to using schema-less NoSQL databases. This allows the organization react more quickly to changing needs in what data to collect.
MySQL operations folks get some new tools with this release as well. A new performance schema provides operational data for how well the database system is performing, in terms of memory usage, transaction and prepared statements speeds, and hundreds of other factors. The most commonly-used of these metrics around performance and health of the database system are gathered in a new schema called MySQL SYS Schema.
Feature image: The National Gallery of Art, taken by Joab Jackson.