Why the Redis Ecosystem is Flourishing
The in-memory, key-value data store Redis continues to consolidate its market position as more applications seek to use the open source project as a data structure server for on-the-spot, real-time application processing.
Cloud hosting infrastructure provider Rackspace is seeing a particularly strong upsurge at present, according to Matthew Barker, product manager at Rackspace, who says that “it is growing rapidly as customers rip and replace Memcached and the proliferation of real-time applications grows.” Earlier this month, Rackspace hosted RedisConf 2015 in San Franciso at the Palace of Fine Arts. About 250 people attended the conference.
Rackspace’s perspective represents a trend observable across the database market. Database ranking website DB-Engines ranked Redis as the 10th most popular database management system in March, up from the 13th position a year ago.
DB-Engines calculates Redis’s popularity based on a number of factors, including number of positions advertised that specifically request Redis expertise, tweet and news mentions, Google Trends interest and discussions on key tech forums, such as Stack Overflow (which coincidentally uses Redis as part of its stack — as a caching layer for the entire Stack Overflow network).
Redis for Hot Data
As an in-memory datastore, more businesses are turning to Redis as they scale their applications in order to carry out quick processing tasks in memory. This is akin to providing a real-time sandbox for ‘hot data’: data endpoints that are in flux as they go through loops of read-write processing before being finalized.
This is how content management API service, Contentful, uses Redis in their stack, according to cofounder Paolo Negri. Contentful is a platform-agnostic content management system that delivers content to any device via APIs. Current customers include e-commerce sites like Nasty Gal and longstanding media empires like Playboy. While their content is stored in data servers, during the editing phase, content is copied to Redis to allow real-time editing of the content before it is saved back into a more permanent datastore. “At the moment, we are using Redis as a temporary datastore, for very hot data, and from that point of view, it works very well. Then as soon as customers finish the editing process, their content goes to Elasticsearch and, ultimately, into our AWS infrastructure.”
Negri has been using Redis in a variety of IT roles pretty much since its creation in 2009, when he first played around with creating a few experimental libraries. But in 2011, at gaming platform Wooga, he looked at Redis more seriously as a replacement for MySQL. “In the beginning, we moved just a small part of the data and then more and more of the data of our games onto Redis. For us, it was a tool that allowed much higher performance than MySQL.
“One thing for us using Redis is that we didn’t have a strict requirement on persistence. You can afford to lose the last few minutes of a gaming session, unlike say finance, where you can’t afford to lose any transaction. Redis has a clear contract on persistence. It wasn’t a strict persistence model but it is very clear, very well-defined.”
Real-Time Needs of Applications
The real-time nature of gaming applications is making it a common use case for Redis. Barker from Rackspace says “a number of gaming companies are using Redis for their leaderboards.” (He also points to other leaderboards, such as up and down voting of articles on blog sites as another example of Redis in popular use.)
Ankur Bulsara, CTO at games company Scopely, has doubled down on using Redis, moving from the open source project to signing with the enterprise provider Redis Labs. Scopely currently has four number one games in the app marketplace, has “had somewhere over 30 million installs of our games,” with some of their largest games making two billion API calls a week each, “with some seeing 40-50,000 requests a second,” Bulsara says.
“We started using Redis for the same things that attract a lot of people to Redis: leader boards, in-memory, sorted sets and social apps.
“Philosophically, we like to let Amazon manage as much as possible, so when Amazon used Elastic Path with Redis, we switched to that. Then we met Redis Labs at AWS Reinvent, and they looked like they had a compelling suite over Elastic, in particular, they offered us a one-click way to manage our size. We were used to AmazonDB where we didn’t have to manage node size, so now we can use Redis Labs like an API.
“We use DynamoDB a lot as a primary datastore, then Redis for things like task or current features or to power a real-time events system. We use some Memcached, but increasingly we are looking at Redis for our caching solution, and we are slowly weening ourselves off MySQL and Redshift as our analytics database.
“We have a number of stacks globally, we make our own games as well as publish games, and they have different requirements and the stacks are configured slightly differently. Generally, we are using Ansible and CloudFormation Playbooks for our configuration systems and from that we can configure DynamoDB tables or other things.
“One good example is we use Redis Labs as an analytics platform for real-time alerting, and we use the new hyperlog log feature to activate user count. Features like that have allowed us to increase our usage of Redis Labs beyond what we were originally pigeonholing it for.”
While gaming may seem like a specific vertical, the real-time nature of its application needs point to some of the ways that other industries are making use of Redis’s in-memory datastore capacities. Barker points to the use of leaderboards beyond gaming use cases to things like up and down voting of articles as another example of how Redis is being used.
He also sees Redis powering message queues, e-commerce session stores, and basic caches, particularly for WordPress installations.
Three Ecosystem Streams for Redis
As Redis continues to strengthen its positions as a cache database store for business technology stacks, the ecosystem surrounding the open source project spans outward. The two main ecosystem streams that flow outward from Redis use in the tech stack are:
- Task management projects running in conjunction with Redis.
- Complementary NoSQL technologies.
In addition, a third ecosystem stream is emerging:
- The traditional use of Redis to spin out SaaS platform installations.
Task Management Projects Running in Conjunction With Redis
As an open source project, a number of other tools are sprouting up around Redis to extend its functionality for particular use cases.
Chief among these is message queuing, a use case that Rackspace itself uses internally with Redis. Tools that harness Redis’s in-memory, cache capabilities for real-time message queuing include:
- Celery: Celery is an asynchronous job queue manager that uses distributed message parsing to manage tasks synchronously or when a particular event or user-input is triggered. Predominantly used for real-time task management, it is also used for task scheduling. Rackspace uses Redis and Celery internally for job queuing, and Barker confirms this is a use case he is seeing increasingly adopted by Rackspace customers using Redis as well.
- Resque: Similarly, Resque is a a “Redis-backed Ruby library for creating background jobs, placing them on multiple queues, and processing them later.” Created by GitHub, it can be used with any Ruby class or module that responds to the “perform” command, and since being created by GitHub has managed over 10 million jobs. Seeking a solution to handle their heavy load of background jobs, they needed a real-time option that could keep an eye on stale and bloated tasks, monitor pending jobs, distribute workers across multiple machines, manage failed jobs without releasing or retrying them, and other task management duties, all with a persistent state. “If we let Redis handle the hard queue problems, we can focus on the hard worker problems: visibility, reliability, and stats. And that’s Resque,” wrote Chris Wanstrath when introducing Resque in November 2009.
- Sidekiq: Created by Mike Perham and written with Resque-compatibility in mind, Sidekiq uses multiple threads to enable processing of hundreds of messages in parallel. This creates lightweight, efficient background processing for Ruby and Rails environments. The idea is that Sidekiq can be used with Resque to create memory efficiencies when processing job queues. Sidekiq uses Redis to store all job and operational data.
Other tools that are cropping up to empower Redis users include:
- ObjectRocket: Rackspace-acquired ObjectRocket aims to quickly spin up performance instances of Redis. The goal is to allow users to focus on scaling their application, managing the scaling of Redis automatically.
- Commissar: The early-release, performance metric tool Commissar is able to do scenario testing of Redis stack environments, in addition to standard benchmarking tests.
- Datadog: Analytics products like Datadog also offer integrations with Redis to enable tracking of all Redis activity as visualizations, and the ability to analyze performance metrics on individual databases and Redis clusters.
Complementary NoSQL Technologies
- Elasticsearch: One of the most common tools used alongside Redis, Elasticsearch is a real-time full-text search engine. It is often used in conjunction with Redis, where cached data is stored in Redis and then queried via Elasticsearch. This is how HipChat has been able to scale their growth. They use Redis for caching data such as who is online, and which users are in which rooms. ElasticsSearch is then used as the storage and search back-end that can monitor the usage data in the Redis cache and scale transparently by adding more nodes as needed.
- Logstash: Also from Elastic, Logstash is a central log file management application often used in a stack alongside Elasticsearch, Nginx and Redis. Vadiraj Joish wrote last year about how to establish a stack that can collect logs from multiple applications distributed across a number of servers; store the data in a central file; and manage a web GUI front-end in order to more easily analyze metrics and create granular reports. For this stack, Redis becomes a datastore to enable filtering of queries in memory, while Elasticsearch serves as the central store for all of the logs collated by Logstash.
- MongoDB: High volume, real-time applications are using a combination of MongoDB and Redis to respond to over 200,000 requests per second. Last year, MaxCDN documented how they use both MongoDB and Redis in their stack to power an analytics platform. Log data for all MaxCDN customers is stored in a specialized version of MongoDB (TokuMX). Redis is then used as a high-volume messaging queue between MaxCDN’s content delivery network and their TokuMX database cluster. “Redis is lightweight enough to keep up with the speed of incoming data,” writes Bryan Conklin.
Redis and SaaS Platforms
Marko Martinovic, a back-end developer specializing in e-commerce platforms, sees Redis solving many of the scaling problems that face e-commerce businesses as they begin to grow. In a post last year on using Redis as a cache back-end and for session storage, Martinovic notes that more common solutions like APC and Memcached begin to flounder once website visits and shopping sessions start to scale. Limitations, like not being able to tag groups of related cache entries, further exacerbate the use of other data storage cache options when the volume of transactions begins to be a factor in architecture decisions.
Their scenario is not dissimilar to why Redis has seen a similar uptake used to manage WordPress installations. In December last year, Scott Miller from Digital Ocean posted a tutorial on using Redis to configure Redis caching in order to speed up WordPress page loading, halving the load time with Redis. With Redis, “The result is a WordPress site which is much faster, uses less database resources, and provides a tunable persistent cache,” writes Miller.
As more platform-oriented SaaS products reach maturity and build a customer base, Redis may well become an essential stack component as both a cache and in-memory store, as has already happened with WordPress and is currently occurring with Magento.