Redis sponsored this post.
At its core, database caching is a simple-enough concept: It is the practice of putting an in-memory datastore in front of (or aside) a database that acts as the system of record.
Caching allows organizations to keep frequently requested data in speedy RAM and enhance application performance by reducing calls to slower disk or flash-based databases that persist system data. This efficiency brings incredible performance benefits.
While the concept is simple, in practice, caching can be complex. Efficient caching requires meticulous planning and a deep understanding of your application architecture and data needs. Smart decisions must be made around which type of caching pattern is a best-fit as well as how the cache should be configured.
Should an application use an inline cache? If so, is a read-through, write-through or read/write-through best, or is a cache-aside pattern a better fit? It’s impractical to hold all your memory in a cache, so how should your cache handle eviction to determine which subset of your data is retained in your cache?
Eviction policies range from least-recently used to oldest stored to random eviction, to name a few. There are countless additional decisions that need to be made as well: Should a cache be persistent or volatile? How will it scale? Will it be replicated, and if so, how?
And so on…
So while caching is relatively simple in concept, in practice, it is highly complex. All this complexity leaves room for error, and those errors can be costly. In fact, implementing an inefficient cache is often worse than having no cache at all, and here’s why.
1. An Inefficient Cache Can Make Applications Slower
You cache to speed up application performance by making critical data available in near-real time. However, an inefficient cache can actually have an adverse effect on application performance and can make applications slower.
When a request results in a cache miss (requested data does not currently reside in the cache), it increases the end-to-end response time of an application. This is because the application must request data from the cache, and when the cache doesn’t contain the requested data, the primary database must be queried as well. There is an additional call that bears no fruit added to the already-existing database call. In these instances, the cache brings no benefit, and, in fact, it adds the cache response time as added latency.
This is not a problem if misses are infrequent, because while the initial cache miss is slower, all subsequent requests for that same data will be sped up (until the data is evicted from the cache). However, if a cache is inefficient and there are a high number of misses, then a cache can slow down an application. This is especially true if the speed of a cache hit is only marginally faster than calling the primary database directly.
For a cache to have a net positive impact, it has to bring a significant benefit in response time for a cache hit vs. a database call, as well as have a high percentage of cache hits.
2. An Inefficient Cache Consumes Additional Expensive Resources
An inefficient cache also wastes valuable technology resources and budget. Caching requires deploying expensive RAM-based infrastructure with the goal of reducing load on downstream databases. However, an inefficient cache doesn’t make good use of this incredibly expensive hardware.
For example, a cache with a suboptimal eviction policy leads to more cache misses, slower response times and more calls to the primary database. This can be incredibly expensive, especially if you’re paying for your database per operation. You then incur the expense of RAM without bringing requisite value — applications aren’t faster, and load isn’t reduced from the downstream databases that act as a system of record.
Additionally, there are indirect expenses associated with an inefficient cache, like the cost of lost revenue and customer attrition that come with unreliable and poorly performing applications.
3. An Inefficient Cache Can Harm Data Quality
In many instances, an application is only as good as the data that powers it. Think of the impact of incorrect shopping, user profile, gaming or financial transaction data to digital customer experiences.
This data is kept in memory because it is critical to applications and needs to be delivered quickly and/or frequently. Otherwise, it wouldn’t be worth the expense.
But inefficient caching introduces potential issues with the consistency of this critical data. An inefficient cache introduces the potential for a mismatch between the data held in memory in the cache and what is persisted in the database.
For example, an application could have written a new value to the cache that, for whatever reason, wasn’t updated to the primary database. Now you have a stale, or incorrect, value in the underlying database — and a cache that is inconsistent from the system of record.
Or perhaps the system of record was updated with a new value, but this was never updated in the cache serving data to an application. Now the cache has stale (incorrect) data and is again inconsistent.
Additionally, there can be consistency issues between individual cache instances when caches are distributed across multiple nodes. This is common when a cache is replicated for additional resilience or to achieve geographic distribution.
Reap the Many Rewards of Caching
The issues that come with an inefficient cache can be addressed with proper planning, an understanding of your application and its data needs, and proper decision-making around caching strategy.
Cache performance can be achieved with the right caching pattern and eviction policies; costs can be alleviated by performance, and consistency can be maintained with a process to invalidate stale data.
While an inefficient cache may be worse than no cache at all, an efficient cache brings tremendous value. It can vastly improve application performance, reduce the burden on downstream databases, and make applications more scalable and resilient. It’s why caching is so widely pursued with open source technologies such as Redis, which was designed purposely with simplicity, efficiency, performance and resiliency, and maintains these principles to this day.
Feature image via Pixabay.