Contributed / Top Stories / Tutorials

How to Build Intelligence into Your Session Stores

10 May 2018 3:33am, by

Session stores are a common component of most web-based applications. In this article, first in a series, I will discuss how you can build in analytics into these session stores so they can enable intelligent decisions in real time. I explain how to implement common patterns such as surfacing the right content to your users, using activity patterns to drive suggestions/recommendations, and sending group notifications to large numbers of users.

What Is a Session Store?

Kyle J. Davis, Technical Marketing Manager, Redis Labs
Kyle J. Davis is the technical marketing manager at Redis Labs. Kyle is an enthusiastic full-stack developer and works frequently with Node.js and Redis, documented in his long-running blog series on the pair. Previously, Kyle worked in higher education, helping to develop tools and technologies for academics and administrators.

Simply put, a session store is a “chunk” of data that is connected to a user of a service, stored separately from the primary database in order to provide stickiness without direct, constant access to the database. “User,” in this case, is loosely defined. A user could be as simple as a mere visitor to a webpage, a user with an account in a phone app or even another service that accesses data via an API.

A session is often persisted between requests through a cookie. The server gives the client the cookie and the client stores it, then sends the subsequent request back with this cookie. The server then uses the cookie string as a token for which data can be associated with the user.

Often times, the session data is the most frequently used by a single user (and that user alone). The session data is also often a critical requirement for rendering a page or view.

Many times, session data is ephemeral and duplicated in some other data store. However, as we will explore shortly, this doesn’t always have to be true.

What Is an Intelligent Session Store?

In this article, we will be exploring session stores that move beyond “dumb” stores of data. This intelligent data in a session store might be calculated, inferred or otherwise not directly supplied by the user of the service. In this way, you can store the traditional session data (username, preferences or other common stateful data) alongside intelligent data.

Some examples of the intelligent data we’ll be using are:

  • Group notifications — providing a single notification to a specific slice of users
  • Content surfacing data — a dataset that can be leveraged for users to be pointed to additional content
  • Activity data — information about the users’ behavior and usage of the service
  • Personalization data — data that can be used to make the service more specific and relevant to each user

Why Does Any Service Need a Session Store?

In a very simple world, one does not need a session store. In the below very simple example, you have the entire internet connected to your single server, that is then backed by a single database. On every page view, the web server connects to the database and grabs the requisite information.

This might work great for small use cases, but when your website becomes busy, you’re likely to start having problems with the database becoming slow. After all, querying—or worse, writing to most databases is very resource intensive as compared to the duties of a web server. To remedy this situation, you alter your website code to start using a file-based session store. This is the simplest of the session store strategies—effectively, individual sessions are stored in text files that reside on your web server. The web server software would then read or manipulate the session data file directly.

Since Redis Enterprise is known for its ability to scale easily in order to accommodate gigantic amounts of traffic, this connection allows for near infinite growth of users, servers and microservice instances.

This helps a great deal because the database is much less frequently accessed, and thus is much quicker. However, when your website grows even more, another problem arises. At this point, you’ve limited the number of requests being made to the database, but with all the file I/O, in addition to the normal duties of a web server, the web server itself starts to strain under the pressure of your popular site. The common solution to this conundrum is to add more web servers and a load balancer. The load balancer distributes traffic evenly to all web server boxes.

This solution might work well during testing, but actual users will probably have complaints. Let’s examine what happens when a user with a session that stores their name gets directed to two different web servers with a file-based session store.

 

As the graphic above illustrates, when the web server array is storing session data on a file and the load balancer just distributes to the next web server, the session data may be stored on a different server in a text file. As a result, the service doesn’t know that the user is called Robin. Let’s explore a better solution.

In this example, we have a web server that stores the session data in Redis rather than an on-server text file. Also included is a session microservice that can manage our session’s intelligent session data. Of note in this architecture is that the web server is never directly connected to the microservice and everything is directed through and in the Redis database. Since Redis Enterprise is known for its ability to scale easily in order to accommodate gigantic amounts of traffic, this connection allows for near infinite growth of users, servers and microservice instances. Additionally, Redis Enterprise has extensive high-availability features, which is essential for the critical components of any architecture.

Why Use a Microservice?

Microservices have some very well known advantages over a monolithic approach to building a complex application. With microservices, you can more easily scale your application’s capacity, development and reliability. In this case, “development” refers to the developers/teams needed to implement complex constellations of functions that your application will need to cover.

Developers can be a picky bunch. One team may want to use an entirely different set of tools, languages and methodologies than another team. Merging these teams together can be difficult (or impossible) to efficiently accomplish. In addition, the larger a codebase grows, the harder it is to:

  • Understand
  • Test
  • Add features

By approaching a session store as a microservice, you can abstract the complexity of the store from the web serving layer and even use entirely different languages and tools. Indeed, you can precisely test a session store for appropriate behavior. Finally, you can add features independently from any other parts of the serving layer.

For capacity, microservices allow you to assign more infrastructure as needed, without having to scale out other parts of the architecture. In contrast to a monolithic approach, microservices allow infrastructure to be sized appropriately for discrete services. With our session microservice example, we can add significant extra functionality without worrying about it becoming a chokepoint later as we can scale out. The opposite is also true: if other parts of the architecture are resource intensive, we can keep the session management layer leaner.

Finally, microservices can add reliability to our application. Building microservices intrinsically requires attention to failure situations. Indeed, you can even build in functionality to allow for hot restarts of microservices or even complete failures of a particular microservice, causing your application to degrade rather than hard fail.

Transport Mechanisms

For our session store, we’ll be using Redis in two ways. Firstly, we’ll be using Redis as you might expect: as a database. Secondly, we’ll be using Redis as a transport. Redis has some very interesting and lightweight mechanisms to manage data flow.

The most commonly known Redis functionality of this type is pub/sub. The publish and subscribe pattern in Redis is classified as “fire and forget,” which means that once you publish a message, it’s gone—there is no automatic and durable record of that message being published. Much like the classic thought experiment:

“If a tree falls in a forest and no one is around to hear it, does it make a sound?”

The Redis equivalent would be:

“If a message is published to a channel and no one is subscribed, does it exist?”

Unlike the thought experiment, in Redis we have a straightforward answer: No.

You can subscribe to a single channel (SUBSCRIBE) or, more interestingly, you can subscribe to a pattern (PSUBSCRIBE) which monitors channels based on glob wildcards.

The other pattern we’ll be using is blocking lists. Lists are ordered items that can be easily pushed to and pop’ed from. A blocking list is the same underlying data type but with a twist: in the case of an empty list the server blocks any other commands from being executed until either an item is pushed by another client into the list OR until a timeout clock has elapsed.

With these two patterns, we can create a flexible microservice transport that supports both critical and non-critical pathways. Let’s examine the non-critical pathway first:

 

Step Serving Layer Session Microservice
0 Pattern subscribed to anything with “pageview:?*”
1 Gets HTTP request
2 Publishes Redis message to a channel “pageview:[session id]:[request id]”
3 Renders route Gets message from serving layer
4 Closes connection Starts processing
5 Ends processing
6 Adds message to a list  “async:pageview:[session id]:[request id]”

Step 0 occurs at some point prior to the request being received.

In this sequence, the serving layer is not waiting for the session microservice to respond. Indeed, the serving layer only sends out the message but it doesn’t have any knowledge of whether or not this message was received by the intended recipient. This type of sequence is great for situations where the data being stored in the session is completely trivial and not critical to the rest of the rendering. The advantage of using this pattern is that the only performance impact on the serving layer is a simple O(1) command: PUBLISH. The session microservice processing can occur during or even after the response has been sent.

Now, for the critical path, we’ll see how the microservice layer plays a larger role.

Step Serving Layer Session Microservice
0 Pattern subscribed to anything with “pageview:?*”
1 Gets HTTP request
2 Publishes Redis message to a channel “pageview:[session id]:[request id]”
3 Blocks with BLPOP on key “async:pageview:[session id]:[request id]” Gets message from serving layer
4 Starts processing
5 Ends processing
6 Adds message to a list ”async:pageview:[session id]:[request id]”
7 Response from blocking command; gets data from list
8 Renders route
9 Closes connection

Step 0 occurs at some point prior to the request being received.

The two sequences are the same up until step 3. Here, in the critical pathway, at step 3 we now block the serving layer’s client (note: you either have to create a Redis client connection per request or, better, use a connection pool). This means that the client will sit idle until the microservice layer adds an item to the list or timeout is met (the timeout can be a very low value). A very important point is that both session and request have unique IDs. There should only ever be one subscriber to any given “async:*” list because of the unique IDs. On the microservice layer, the microservice does its processing, then posts a message to this unique list. This message can either be simply something that indicates to the serving layer that the microservice layer is complete or the message can, itself, contain information. This sequence is useful when you need to confirm that the microservice has completed or the rest of your steps depend on something the microservice has performed. As you might imagine, in a relative sense, this sequence has more overhead and potential latency, but keep in mind that these are very simple, lightweight operations that are most likely in the sub-millisecond timescale (obviously, minus whatever processing time the microservice requires).

From Submitted to Calculated Data

Most session stores simply store data that is at some point submitted by the user: account details, preferences, or maybe even shopping carts. In this way, the data is rather simple — effectively, the database is merely holding a value and serving it back to the user. Session stores can, however, be used for more sophisticated data. In the next part of this series, we will take a look at some data structures available in Redis and/or Redis modules (later in this series, we’ll illustrate how to use them in a session store situation). With all of the structures described below, we’ll describe their properties as they relate to use and storage characteristics, without diving too deeply into the algorithm implementation or theory.

Feature image via Pixabay.


A digest of the week’s most important stories & analyses.

View / Add Comments

Please stay on topic and be respectful of others. Review our Terms of Use.