Using AWS Storage for Serverless Microservices
The phrase “microservices architectures in the cloud” is often heard with the term “serverless.” Serverless does not mean by any sense that there are no servers, rather it is an architectural practice to use fully managed services available in the cloud as building blocks to build microservices. These fully managed services in the cloud are also called “serverless services.”
There are several advantages of using serverless that come with the consumption-based cost model and fully managed security, reliability, performance, and efficiency of these services by the cloud provider. When using these services, the level of manual configurations greatly depends on the specifics of the serverless service.
The following list contains a set of serverless services, commonly used to build Microservices in Amazon Web Services.
- AWS Lambda — Serverless Compute (Running your Code)
- Amazon S3 — Serverless Object Storage (Store Files in Cloud)
- AWS CloudFront — Serverless Proxy and CDN (Routing Requests and Caching)
- AWS API Gateway — Serverless API Gateway (Gateway to Microservices)
- AWS DynamoDB — Serverless NoSQL database
- AWS Aurora Serverless — Serverless Relational Database Services with AWS RDS
- AWS Cognito User Pools — Serverless Identity Provider
It is also important to understand that serverless is still an evolving field of technology. At the time of this writing, it is still required to consider using other cloud services in combination with serverless for certain use cases. For instance, a microservice that frequently modifies a large file is efficient in using block-level storage, either using Amazon EFS or EBS, which will require an AWS EC2 server instance, since Lambda cannot connect to these.
Storage for Serverless Microservices
There are several storage options available in AWS, which can be directly used as building blocks for microservices. These include persistence storage services such as AWS DynamoDB, AWS Aurora Serverless, Amazon S3 and temporary storage services such as AWS Kinesis, AWS SQS, etc.
Connecting AWS Lambda and Storage
When building serverless microservices one of the main advantages of storage solutions comes with the ability to interact with serverless compute service, AWS Lambda. These interactions come in two flavors. Most of these storage services can be accessed directly by AWS Lambda functions, to write or read data which is the most common use case. The ability to trigger Lambda functions as a form of a state change in these storage services is another form of interaction. This allows building complex data flows and interactions not only in between storage and Lambda but also connecting different storage solutions for various stages of the data flow.
Different Storage Use Cases
These storage services can be used for different use cases. It can range from, data storage, deployment storage and also data flow connectors that leverage capabilities of storage services in AWS. Few example use cases are listed below to provide a broader sense of utilizing AWS storage services in building serverless microservices.
NoSQL vs. Relational
There are use cases that require choosing between AWS DynamoDB (NoSQL) vs. AWS Aurora Serverless (Relational DB). For these kinds of use cases, it is important to analyze not only the data models and relationships but also how they are being queried.
One of the common mistakes I have seen is that when people moving from Relational to NoSQL where they try to model relational structured in NoSQL which makes things very complex.
Also, it is also important to understand that the storage is becoming more cheaper compared to compute, where storing the computed or queried results and direct retrieval is likely to become cost efficient than querying and processing each time.
Therefore it is important to address these concerns properly when designing the database schema, data flow, and processing pipeline.
File Object Storage
If you are building microservices, which requires to store and retrieve file objects, you can consider using Amazon S3. One of the best practices is to allow the microservice consumers to directly access Amazon S3 rather going through the microservice gateway. However, it is also important to allow only the authorized clients to access these objects. Your microservice can generate a specialized URL (e.g., AWS CloudFront Signed URLs/ Signed Cookies, S3 Signed URLs or AWS STS Tokens) granting access to the relevant objects after validating access control.
This approach can be used to both read and write objects and objects directly to Amazon S3. If there is a need to store metadata of objects for queries, it is a common pattern to use Amazon S3 triggers to invoke a Lambda function where it will save the metadata of the objects to AWS DynamoDB.
Pub-Sub Messaging for Data Storage Synchronization
One of the common patterns in replicating data across microservices uses pub-sub messaging middleware. In AWS serverless pub-sub messaging service, AWS SNS can be used for this purpose. Each microservices can connect to their interested SNS topics, either as a publisher or subscriber or as both. This can be used in conjunction with triggers or streams such as AWS DynamoDB, Aurora, Kinesis, etc.
For example, let’s look at a scenario where User Profile Data is shared between two microservices. There are two microservices, where one is for User Profile and the other one is for User Analytics. User Profile service uses AWS DynamoDB and User Analytics service uses AWS RDS for their respective internal storage. If the user’s name is updated in the User Profile service, the User Analytics service should also receive this change. This can be handled by having an SNS topic for User Data Change where User Analytic Microservice internally acts as a subscriber while User Profile Microservice Data change acts as the publisher.
Streaming Data Storage and Processing
For Microservices that receives a large number of events, AWS Kinesis Streams can be used to temporary store those events while AWS Lambda can process them in batches. The advantages come regarding cost and efficiency since the computational resources are utilized to process these events in batches rather processing one at a time.
Data plays a major role in designing microservices. It not only affects the internal design of microservices but also how they connect both internally and externally with each other. Using right storage option for data is equally important.
Selecting the right storage options helps in many ways affecting the overall reliability, performance, efficiency, cost optimization while providing the required level of agility for change.
Therefore, the focus on storage in building serverless microservices is highly important not only for the data storage but also for data flow and interactions as well as the deployment environment itself.