Understanding AWS Cognito User and Identity Pools for Serverless Apps
Amazon Cognito is Amazon Web Services’ service for managing user authentication and access control. Although it was originally associated with AWS’s mobile backend-as-a-service offering (MBaaS), it has recently gained the attention of the serverless crowd, who are looking for ways to offload user management concerns to a service provider. Cognito solves this problem by providing a fully managed, scalable and cost-effective sign-up/sign-in service — but at the cost of a steep learning curve. One of the reasons for this is because Cognito is comprised of two services — User Pools and Identity Pools (a.k.a. Federated Identities) — that are similar on the surface but different under the hood. These two services solve the same problem (i.e. authentication and authorization) but do so in very different ways. They can also be used separately or together, providing both flexibility and a source of confusion at first.
In this article, I’ll provide a gentle introduction to User Pools and Identity Pools, including the nuanced relationship between them. Before we dive into the explanation, however, we first need to explain two core security concepts: authentication vs. authorization and identity providers.
Authentication vs. Authorization
In the security world, the terms “authentication” and “authorization” have very specific meanings. Authentication is the process of verifying a user’s identity. Most commonly, users authenticate with a username (which identifies the user) and a password (which confirms the user is who he claims). Authorization, in contrast, is the process of granting users access to specific resources after they have been authenticated. For example, users might be placed into one or more groups based on their job title, and the application then determines which features are available to them based on their group membership.
An Identity Provider is a service that manages authentication, providing a user login and the ability to verify a user’s identity. AWS Cognito has its own Identity Provider (using User Pools, which are explained below), but it can also integrate with well-established third-party Identity Providers like Facebook and Google. Additionally, Cognito can integrate with any Identity Provider that implements the SAML or OAuth2 protocols. The process of integrating with a third-party for authentication is called Federation.
User Pools vs. Identity Pools — Understanding the Difference
As we mentioned earlier, AWS Cognito is comprised of two separate, but related, services: User Pools and Identity Pools (also called Federated Identities). User Pools provide a user directory for your application, including all the bells and whistles that come with user management, like sign-up, sign-in, group management, etc. User Pools also provide your app with information like the user’s ID and group membership, so that your code can handle authorization. Identity Pools, in contrast, are used to assign IAM roles to users who authenticate through a separate Identity Provider. Because these users are assigned an IAM role, they each have their own set of IAM permissions, allowing them to access AWS resources directly.
Because Identity Pools map a user from an Identity Provider to an IAM role, they essentially allow you to delegate authorization for AWS resources to AWS itself. This is the critical distinction between User Pools and Identity Pools. User Pools (by themselves) don’t deal with permissions at the IAM-level. Rather, they provide information like group membership and the user’s ID to your app, so you can deal with authorization yourself. Identity Pools, in contrast, grant users’ permissions at the IAM level. This means that Identity Pools allow for a much more granular set of permissions, with respect to AWS services.
Let’s use an example to illustrate the distinction. Say you’re developing a serverless app using Cognito and Lambda. If you used User Pools to manage authentication, then you could configure API Gateway to pass through the user’s ID and group membership to your application. This would allow your code to determine if the user has sufficient permissions to access the requested functionality. However, the IAM permissions used to access the underlying AWS resources, like DynamoDB, would come from the Lambda execution role. All users who access your app would be operating under the same IAM role, and it would be up to you to make sure the right users get access to the right resources.
However, if your application was using Identity Pools, then AWS would assign the user to an IAM role, and you could flow the permissions associated with that role through the application. This would mean, for example, that the user could access DynamoDB with her own IAM permissions, rather than the application-wide permissions that come from the Lambda execution role.
User Pools + Identity Pools: Using Them Together
Now that we have a better understanding of what differentiates User Pools and Identity Pools, let’s explore how the two services work together. As mentioned earlier, the main purpose of an Identity Pool is to map users from an Identity Provider to an IAM role. An Identity Pool doesn’t have its own user directory, it just assigns users from other user directories to an IAM role in your AWS environment. Often, the Identity Provider is an external third-party, but it can also be your app’s own user directory if it’s implemented as a Cognito User Pool. Since a Cognito User Pool is itself an Identity Provider, you can configure your Identity Pool to use your app’s own User Pool as one of its Identity Providers. This gives you the ability to authenticate users with your User Pool and assign them an IAM role using an Identity Pool.
The confusing thing is that the term “Federated Identity” (which is used synonymously with Identity Pool in the AWS documentation) implies a service that is designed to integrate with external Identity Providers. So, it’s a little strange at first to see Cognito itself as an available Identity Provider.
A similar source of confusion is caused by the fact that you can integrate external social providers like Facebook and Google with User Pools directly, without using Federated Identities at all. Using this approach, users can sign up and sign in to your app with their Facebook login, but they never get assigned an IAM role. Instead, the User Pool service automatically assigns these users to a Facebook group, and then maps the attributes of their Facebook profile (e.g. name, email, location) to the user attributes you’ve defined in your User Pool. Again, the key distinction here is not whether the Identity Provider is internal or external, but rather if an IAM role is assigned to the user after authentication.
User Pools or Identity Pools or Both: Which Approach Is Best?
Cognito provides a great deal of flexibility when securing applications. This allows Cognito to be used more broadly, but it also means there are more moving parts to understand. Below are three questions you should ask when designing your Cognito architecture.
First, consider whether it makes sense to manage security at the IAM level. If you have important AWS resources whose access can easily be segregated using IAM permissions, then consider using Identity Pools. For example, if you store confidential information in S3, and each bucket stores information for a different department, then it might make sense to use Identity Pools to let AWS manage access to those buckets for you via IAM. If, on the other hand, most of the information your app is managing resides in a Postgres database, then IAM permissions won’t help you much, as they don’t provide row- or column-level granularity within the database — only your application code can do that.
Second, determine whether your application needs to integrate with third-party Identity Providers. If it’s a small, internal business application, you likely don’t need third-party integration. Your employees will probably be fine managing stand-alone logins for your app. If, on the other hand, your app is a public, consumer-facing website that promotes self-registration, then giving users the ability to register with an existing social account is a must-have, as it will lead to more sign-ups.
Third, consider the trade-off between complexity and security. The more services you use, the more complex the development will be. If you can get away with using a User Pool with no third-party integration, you should do it, as this is the simplest option. However, if you’re dealing with highly confidential financial data that needs to be securely accessed by multiple external partners, then you should manage access at the application-level AND the IAM-level as a second layer of defense, and this would require User Pools, Federated Identities and highly granular IAM permissions.
Once you understand the basic concepts behind User Pools and Identity Pools, you can start to appreciate the flexibility and power of Cognito for securing your applications. And it comes at an important time: the news is littered with accounts of data leaks resulting from broken authentication and busted access control. The more we understand application security — and how to use the security tools in our tool belt — the more secure our users’ data will be.