Cloud Native Identity and Access Management in Kubernetes
Cloud native applications deserve a cloud native identity and access management (IAM) system. In this article, I will highlight some cloud native principles and discuss their roles in deploying an IAM system. Further, I’ll demonstrate how a single IAM system can serve customized APIs in Kubernetes using cloud native principles and also show an API-first approach. Below, “IAM” is loosely used to refer to any type of IAM system, including a customer identity and access management (CIAM) system.
Four Principles for Cloud Native Applications
“Cloud native” refers to a software approach using the tools, services and technologies offered in cloud computing environments. Examples of cloud native technologies include containers, related platforms such as Kubernetes, and immutable infrastructure and declarative APIs with tools like Terraform. The goal is to take advantage of the scale, agility and resilience provided by modern cloud computing to build, deploy and manage dynamic applications. The following principles apply to any cloud native application but are essential for an IAM system.
Elasticity is the ability of a system to dynamically adapt its capacity in response to changing demands. Elastic systems scale up (“grow”) and scale down (“shrink”) depending on the needs and policies. This flexibility is a primary driver of cloud computing. An IAM system may encounter high-demand spikes when many users log in simultaneously, for example, at the beginning of a work day or when launching a campaign. A cloud native approach allows for quickly allocating more resources, such as scaling an IAM system without affecting user experience.
Resilience describes the ability to handle failures and recover from faults. Applications should be designed to tolerate failures such as unexpected network latency. For example, on an architectural level, a resilient system might include redundant deployments in different availability zones, allowing one deployment to take over if the other fails. If the IAM system is down, no users, employees or customers can log in, which could significantly hurt a business’s revenue. Therefore, an IAM system must remain functional even when failures occur.
Observability provides visibility of the state of the system. It allows teams and tools to take actions if required (like restarting an instance). Observability may be combined with traceability or audibility, and is therefore essential for compliance and security. Not only is it important to know the operational state of the IAM system, but also its security state. Observability is key to detecting fraudulent activity in real time and helping security teams react to security breaches.
Automation is the process of replacing manual tasks with automated processes. It includes DevOps features like continuous integration and deployment to deliver and change software on a regular basis. Such automation enables an IAM system to be deployed in a repeatable manner. In particular, it should be possible to automatically scale the IAM system and quickly replace or update instances.
Characteristics of a Cloud Native IAM
To fulfill the above principles, cloud native applications must adopt certain characteristics. For cloud native IAM systems, these common traits include:
- Independent services
- Standard interfaces
- Stateless components
- Environment parity
Cloud native is tightly coupled to microservice architecture, in which the features of an application are implemented by loosely coupled services. Each service has its own codebase and explicitly defined dependencies. Since the services are loosely coupled, a microservice architecture can scale horizontally. To scale horizontally means that one or more services can be duplicated to increase capacity if necessary. This task becomes particularly easy with containerized services.
Within an IAM system, these microservices typically include:
- Authentication service
- Token service
- User management service
All of these services should be able to scale independently, and any administrative or maintenance task should run separately from the application. That is, an IAM system should have a separate administrative service to manage the configuration. The administrative service could include a self-service portal for developers to register their OAuth 2.0 clients, for example. It can also be part of the continuous integration/continuous delivery (CI/CD) pipeline that triggers an update of the IAM system when configuration changes.
Microservices communicate over APIs, and a service commonly relies on many other backend services as part of its normal operation. A cloud native application should not make any distinction between local or third-party backend services but communicate over standard interfaces and protocols. In this way, it can maintain the loose coupling between services and resources.
With loose coupling and standard interfaces, any resource can be replaced during runtime without updating the related services. For an IAM system, these resources commonly include databases like credential stores, token stores or user stores. An IAM system must also integrate with email or SMS providers to send users one-time passwords (OTPs) or activation links.
As mentioned above, observability is vital for an IAM system. Therefore, a cloud native IAM system must support integration with observability tools. That includes publishing metrics in a standardized form, sending logs to the log-management solutions, and providing tracing and audit information.
Note that the IAM system itself is an important backend service for applications. As such, it should provide a standardized API. In other words, it should support standard protocols that other services can integrate with. OAuth 2.0 and OpenID Connect are two examples of protocols an IAM system is expected to support.
Each microservice should manage its own data, meaning that services should communicate via APIs and not share data stores. Therefore, the IAM’s authentication service, token service and user management service should store credentials, token data and user account data in separate locations. In addition to data isolation, cloud native applications should store any stateful data, session data or other data shared with backend services. In this way, instances of services become disposable. They can start, stop or be replaced without losing data.
Stateless and disposable components are essential to automation and are key when deploying containers. Moreover, they simplify routing rules because a load balancer does not need to keep track of the state either. Consequently, authentication or authorization can continue seamlessly, even when a node of the IAM system gets torn down.
When working with cloud native applications, one recommendation is to separate the steps for building, running and deploying. Use a CI/CD tool to automate builds and deployments. For example, create container images for every build and deploy a new version of the system based on the new image.
Aim for the different environments — development, stage and production — to be as identical as possible. Once more, containers are a great tool for that purpose because you can easily reuse the same image in different environments.
If the configuration is file-based and does not contain secrets, it can easily be put into version control.
Another good approach is to share the configuration between the environments but keep environment-specific configuration parameters in environment variables. If the configuration is file-based and does not contain secrets, it can easily be put into version control.
That way, the configuration can be changed outside the running application, and the CI/CD pipeline can take over administrative tasks. Then there is no need to enable administrative support in the IAM application, thus reducing the attack vector and minimizing the risk of the environments diverging. At the same time, the version control system provides audibility for the configuration file.
Deploying a Cloud Native IAM in Kubernetes
As mentioned above, containers are the ultimate tool for cloud native apps. When used correctly, containers are self-contained, disposable, resilient and scalable. Kubernetes is the de-facto platform for managing containers and containerized applications. It is, therefore, the first choice for managing a cloud native IAM system as well.
When deploying a cloud native IAM in Kubernetes, you get more power and control than when consuming a SaaS product. Obviously, you can choose the product for the IAM system. Some vendors, like Curity, use an elastic licensing model that does not add extra licensing costs when scaling automatically.
In addition, you can also control the deployment in Kubernetes and reduce the attack surface. For example, you can configure Kubernetes to expose only certain endpoints to the internet and keep other endpoints private.
Health-status endpoints and metrics help to implement auto-healing and scaling for the containers.
An IAM system should expose status information and metrics. Health-status endpoints and metrics help to implement auto-healing and scaling for the containers. Kubernetes can replace a broken container automatically if health checks fail.
If certain metrics values pass a threshold, new instances can be added to increase the capacity. A cloud native IAM system must support some integration with the Kubernetes Control Plane, observability tools and auto-scaling features of the cloud platform to improve availability and resilience.
Typically, each IAM service is deployed in a separate container and together they form a cluster. Kubernetes takes care of service discovery and DNS resolution, among other duties within the cluster.
Consequently, new services are automatically detected, and routing rules are automatically set up to enable services to receive requests and responses. This feature is important for the magic behind auto-healing and scaling but also for update procedures, where one part of the IAM system after the other gets replaced with a new version to keep things working.
As an option, a service mesh can be added to the cluster to protect and improve inter-service communication. Typically, each container in a service mesh is accompanied by a proxy that can, for example, encrypt communication between the services, handle load balancing or enforce policies.
Security policies for the IAM cluster are configured in the same manner as for any other application running in a Kubernetes cluster. In particular, you can make use of technologies such as SPIFFE to automatically manage workload identities of the IAM system.
It may require several services to satisfy the diverse requirements of API consumers. For this and security reasons, place an ingress controller or API gateway at the edge of the IAM cluster. Not only can an API gateway obfuscate the internals of the cluster and provide customized APIs by packaging (micro) services into different products, but it also provides security measures such as throttling traffic or request validation.
IAM with an API-First Approach
The IAM system is ultimately a specialized API that must meet its consumers’ requirements despite standard-based restraints. Therefore, the API-first approach is applicable even for an IAM. In an API-first approach, you start designing the API according to the needs of its (future) consumers.
Now, I strongly discourage you from writing your own IAM system. However, I still want to stress the importance of selecting an IAM solution that lets you design an IAM-specific API as part of the deployment. Ensure that the API you expose via the API gateway meets the demands of your consumers (“clients,” in OAuth 2.0 terms). For example, the IAM system’s authentication and token service are typically considered “external” services, whereas user management typically serves an “internal” audience.
At Curity, we recommend issuing so-called opaque access tokens to external clients for security reasons. An organization cannot control external clients and should limit the impact of leaked access tokens. Opaque tokens are just random strings with no further meaning outside the IAM system.
Consequently, the impact of such a token being lost or stolen is limited. Internal consumers, on the other hand, may receive structured tokens like JWTs (JSON Web Tokens). This approach is called the phantom token pattern. The idea is to have pairs of opaque and structured tokens where a reverse proxy or an API gateway uses the opaque token as a reference to fetch the structured token that it can forward to the APIs of the application.
For an API-first approach, the services of the IAM system must be customizable.
Internal consumers can benefit from the JWTs without compromising security. The pattern obfuscates the details for the client, thus the name phantom token. It only requires simple plugins in the Kubernetes ingress controller, for example.
Not all features and services of an IAM system are equally suitable for all types of clients. For example, user management is typically reserved for internal clients and not hosted next to common APIs for external clients. However, it is not enough to simply divide an IAM system into static services.
For an API-first approach, the services of the IAM system must be customizable. For example, the authentication service can serve many authentication methods, but for various reasons, some methods should be only accessible for certain types of clients.
The token service publishes endpoints for different OAuth 2.0 or OpenID Connect flows. You may need to configure the endpoints differently and only expose some for a certain group of clients and others for another group, or you may want to be able to scale different flows independently because some are more requested than others. The custom services and IAM system configuration should be manageable even for complex setups.
Preferably, there should be only one configuration applicable for all (customized) services of the IAM system.
Preferably, there should be only one configuration applicable for all (customized) services of the IAM system.
To achieve that, the Curity Identity Server offers three runtime service types: an authentication service, a token service and a user management service. Each service provides a list of endpoints. Which endpoints a service includes depends partly on the supported features and can be adapted individually.
A service does not automatically expose all endpoints, but endpoints are mapped to a service role. A runtime instance of the Curity Identity Server is then assigned a service role and deployed in a container that publishes the listed endpoints of that role. In that way, the overall set of features of a service can be divided and spread over (specialized) service roles and containers. The containers can scale independently and serve different clients.
You can add new instances or remove old ones dynamically without affecting others. This is possible because the Curity Identity Server follows the principles of independent and stateless services.
A cluster of the Curity Identity Server may include runtime instances with different service roles. This is called an asymmetric cluster. However, even when runtime instances expose different endpoints and run different service roles, they all use the same (global) configuration.
Runtime instances are independent of each other and, in particular, of the admin service. They can operate completely isolated as long as they have a working configuration. The configuration may be parameterized with environment variables to easily be ported from one environment to another. With that approach, it is feasible to add the configuration file under version control, track and audit changes, and let the CI/CD pipeline handle the deployment completely automated, as you would expect working with cloud native applications.
Cloud native principles work well for an IAM system. If implemented properly, they improve the performance, reliability, and security of the deployment. By sticking to a standard model, such as containers and Kubernetes, it is possible to deploy the IAM system in any cloud computing environment, including private clouds.
To get the best out of an IAM system, ensure it is flexible and apply an API-first approach. This means first considering your requirements and designing the IAM system to suit them. Standardized interfaces, customizable and independent services as offered by the Curity Identity Server help on the way to running a cloud native IAM system that fits your needs.