The Data Your Access Token Reveals and How to Secure It
Information security, sometimes shortened to InfoSec, is an integral part of an organization’s security policy. It tries to ensure that a malicious actor cannot access or modify valuable information belonging to a company.
Unsurprisingly, InfoSec is equally important to APIs, although the topic is often overlooked. Many APIs struggle with keeping the information shared with consumers to a necessary minimum. As such, excessive data exposure is listed as No. 3 on the OWASP 2019 list of top 10 API vulnerabilities.
Where Is My Data?
The OWASP vulnerability focuses on developers exposing unfiltered data in API responses. For example, an API request may return a whole database object in the response body instead of sending just the bare minimum data that the client needs. But the response body is not the only place that can shed data from your system.
Access, refresh and ID tokens are mostly seen as security or utility entities, but they are, in fact, carriers of valuable information. Show me your access token, and I will tell you who you are, or at least obtain a lot of valuable information. ID tokens, in line with the OpenID Connect specification, are always in the form of a JSON Web Token (JWT). This means that its content, even though integrity-protected, can be read by anyone who gets hold of it.
Of course, an ID token is intended to give the client information about the user. Still, companies should always be aware of the claims that eventually end up in an ID token and whether it’s the necessary minimum the client should receive.
Any JWT may contain a user’s sensitive data.
In the case of access and refresh tokens, things are not so straightforward. The OAuth specification does not require those tokens to have any specific format, but JWTs are most often used out of convenience. However, the information encoded in an access or refresh token is meant to be used by the API and not the client, and definitely not by anyone who happens to read such a token.
APIs need information about a user to perform fine-grained authorization decisions, but it’s the API that should have access to this information, not anyone else. When JWTs are used for access or refresh tokens, that information is leaked to the client or any malicious actor who intercepts the token.
The API and the authorization server often belong to the same organization, so the data can be safely shared between them. But clients often belong to a third party and thus shouldn’t have access to the information contained in tokens. Even tokens of first-party clients are accessible by users and exchanged through the internet, thus are vulnerable to eavesdropping.
Which Data Should Be Secured?
Personally identifiable information (PII) is the type of data that typically first comes to mind when thinking about InfoSec. Companies must not leak any of their users’ PII and must take care when such information is put in tokens as claims. Other information about a user used as claims in tokens, which can be important from a business perspective, should also be protected. This could include any information used to profile users, such as interests and hobbies. Even if leaking such information doesn’t incur penalties, it might still damage the company’s operations and reputation.
However, user information is not the only type of token data that should be treated with caution. Malicious actors can use claims to gain information about the company’s infrastructure and technologies they use. Such information could help attackers breach the company’s security systems.
How to Protect Data in Tokens
Governance Is the Key
Take care with the contents of tokens through every step of the application life cycle, during design, development and security audits. Companies must be aware of the data that ends up in access and refresh tokens and what possible user information might end up in an ID token. Put procedures in place to govern any modifications made to the contents of a token. Claims should not be added freely by anyone and should be audited and verified to ensure including them does not constitute information disclosure.
Always Use a Token Service
Always use a dedicated service responsible for issuing tokens to facilitate the governance of token contents. This gives a clear separation of concerns and creates a single point of control. Business services should never issue tokens themselves. Allowing it will quickly become unmaintainable. Identifying data leaks and solving them will become difficult when there is no centralized token issuer.
Often, dedicated authorization servers have robust token service implementations. They will allow you to easily customize the contents of a token based on a variety of input parameters. (The Curity Identity Server’s Token Designer allows just that.) For example, tokens could contain different claims for different clients or audiences. This provides a mechanism to prevent disclosing unnecessary data. The authorization server will also audit token issuance and provide endpoints for revoking tokens.
Data from a token can’t be leaked if you use opaque tokens, but it’s usually more convenient for your APIs to receive JWTs. That’s where the Phantom Token pattern comes in handy. In this approach, you issue opaque tokens to clients and use the API gateway to perform token introspection and exchange the opaque token for a JWT. Thus, the APIs receive JWT access tokens, even though clients deal with opaque tokens. This approach allows companies to be more lenient when putting data in tokens, as their contents cannot be read.
In some cases, the Split Token approach might be more appropriate. This is a pattern similar to the Phantom Token, but the token introspection does not require online communication with the authorization server.
Keep Tokens out of the Browser
Another way to secure the token’s contents from eavesdroppers is to keep tokens out of the browser altogether. With the help of a backend component called a Token Handler, tokens can be kept away from the user agent, and secure, encrypted cookies are used to handle user sessions instead. This prevents malicious actors from intercepting tokens, which also means that the token’s contents cannot be leaked.
It’s easy to overlook particular parts of systems when thinking about information security. Tokens are often missed when information security is designed or tested. Companies tend to focus on what data an API should return, how to properly secure access to those APIs or how to properly secure the tokens themselves.
Rarely is the content of the token itself considered worthy of design forethought. But data contained in tokens in the form of claims should be protected in the same manner as data returned by APIs. Companies should look into hiding this data from the outside world, or, if that’s not possible, be aware of what information can be exposed through tokens.