14 Addressing Common Issues Q& a for Oauth 2.0

14 Addressing Common Issues Q&A for OAuth 2 #

Hello, I am Wang Xindong.

It has been over a month since this course was launched on June 29th. I have seen many comments from classmates, including thoughts and questions. First of all, I would like to thank you for your support, encouragement, and feedback on this course.

While replying to your comments, I have also made note of the questions you raised. As I reviewed these questions from start to finish while preparing for today’s Q&A session, I further contemplated the metacognition behind each question. In the end, I have summarized 6 questions:

Can using HTTPS ensure the security of data in JWT format tokens?

Next, let’s take a look at each of these questions.

What is the purpose of inventing OAuth? #

The purpose of the OAuth protocol is to allow end-users, who are the resource owners (e.g. Xiao Ming), to delegate some of their permissions (e.g. querying today’s orders) on a protected resource server (e.g. JD.com open platform) to third-party applications (e.g. Xiao Tu’s order software), thereby enabling the third-party applications (e.g. Xiao Tu) to act on behalf of the end-users (e.g. Xiao Ming) and perform operations (e.g. querying today’s orders).

This is the purpose of designing the OAuth protocol. In the OAuth protocol, instead of using usernames and passwords, credentials with limited access permissions to the protected resources, known as access tokens, are generated for each combination of third-party software and user. The process of generating the access token is performed between the user and the platform, and the third-party software has no knowledge of any user information.

This greatly simplifies the logic processing of the third-party software, as its future actions become requesting an access token, using the access token, and accessing the protected resources. Additionally, when the third-party software calls numerous APIs, it no longer transmits usernames and passwords, reducing the attack surface of network security.

From a security perspective, generating an access token for each combination of third-party software and user can reduce the harm caused to other users on the platform. If a single third-party software is compromised, only the users of that particular third-party software will be affected.

Now, some may ask, won’t the focus of attacks then shift to the authorization server? This idea is correct, but protecting an authorization server is definitely easier than protecting thousands of third-party software developed by different developers.

Is OAuth 2.0 an identity authentication protocol? #

In this course, I have actually been emphasizing that OAuth 2.0 is an authorization protocol that “solely focuses on doing authorization well,” and it is not an identity authentication protocol. However, when I first started learning OAuth 2.0, I also mistakenly believed that it was an identity authentication protocol.

At that time, I thought that since users are involved, such as Xiao Ming needing to log in to the authorization service before using the Little Rabbit Order App to authenticate their identity, OAuth 2.0 should be an identity authentication protocol.

However, Xiao Ming must log in before authorization, which is an additional requirement. The login and authorization systems are independent. Although the login operation seems to be “embedded” in the OAuth 2.0 process, in a production environment, login and authorization are two separate systems. Therefore, this kind of “embedded” identity authentication behavior does not mean that OAuth 2.0 itself takes on the responsibility of an identity authentication protocol.

Additionally, identity authentication tells third-party software who the current user is, but in reality, OAuth 2.0 never reveals any user information to third-party software. We also mentioned this when discussing the purpose of inventing the OAuth protocol. Let’s think about the example of the Little Rabbit Order App again to see if it’s like this: the app will never know any information about Xiao Ming; it only requests access tokens, uses them, and eventually calls the API to query orders.

Can the access token be kept valid with the refresh token? #

To answer this question, let’s review a few key points about access tokens and refresh tokens.

First, the core of OAuth 2.0 is authorization, and the core of authorization is tokens, which are what we call access tokens.

Second, in Lesson 3, we mentioned that to improve user experience, OAuth 2.0 provides a mechanism called refresh token, which allows third-party software to request a new access token without requiring the user to authorize again when the access token expires.

Third, in terms of usage, refresh tokens can only be used at the authorization service, while access tokens can only be used at the protected resource service.

With these foundational knowledge, we can now analyze the question “Can the access token be kept valid with the refresh token?”

When the access token is “passed” to the protected resource service, the service needs to verify the access token and match the permissions associated with the access token to the requests from the third-party software. When the access token expires, the new access token we get using the refresh token is generated by the authorization service, not by extending the validity of the original access token.

Once this refresh token is used, the authorization service can decide whether to issue a new refresh token or to return the previous refresh token to the third-party software. For security reasons, our recommendation is to return a new refresh token. Now, you might have a question: The third-party software has already obtained a new access token while the refresh token still exists, so can it continue to use the refresh token to obtain access tokens indefinitely?

To address this question, we need to know that refresh tokens also have an expiration period. Even though a new refresh token is generated, its expiration period will not change, and the timestamp of the expiration period remains the same as the previous refresh token. Once the expiration period of the refresh token is reached, it can no longer be used to request new access tokens.

Does using HTTPS ensure the data security of JWT format tokens? #

The use of HTTPS should never be detached from OAuth 2.0. This is because the transmission of sensitive information such as access tokens and application secrets cannot be separated from the protection provided by HTTPS. However, HTTPS only ensures the security of important information such as access tokens during network transmission.

In the OAuth 2.0 specification, access tokens should be opaque to third-party software and should never be parseable by any third-party software. Since JWT format tokens contain user-related information, such as user identifiers, simply signing them is not enough. To prevent third-party software from having the opportunity to access the information contained in access tokens, when using JWT format tokens in an environment where we interact with third-party software, we must also encrypt the tokens to ensure their security, instead of relying solely on HTTPS.

Is there a connection between ID tokens and access tokens? #

In [Lesson 9], we talked about ID tokens while implementing an OpenID Connect authentication protocol using OAuth 2.0. Some students were still confused about the relationship between ID tokens and access tokens, so I replied in the comments section during that lesson. Now, I will explain it again after reorganizing my thoughts because understanding the connection and differences between ID tokens and access tokens is crucial for us to build an authentication protocol using OAuth 2.0.

First, let’s summarize the roles of ID tokens and access tokens:

ID tokens, also known as ID_TOKEN, represent user identity tokens. They are considered separate authentication results and are never passed as parameters to other external services like access tokens are.
Access tokens, also known as ACCESS_TOKEN, are tokens used by third-party software as credentials to represent the user when requesting protected resource services.

As you can see, these two tokens are fundamentally different. Now, let’s analyze where their differences lie.

First, the ID token is meant to complement the access token rather than replace it. The reason for this dual-token approach is to maintain the existing transparency of the access token to third-party software in OAuth 2.0, while allowing the newly added ID token to be easily interpreted for use in the authentication protocol.

Second, ID tokens and access tokens have different lifecycles, with the ID token having a relatively shorter lifespan. The purpose of the ID token is to represent a unique authentication result and serve as a user identifier. However, this identifier is not the user’s username, which is used during login rather than the ID token. As a result, when the user logs out or ends the session, the lifecycle of the ID token also ends.

In contrast, the access token can continue to be used by third-party software to request protected resource services for a long time even after the user has left. For example, if Xiao Ming is using the TuDaDanDan software’s batch order export feature, and the process takes a long time, Xiao Ming does not need to remain present throughout.

What problem does the PKCE protocol solve? #

In Lecture 7, when we learned about the PKCE protocol, I saw many comments from everyone, some are their own thoughts and some are further discussions about the protocol. To understand what problem the PKCE protocol solves, we need to first look at the background of its introduction.

In October 2012, the official authorization protocol framework of OAuth 2.0, RFC 6749, was formally released. In September 2015, the PKCE protocol, RFC 7636, was added as a supplement. In terms of time, there was a three-year gap between the formal release of the OAuth 2.0 authorization protocol and the added release of the PKCE protocol, and these three years happened to be a period of vigorous development for mobile applications.

At the same time, there are special security issues with storing secrets in native mobile client applications, and clients that use the OAuth 2.0 authorization code grant type are susceptible to authorization code interception attacks.

Therefore, the background for the addition of the PKCE protocol is the rapid development of mobile applications and the security risks faced by native clients using OAuth 2.0. With this understanding, we can see that the main purpose of releasing the PKCE protocol is to mitigate attacks against public clients and improve the security of authorization code usage.

Summary #

Today, we dedicated a class specifically to answering common questions about OAuth 2.0. Let me summarize the key points you need to understand:

The purpose of the OAuth protocol is to use tokens instead of usernames and passwords.

OAuth 2.0 cannot be directly used as an “identity authentication” protocol. Although the use of OAuth 2.0 requires an HTTPS environment, this does not solve the problem of the “opaqueness” of JWT tokens to third-party software, so encryption is still necessary.

Even with a refresh token, the access token cannot remain valid forever, as the refresh token also has an expiration period.

The ID token is a supplement to the access token, rather than a replacement for it.

PKCE is a supplementary protocol of OAuth 2.0, mainly used to mitigate the security risk of authorization code interception.

You may encounter other issues while learning and practicing OAuth 2.0, but don’t worry, our comment section is always open, and I will continue to wait for you in the comment section to respond to your concerns and questions.

Feel free to share your thoughts in the comment section and share today’s content with other friends who use OAuth 2.0. Let’s progress together.