03 Token How to Reduce the Traffic Pressure of User Identity Verification

03 Token How to Reduce the Traffic Pressure of User Identity Verification #

Hello, I am Xu Changlong. In this lesson, we will explore how to use token algorithms to reduce traffic pressure on user authentication in the user center.

In the initial stages, many websites usually implement user authentication through session-based login. After a user logs in successfully, their specific information is stored in the server’s session cache, and a session_id is assigned and saved in the user’s cookie. Each time the user makes a request, this ID is sent along with the request, allowing access to the record stored in the session cache during login.

The flowchart is shown below:

The advantage of this approach is that all the information is stored on the server, without exposing any sensitive user data to the client. Additionally, each logged-in user shares the same cache space (Session Cache).

However, as traffic increases, this design exposes a significant problem - user authentication in the user center becomes unstable under heavy traffic. This is because the user center needs to maintain a large session cache, which is frequently accessed by various business subsystems. If the cache fails, all subsystems are unable to verify user identities, thus unable to provide services externally.

This is mainly due to the tight coupling between the session cache and various subsystems. Every request to the entire site will access this cache at least once, which means the cache’s content length and response speed directly determine the upper limit of QPS (queries per second) for the entire site, resulting in poor system isolation and significant interdependencies between subsystems.

So, how do we reduce the coupling between the user center and various subsystems, and improve system performance? Let’s find out together.

The common way to handle user authentication is by using a signed and encrypted token, which is an industry standard known as JSON Web Token (JWT):

The diagram above shows the login process with JWT. After a user logs in, their user information is stored in an encrypted and signed token. This token is included in the header or cookie of each request sent to the server. The server can then decode the token to obtain the user’s information without having to interact with the user center.

Here is an example of code that generates a token:

import "github.com/dgrijalva/jwt-go"

// Secret key for signing, should not be too simple to prevent cracking
// Asymmetric encryption can also be used, allowing the client to verify the signature with a public key
var secretString = []byte("jwt secret string 137 rick") 

type TokenPayLoad struct {
    UserId   uint64 `json:"userId"` // User ID
    NickName string `json:"nickname"` // Nickname
    jwt.StandardClaims // Private part
}

// Generate JWT token
func GenToken(userId uint64, nickname string) (string, error) {
    c := TokenPayLoad{
        UserId: userId,
        NickName: nickname,
        // Additional encrypted data can be added here
        // Sensitive information should not be in plaintext, if necessary, it should be encrypted again
        
        // Private part
        StandardClaims: jwt.StandardClaims{
            // Expires in two hours
            ExpiresAt: time.Now().Add(2 * time.Hour).Unix(),
            // Issuer
            Issuer:    "geekbang",
        },
    }
    // Create a signature using hs256
    token := jwt.NewWithClaims(jwt.SigningMethodHS256, c)
    // Sign and get the token
    return token.SignedString(secretString)
}

As you can see, this token includes an expiration time. When a token is about to expire, the client automatically communicates with the server to obtain a new token. This approach significantly increases the difficulty of intercepting the client token and impersonating the user.

At the same time, the server can decouple from the user center. The business server can directly parse the token included in the request to obtain the user information without needing to make a request to the user center for every request. The token refresh can be initiated by the app client by requesting the user center, eliminating the need for the business server to make requests to the user center for token refreshing.

How does JWT ensure that the data is not tampered with and the integrity of the data? Let’s take a look at its components.

As shown in the diagram above, the signed token consists of three parts separated by periods: Header, Payload, and Signature. The Header contains the encryption algorithm type, the Payload contains the custom content, and the Signature is used to prevent tampering.

The data structure of the decoded JWT token is shown in the diagram below:

// Header
// Encryption header
{
  "alg": "HS256", // Encryption algorithm, be aware that some attacks set it to "none" to bypass signature verification
  "typ": "JWT" // Protocol type
}

// PAYLOAD
// Payload section, contains JWT standard fields and our custom data fields
{
  "userid": "9527", // Some plaintext information we include, if it involves sensitive information, it is recommended to encrypt again
  "nickname": "Rick.Xu", // Some plaintext information we include, if it involves privacy, it is recommended to encrypt again
  "iss": "geekbang",
  "iat": 1516239022, // Token issuance time
  "exp": 1516246222, // Token expiration time
}

// Signature
// The signature is used to verify if the above two sections have been tampered with. If tampered with, the signature will be different
// It will not match during validation

How does JWT validate if a token is valid and if it has expired or is legal? The specific method is as follows:

func DecodeToken(token string) (*TokenPayLoad, error) {
    token, err := jwt.ParseWithClaims(token, &TokenPayLoad{}, func(tk *jwt.Token) (interface{}, error) {
        return secret, nil
    })
    if err != nil {
        return nil, err
    }
    if decodeToken, ok := token.Claims.(*TokenPayLoad); ok && token.Valid {
        return decodeToken, nil
    }
    return nil, errors.New("token wrong")
}

Decrypting a JWT token is straightforward. The first and the second parts are base64-encoded. Decrypting these two parts allows us to access all the data in the payload, including the user’s nickname, user ID, user permissions, and token expiration time. To check if a token has expired, we simply compare the expiration time with the current local time to confirm the validity of the token.

The validation of the token’s legality is achieved through signature verification. Any modifications to the information will fail the signature verification. If the signature verification passes, it means the token has not been tampered with and is a valid token that can be used directly.

The process is depicted in the diagram below:

As we can see, with the token approach, the most demanding interface in the user center can be taken offline. Each business server only needs to decode the token and validate its legality to obtain the user information. However, this approach also has a drawback: if a user is blacklisted, the client will need to log out only after the token expires, which introduces a certain delay in our management.

If real-time management of users is desired, the newly generated token can be temporarily stored on the server and compared with the cached token each time a user makes a request. However, this approach significantly impacts performance, and only a few companies would implement it. Additionally, to enhance the security of the JWT system, tokens are generally set to expire after a relatively short time, typically around fifteen minutes. After expiration, the client automatically replaces the token.

Token Replacement and Offline Verification #

So how do we replace and offline validate JWT tokens?

The token replacement on the server side is quite simple. If the client detects that the current token is about to expire, it proactively requests the user center to replace the token by calling the token replacement API. A new token with a timeout of another 15 minutes is generated.

However, if it takes more than 15 minutes to replace the token, the client login will fail. To minimize this type of issue and ensure normal operation of the client even when it is offline for an extended period, the industry widely adopts the dual token approach. You can refer to the following flowchart:

In this solution, there are two types of tokens: refresh_token, which is used to replace the access_token and has a validity period of 30 days, and the access_token, which is used to store the current user information and permission information and is replaced every 15 minutes. If there are failures in requesting the user center and the app is offline, as long as the local refresh_token is not expired, the system can continue to work until the refresh_token expires, and then prompts the user to log in again. This way, even if the user center is down, the business can still operate normally for a period of time.

The implementation of token replacement in the user center is as follows:

// If the token is about to expire in five minutes, replace the token
if decodeToken.StandardClaims.ExpiresAt < TimestampNow() - 300 {
  // Request the user center to check if the user is prohibited from logging in
  //....specific implementation omitted
  
  // Generate a new token
  token, err := GenToken(.....)
  if err != nil {
        return nil, err
  }
  // Update the returned token in the cookie
  resp.setCookie("xxxx", token)
}

This code only performs token replacement when the current token expires. JWT is very friendly to offline app clients because they can store it locally and simply parse it to use user information when needed.

Security Recommendations #

Finally, I’d like to add a few more words. In addition to the comments in the code mentioned above, there are some key considerations when using the JWT scheme. Here I will share them with you.

Firstly, the communication process must use the HTTPS protocol to reduce the possibility of interception.

Secondly, it is important to limit the number of token replacements and regularly refresh tokens. For example, a user’s access_token can only be replaced up to 50 times per day. If exceeded, the user will be required to log in again. Additionally, tokens should be refreshed every 15 minutes. This can minimize the impact on users if a token is stolen.

Thirdly, when storing a web user’s token in a cookie, it is recommended to add restrictions such as httponly and SameSite=Strict. This helps prevent cookies from being stolen by certain malicious scripts.

Summary #

The traditional session approach stores user login information in a centralized manner on the server using SessionID. The client and subsystems need to retrieve this information from the user center every time a request is made, resulting in heavy traffic on the user center and strong dependency on it for all operations.

To alleviate the traffic pressure on the user center and decouple the subsystems from it, we adopt a token-based approach that relies on trust and “signs” the token. The user information is encrypted and issued to the client, allowing the client to possess this information locally. The subsystems can then verify the token using the signature algorithm to obtain user information.

The core of this approach is to transmit and maintain user information outside the server to address the performance bottleneck of the user center. In addition, by regularly changing tokens, the user center also maintains a certain level of user control and increases the difficulty of cracking, achieving multiple benefits.

In fact, there are many similar designs that can simplify system pressure, such as using file CRC32 checksum signatures to confirm the integrity of files during transmission; using Bloom Filters to confirm the existence of a certain key in a data set, and so on. These techniques can greatly improve the efficiency of a system and reduce the pressure of interaction, even in the era of rapid advances in hardware capabilities.

Thought-provoking question #

How can we quickly change the user nickname saved in the token if the user changes their nickname?

You are welcome to communicate and discuss with me in the comment section. See you in the next lesson!