021 Sigir 2018 Paper Review How to Model Click Behaviors on Search Pages as a Sequence

021 SIGIR 2018 Paper Review - How to Model Click Behaviors on Search Pages as a Sequence #

Today, we will continue to delve into the SIGIR 2018 paper.

We have already shared the best papers from SIGIR 2018, which introduced how to model biases in recommendation systems. By understanding these biases, we can make more accurate recommendations based on popularity. On Monday, we shared the best short paper of the conference, which focused on using adversarial learning techniques to make ranking models more robust and applicable to various domains.

Today, we will share a paper titled “A Click Sequence Model for Web Search.”

The first author of the paper, Alexey Borisov, is from the Russian search engine Yandex and is pursuing a PhD at the University of Amsterdam. He has previously published several papers combining click models and deep learning models.

The second author, Martijn Wardenaar, the third author, Ilya Markov, and the final author, Maarten de Rijke, are also from the University of Amsterdam. Maarten is a Dutch computer scientist, an authority in European information retrieval research, and a member of the Royal Netherlands Academy of Arts and Sciences.

Main Contributions of the Paper #

First, let me summarize the core idea of this paper, which is to model user click behavior on search pages using deep learning models.

Traditionally, this approach of modeling user click behavior on search pages is referred to as “click models”. Starting from the year 2000, research on click models has become a very active subject in the field of information retrieval and search. In the past 10 years, researchers have proposed dozens of different click models. In general, these different click models mainly encode different user behaviors in order to predict user click behavior more accurately.

In many traditional click models, a common assumption is made to simplify the model: that for each query keyword, a user makes only one click on the search result page. Under this simplified assumption, it becomes easier for researchers to model user browsing, clicking, and page biases (such as position bias). However, in many scenarios, this assumption is overly simplistic. Many users will click on multiple results on the search result page for the same query keyword. Therefore, modeling multiple click results becomes important.

This paper focuses on sequentially modeling user click behavior on search pages, which enables us to easily predict various aspects of each search page, such as the number of clicks and the positions of clicks.

Furthermore, this paper also contributes by utilizing recurrent neural networks (RNNs) in deep learning to model query keyword results, expanding the expressive power of traditional click models based on purely probabilistic modeling in the era of deep learning.

Core Methods of the Paper #

The core idea of the paper is to model all possible click sequences for each query keyword. This task is accomplished by building a neural network.

Specifically, the model proposed in the paper consists of two important modules: the encoder and the decoder.

The encoder is responsible for generating the “embedding vector” for the query keyword and search results. In recent years, embedding vectors have been an important technique in deep learning modeling. Their goal is to transform discrete variables into continuous information. In this case, both the query keyword and search results can be represented as discrete input information, which need to be mapped to a common semantic space. This can be considered as an intermediate result or a latent variable in probabilistic models.

The decoder determines the likelihood of a click occurring at each position based on the query keyword and search results represented by the embedding vector. This is essentially a multiclass classification problem. The decoder needs to predict whether to terminate at a certain state, and the authors introduce a special symbol to represent the termination of a sequence. Such operations on the decoder are common in deep sequence modeling.

It can be said that the authors put a lot of effort into designing the structure of the encoder and decoder.

For the encoder, the authors believe that a good embedding vector should include the current result information, as well as the context information of the current result and the query keyword. This allows each search result to be treated as an independent unit with sufficient information for subsequent modeling.

Therefore, the authors first convert the query keyword and each search result into the first level embedding vector, forming a large first level embedding vector. Then, the authors use this first level embedding vector and introduce a recurrent neural network to encode the surrounding results of the current result, both forward and backward, resulting in a second level embedding vector. This second level embedding vector is the final representation of each search result.

For the decoder, the authors use the “attention” mechanism to apply different weights or attention to each search result. At each time step, after making the decision of whether to click, a new attention vector or a set of new attention weights is generated. The core here is a recurrent neural network that updates its internal state variables and predicts the next click position based on the attention vector and input embedding vector.

Having the encoder and decoder, a challenge is how to generate the most probable click sequences. As mentioned earlier, the entire model can predict multiple different click sequences. Therefore, it is necessary to generate the most optimal K sequences. In this paper, the authors use the “beam search” method to approximate the generation of the top K sequences, where K is set to 1024.

The training of the model uses standard SGD and Adam optimization methods. The authors also employ “gradient clipping” to prevent the occurrence of “gradient explosion” during optimization.

Experimental Results #

The authors conducted their experiments on Yandex, a search engine in Russia. As there were no similar models previously, the article did not have other models to directly compare with. The main evaluation performed by the authors was to see if the historical data of click sequences could be correctly predicted and if they appeared in the top K predicted click sequences by the models. This is why the authors chose K to be 1024, as in this case, nearly 97% of the historical sequences were included in the model’s predicted sequences.

The authors also evaluated if the models could predict the total number of clicks and other related tasks associated with click prediction. The proposed models in the paper were able to predict all clicks with a probability close to 1 and outperformed some previous probability-based click models. It can be said that the proposed models can effectively model users’ click behavior on search pages.

Summary #

Today I discussed a brilliant paper from SIGIR 2018.

Let’s review the main points: Firstly, we provided a detailed introduction to the problem that this paper addresses and its contributions. The main contribution is the sequence modeling of user click behavior on search pages. Secondly, we briefly introduced the core content of the proposed method, which includes two modules: the encoder and the decoder. Finally, we provided a brief overview of the experimental results in the paper.

Lastly, I leave you with a question to ponder: If modeling the click behavior of multiple consecutive query keywords, can you use the approach proposed in this paper to expand the model?