013 Wsdm 2018 Paper Review How Google Team Estimates Location Bias

013 WSDM 2018 Paper Review - How Google Team Estimates Location Bias #

WSDM (International Conference on Web Search and Data Mining) is a top conference on search, data mining, and machine learning that takes place once a year. It has been held since 2008 and has a history of 11 editions.

Although WSDM has only been held for 11 editions, it is considered a very young conference in the field of computer science. However, the rapid accumulation of influence by WSDM has made it a top conference in the field of data mining. According to data published by Google Scholar, WSDM is currently the second most important academic conference in the field of data mining, second only to KDD, which has been held for over 20 years.

One of the major features of WSDM is the participation of many scholars from the industry. Whether it is submitting and publishing papers, serving on review committees, or being members of the conference organizing committee, there are many people with industrial backgrounds involved. This is also a reason why WSDM attracts attention, as people value research achievements from the industry and hope to learn the latest experiences from them.

The 2018 WSDM conference took place from February 5th to 9th in Los Angeles, United States. Today, we will share a paper from Google presented at WSDM 2018 titled “Position Bias Estimation for Unbiased Learning to Rank in Personal Search”. The core content of this paper is how to combine “causal inference” and “learning to rank” to further estimate user data without bias.

Author Information #

This paper was authored by a group of researchers from Google. Here is a brief introduction to each author:

  • Xuanhui Wang is the first author of the paper. He has been working at Google since 2015. Prior to that, he worked at Facebook for three years, focusing on the development of advertising systems. Before his time at Facebook, he worked as a scientist at Yahoo for two years. Wang received his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign in 2009, under the supervision of renowned Chinese scholar Chengxiang Zhai, who is well-known in the field of information retrieval.

  • Nadav Golbandi is the second author of the paper. He joined Google in 2016, having previously worked as a Principal Research Engineer at Yahoo Research for 8 years, where he specialized in search-related research and development. Before his time at Yahoo Research, Golbandi worked at IBM Research in Israel for 6 years. He holds a Master’s degree in Computer Science from the Technion – Israel Institute of Technology.

  • Michael Bendersky is the third author of the paper. He joined Google in 2012 and has been primarily involved in the development of personal and enterprise information systems, specifically Google Drive. Bendersky received his Ph.D. in Computer Science from the University of Massachusetts Amherst in 2011, under the guidance of Bruce Croft, a well-respected academic authority in the field of information retrieval.

  • Donald Metzler is the fourth author of the paper. He also joined Google in 2012 and has been responsible for the research and development of search quality for personal and enterprise information systems, including Google Drive. Metzler previously worked at Yahoo Research for over two years, and he also served as faculty at the University of Southern California. He received his Ph.D. in Computer Science from the University of Massachusetts Amherst in 2007, under the supervision of Bruce Croft.

  • The last author of the paper is Marc Najork. He joined Google in 2014 and currently holds the position of Research Engineering Director. Prior to Google, Najork worked at Microsoft Research Silicon Valley for 13 years and DEC Research Labs for 8 years. Najork is an academic authority in the field of information retrieval and web data mining. He previously served as the editor-in-chief of the top ACM journal ACM Transactions on the Web. He has published numerous academic papers, with over 7,000 citations.

Main Contributions of the Paper #

Following our method of reading the paper, let’s first look at the main contributions of this article and clarify what problem it primarily solves.

As we all know, all search systems have various “biases,” and how to model these biases better has become an important challenge for machine learning in search systems.

One way is to use manual labeling of “relevance” to obtain relevance information, similar to traditional information retrieval systems, without the need for human-computer interaction. Therefore, the problem of estimating biases is not discussed.

The second way, as mentioned in the article, is to use traditional “click models.” Click models are a kind of probability graphical model specifically designed to estimate both relevance and biases, and they have become relatively mature over the past 10 years or so. The article also mentions that most applications of click models focus on extracting relevance information rather than accurately estimating biases.

The third way is a new direction that has emerged in recent years, which is to directly model biases by combining “causal inference” and ranking learning. This concept was showcased in the best paper at WSDM 2017 [1]. However, in that paper from last year, the relationship between bias estimation and click models was not extensively explored.

In short, the main goal of this paper is to use some ideas from click models to more accurately estimate biases in order to learn better ranking results. At the same time, the paper explores how to better estimate biases with less use of random data. Here, the authors propose a method called “Regression-based EM” (Expectation-Maximization) algorithm.

Core Methods of the Paper #

The article first discusses how if the “bias value” (Propensity Score) is known, which is the probability that users see each document or item, we can construct an “unbiased” metric, such as “Unbiased Precision”, to measure the quality of the system.

Here, the unbiased effect mainly comes from readjusting the weights of the results. This means that not every click is considered equally valuable. In general, if a document is positioned higher, the weight will be lower, and vice versa, if a document is positioned lower, the weight will be higher. The assumption here is a “position bias” hypothesis, meaning that regardless of the document, it is more likely to receive more clicks when placed in a relatively higher position. Therefore, it becomes more difficult for documents in lower positions to be clicked.

In this case, the “bias value” cannot generally be directly known. Therefore, estimating the bias value becomes a core problem.

In estimating the bias value, this article first utilizes a classical click model called “Position Bias Model” to model the bias value and relevance. The assumption of the position bias model is that the probability that a user clicks on a document at a certain position for each query keyword can be decomposed into the product of two probabilities: the probability that the user sees this position and the relevance probability of the document itself. Therefore, the main task of the position bias model is to estimate these two probability values.

If we can randomize the results for each query keyword, then we do not need to estimate the first probability, and we can directly use the click-through rate of the document to estimate its relevance. However, the authors show that complete randomization has an impact on the user experience.

Another method that relatively takes care of the user experience is to not randomize all the results, but only randomize between different “pairs”. For example, swap the positions of the first and second documents, and then swap the positions of the second and third documents, and so on. With this approach, the authors can still estimate the bias and relevance, but the user experience is better than the first method of complete randomization. However, in reality, this method still sacrifices the user experience to some extent.

Therefore, the authors propose a third method, which is to directly estimate the parameters of the position bias model. That is, they do not want to completely eliminate the position probability using randomization, but estimate the position probability and relevance probability.

Here, because there are two probability variables that need to be estimated, the authors use the traditional Expectation-Maximization (EM) algorithm and propose a method called “Regression-based Expectation-Maximization”. Why do they do this? The reason is that in the traditional expectation-maximization, the authors have to estimate each keyword-document pair. However, in user data, such pairs may be very limited, leading to a lack of data situation. Therefore, the authors propose using a regression model to estimate the relevance between documents and query keywords. That is, utilizing the expectation-maximization to estimate the position bias and utilizing the regression model to estimate the relevance.

Experimental Results of the Method #

This article uses search data from Google’s email and file storage, using logs from two weeks in April 2017. The data set contains approximately four million query keywords, with about five results per keyword. The authors validated their proposed method on this data set and found that it can more effectively capture document biases. The ranking model trained using this method performed 1% to 2% better than models that did not consider biases.

Summary #

Today I talked to you about a paper from the Google team at WSDM 2018. This paper introduces how to estimate the position deviation of documents and then train more effective ranking algorithms.

Let’s review the key points together: First, we briefly introduced the author group information of this paper; second, we described in detail the problem and contribution that this paper aims to address; third, we briefly introduced the core content of the proposed method in the paper.

Finally, I will leave you with a question to ponder: Is there any requirement for the randomness of the data when estimating position deviation?

References

  1. Thorsten Joachims, Adith Swaminathan, and Tobias Schnabel. Unbiased Learning-to-Rank with Biased Feedback. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining (WSDM ‘17). ACM, New York, NY, USA, 781-789, 2017.