how can the search engine learn from user interactions?
query suggestions
goal: find related queries in the query log, based on
- common substring
- co-occurrence in session
- term clustering
- clicks
how can we use log data for evaluation?
use clicking and browsing behaviour in addition to queries:
- click-through rate: nr of clicks a document attracts
- dwell time: time spent on a document
- scrolling behaviour: how users interact with the page
- stopping information: does the user abandon the search engine after a click?
what are the limitations of query logs?
learning from interaction data
implicit feedback, needed if we don’t have explicit relevance assessments
assumption: when the user clicks on a result, it is relevant to them
3 limitations of implicit feedback
noisy: a non-relevant document might be clicked or a relevant document might not be clicked
biased: clicks for reasons other than relevance
- position bias: higher ranked documents get more attention
- selection bias: only interactions on retrieved documents
- presentation bias: results that are presented differently will be treated differently
what is the interpretation of a non-click? => either the document didn’t seem relevant or the user did not see the document
probabilistic model of user clicks
P(clicked(d)|relevance(d), position(d)) = P(clicked(d)|relevance(d), observed(d)) * P(clicked(d)|position(d))
how to measure the effect of position bias?
Idea: changing the position of a document doesn’t change its relevance, so all changes in click behaviour come from the position bias
intervention in the ranking:
1. swap two documents in the ranking
2. present the modified ranking to some users (A/B test)
3. record the clicks on the document in both original and modified rankings
4. measure the probability of a document being observed based on the clicks
how to correct for position bias?
Inverse Propensity Scoring (IPS) estimators can remove bias
Main idea: weigh clicks depending on their observation probability => clicks near the top get low weight, clicks near bottom get large weight
formula on slide 20, lecture 11
simulation of interaction
session simulation:
- simulate queries
- simulate clicks
- simulate user satisfaction
require a model of range of user behaviour
- users do not always behave deterministically
- might make non-optimal choices
- models need to contain noise
click models
How do users examine the result list and where do they click?
cascade assumption: user examines result list from top to bottom
Dependent Click Model (DCM)
Dependent Click Model (DCM)
advantages of simulation of interaction
disadvantages of simulation of interaction
query expansion
expand the query with more similar terms: easy to experiment with in a live search engine because no changes to the index are required
document expansion
Doc2Query
document expansion: train a sequence-to-sequence model that, given a text from a corpus, produces queries for which that document might be relevant
- train on relevant pairs of documents-queries
- use model to predict relevant queries for docs
- append predicted queries to documents
conversational search: different methods
retrieval-based: select best response from a collection of responses
generation-based: generate response in natural language
hybrid: retrieve information, then generate response
pros of retrieval-based methods
cons of retrieval-based methods
pros of generation-based methods
cons of generation-based methods
how to evaluate conversational search methods?
retrieval-based methods:
- Precision@n
- Mean Reciprocal Rank (MRR)
- Normalized Discounted Cumulative Gain (NDCG)
generation-based methods (measure word overlap):
- BLEU
- ROUGE
- METEOR
challenges in conversational search