ultra.learning_algorithm package¶
Submodules¶
ultra.learning_algorithm.base_algorithm module¶
The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.
-
class
ultra.learning_algorithm.base_algorithm.
BaseAlgorithm
(data_set, exp_settings)¶ Bases:
abc.ABC
The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.
-
PADDING_SCORE
= -100000¶
-
abstract
__init__
(data_set, exp_settings)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
create_input_feed
(input_feed, list_size)¶ Create the input from input_feed to run the model
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
list_size – (int) The top number of documents to consider in the input docids.
-
create_model
(feature_size)¶ Initialize the ranking model.
- Returns
The ranking model that will be used to computer the ranking score.
-
create_summary
(scalar_name, summarize_name, value, is_training)¶ Summarize the result of an operation
- Parameters
scalar_name – (str) A string used as the name for the result of the operation to add to Tensorboard.
summarize_name – (str) A string used as the name for the summarization of the operation.
value – The value of the result of the operation.
is_training – (Boolean) if the model is in training mode or eval mode.
- Returns
The ranking model that will be used to computer the ranking score.
-
get_ranking_scores
(model, input_id_list, **kwargs)¶ Compute ranking scores with the given inputs.
- Parameters
model – (BaseRankingModel) The model that is used to compute the ranking score.
input_id_list – (list<torch.Tensor>) A list of tensors containing document ids. Each tensor must have a shape of [None].
is_training – (bool) A flag indicating whether the model is running in training mode.
- Returns
A tensor with the same shape of input_docids.
-
l2_loss
(input)¶
-
opt_step
(opt, params)¶ Perform an optimization step
- Parameters
opt – Optimization Function to use
params – Model’s parameters
- Returns
The ranking model that will be used to computer the ranking score.
-
pairwise_cross_entropy_loss
(pos_scores, neg_scores, propensity_weights=None)¶ Computes pairwise softmax loss without propensity weighting.
- Parameters
pos_scores – (torch.Tensor) A tensor with shape [batch_size, 1]. Each value is
ranking score of a positive example. (the) –
neg_scores – (torch.Tensor) A tensor with shape [batch_size, 1]. Each value is
ranking score of a negative example. (the) –
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.
- Returns
(torch.Tensor) A single value tensor containing the loss.
-
pairwise_loss_on_list
(output, labels, propensity_weights=None)¶ Computes pairwise entropy loss.
- Parameters
output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.
- Returns
(torch.Tensor) A single value tensor containing the loss.
-
ranking_model
(model, list_size)¶ Construct ranking model with the given list size.
- Parameters
model – (BaseRankingModel) The model that is used to compute the ranking score.
list_size – (int) The top number of documents to consider in the input docids.
scope – (string) The name of the variable scope.
- Returns
A tensor with the same shape of input_docids.
-
remove_padding_for_metric_eval
(input_id_list, model_output)¶
-
sigmoid_loss_on_list
(output, labels, propensity_weights=None)¶ Computes pointwise sigmoid loss without propensity weighting.
- Parameters
output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a
example. (relevant) –
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.
- Returns
(torch.Tensor) A single value tensor containing the loss.
-
softmax_loss
(output, labels, propensity_weights=None)¶ Computes listwise softmax loss without propensity weighting.
- Parameters
output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a
example. (relevant) –
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.
- Returns
(torch.Tensor) A single value tensor containing the loss.
-
abstract
train
(input_feed)¶ Run a step of the model feeding the given inputs for training.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.
-
abstract
validation
(input_feed)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.
-
-
ultra.learning_algorithm.base_algorithm.
softmax_cross_entropy_with_logits
(logits, labels)¶ Computes softmax cross entropy between logits and labels.
- Parameters
output – A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – A tensor of the same shape as output. A value >= 1 means a
example. (relevant) –
- Returns
A single value tensor containing the loss.
ultra.learning_algorithm.dbgd module¶
Training and testing the Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.
See the following paper for more information on the Dueling Bandit Gradient Descent (DBGD) algorithm.
Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.
-
class
ultra.learning_algorithm.dbgd.
DBGD
(data_set, exp_settings)¶ Bases:
ultra.learning_algorithm.base_algorithm.BaseAlgorithm
The Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.
This class implements the Dueling Bandit Gradient Descent (DBGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.
Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.
-
__init__
(data_set, exp_settings)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
click_simulation_winners
(input_feed, rank_scores)¶
-
compute_gradient
(final_winners, noisy_params)¶
-
create_new_output_list
(noisy_params)¶
-
create_noisy_param
()¶
-
train
(input_feed)¶ Run a step of the model feeding the given inputs for training process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
validation
(input_feed, is_online_simulation=False)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.
ultra.learning_algorithm.dbgd_interleave module¶
ultra.learning_algorithm.dla module¶
Training and testing the dual learning algorithm for unbiased learning to rank.
See the following paper for more information on the dual learning algorithm.
Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18
-
class
ultra.learning_algorithm.dla.
DLA
(data_set, exp_settings)¶ Bases:
ultra.learning_algorithm.base_algorithm.BaseAlgorithm
The Dual Learning Algorithm for unbiased learning to rank.
This class implements the Dual Learning Algorithm (DLA) based on the input layer feed. See the following paper for more information on the algorithm.
Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18
-
__init__
(data_set, exp_settings)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
clip_grad_value
(parameters, clip_value_min, clip_value_max)¶ Clips gradient of an iterable of parameters at specified value.
Gradients are modified in-place.
- Parameters
parameters (Iterable[Tensor] or Tensor) – an iterable of Tensors or a single Tensor that will have gradients normalized
clip_value (float or int) – maximum allowed value of the gradients. The gradients are clipped in the range \(\left[\text{-clip\_value}, \text{clip\_value}\right]\)
- Return type
None
-
get_normalized_weights
(propensity)¶ Computes listwise softmax loss with propensity weighting.
- Parameters
propensity – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.
- Returns
(tf.Tensor) A tensor containing the propensity weights.
-
separate_gradient_update
()¶
-
train
(input_feed)¶ Run a step of the model feeding the given inputs.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
validation
(input_feed, is_online_simulation=False)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.
-
class
ultra.learning_algorithm.dla.
DenoisingNet
(input_vec_size)¶ Bases:
torch.nn.modules.module.Module
-
__init__
(input_vec_size)¶ Initializes internal Module state, shared by both nn.Module and ScriptModule.
-
forward
(input_list)¶ Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
-
training
: bool¶
-
-
ultra.learning_algorithm.dla.
sigmoid_prob
(logits)¶
ultra.learning_algorithm.ipw_rank module¶
Training and testing the inverse propensity weighting algorithm for unbiased learning to rank.
See the following paper for more information on the inverse propensity weighting algorithm.
Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16
Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17
-
class
ultra.learning_algorithm.ipw_rank.
IPWrank
(data_set, exp_settings)¶ Bases:
ultra.learning_algorithm.base_algorithm.BaseAlgorithm
The Inverse Propensity Weighting algorithm for unbiased learning to rank.
This class implements the training and testing of the Inverse Propensity Weighting algorithm for unbiased learning to rank. See the following paper for more information on the algorithm.
Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16
Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17
-
__init__
(data_set, exp_settings)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
train
(input_feed)¶ Run a step of the model feeding the given inputs for training process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
validation
(input_feed, is_online_simulation=False)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
ultra.learning_algorithm.ipw_rank.
selu
(x)¶
ultra.learning_algorithm.pairwise_debias module¶
Training and testing the Pairwise Debiasing algorithm for unbiased learning to rank.
See the following paper for more information on the Pairwise Debiasing algorithm.
Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.
-
class
ultra.learning_algorithm.pairwise_debias.
PairDebias
(data_set, exp_settings)¶ Bases:
ultra.learning_algorithm.base_algorithm.BaseAlgorithm
The Pairwise Debiasing algorithm for unbiased learning to rank.
This class implements the Pairwise Debiasing algorithm based on the input layer feed. See the following paper for more information on the algorithm.
Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.
-
__init__
(data_set, exp_settings)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
train
(input_feed)¶ Run a step of the model feeding the given inputs for training process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
validation
(input_feed, is_online_simulation=False)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
ultra.learning_algorithm.pairwise_debias.
get_bernoulli_sample
(probs)¶ Conduct Bernoulli sampling according to a specific probability distribution.
- Parameters
prob – (tf.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.
- Returns
A Tensor of binary samples (0 or 1) with the same shape of probs.
ultra.learning_algorithm.pdgd module¶
Training and testing the Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.
See the following paper for more information on the Pairwise Differentiable Gradient Descent (PDGD) algorithm.
Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.
-
class
ultra.learning_algorithm.pdgd.
PDGD
(data_set, exp_settings, forward_only=False)¶ Bases:
ultra.learning_algorithm.base_algorithm.BaseAlgorithm
The Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.
This class implements the Pairwise Differentiable Gradient Descent (PDGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.
Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.
-
__init__
(data_set, exp_settings, forward_only=False)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
train
(input_feed)¶ Run a step of the model feeding the given inputs for training process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
validation
(input_feed, is_online_simulation=False)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.
ultra.learning_algorithm.regression_EM module¶
Training and testing the regression-based EM algorithm for unbiased learning to rank.
See the following paper for more information on the regression-based EM algorithm.
Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.
-
class
ultra.learning_algorithm.regression_EM.
RegressionEM
(data_set, exp_settings)¶ Bases:
ultra.learning_algorithm.base_algorithm.BaseAlgorithm
The regression-based EM algorithm for unbiased learning to rank.
This class implements the regression-based EM algorithm based on the input layer feed. See the following paper for more information.
Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.
In particular, we use the online EM algorithm for the parameter estimations:
Cappé, Olivier, and Eric Moulines. “Online expectation–maximization algorithm for latent data models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71.3 (2009): 593-613.
-
__init__
(data_set, exp_settings)¶ Create the model.
- Parameters
data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
-
train
(input_feed)¶ Run a step of the model feeding the given inputs for training process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
validation
(input_feed, is_online_simulation=False)¶ Run a step of the model feeding the given inputs for validating process.
- Parameters
input_feed – (dictionary) A dictionary containing all the input feed data.
- Returns
A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.
-
ultra.learning_algorithm.regression_EM.
get_bernoulli_sample
(probs)¶ Conduct Bernoulli sampling according to a specific probability distribution.
- Parameters
prob – (torch.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.
- Returns
A Tensor of binary samples (0 or 1) with the same shape of probs.