ultra.learning_algorithm package¶

Submodules¶

ultra.learning_algorithm.base_algorithm module¶

The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.

class ultra.learning_algorithm.base_algorithm.BaseAlgorithm(data_set, exp_settings)¶

Bases: abc.ABC

The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.

PADDING_SCORE = -100000¶

abstract __init__(data_set, exp_settings)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

create_input_feed(input_feed, list_size)¶

Create the input from input_feed to run the model

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.
list_size – (int) The top number of documents to consider in the input docids.

create_model(feature_size)¶

Initialize the ranking model.

Returns: The ranking model that will be used to computer the ranking score.

create_summary(scalar_name, summarize_name, value, is_training)¶

Summarize the result of an operation

Parameters

scalar_name – (str) A string used as the name for the result of the operation to add to Tensorboard.
summarize_name – (str) A string used as the name for the summarization of the operation.
value – The value of the result of the operation.
is_training – (Boolean) if the model is in training mode or eval mode.

Returns: The ranking model that will be used to computer the ranking score.

get_ranking_scores(model, input_id_list, **kwargs)¶

Compute ranking scores with the given inputs.

Parameters

model – (BaseRankingModel) The model that is used to compute the ranking score.
input_id_list – (list<torch.Tensor>) A list of tensors containing document ids. Each tensor must have a shape of [None].
is_training – (bool) A flag indicating whether the model is running in training mode.

Returns

A tensor with the same shape of input_docids.

l2_loss(input)¶

opt_step(opt, params)¶

Perform an optimization step

Parameters

opt – Optimization Function to use
params – Model’s parameters

Returns: The ranking model that will be used to computer the ranking score.

pairwise_cross_entropy_loss(pos_scores, neg_scores, propensity_weights=None)¶

Computes pairwise softmax loss without propensity weighting.

Parameters

pos_scores – (torch.Tensor) A tensor with shape [batch_size, 1]. Each value is
ranking score of a positive example. (the) –
neg_scores – (torch.Tensor) A tensor with shape [batch_size, 1]. Each value is
ranking score of a negative example. (the) –
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

pairwise_loss_on_list(output, labels, propensity_weights=None)¶

Computes pairwise entropy loss.

Parameters

output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

ranking_model(model, list_size)¶

Construct ranking model with the given list size.

Parameters

model – (BaseRankingModel) The model that is used to compute the ranking score.
list_size – (int) The top number of documents to consider in the input docids.
scope – (string) The name of the variable scope.

Returns

A tensor with the same shape of input_docids.

remove_padding_for_metric_eval(input_id_list, model_output)¶

sigmoid_loss_on_list(output, labels, propensity_weights=None)¶

Computes pointwise sigmoid loss without propensity weighting.

Parameters

output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a
example. (relevant) –
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

softmax_loss(output, labels, propensity_weights=None)¶

Computes listwise softmax loss without propensity weighting.

Parameters

output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a
example. (relevant) –
propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

abstract train(input_feed)¶

Run a step of the model feeding the given inputs for training.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

abstract validation(input_feed)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

ultra.learning_algorithm.base_algorithm.softmax_cross_entropy_with_logits(logits, labels)¶

Computes softmax cross entropy between logits and labels.

Parameters

output – A tensor with shape [batch_size, list_size]. Each value is
ranking score of the corresponding example. (the) –
labels – A tensor of the same shape as output. A value >= 1 means a
example. (relevant) –

Returns

A single value tensor containing the loss.

ultra.learning_algorithm.dbgd module¶

Training and testing the Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.

See the following paper for more information on the Dueling Bandit Gradient Descent (DBGD) algorithm.

Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.

class ultra.learning_algorithm.dbgd.DBGD(data_set, exp_settings)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.

This class implements the Dueling Bandit Gradient Descent (DBGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.

Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.

__init__(data_set, exp_settings)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

click_simulation_winners(input_feed, rank_scores)¶

compute_gradient(final_winners, noisy_params)¶

create_new_output_list(noisy_params)¶

create_noisy_param()¶

train(input_feed)¶

Run a step of the model feeding the given inputs for training process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

ultra.learning_algorithm.dbgd_interleave module¶

ultra.learning_algorithm.dla module¶

Training and testing the dual learning algorithm for unbiased learning to rank.

See the following paper for more information on the dual learning algorithm.

Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18

class ultra.learning_algorithm.dla.DLA(data_set, exp_settings)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Dual Learning Algorithm for unbiased learning to rank.

This class implements the Dual Learning Algorithm (DLA) based on the input layer feed. See the following paper for more information on the algorithm.

Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18

__init__(data_set, exp_settings)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

clip_grad_value(parameters, clip_value_min, clip_value_max)¶

Clips gradient of an iterable of parameters at specified value.

Gradients are modified in-place.

Parameters

parameters (Iterable[Tensor] or Tensor) – an iterable of Tensors or a single Tensor that will have gradients normalized
clip_value (float or int) – maximum allowed value of the gradients. The gradients are clipped in the range \(\left[\text{-clip\_value}, \text{clip\_value}\right]\)

Return type

None

get_normalized_weights(propensity)¶

Computes listwise softmax loss with propensity weighting.

Parameters: propensity – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.
Returns: (tf.Tensor) A tensor containing the propensity weights.

separate_gradient_update()¶

train(input_feed)¶

Run a step of the model feeding the given inputs.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

class ultra.learning_algorithm.dla.DenoisingNet(input_vec_size)¶

Bases: torch.nn.modules.module.Module

__init__(input_vec_size)¶: Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_list)¶

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool¶

ultra.learning_algorithm.dla.sigmoid_prob(logits)¶

ultra.learning_algorithm.ipw_rank module¶

Training and testing the inverse propensity weighting algorithm for unbiased learning to rank.

See the following paper for more information on the inverse propensity weighting algorithm.

Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16

Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17

class ultra.learning_algorithm.ipw_rank.IPWrank(data_set, exp_settings)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Inverse Propensity Weighting algorithm for unbiased learning to rank.

This class implements the training and testing of the Inverse Propensity Weighting algorithm for unbiased learning to rank. See the following paper for more information on the algorithm.

Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16
Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17

__init__(data_set, exp_settings)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)¶

Run a step of the model feeding the given inputs for training process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.ipw_rank.selu(x)¶

ultra.learning_algorithm.navie_algorithm module¶

The navie algorithm that directly trains ranking models with clicks.

class ultra.learning_algorithm.navie_algorithm.NavieAlgorithm(data_set, exp_settings, forward_only=False)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The navie algorithm that directly trains ranking models with input labels.

__init__(data_set, exp_settings, forward_only=False)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.
forward_only – Set true to conduct prediction only, false to conduct training.

step(session, input_feed, forward_only)¶

Run a step of the model feeding the given inputs.

Parameters

session – (tf.Session) tensorflow session to use.
input_feed – (dictionary) A dictionary containing all the input feed data.
forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.pairwise_debias module¶

Training and testing the Pairwise Debiasing algorithm for unbiased learning to rank.

See the following paper for more information on the Pairwise Debiasing algorithm.

Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.

class ultra.learning_algorithm.pairwise_debias.PairDebias(data_set, exp_settings)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Pairwise Debiasing algorithm for unbiased learning to rank.

This class implements the Pairwise Debiasing algorithm based on the input layer feed. See the following paper for more information on the algorithm.

Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.

__init__(data_set, exp_settings)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)¶

Run a step of the model feeding the given inputs for training process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.pairwise_debias.get_bernoulli_sample(probs)¶

Conduct Bernoulli sampling according to a specific probability distribution.

Parameters: prob – (tf.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.
Returns: A Tensor of binary samples (0 or 1) with the same shape of probs.

ultra.learning_algorithm.pdgd module¶

Training and testing the Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.

See the following paper for more information on the Pairwise Differentiable Gradient Descent (PDGD) algorithm.

Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.

class ultra.learning_algorithm.pdgd.PDGD(data_set, exp_settings, forward_only=False)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.

This class implements the Pairwise Differentiable Gradient Descent (PDGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.

Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.

__init__(data_set, exp_settings, forward_only=False)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)¶

Run a step of the model feeding the given inputs for training process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

ultra.learning_algorithm.regression_EM module¶

Training and testing the regression-based EM algorithm for unbiased learning to rank.

See the following paper for more information on the regression-based EM algorithm.

Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.

class ultra.learning_algorithm.regression_EM.RegressionEM(data_set, exp_settings)¶

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The regression-based EM algorithm for unbiased learning to rank.

This class implements the regression-based EM algorithm based on the input layer feed. See the following paper for more information.

Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.

In particular, we use the online EM algorithm for the parameter estimations:

Cappé, Olivier, and Eric Moulines. “Online expectation–maximization algorithm for latent data models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71.3 (2009): 593-613.

__init__(data_set, exp_settings)¶

Create the model.

Parameters

data_set – (Raw_data) The dataset used to build the input layer.
exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)¶

Run a step of the model feeding the given inputs for training process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)¶

Run a step of the model feeding the given inputs for validating process.

Parameters: input_feed – (dictionary) A dictionary containing all the input feed data.
Returns: A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.regression_EM.get_bernoulli_sample(probs)¶

Conduct Bernoulli sampling according to a specific probability distribution.

Parameters: prob – (torch.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.
Returns: A Tensor of binary samples (0 or 1) with the same shape of probs.

Module contents¶

ultra.learning_algorithm.list_available()¶

Return type: list