ultra.learning_algorithm package

Submodules

ultra.learning_algorithm.base_algorithm module

The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.

class ultra.learning_algorithm.base_algorithm.BaseAlgorithm(data_set, exp_settings)

Bases: abc.ABC

The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.

PADDING_SCORE = -100000
abstract __init__(data_set, exp_settings)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

create_input_feed(input_feed, list_size)

Create the input from input_feed to run the model

Parameters
  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • list_size – (int) The top number of documents to consider in the input docids.

create_model(feature_size)

Initialize the ranking model.

Returns

The ranking model that will be used to computer the ranking score.

create_summary(scalar_name, summarize_name, value, is_training)

Summarize the result of an operation

Parameters
  • scalar_name – (str) A string used as the name for the result of the operation to add to Tensorboard.

  • summarize_name – (str) A string used as the name for the summarization of the operation.

  • value – The value of the result of the operation.

  • is_training – (Boolean) if the model is in training mode or eval mode.

Returns

The ranking model that will be used to computer the ranking score.

get_ranking_scores(model, input_id_list, **kwargs)

Compute ranking scores with the given inputs.

Parameters
  • model – (BaseRankingModel) The model that is used to compute the ranking score.

  • input_id_list – (list<torch.Tensor>) A list of tensors containing document ids. Each tensor must have a shape of [None].

  • is_training – (bool) A flag indicating whether the model is running in training mode.

Returns

A tensor with the same shape of input_docids.

l2_loss(input)
opt_step(opt, params)

Perform an optimization step

Parameters
  • opt – Optimization Function to use

  • params – Model’s parameters

Returns

The ranking model that will be used to computer the ranking score.

pairwise_cross_entropy_loss(pos_scores, neg_scores, propensity_weights=None)

Computes pairwise softmax loss without propensity weighting.

Parameters
  • pos_scores – (torch.Tensor) A tensor with shape [batch_size, 1]. Each value is

  • ranking score of a positive example. (the) –

  • neg_scores – (torch.Tensor) A tensor with shape [batch_size, 1]. Each value is

  • ranking score of a negative example. (the) –

  • propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

pairwise_loss_on_list(output, labels, propensity_weights=None)

Computes pairwise entropy loss.

Parameters
  • output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.

  • propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

ranking_model(model, list_size)

Construct ranking model with the given list size.

Parameters
  • model – (BaseRankingModel) The model that is used to compute the ranking score.

  • list_size – (int) The top number of documents to consider in the input docids.

  • scope – (string) The name of the variable scope.

Returns

A tensor with the same shape of input_docids.

remove_padding_for_metric_eval(input_id_list, model_output)
sigmoid_loss_on_list(output, labels, propensity_weights=None)

Computes pointwise sigmoid loss without propensity weighting.

Parameters
  • output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

softmax_loss(output, labels, propensity_weights=None)

Computes listwise softmax loss without propensity weighting.

Parameters
  • output – (torch.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (torch.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity_weights – (torch.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(torch.Tensor) A single value tensor containing the loss.

abstract train(input_feed)

Run a step of the model feeding the given inputs for training.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

abstract validation(input_feed)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

ultra.learning_algorithm.base_algorithm.softmax_cross_entropy_with_logits(logits, labels)

Computes softmax cross entropy between logits and labels.

Parameters
  • output – A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

Returns

A single value tensor containing the loss.

ultra.learning_algorithm.dbgd module

Training and testing the Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.

See the following paper for more information on the Dueling Bandit Gradient Descent (DBGD) algorithm.

  • Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.

class ultra.learning_algorithm.dbgd.DBGD(data_set, exp_settings)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.

This class implements the Dueling Bandit Gradient Descent (DBGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.

  • Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.

__init__(data_set, exp_settings)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

click_simulation_winners(input_feed, rank_scores)
compute_gradient(final_winners, noisy_params)
create_new_output_list(noisy_params)
create_noisy_param()
train(input_feed)

Run a step of the model feeding the given inputs for training process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

ultra.learning_algorithm.dbgd_interleave module

ultra.learning_algorithm.dla module

Training and testing the dual learning algorithm for unbiased learning to rank.

See the following paper for more information on the dual learning algorithm.

  • Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18

class ultra.learning_algorithm.dla.DLA(data_set, exp_settings)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Dual Learning Algorithm for unbiased learning to rank.

This class implements the Dual Learning Algorithm (DLA) based on the input layer feed. See the following paper for more information on the algorithm.

  • Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18

__init__(data_set, exp_settings)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

clip_grad_value(parameters, clip_value_min, clip_value_max)

Clips gradient of an iterable of parameters at specified value.

Gradients are modified in-place.

Parameters
  • parameters (Iterable[Tensor] or Tensor) – an iterable of Tensors or a single Tensor that will have gradients normalized

  • clip_value (float or int) – maximum allowed value of the gradients. The gradients are clipped in the range \(\left[\text{-clip\_value}, \text{clip\_value}\right]\)

Return type

None

get_normalized_weights(propensity)

Computes listwise softmax loss with propensity weighting.

Parameters

propensity – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(tf.Tensor) A tensor containing the propensity weights.

separate_gradient_update()
train(input_feed)

Run a step of the model feeding the given inputs.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

class ultra.learning_algorithm.dla.DenoisingNet(input_vec_size)

Bases: torch.nn.modules.module.Module

__init__(input_vec_size)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(input_list)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
ultra.learning_algorithm.dla.sigmoid_prob(logits)

ultra.learning_algorithm.ipw_rank module

Training and testing the inverse propensity weighting algorithm for unbiased learning to rank.

See the following paper for more information on the inverse propensity weighting algorithm.

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16

  • Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17

class ultra.learning_algorithm.ipw_rank.IPWrank(data_set, exp_settings)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Inverse Propensity Weighting algorithm for unbiased learning to rank.

This class implements the training and testing of the Inverse Propensity Weighting algorithm for unbiased learning to rank. See the following paper for more information on the algorithm.

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16

  • Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17

__init__(data_set, exp_settings)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)

Run a step of the model feeding the given inputs for training process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.ipw_rank.selu(x)

ultra.learning_algorithm.navie_algorithm module

The navie algorithm that directly trains ranking models with clicks.

class ultra.learning_algorithm.navie_algorithm.NavieAlgorithm(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The navie algorithm that directly trains ranking models with input labels.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.pairwise_debias module

Training and testing the Pairwise Debiasing algorithm for unbiased learning to rank.

See the following paper for more information on the Pairwise Debiasing algorithm.

  • Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.

class ultra.learning_algorithm.pairwise_debias.PairDebias(data_set, exp_settings)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Pairwise Debiasing algorithm for unbiased learning to rank.

This class implements the Pairwise Debiasing algorithm based on the input layer feed. See the following paper for more information on the algorithm.

  • Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.

__init__(data_set, exp_settings)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)

Run a step of the model feeding the given inputs for training process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.pairwise_debias.get_bernoulli_sample(probs)

Conduct Bernoulli sampling according to a specific probability distribution.

Parameters

prob – (tf.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.

Returns

A Tensor of binary samples (0 or 1) with the same shape of probs.

ultra.learning_algorithm.pdgd module

Training and testing the Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.

See the following paper for more information on the Pairwise Differentiable Gradient Descent (PDGD) algorithm.

  • Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.

class ultra.learning_algorithm.pdgd.PDGD(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.

This class implements the Pairwise Differentiable Gradient Descent (PDGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.

  • Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)

Run a step of the model feeding the given inputs for training process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a summary containing related information about the step.

ultra.learning_algorithm.regression_EM module

Training and testing the regression-based EM algorithm for unbiased learning to rank.

See the following paper for more information on the regression-based EM algorithm.

  • Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.

class ultra.learning_algorithm.regression_EM.RegressionEM(data_set, exp_settings)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The regression-based EM algorithm for unbiased learning to rank.

This class implements the regression-based EM algorithm based on the input layer feed. See the following paper for more information.

  • Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.

In particular, we use the online EM algorithm for the parameter estimations:

  • Cappé, Olivier, and Eric Moulines. “Online expectation–maximization algorithm for latent data models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71.3 (2009): 593-613.

__init__(data_set, exp_settings)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

train(input_feed)

Run a step of the model feeding the given inputs for training process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

validation(input_feed, is_online_simulation=False)

Run a step of the model feeding the given inputs for validating process.

Parameters

input_feed – (dictionary) A dictionary containing all the input feed data.

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.regression_EM.get_bernoulli_sample(probs)

Conduct Bernoulli sampling according to a specific probability distribution.

Parameters

prob – (torch.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.

Returns

A Tensor of binary samples (0 or 1) with the same shape of probs.

Module contents

ultra.learning_algorithm.list_available()
Return type

list