ultra.learning_algorithm package

Submodules

ultra.learning_algorithm.base_algorithm module

The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.

class ultra.learning_algorithm.base_algorithm.BaseAlgorithm(data_set, exp_settings, forward_only=False)

Bases: abc.ABC

The basic class that contains all the API needed for the implementation of an unbiased learning to rank algorithm.

PADDING_SCORE = -100000
abstract __init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

get_ranking_scores(input_id_list, is_training=False, scope=None, **kwargs)

Compute ranking scores with the given inputs.

Parameters
  • input_id_list – (list<tf.Tensor>) A list of tensors containing document ids. Each tensor must have a shape of [None].

  • is_training – (bool) A flag indicating whether the model is running in training mode.

  • scope – (string) The name of the variable scope.

Returns

A tensor with the same shape of input_docids.

pairwise_cross_entropy_loss(pos_scores, neg_scores, name=None)

Computes pairwise softmax loss without propensity weighting.

Parameters
  • pos_scores – (tf.Tensor) A tensor with shape [batch_size, 1]. Each value is

  • ranking score of a positive example. (the) –

  • neg_scores – (tf.Tensor) A tensor with shape [batch_size, 1]. Each value is

  • ranking score of a negative example. (the) –

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

ranking_model(list_size, scope=None)

Construct ranking model with the given list size.

Parameters
  • list_size – (int) The top number of documents to consider in the input docids.

  • scope – (string) The name of the variable scope.

Returns

A tensor with the same shape of input_docids.

remove_padding_for_metric_eval(input_id_list, model_output)
abstract step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.dbgd module

Training and testing the Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.

See the following paper for more information on the Dueling Bandit Gradient Descent (DBGD) algorithm.

  • Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.

class ultra.learning_algorithm.dbgd.DBGD(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Dueling Bandit Gradient Descent (DBGD) algorithm for unbiased learning to rank.

This class implements the Dueling Bandit Gradient Descent (DBGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.

  • Yisong Yue and Thorsten Joachims. 2009. Interactively optimizing information retrieval systems as a dueling bandits problem. In ICML. 1201–1208.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.dbgd_interleave module

ultra.learning_algorithm.dla module

Training and testing the dual learning algorithm for unbiased learning to rank.

See the following paper for more information on the dual learning algorithm.

  • Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18

class ultra.learning_algorithm.dla.DLA(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Dual Learning Algorithm for unbiased learning to rank.

This class implements the Dual Learning Algorithm (DLA) based on the input layer feed. See the following paper for more information on the algorithm.

  • Qingyao Ai, Keping Bi, Cheng Luo, Jiafeng Guo, W. Bruce Croft. 2018. Unbiased Learning to Rank with Unbiased Propensity Estimation. In Proceedings of SIGIR ‘18

DenoisingNet(list_size, forward_only=False, scope=None)
__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

click_weighted_log_loss(output, labels, propensity_weights, name=None)

Computes pointwise sigmoid loss with propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity_weights – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

click_weighted_pairwise_loss(output, labels, propensity_weights, name=None)

Computes pairwise entropy loss with propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.

  • propensity_weights – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss. (tf.Tensor) A tensor containing the propensity weights.

click_weighted_softmax_cross_entropy_loss(output, labels, propensity_weights, name=None)

Computes listwise softmax loss with propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity_weights – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

get_normalized_weights(propensity)

Computes listwise softmax loss with propensity weighting.

Parameters

propensity – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

Returns

(tf.Tensor) A tensor containing the propensity weights.

separate_gradient_update()
softmax_loss(output, labels, propensity=None, name=None)

Computes listwise softmax loss without propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity – No use.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.dla.sigmoid_prob(logits)

ultra.learning_algorithm.ipw_rank module

Training and testing the inverse propensity weighting algorithm for unbiased learning to rank.

See the following paper for more information on the inverse propensity weighting algorithm.

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16

  • Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17

class ultra.learning_algorithm.ipw_rank.IPWrank(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Inverse Propensity Weighting algorithm for unbiased learning to rank.

This class implements the training and testing of the Inverse Propensity Weighting algorithm for unbiased learning to rank. See the following paper for more information on the algorithm.

  • Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork. 2016. Learning to Rank with Selection Bias in Personal Search. In Proceedings of SIGIR ‘16

  • Thorsten Joachims, Adith Swaminathan, Tobias Schnahel. 2017. Unbiased Learning-to-Rank with Biased Feedback. In Proceedings of WSDM ‘17

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

click_weighted_pairwise_loss(output, labels, propensity, name=None)

Computes pairwise entropy loss with propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.

  • propensity – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

click_weighted_softmax_loss(output, labels, propensity, name=None)

Computes listwise softmax loss with propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity – (tf.Tensor) A tensor of the same shape as output containing the weight of each element.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

sigmoid_loss(output, labels, propensity, name=None)

Computes pointwise sigmoid loss without propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.

  • propensity – No use.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

softmax_loss(output, labels, propensity, name=None)

Computes listwise softmax loss without propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.

  • propensity – No use.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.ipw_rank.selu(x)

ultra.learning_algorithm.navie_algorithm module

The navie algorithm that directly trains ranking models with clicks.

class ultra.learning_algorithm.navie_algorithm.NavieAlgorithm(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The navie algorithm that directly trains ranking models with input labels.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

pairwise_loss(output, labels, name=None)

Computes pairwise entropy loss.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a relevant example.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

sigmoid_loss(output, labels, name=None)

Computes pointwise sigmoid loss without propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity – No use.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

softmax_loss(output, labels, name=None)

Computes listwise softmax loss without propensity weighting.

Parameters
  • output – (tf.Tensor) A tensor with shape [batch_size, list_size]. Each value is

  • ranking score of the corresponding example. (the) –

  • labels – (tf.Tensor) A tensor of the same shape as output. A value >= 1 means a

  • example. (relevant) –

  • propensity – No use.

  • name – A string used as the name for this variable scope.

Returns

(tf.Tensor) A single value tensor containing the loss.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.pairwise_debias module

Training and testing the Pairwise Debiasing algorithm for unbiased learning to rank.

See the following paper for more information on the Pairwise Debiasing algorithm.

  • Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.

class ultra.learning_algorithm.pairwise_debias.PairDebias(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Pairwise Debiasing algorithm for unbiased learning to rank.

This class implements the Pairwise Debiasing algorithm based on the input layer feed. See the following paper for more information on the algorithm.

  • Hu, Ziniu, Yang Wang, Qu Peng, and Hang Li. “Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.” In The World Wide Web Conference, pp. 2830-2836. ACM, 2019.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.pairwise_debias.get_bernoulli_sample(probs)

Conduct Bernoulli sampling according to a specific probability distribution.

Parameters

prob – (tf.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.

Returns

A Tensor of binary samples (0 or 1) with the same shape of probs.

ultra.learning_algorithm.pdgd module

Training and testing the Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.

See the following paper for more information on the Pairwise Differentiable Gradient Descent (PDGD) algorithm.

  • Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.

class ultra.learning_algorithm.pdgd.PDGD(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The Pairwise Differentiable Gradient Descent (PDGD) algorithm for unbiased learning to rank.

This class implements the Pairwise Differentiable Gradient Descent (PDGD) algorithm based on the input layer feed. See the following paper for more information on the algorithm.

  • Oosterhuis, Harrie, and Maarten de Rijke. “Differentiable unbiased online learning to rank.” In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pp. 1293-1302. ACM, 2018.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.regression_EM module

Training and testing the regression-based EM algorithm for unbiased learning to rank.

See the following paper for more information on the regression-based EM algorithm.

  • Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.

class ultra.learning_algorithm.regression_EM.RegressionEM(data_set, exp_settings, forward_only=False)

Bases: ultra.learning_algorithm.base_algorithm.BaseAlgorithm

The regression-based EM algorithm for unbiased learning to rank.

This class implements the regression-based EM algorithm based on the input layer feed. See the following paper for more information.

  • Wang, Xuanhui, Nadav Golbandi, Michael Bendersky, Donald Metzler, and Marc Najork. “Position bias estimation for unbiased learning to rank in personal search.” In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, pp. 610-618. ACM, 2018.

In particular, we use the online EM algorithm for the parameter estimations:

  • Cappé, Olivier, and Eric Moulines. “Online expectation–maximization algorithm for latent data models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 71.3 (2009): 593-613.

__init__(data_set, exp_settings, forward_only=False)

Create the model.

Parameters
  • data_set – (Raw_data) The dataset used to build the input layer.

  • exp_settings – (dictionary) The dictionary containing the model settings.

  • forward_only – Set true to conduct prediction only, false to conduct training.

step(session, input_feed, forward_only)

Run a step of the model feeding the given inputs.

Parameters
  • session – (tf.Session) tensorflow session to use.

  • input_feed – (dictionary) A dictionary containing all the input feed data.

  • forward_only – whether to do the backward step (False) or only forward (True).

Returns

A triple consisting of the loss, outputs (None if we do backward), and a tf.summary containing related information about the step.

ultra.learning_algorithm.regression_EM.get_bernoulli_sample(probs)

Conduct Bernoulli sampling according to a specific probability distribution.

Parameters

prob – (tf.Tensor) A tensor in which each element denotes a probability of 1 in a Bernoulli distribution.

Returns

A Tensor of binary samples (0 or 1) with the same shape of probs.

Module contents

ultra.learning_algorithm.list_available()
Return type

list