FAQ

This is a living document that will be updated as questions are asked and answered.

The --csoaa_ldf and --wap_ldf modes use label dependent features, which allow you to specify a dynamic set of labels on each example.

See tutorials page here

MTR stands for Multi Task Regression and more information can be found in the CB Bakeoff paper.

The reason it is set as the default is that in an action dependent features settings, the update rule is usually more efficient relative to IPS/DR:

In MTR, only the weights of the chosen action are updated (rather than updating all weights assuming 0 reward for non-chosen actions for IPS)
In MTR, the propensity (i.e., the probability of the chosen action) is directly used in the regression cost formula as a weight rather than only be used in the estimator of loss (in the CB Bakeoff paper, compare eq. 6 for MTR versus eq. 5 where the estimator of the loss is in eq. 3 for IPS and in eq. 4 for DR).

When the propensities are not available there are two options:

No randomization is performed online, hence each action is taken with probability 1. In this case, offline estimation of the performance of CB (or any other offline algorithm) cannot be done reliably, incurring also possibly obtaining very bias and wrong estimates (e.g., the No Unknown Confounder assumption may not hold). One option is to start implementing an A/B test where you randomize your campaigns. There is a relevant discussion here
Randomization is performed but the propensity is not known. In this case, one could use offline experimentation estimators that do not use the propensity. One such estimator is the Direct Method estimator. In our empirical evidence this option is usually not very data efficient since the DM estimator would still need a large offline dataset to provide small confidence intervals, and it is also prone to very difficult to spot estimation errors due to bugs in the data collection pipeline. When possible, we suggest collecting data with data pipelines that include logging the propensity, as done in Azure Personalizer.

See documentation here

Provide feedback