-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Matching Label and Prediction Types Between Reductions
VW solves various machine learning problems using a reduction stack. The primary philosophy behind this is that we can reduce new problems / algorithms into problems we have already solved. The VW reduction stack consists of a sequence of learners, ending in a bottom learner that does not further reduce the problem. Examples enter from the top of the stack, and prediction propagate up starting from the bottom.
Each learner in VW has its own setup, learn, and predict functions. The setup()
function is used to set various features within the learner and process command line arguments. The learn()
function processes labeled examples to update the current model, and the predict()
function outputs a prediction given the current model. More information on learners can be found on the page What is a learner?
When considering a reduction stack, it is important to understand how labels and predictions are propagated through the stack via the learn()
and predict()
functions. Unless a learner is at the bottom of the stack (simply called a bottom learner), it will rely on its base immediately below it. In the reduction learner's learn()
function, it will at some point call base.learn()
in its own learn()
function, and base.predict()
in its own predict()
function in order to invoke these functions in the learner beneath it.
A learner will call base.learn()
after it has processed (and potentially altered) a label that was input. Calling base.learn()
has the effect of giving this updated label to the base learner. On the other hand, a learner will call base.predict()
before it processes (and potentially alters) a prediction, so it will apply its logic to the prediction returned by the base. Thus base.predict()
can be thought of as fetching the prediction from the base learner and providing it to the current learner.
Consider the following image which demonstrates the directionality of labels and predictions between a reduction learner and its base. Note that the base can be either a bottom learner or another reduction learner, but this is not illustrated in the diagram.
In an effort to organize how learners interact with one another, we have introduced the two functions set_input_label_type()
and set_output_label_type()
. They are member functions of learner builders and are called when creating a learner. The set_input_label_type()
function will specify the label type that is directly passed into a learner (which comes the learner above it), and set_output_label_type()
will specify the label type that is passed into base.learn()
(which will be sent to the learner below it).
Likewise, we also have set_input_prediction_type()
and set_output_prediction_type()
. Note that the direction of "input" and "output" is reversed here. The set_input_prediction_type()
function will specify the prediction type which is returned by base.predict()
, not the prediction type directly passed into the predict()
function. This is because predict()
acts on the prediction returned by base.predict()
. set_output_prediction_type()
will specify the prediction which is present at the end of the predict()
function, as this is the output prediction of this learner that will be passed back up to any learner above it.
Given that the output label type is specified by what is passed into base.learn()
and the input prediction type is specified by what is returned from base.predict()
, it is ill-defined what these types should be for a bottom learner. As to not leave these values uninitialized, we have decided to set them to NOLABEL
and NOPRED
as placeholders.
The ultimate goal of setting each learner's input and output label and prediction types is to enforce that a reduction stack is set up correctly. Learner builders will perform a check at runtime for correctness.
Given the directionality of labels and predictions, the following 4 properties should always be met for a given reduction learner R
:
- The learner above
R
should have the same output label type as the input label type ofR
- The learner below
R
should have the same input label as the output label ofR
- The learner above
R
should have the same input prediction type as the output prediction type ofR
- The learner below
R
should have the same output prediction type as the input prediction type ofR
And the following 4 properties should always be met for a given bottom learner B
:
- The learner above
B
should have the same output label type as the input label type ofB
- There is no learner below
B
, and the output label type ofB
is set toNOLABEL
- The learner above
B
should have the same input prediction type as the output prediction type ofB
- There is no learner below
B
, and the input prediction type ofB
is set toNOPRED
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: