-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Slates
Slates is a multi-slot algorithm that builds upon the ideas of CCB. Please see here for the paper.
It has the following constraints:
- There is a single global reward
- Slots have disjoint action sets
slates shared [global_cost] | <features>
slates action <slot_id> | <features>
slates slot [chosen_action_id:probability,action_id:probability...] | <features>
-
slot_id
is zero based and required -
chosen_action_id
is zero based and refers to the index relative to actions in that slot - If labeled, every slot must have
chosen_action_id:probability
pairs - It is possible to supply the entire actions and probability lists as well for more complex counterfactual experiments
Note: as a single example can span multiple lines it is important to leave empty lines between these examples. If reading from a file, make sure your file ends with an empty line.
When providing a labelled example the global cost must be supplied and you may either provide just the top action and probability:
slates shared 0.8 | <features>
slates action 0 | <features>
slates action 0 | <features>
slates action 0 | <features>
slates action 1 | <features>
slates action 1 | <features>
slates slot 1:0.8 | <features>
slates slot 0:0.6 | <features>
Or you can supply the entire list of probabilities:
slates shared 0.8 | <features>
slates action 0 | <features>
slates action 0 | <features>
slates action 0 | <features>
slates action 1 | <features>
slates action 1 | <features>
slates slot 1:0.8,0:0.1,2:0.1 | <features>
slates slot 0:0.6,1:0.4 | <features>
For an unlabeled example, the global cost and probability list is excluded.
slates shared | <features>
slates action 0 | <features>
slates action 0 | <features>
slates action 0 | <features>
slates action 1 | <features>
slates action 1 | <features>
slates slot | <features>
slates slot | <features>
{
"_label_cost": 1,
"_outcomes":[{
"_a": // <list of integers, or single integer>
"_p": // <list of floats, or single float>
},...
],
"c":"..."
}
context
:
- Context is similar to CCB, except for the following changes
{
<shared_features>,
"_multi":[<action>, ...],
"_slots":[<slot>, ...],
}
action
:
- Action must specify which slot it belongs to
{
"_slot_id":<slot_id>,
...
}
slot
:
- Slot has no special fields
- May not specify ids or included actions
{
...
}
When providing a labelled example the global cost must be supplied and you may either provide just the top action and probability:
{
"_label_cost": 1,
"c": {
"shared_feature": 1.0,
"_multi": [
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 1,
"feature": 1.0
},
{
"_slot_id": 1,
"feature": 1.0
}
],
"_slots": [
{
"feature": 1.0
},
{
"feature": 1.0
}
]
},
"_outcomes": [
{
"_a": 1,
"_p": 0.8
},
{
"_a": 0,
"_p": 0.6
}
]
}
Or you can supply the entire list of probabilities:
{
"_label_cost": 1,
"c": {
"shared_feature": 1.0,
"_multi": [
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 1,
"feature": 1.0
},
{
"_slot_id": 1,
"feature": 1.0
}
],
"_slots": [
{
"feature": 1.0
},
{
"feature": 1.0
}
]
},
"_outcomes": [
{
"_a": [1,0,2],
"_p": [0.8,0.1,0.1]
},
{
"_a": [0,1],
"_p": [0.6,0.4]
}
]
}
For an unlabeled example, the global cost and probability list is excluded.
{
"c": {
"shared_feature": 1.0,
"_multi": [
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 0,
"feature": 1.0
},
{
"_slot_id": 1,
"feature": 1.0
},
{
"_slot_id": 1,
"feature": 1.0
}
],
"_slots": [
{
"feature": 1.0
},
{
"feature": 1.0
}
]
}
}
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: