-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Namespaces
vowpal wabbit supports a concept called feature name spaces (aka namespaces).
Two identically named features in two separate name-spaces are effectively different features.
Name spaces are useful for:
- Separating the data-set (by row) into example subsets (think one model including multiple, smaller and separate sub-models)
- Separating features (columns) into subsets
- Ignore/drop a subset of features belonging to a name-space as a group from consideration
- Cross one name-space subset with another, in the same example, on the fly during runtime (dynamic feature-interaction generation).
Example1: row-based name-spaces: suppose you're tracking revenues for N different stores, based on features like: year opened, median income in local zip-code, previous-week gross proceeds, day-of-week. etc. You can use the store name or town as a name-space for all the other features segregating the data-set into N separate sub-models, one for each store. Obviously, when trying to predict, you should be sending the expected name-space with the example you want to predict.
Example2: column-based name-spaces: you have features about a product (price, category, ...) in one name-space P
and features about the buyers of the product (age, income, ...) in another name-space B
, by using --quadratic PB
you can generate the interaction features between the two name-spaces on-the-fly.
In essence when you add a name-space to an example, all the features following that name-space get mapped to a new feature name. A name-space can be introduced by dropping the initial space after the |
char introducing input features. If a space is present, there are no name-spaces; only regular features (as in the 1st example in the table below). When the space is dropped and a feature-name appears immediately after the |
char, that 1st feature name becomes a name-space for all the following feature-names.
Namespace | Original feature-name | Effective feature name | Example syntax |
---|---|---|---|
bar | bar | | bar | |
foo | bar | foo^bar | |foo bar |
foo | baz | foo^baz | |foo baz |
The complete Input file format page covers the full syntax.
A few of the many vw
command-line options, act on name spaces.
Limitation: options that act on name-spaces only use on the 1st char of the name-space. You cannot (for example) drop only one name-space starting with a
if you have more than one name-space starting with a
.
Option | Meaning |
---|---|
--keep c |
Keep a name-space staring with the character c
|
--ignore c |
Ignore a name-space starting with the character c
|
--redefine a:=b |
redefine namespace starting with b as starting with a
|
--quadratic ab |
Cross namespaces starting with a & b on the fly to generate 2-way interacting features |
--cubic abc |
Cross namespaces starting with a , b , & c on the fly to generate 3-way interacting features |
The complete Command line arguments page covers all possible vw
options.
- vw command line arguments
- vw input format spec
- (Stackoverflow): Difference between name space and feature
- (Crossvalidated): Finding the best features in interaction models
- (Stackoverflow): Can't use full-name of name space in --keep
- (Stackoverflow): What hash function is used by vw?
- (Stackoverflow): How to use the
--keep
and--ignore
options? - Feature Hashing for Large Scale Multitask Learning K. Weinberger, A. Dasgupta, J. Attenberg, J. Langford, A. Smola
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: