-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Projects
- VW support for FlatBuff and/or Protobuf
- Parallelized parsing
- VW server mode revamp
- Contextual bandits data visualization with Jupyter notebooks
- Improve VW’s Python experience
- End-to-load local loop for reinforcement learning
- TensorWatch and TensorBoard integration
- ONNX operator set and model format for VW models
- Enable implementation of a VW reduction in Python
- Allow Python implementations of RLClientLib extensibility points
- Contextual bandit benchmark and competition
- Library of contextual bandit estimators
VW has several file inputs, examples, cache and models. This project involves adding support for a modern serialization framework such as FlatBuff or ProtoBuff. This will enable easier interop, better stability and potentially increased performance.
- Produce wiki page outlining design and usage
- Design schemas
- Load and save a model
- Load examples from a file. Start by keeping labels stored as a string.
- Utilities to inspect and convert to/from VW's file formats
- Produce wiki page outlining design and usage
- Load and save the cache file
- Schemas for structured label types
- Benchmark and optimize performance
- VW example text input format
- Code pointer for text example parsing
- Code pointer for model saving and loading
- Flatbuffers GitHub repo
- Protobuf GitHub repo
Modern machines often utilize many threads to achieve performance. VW currently uses a single parse thread and a single learner thread, and parsing is often the bottleneck. Extending the parser to support many threads will allow us to better utilize resources.
- Produce wiki page outlining design and usage
- Extract parser to standalone component
- Spawn threads for parse jobs
- Ensure original ordering in datafile is preserved
- Lock free synchronization of threads
- Use all reduce to support multi threaded learning
- Separate I/O threads from parse threads
VW currently has daemon mode, which allows clients to send examples, train and model and receive predictions. This uses raw sockets and a custom binary protocol We want to provide a modern version of VW's server mode utilizing a modern RPC technology.
- Single model serving using GRPC with the following endpoints:
- Predict
- Learn
- Statistics (number of features, current loss, etc)
- Management (download current model, number of features)
- Packaging tools to create docker containers from VW params and model
- Wiki page describing how to use it
- Persistent model storage.
- Multiple models from a single daemon.
Build visualizations to help understand the behavior of Contextual Bandit policies and logs.
- Vizualizations for:
- Action distribution
- Action/reward distribution by feature(s) or model used
- Model comparison
- Feature importance
- Produce a synthetic dataset that highlight the usefulness of the visualizations
- TBD
VW's Python integration can be improved is several areas to make it easier for users. Supporting Pandas as a first class concept will make utilizing VW in experimentation workflows much more streamlined. Implementing IPython HTML representations for some common types will improve usability of these components.
- Implement repr_html for examples, model and labels
- Access to progressive validation and other model statistics
- Pandas load and save from VW text format
- Simplify example lifecycle
- [IPython HTML representation]https://ipython.readthedocs.io/en/stable/config/integrating.html
The reinforcement learning library has extension points to allow for swapping out parts of the framework, however there is no simple way to make it work end to end locally at the moment. Making RLClientLib support prediction, logging, joining and training locally will make for a great prototyping tool.
- In-memory joining and training
- Extend configuration to enable local mode
- Python and C# API support
- Checkpointing - load and save model
- Port some of our RLClientLib simulators to use the local loop
TensorBoard and TensorWatch are great tools for debugging and monitoring training making them a great choice for integrating with VW and RLClientLib.
- Integrate VW training with TensorWatch all within a notebook
- Extend VW to output TensorBoard logs
- Extend RLClientLib to support TensorBoard and TensorWatch
- Add lazy logging mode to VW and RLClientLib
VW has its own runtime for running inference off of its own model files. However, ONNX is the emerging standard for defining models and supporting inference. This project enables VW models to interoperate with ONNX runtime.
- Define ONNX.vw operation set for the reductions needed for classification (CSOAA)
- Define shape of VW example in tensor format
- Converter tool from vw model to ONNX model
- Implement the new opset with ONNX runtime
- Sample app that runs inference
- Extend opset to Contextual Bandits
- Export ONNX model directly from VW
All reductions in VW are implemented in C++. However, to allow for rapid prototyping and taking advantage of the Python ecosystem, using Python to do this makes sense.
- Create interface that allows Python code to implement a base learner in VW
- Implement a simple gradient descent base learner using SKlearn
- Allow for the Python implemented reduction to be used at a different level of the reduction stack
RLCLientLib supports several points of extensibility, but these are only exposed in C++. When using RLCLientLib in Python it is important to be able to support these.
- Support a custom model implementation in Python through the
i_model
interface - Create an example of using these locally
- Support custom
i_sender
implementation for event logging - Support
i_data_transport
for retrieving updated models
There exists many different contextual bandit algorithms. In order to compare these a standard benchmark would be useful. Use the contextual bandit bake off paper as a base and build a set of standard CB benchmarks and supporting infrastructure to competitively evaluate CB algorithms. This is similar to the GLUE benchmark for NLP.
- Design CB experiments - start off with CB bakeoff paper
- Create infrastructure to obtain datasets
- Upload predictions to evaluate performance of algorithm
- Visualization, display results and compare to others
- Abstract what it means to be a CB algo to provide a more structured evaluation workflow
Estimators are used in off policy evaluation. One common estimator is IPS, and others are DR and PseudoInverse. These estimators work better or worse in different settings. This project explores reference implementations of each and allows for comparison between them to aid in understanding. As a stretch goal it involves utilizing this common library of estimators in the existing counterfactual estimation module.
- Add implementation of DR, and DR in episodic settings
- Simulator interface that allows evaluation against logging policy and target policy
- Generate a random logging policy and target policy to use for evaluation
- Visualization of comparison
- Pseudo inverse
- Integrate into existing counterfactual evaluation framework
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: