-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Daemon example
Note: This feature is currently only supported for Linux environments. Daemon mode, the way it is currently implemented, is deprecated on MacOS since version 10.5. If you want to use it on MacOS and are experiencing problems, please let us know by opening an issue on github.
Use case: You have a model and you want vw
to efficiently serve predictions based on the model.
When you run vw
in daemon mode. It loads the model once into memory at startup (using -i modelfile
) and listens to requests coming over a tcp socket (using the --daemon
option). If you run vw
in test-only (-t
) mode, it will only test (i.e. predict). Every write of an example into the socket should result in an immediate response on the same socket.
In performance tests I (arielf) was able to sustain a throughput of about 50,000 requests+predictions per second on standard hardware on a simple (~20 feature) model.
Deployment note: since vw
is already highly optimized for performance, the step of rewriting code to deploy a model in production is redundant. Using vw
itself for deployment, with the same model that was produced during training, is the recommended way to deploy a model.
Here's a short howto:
Our little training set train.vw
has only two examples. The features a
, b
, and c
have a 0
label, and the features x
, y
, and z
have a 1
label:
$ cat train.vw
0 example0| a b c
1 example1| x y z
Due to the tiny number of examples in the training-set, we use 20 passes to achieve perfect convergence (0 progressive loss). We run a (default) squared-loss regression.
$ vw -c --passes 20 --holdout_off train.vw -f model.vw
final_regressor = model.vw
Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
decay_learning_rate = 1
using cache_file = train.vw.cache
num sources = 1
average since example example current current current
loss last counter weight label predict features
0.000000 0.000000 1 1.0 0.0000 0.0000 4
0.500000 1.000000 2 2.0 1.0000 0.0000 4
0.275101 0.050201 4 4.0 1.0000 0.7458 4
0.138850 0.002599 8 8.0 1.0000 0.9692 4
0.069436 0.000022 16 16.0 1.0000 0.9994 4
0.034718 0.000000 32 32.0 1.0000 1.0000 4
finished run
number of examples per pass = 2
passes used = 20
weighted example sum = 40
weighted label sum = 20
average loss = 0.0277744
best constant = 0.5
best constant's loss = 0.25
total feature number = 160
Now that we have a model model.vw
, we can start vw
in daemon mode:
$ vw -i model.vw -t --daemon --quiet --port 26542
# Check that the daemon is running:
# By default we have one parent and 10 children processes
# This number of children (worker threads) can be changed with --num_children <N>
$ pgrep vw| wc -l
11
$ echo " abc-example| a b c" | netcat localhost 26542
0.000000 abc-example
$ echo " xyz-example| x y z" | netcat localhost 26542
1.000000 xyz-example
As expected, we got a prediction of 0
for a b c
and a prediction of 1
for x y z
.
The tags on the two examples are for illustrative purposes, they aren't mandatory.
There's no need to read the response after every request. Since network buffers are large, you may also send multiple examples and then read multiple responses. A new line character is both the input and the output record (example) separator.
$ echo '| a c
| b c
| y x
| z y x' | netcat localhost 26542
0.079382
0.079382
0.746049
1.000000
# this kills only the set of vw procs which listen to our 26542 port
# On Ubuntu you need package 'psmisc' installed to have 'pkill' and 'pgrep'
$ pkill -9 -f 'vw.*--port 26542'
# Verify that it no longer runs:
$ pgrep vw | wc -l
0
https://github.com/VowpalWabbit/vowpal_wabbit/issues/1597
TLDR: The core issue here is that there is no way for the daemon mode to report errors.
- Home
- First Steps
- Input
- Command line arguments
- Model saving and loading
- Controlling VW's output
- Audit
- Algorithm details
- Awesome Vowpal Wabbit
- Learning algorithm
- Learning to Search subsystem
- Loss functions
- What is a learner?
- Docker image
- Model merging
- Evaluation of exploration algorithms
- Reductions
- Contextual Bandit algorithms
- Contextual Bandit Exploration with SquareCB
- Contextual Bandit Zeroth Order Optimization
- Conditional Contextual Bandit
- Slates
- CATS, CATS-pdf for Continuous Actions
- Automl
- Epsilon Decay
- Warm starting contextual bandits
- Efficient Second Order Online Learning
- Latent Dirichlet Allocation
- VW Reductions Workflows
- Interaction Grounded Learning
- CB with Large Action Spaces
- CB with Graph Feedback
- FreeGrad
- Marginal
- Active Learning
- Eigen Memory Trees (EMT)
- Element-wise interaction
- Bindings
-
Examples
- Logged Contextual Bandit example
- One Against All (oaa) multi class example
- Weighted All Pairs (wap) multi class example
- Cost Sensitive One Against All (csoaa) multi class example
- Multiclass classification
- Error Correcting Tournament (ect) multi class example
- Malicious URL example
- Daemon example
- Matrix factorization example
- Rcv1 example
- Truncated gradient descent example
- Scripts
- Implement your own joint prediction model
- Predicting probabilities
- murmur2 vs murmur3
- Weight vector
- Matching Label and Prediction Types Between Reductions
- Zhen's Presentation Slides on enhancements to vw
- EZExample Archive
- Design Documents
- Contribute: