DNNs can be deployed within a model serving framework such as TensorFlow Serving, a high-performance and highly scalable deployment architecture. All of the baseline implementations have exporters, though some customization may be required for user-defined models, for which we provide interfaces. The exporter transforms the model to include pre-processing and service-consumable output.
Currently we provide code to export and serve TensorFlow models when the backend is TensorFlow, and code to export ONNX models from PyTorch.
To export a model, use export.py. The following are the important arguments for the exporter:
--config
: configuration file for the trained model (see mead)--task
: classify,taggger,seq2seq,lm (for which task the model was trained, see baseline)--model
: location of the saved model checkpoint.--output_dir
: location of the directory where the exported model is saved.--backend
: If you override this param from the config inmead-train
, pass the same argument here--is_remote
: If you set this to false, the model's vectorizers and meta-data files will be packaged in the same place as the exported model Here is an example of running for a saved model from training. Adjust for your model name:
mead-export --config config/sst2.json --model sst2-29858.zip
The last command will store the model at ./models/1
(note the version number). Please see export.py
for more customization.
TODO: Update this section, its quite old!
To serve the model you must run Tensorflow Serving.
- Clone serving:
git clone --recursive https://github.com/tensorflow/serving
- Install bazel:
echo "deb [arch=amd64] http://storage.googleapis.com/bazel-apt stable jdk1.8" | sudo tee /etc/apt/sources.list.d/bazel.list
curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add -
sudo apt-get update && sudo apt-get install bazel
- Serve model using bazel (CPU version):
build:
cd serving
bazel build //tensorflow_serving/model_servers:tensorflow_model_server
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=8500 --model_name=test --model_config_file=/data/model-store/models.config
- Serve model using bazel (GPU version):
Add this to .bashrc
export "TF_NEED_CUDA=1"
build:
bazel build -c opt --config=cuda --spawn_strategy=standalone //tensorflow_serving/model_servers:tensorflow_model_server
- Start serving models:
bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=8500 --model_config_file=models.config
A sample models.config file looks like this:
model_config_list: {
config: {
name: "age",
base_path: "/data/model-store/age",
model_platform: "tensorflow"
}
}
TODO: This is way out of date
You can alternately test your module by running a docker image pulled from epigram ai: docker pull epigramai/model-server:light-1.5
.
Once you have an exported model and the docker image pulled, you can load it in a docker container as follows:
docker run -it --name tf_classify -p 9000:9000 -v ./models:/models/ epigramai/model-server:light --port=9000 --model_name=tf_classify --model_base_path=/models/
To export a model to ONNX for a PyTorch model, use mead-export
as usual. If you want to run the model locally with the api-examples, use the is_remote
parameter.
mead-export --config config/conll.json --model tagger-31405.zip --name conll-iobes --use_version false --zip true --is_remote False
The options above make tell mead-export
to ignore versioning and generate a fully-contained onnx
bundle in zip format.
The API examples classify-text
and text-text
use the baseline.Service
classes to load and run models that are either remote or local.
They can run onnx
models if you pass --backend onnx
to them. To do this, you need the onnxruntime
package installed.
You can install it manually like this:
pip install onnxruntime
Or you can install it as a dependency of mead like this
pip install mead-baseline[onnx]
To run the checkpoint we just exported:
(base) dpressel@dpressel-CORSAIR-ONE:~/dev/work/baseline/api-examples$ python tag_text.py --model ../mead/models/conll-iobes-11335.zip --text "Mr. Jones flew to New York ." --backend onnx
/tmp/43cd89440a7111ce9180faee69b1e75aac71c661/conll-iobes/tagger-model-31405.onnx
/tmp/43cd89440a7111ce9180faee69b1e75aac71c661/conll-iobes/vocabs-word-31405.json
/tmp/43cd89440a7111ce9180faee69b1e75aac71c661/conll-iobes/vocabs-char-31405.json
/tmp/43cd89440a7111ce9180faee69b1e75aac71c661/conll-iobes/vocabs-senna-31405.json
2020-05-08 13:45:41.131100225 [W:onnxruntime:, graph.cc:2413 CleanUnusedInitializers] Removing initializer 'embeddings.embeddings.2.proj.bias'. It is not used by any node and should be removed from the model.
2020-05-08 13:45:41.131122836 [W:onnxruntime:, graph.cc:2413 CleanUnusedInitializers] Removing initializer 'embeddings.embeddings.2.proj.weight'. It is not used by any node and should be removed from the model.
Mr. O
Jones S-PER
flew O
to O
New B-LOC
York E-LOC
. O
To export to the Triton server, we will tell mead-export
to split the exported components into 2 sub-directories, one for the server and one for the client. During generation, we will also want to generate a sub-directory with the version information underneath the 2 areas:
mead-export --config config/conll.json --backend pytorch \
--model tagger-14444.zip \
--name conll-remote \
--use_version true
(base) dpressel@dpressel:~/dev/work/baseline/mead/models$ ls -R
.:
client server
./client:
conll-remote
./client/conll-remote:
1
./client/conll-remote/1:
model.assets vectorizers-14444.pkl vocabs-word-14444.json
tagger-model-14444.labels vocabs-char-14444.json
vectorizers-14444.json vocabs-senna-14444.json
./server:
conll-remote
./server/conll-remote:
1
./server/conll-remote/1:
model.onnx
We can launch the Triton server using Docker to test things out.
export VERSION=21.04
docker run --rm --network host -p8000:8000 -p8001:8001 -p8002:8002 -v/path/to/mead-baseline/mead/models/server:/models nvcr.io/nvidia/tritonserver:$VERSION-py3 tritonserver --model-repository=/models --strict-model-config=false
Triton will look for the model.onnx
file under the server side sub-directory and load it up.
Once the server is up and running, we can test it using the API examples as before, but this time we will give it a URL to hit:
python tag_text.py \
--model /path/to/mead-baseline/mead/models/client/conll-remote/1 \
--text "Baldwin ( 10-4 ) struck out four and did not walk a batter for Chicago , which won for only the fourth time in 15 games ." \
--backend onnx \
--remote 0.0.0.0:8001 \
--name conll-remote
The following MEAD Baseline models have been tested with ONNX
Task | PyTorch | Encoder | Decoder | ORT Version |
---|---|---|---|---|
Tagger | 1.4+ | BiLSTM | Greedy | 1.2.0 |
Tagger | 1.4+ | BiLSTM | CRF | 1.2.0 |
Classifier | 1.4+ | CNN | N/A | 1.2.0 |
Classifier | 1.4+ | LSTM | N/A | 1.2.0 |
Classifier | 1.4+ | Fine-Tune BERT | N/A | 1.2.0 |
A few commandline arguments to mead-export
can be used to control what the output paths look like.
The first is --output_dir
This controls the root of the exported model. The argument --project
creates a subdirectory for the model and --name
can be used to create a second one. --is_remote
is used to split the export into a client
and server
trees. Here are a few examples. When neither a --project
nor as --name
are given then a directory with the same name is the basename of the --output_dir
is created. --model_version
is the last directory.
mead-export ... --output_dir path/to/models --is_remote true --model_version 3
->path/to/models/client/models/3
andpath/to/models/server/models/3
mead-export ... --output_dir path/to/models --is_remote false --model_version 19
->path/to/models/19
mead-export ... --output_dir path/to/models --project proj --model_version 2
->path/to/models/client/proj/2
andpath/to/models/server/proj/2
mead-export ... --output_dir path/to/models --project example --name bob --model_version 1 --is_remote false
->path/to/models/example/bob/1
--model_version
can be set to None
and it will search in the export dir and use a model number one larger than the largest one currently in there.