# file tree of model repository
├── <model_repository>/
│ ├── <model_name_1>/ # the directory of model, include parameters and configurations.
│ │ ├── config.pbtxt # the config of model, more detail is as below.
│ │ ├── <model_file> # the file of model parameters, you can simply replace the model file with
│ │ │ your own model and change the ${default_model_filename} in config.pbtxt.
│ │ └── 1/ # this empty directory is necessary, which is needed by tritonserver.
│ ├── <model_name_2>/ # ...
│ │ ├── config.pbtxt # ...
│ │ ├── <model_file> # ...
│ │ └── 1/ # ...
│ └── #<model_name_vid>... # more models etc...
-
The meaning of parameters in config.pbtxt, more information you can find in Model config of tritonbackend
${name}: name of model, which should be same with <model_name_vid>
${backend}: fixed value - "lightseq", which is used to recognize the dynamic link library of tritonbackend, libtriton_lightseq.so
${default_model_filename}: name of model file, which should be same with <model_file>
${parameters - value - string_value}: the type of model, which should be supported by lightseq. You can choose
Transformer
|QuantTransformer
|Bert
|Gpt
|Moe
-
You can see example in Example Of Triton Model Config, while you can also find more detailed information in Model Config Of Tritonserver.
-
The model files which needed by Example Of Triton Model Config you can find in Examples of exporting models for LightSeq inference, and you can also export your own model, steps are available here - How to export your own model.
-
Get tritonserver Docker: Tritonserver Quickstart
$ sudo docker build -t <docker_image_name> - < <repository_root>/docker/tritonserver/Dockerfile # Or you can simply pull image which is compiled by ourselves in advance, # and you can choose suitable version by replacing `22.01-1` with <tag_name> $ sudo docker pull hexisyztem/tritonserver_lightseq:22.01-1
- We create a Dockerfile ,because lightseq need a dynamic link library which is not contained by nvcr.io/nvidia/tritonserver:22.01-py3. If necessary, you can add http_proxy/https_proxy to reduce compile time.
- The structure of file tree is shown as blow:
# file tree of tritonserver in docker image, user could ignore this part. ├── /opt/tritonserver/ │ ├── backends/ # the directory of backends, which is used to store backends' │ │ │ dynamic link libraries by default. │ │ ├── lightseq/ # the config of model, more detail is as below. │ │ │ └── libtriton_lightseq.so # the dynamic link library of lightseq's tritonbackend. │ │ └── <other_backends...> # other directories which are unnecessary for lightseq... │ ├── lib/ # ... │ │ ├── libliblightseq.so # ... │ │ └── libtritonserver.so # ... │ ├── bin/ # ... │ │ └── tritonserver # the executable file of tritonserver. │ └── <other_directories...> # ...
-
Docker Commands:
$ sudo docker run --gpus=<num_of_gpus> --rm -p<http_port>:<http_port> -p<grpc_port>:<grpc_port> -v<model_repository>:/models <docker_image_name> tritonserver --model-repository=/models --http-port=<port_id> --grpc-port=<grpc_port>
-
<num_of_gpus>: int, the number of gpus which are needed by tritonserver.
-
<model_repository>: str, the path of model repository which are organized by yourself.
-
-
Install client requirements:
$ pip install tritonclient[all]
-
Run client example:
$ export HTTP_PORT=<http_port> && python3 transformer_client_example.py