From bd68fbe7a32ab0fe5075b84cc6909d5d92a3fcbe Mon Sep 17 00:00:00 2001 From: ZiniuYu Date: Thu, 14 Jul 2022 17:46:03 +0800 Subject: [PATCH 1/3] docs: imporve model support --- docs/user-guides/server.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guides/server.md b/docs/user-guides/server.md index 28e46c934..fb858aa03 100644 --- a/docs/user-guides/server.md +++ b/docs/user-guides/server.md @@ -61,7 +61,7 @@ The procedure and UI of ONNX and TensorRT runtime would look the same as Pytorch ## Model support -Open AI has released 9 models so far. `ViT-B/32` is used as default model in all runtimes. Due to the limitation of some runtime, not every runtime supports all nine models. Please also note that different model give different size of output dimensions. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM) using a default `minibatch_size=32` in server with PyTorch runtime and a default `batch_size=8` in client. +Open AI has released 9 models so far. `ViT-B/32` is used as default model in all runtimes. Due to the limitation of some runtime, not every runtime supports all nine models. Please also note that different model give different size of output dimensions. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM, driver version 510.47.03, version CUDA 11.6) to encode a list of images with `batch_size=8` using PyTorch runtime. | Model | PyTorch | ONNX | TensorRT | Output Dimension | Disk Usage (MB) | Peak RAM Usage (GB) | Peak VRAM Usage (GB) | |----------------|---------|------|----------|------------------|-----------------|---------------------|----------------------| From 83221b7511abd7ad16105683fa03d92ece45bca9 Mon Sep 17 00:00:00 2001 From: ZiniuYu Date: Thu, 14 Jul 2022 17:53:29 +0800 Subject: [PATCH 2/3] docs: improve model support --- docs/user-guides/server.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guides/server.md b/docs/user-guides/server.md index fb858aa03..d1d8a2777 100644 --- a/docs/user-guides/server.md +++ b/docs/user-guides/server.md @@ -61,7 +61,7 @@ The procedure and UI of ONNX and TensorRT runtime would look the same as Pytorch ## Model support -Open AI has released 9 models so far. `ViT-B/32` is used as default model in all runtimes. Due to the limitation of some runtime, not every runtime supports all nine models. Please also note that different model give different size of output dimensions. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM, driver version 510.47.03, version CUDA 11.6) to encode a list of images with `batch_size=8` using PyTorch runtime. +Open AI has released 9 models so far. `ViT-B/32` is used as default model in all runtimes. Due to the limitation of some runtime, not every runtime supports all nine models. Please also note that different model give different size of output dimensions. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM) to encode a list of images with `batch_size=8` using PyTorch runtime. | Model | PyTorch | ONNX | TensorRT | Output Dimension | Disk Usage (MB) | Peak RAM Usage (GB) | Peak VRAM Usage (GB) | |----------------|---------|------|----------|------------------|-----------------|---------------------|----------------------| From 83980f2dbeacf795fe2dd541e6b01b752102dfe8 Mon Sep 17 00:00:00 2001 From: ZiniuYu Date: Fri, 15 Jul 2022 17:53:15 +0800 Subject: [PATCH 3/3] docs: improve narratives --- docs/user-guides/server.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/user-guides/server.md b/docs/user-guides/server.md index d1d8a2777..740ac43b3 100644 --- a/docs/user-guides/server.md +++ b/docs/user-guides/server.md @@ -61,7 +61,7 @@ The procedure and UI of ONNX and TensorRT runtime would look the same as Pytorch ## Model support -Open AI has released 9 models so far. `ViT-B/32` is used as default model in all runtimes. Due to the limitation of some runtime, not every runtime supports all nine models. Please also note that different model give different size of output dimensions. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM) to encode a list of images with `batch_size=8` using PyTorch runtime. +Open AI has released 9 models so far. `ViT-B/32` is used as default model in all runtimes. Due to the limitation of some runtime, not every runtime supports all nine models. Please also note that different model give different size of output dimensions. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM) for a series of text and image encoding tasks with `batch_size=8` using PyTorch runtime. | Model | PyTorch | ONNX | TensorRT | Output Dimension | Disk Usage (MB) | Peak RAM Usage (GB) | Peak VRAM Usage (GB) | |----------------|---------|------|----------|------------------|-----------------|---------------------|----------------------| @@ -73,7 +73,7 @@ Open AI has released 9 models so far. `ViT-B/32` is used as default model in all | ViT-B/32 | ✅ | ✅ | ✅ | 512 | 351 | 3.20 | 1.40 | | ViT-B/16 | ✅ | ✅ | ✅ | 512 | 354 | 3.20 | 1.44 | | ViT-L/14 | ✅ | ✅ | ❌ | 768 | 933 | 3.66 | 2.04 | -| ViT-L/14-336px | ✅ | ✅ | ❌ | 768 | 934 | 3.74 | 2.23 | +| ViT-L/14@336px | ✅ | ✅ | ❌ | 768 | 934 | 3.74 | 2.23 | ## YAML config