Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same #873

Closed
TopTea1 opened this issue Dec 7, 2022 · 9 comments · Fixed by #882
Assignees

Comments

@TopTea1
Copy link

TopTea1 commented Dec 7, 2022

Hi, I'm using the Docker image clip-server:master on CUDA GPU, and when I'm trying to execute the basic example :

from clip_client import Client

c = Client('grpc://0.0.0.0:51000')
r = c.encode([''])

print(r.shape) 

With this config :

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      metas:
        py_modules:
          - clip_server.executors.clip_torch
      with:
        name: ViT-L-14-336::openai

I have this issue :

jina.excepts.BadServer: request_id: "d1a940f074264c97b10891b885e4c8a8"
status {
  code: ERROR
  description: "RuntimeError(\'Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should 
be the same\')"
  exception {
    name: "RuntimeError"
    args: "Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same"
    stacks: "Traceback (most recent call last):\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/runtimes/worker/__init__.py\", line 222, in
process_data\n    result = await self._request_handler.handle(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/runtimes/worker/request_handling.py\", line
291, in handle\n    return_data = await self._executor.__acall__(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 354, in 
__acall__\n    return await self.__acall_endpoint__(__default_endpoint__, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 401, in 
__acall_endpoint__\n    return await exec_func(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 366, in 
exec_func\n    return await func(self, tracing_context=tracing_context, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/decorators.py\", line 173, in 
arg_wrapper\n    return await fn(executor_instance, *args, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/executors/clip_torch.py\", line 180, in 
encode\n    self._model.encode_image(**batch_data)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/openclip_model.py\", line 64, in 
encode_image\n    return self._model.encode_image(pixel_values)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/model.py\", line 182, in encode_image\n    
features = self.visual(image)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/transformer.py\", line 304, in forward\n    
x = self.conv1(x)  # shape = [*, width, grid, grid]\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py\", line 463, in forward\n    
return self._conv_forward(input, self.weight, self.bias)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/conv.py\", line 459, in 
_conv_forward\n    return F.conv2d(input, weight, bias, self.stride,\n"
    stacks: "RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be 
the same\n"
    executor: "CLIPEncoder"
  }
}
exec_endpoint: "/encode"
target_executor: ""

Have you any ideas to solve this issue ?

Thanks for your help

@ZiniuYu
Copy link
Member

ZiniuYu commented Dec 7, 2022

Hi @TopTea1 , thanks for reporting this!

This is a known issue and we are fixing it.

@TopTea1
Copy link
Author

TopTea1 commented Dec 7, 2022

Thanks for your feedback

@ZiniuYu
Copy link
Member

ZiniuYu commented Dec 7, 2022

@TopTea1 You can use our pre-build docker image to get around with the error like this:

jtype: Flow
with:
  port: 51000
executors:
  - name: clip_t
    uses: jinahub+docker://CLIPTorchEncoder/0.8.1
    uses_with:
      name: ViT-L-14-336::openai

@TopTea1
Copy link
Author

TopTea1 commented Dec 7, 2022

Thanks @ZiniuYu, but when I try this version 0.8.1 from dockerhub I have this issue :

jina.excepts.BadServer: request_id: "3732bde0aa46429da0be6f4638c50b08"
status {
  code: ERROR
  description: "RuntimeError(\'CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix(
handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, 
(void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`\')"
  exception {
    name: "RuntimeError"
    args: "CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix( handle, opa, opb, 
m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, (void*)(&fbeta), c, 
CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`"
    stacks: "Traceback (most recent call last):\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/runtimes/worker/__init__.py\", line 219, in
process_data\n    result = await self._data_request_handler.handle(\n"
    stacks: "  File 
\"/usr/local/lib/python3.8/dist-packages/jina/serve/runtimes/request_handlers/data_request_handler.py\", line 228, 
in handle\n    return_data = await self._executor.__acall__(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 329, in 
__acall__\n    return await self.__acall_endpoint__(__default_endpoint__, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 378, in 
__acall_endpoint__\n    return await exec_func(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 339, in 
exec_func\n    return await func(self, tracing_context=tracing_context, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/decorators.py\", line 153, in 
arg_wrapper\n    return await fn(executor_instance, *args, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/executors/clip_torch.py\", line 140, in 
encode\n    self._model.encode_image(**batch_data)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/openclip_model.py\", line 51, in 
encode_image\n    return self._model.encode_image(pixel_values)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/model.py\", line 591, in 
encode_image\n    return self.visual(image.type(self.dtype))\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/model.py\", line 428, in forward\n  
x = self.transformer(x)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/model.py\", line 353, in forward\n  
x = r(x, attn_mask=attn_mask)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/model.py\", line 322, in forward\n  
x = x + self.ln_attn(self.attention(self.ln_1(x), attn_mask=attn_mask))\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/model.py\", line 317, in attention\n
return self.attn(x, x, x, need_weights=False, attn_mask=attn_mask)[0]\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/activation.py\", line 1167, in 
forward\n    attn_output, attn_output_weights = F.multi_head_attention_forward(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py\", line 5160, in 
multi_head_attention_forward\n    attn_output_weights = torch.bmm(q_scaled, k.transpose(-2, -1))\n"
    stacks: "RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix( 
handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, 
(void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`\n"
    executor: "CLIPEncoder"
  }
}
exec_endpoint: "/encode"
target_executor: ""

@ZiniuYu
Copy link
Member

ZiniuYu commented Dec 8, 2022

Hi @TopTea1 , can you try again with jinahub+docker://CLIPTorchEncoder/0.8.1-gpu?
What's the output of nvidia-smi?

@TopTea1
Copy link
Author

TopTea1 commented Dec 8, 2022

Hi @ZiniuYu, thanks it's working with this version, I was using a wrong version in my last comment

@ZiniuYu
Copy link
Member

ZiniuYu commented Dec 8, 2022

Glad to see it works 🍻
You can also give the main branch another try! The problem you met should be fixed now.

@TopTea1
Copy link
Author

TopTea1 commented Dec 8, 2022

I have tested with the new master image, I got this issue :

jina.excepts.BadServer: request_id: "98fc908ccd2944fa8221d5fed3f420f6"
status {
  code: ERROR
  description: "RuntimeError(\'CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix(
handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, 
(void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`\')"
  exception {
    name: "RuntimeError"
    args: "CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix( handle, opa, opb, 
m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, (void*)(&fbeta), c, 
CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`"
    stacks: "Traceback (most recent call last):\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/runtimes/worker/__init__.py\", line 222, in
process_data\n    result = await self._request_handler.handle(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/runtimes/worker/request_handling.py\", line
291, in handle\n    return_data = await self._executor.__acall__(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 354, in 
__acall__\n    return await self.__acall_endpoint__(__default_endpoint__, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 401, in 
__acall_endpoint__\n    return await exec_func(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/__init__.py\", line 366, in 
exec_func\n    return await func(self, tracing_context=tracing_context, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/jina/serve/executors/decorators.py\", line 173, in 
arg_wrapper\n    return await fn(executor_instance, *args, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/executors/clip_torch.py\", line 194, in 
encode\n    self._model.encode_image(**batch_data)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/openclip_model.py\", line 64, in 
encode_image\n    return self._model.encode_image(pixel_values)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/model.py\", line 182, in encode_image\n    
features = self.visual(image)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/clip_server/model/model.py\", line 88, in forward\n   
return super().forward(x)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/transformer.py\", line 314, in forward\n    
x = self.transformer(x)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/transformer.py\", line 230, in forward\n    
x = r(x, attn_mask=attn_mask)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/transformer.py\", line 154, in forward\n    
x = x + self.ls_1(self.attention(self.ln_1(x), attn_mask=attn_mask))\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/open_clip/transformer.py\", line 151, in attention\n  
return self.attn(x, x, x, need_weights=False, attn_mask=attn_mask)[0]\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py\", line 1190, in 
_call_impl\n    return forward_call(*input, **kwargs)\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/modules/activation.py\", line 1167, in 
forward\n    attn_output, attn_output_weights = F.multi_head_attention_forward(\n"
    stacks: "  File \"/usr/local/lib/python3.8/dist-packages/torch/nn/functional.py\", line 5160, in 
multi_head_attention_forward\n    attn_output_weights = torch.bmm(q_scaled, k.transpose(-2, -1))\n"
    stacks: "RuntimeError: CUDA error: CUBLAS_STATUS_INVALID_VALUE when calling `cublasGemmStridedBatchedExFix( 
handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, 
(void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)`\n"
    executor: "CLIPEncoder"
  }
}
exec_endpoint: "/encode"
target_executor: ""

Here is the output of my nvidia-smi :

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:18:00.0 Off |                    0 |
| N/A   35C    P0    37W / 250W |   2125MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  On   | 00000000:3B:00.0 Off |                    0 |
| N/A   30C    P0    26W / 250W |      4MiB / 32768MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-PCIE...  On   | 00000000:86:00.0 Off |                    0 |
| N/A   52C    P0    65W / 250W |   3515MiB / 16384MiB |     65%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A    683662      C   python                           2121MiB |
+-----------------------------------------------------------------------------+

@TopTea1
Copy link
Author

TopTea1 commented Dec 9, 2022

To give more precision, have tested this docker image (https://hub.docker.com/r/jinaai/clip-server) in both comment. In the first try (#873 (comment)) I have used the image with the tag 0.8.1 and in the second comment (#873 (comment)) I used the tag master. The config and the command to start the container that I used are :

cat clip_config.yml | CUDA_VISIBLE_DEVICES=1 docker run -i   -p 51000:51000 -v $HOME/jina/.cache:/home/cas/.cache --gpus all jinaai/clip-server:master -i

And in clip_config.yml :

jtype: Flow
version: '1'
with:
  port: 51000
executors:
  - name: clip_t
    uses:
      jtype: CLIPEncoder
      with:
        name: ViT-L-14-336::openai
      metas:
        py_modules:
          - clip_server.executors.clip_torch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants