Error during inference LLaMA2 + LoRA: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #4432

WJMacro · 2024-06-24T01:50:37Z

Reminder

I have read the README and searched the existing issues.

System Info

Traceback (most recent call last):                                                                                                                                          
  File "[My_env_dir]/lib/python3.9/threading.py", line 980, in _bootstrap_inner                                                                        
    self.run()                                                                                                                                                              
  File "[My_env_dir]/lib/python3.9/threading.py", line 917, in run                                                                                     
    self._target(*self._args, **self._kwargs)                                                                                                                               
  File "[My_env_dir]/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context                                            
    return func(*args, **kwargs)                                                                                                                                            
  File "[My_env_dir]/lib/python3.9/site-packages/transformers/generation/utils.py", line 1758, in generate 
   result = self._sample(                                                                                                                                         [59/1930]
  File "[My_env_dir]/lib/python3.9/site-packages/transformers/generation/utils.py", line 2397, in _sample                                              
    outputs = self(                                                                                                                                                         
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl                                         
    return self._call_impl(*args, **kwargs)                                                                                                                                 
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl                                                 
    return forward_call(*args, **kwargs)                                                                                                                                    
  File "[My_env_dir]/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 1164, in forward                                   
    outputs = self.model(                                                                                                                                                   
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl                                         
    return self._call_impl(*args, **kwargs)                                                                                                                                 
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl                                                 
    return forward_call(*args, **kwargs)                                                                                                                                    
  File "[My_env_dir]/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 968, in forward                                    
    layer_outputs = decoder_layer(                                                                                                                                          
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl                                         
    return self._call_impl(*args, **kwargs)                                                                                                                                  
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl                                                 
    return forward_call(*args, **kwargs)                                                                                                                                    
  File "[My_env_dir]/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 713, in forward                                    
    hidden_states, self_attn_weights, present_key_value = self.self_attn(                                                                                                   
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl                                         
    return self._call_impl(*args, **kwargs)                                                                                                                                 
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl                                                 
    return forward_call(*args, **kwargs)                                                                                                                                    
  File "[My_env_dir]/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 616, in forward                                    
    key_states = self.k_proj(hidden_states)                                                                                                                                 
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl                                         
    return self._call_impl(*args, **kwargs)                                                                                                                                 
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl                                                 
    return forward_call(*args, **kwargs)                                                                                                                                    
  File "[My_env_dir]/lib/python3.9/site-packages/torch/nn/modules/linear.py", line 116, in forward                                                     
    return F.linear(input, self.weight, self.bias)                                                                                                                          
RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float

Reproduction

Inference config file:

model_name_or_path: llama2-7b-chat-hf 
template: llama2 
adapter_name_or_path: [LoRA path]
finetuning_type: lora

Run with LLaMA-Factory

llamafactory-cli chat config.yaml

Expected behavior

No response

Others

No response

The text was updated successfully, but these errors were encountered:

WJMacro · 2024-06-24T01:58:19Z

Is it due to inconsistent precision of lora parameters?

Fighoture · 2024-07-23T05:03:20Z

Same issue. How do you solve it?

github-actions bot added the pending This problem is yet to be addressed label Jun 24, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 24, 2024

hiyouga closed this as completed in 1e9d0aa Jun 24, 2024

PrimaLuz pushed a commit to PrimaLuz/LLaMA-Factory that referenced this issue Jul 1, 2024

fix hiyouga#4432

bdeb03d

xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024

fix hiyouga#4432

69ae116

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error during inference LLaMA2 + LoRA: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #4432

Error during inference LLaMA2 + LoRA: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #4432

WJMacro commented Jun 24, 2024

WJMacro commented Jun 24, 2024

Fighoture commented Jul 23, 2024

Error during inference LLaMA2 + LoRA: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #4432

Error during inference LLaMA2 + LoRA: RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float #4432

Comments

WJMacro commented Jun 24, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

WJMacro commented Jun 24, 2024

Fighoture commented Jul 23, 2024