-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"RuntimeError: expected scalar type Float but found Half" when using 16-bit with Lora. #519
Comments
Did you |
Yeah, I did. |
Can you check if things work as expected after this commit? |
I get the same error. I did a fresh install and I still get the same error, even after your last commit
|
Quick fix for using lora with line 23:
line 27
|
I'll give you a trophy if this works |
Same error for me: Command
Error
Diffdiff --git a/modules/LoRA.py b/modules/LoRA.py
index aa68ad3..524545f 100644
--- a/modules/LoRA.py
+++ b/modules/LoRA.py
@@ -27,11 +27,11 @@ def add_lora_to_model(lora_name):
params['dtype'] = shared.model.dtype
if hasattr(shared.model, "hf_device_map"):
params['device_map'] = {"base_model.model."+k: v for k, v in shared.model.hf_device_map.items()}
- elif shared.args.load_in_8bit:
+ elif shared.args.load_in_8bit or shared.args.gptq_bits:
params['device_map'] = {'': 0}
shared.model = PeftModel.from_pretrained(shared.model, Path(f"loras/{lora_name}"), **params)
- if not shared.args.load_in_8bit and not shared.args.cpu:
+ if not (shared.args.load_in_8bit or shared.args.gptq_bits) and not shared.args.cpu:
shared.model.half()
if not hasattr(shared.model, "hf_device_map"):
shared.model.cuda()
|
Used to apply this fix for |
This fix doesn't really work #332 (comment) People have been using this patch: https://github.com/johnsmith0031/alpaca_lora_4bit |
This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below. |
Describe the bug
I have been using lora with
--load-in-8bit
but I saw that now Lora is supposed to work with 16-bit mode. But I'm getting "RuntimeError: expected scalar type Float but found Half" when I try to use it using with--bf16
.Is there an existing issue for this?
Reproduction
python server.py --listen --listen-port 8888 --bf16 --model llama-13b-hf --lora alpaca-lora-13b --cai-chat --verbose --extension simple_memory
Screenshot
No response
Logs
System Info
The text was updated successfully, but these errors were encountered: