-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"RuntimeError: The size of tensor a (0) must match the size of tensor b (4096) at non-singleton dimension 1" (DPO + LoRA) #57
Comments
NOTE: This only occurs if I'm using the deepspeed accelerate config and set |
So I think the solution to add If someone can confirm that feel free to close this out. If not, lmk :) |
I think the problem might be related to using deepspeed on my local DL rig with 2x3090s. Just switched to the multi-gpu.yaml file and the script ran no problem. |
Hi @ohmeow as discussed here I think indeed the issue is when trying to do the following:
I don't think we saw this issue in the original release of the code because we made a goof on the If you have enough vRAM then one should be able to workaround this by setting I'm discussing this with the |
The only way I was able to get training to proceed was by adding if is_adapter_model(model, model_args.model_revision):
# load the model, merge the adapter weights and unload the adapter
# Note: to run QLora, you will need to merge the based model separately as the merged model in 16bit
logger.info(f"Merging peft adapters for {model_args.model_name_or_path=}")
peft_config = PeftConfig.from_pretrained(model_args.model_name_or_path, revision=model_args.model_revision)
model_kwargs = dict(
revision=model_args.base_model_revision,
trust_remote_code=model_args.trust_remote_code,
use_flash_attention_2=model_args.use_flash_attention_2,
torch_dtype=torch_dtype,
use_cache=False if training_args.gradient_checkpointing else True,
device_map=get_kbit_device_map(),
)
base_model = AutoModelForCausalLM.from_pretrained(peft_config.base_model_name_or_path, **model_kwargs)
model = PeftModel.from_pretrained(base_model, model_args.model_name_or_path, revision=model_args.model_revision)
model.eval()
model = model.merge_and_unload()
model_kwargs = None
if model_args.use_peft is True:
ref_model = None
ref_model_kwargs = None
else:
ref_model = model
ref_model_kwargs = model_kwargs
accelerator.wait_for_everyone() With this I can get everything running on my 2x3090s using the The deepspeed config works as well but for some reason fails when pushing the model to the hub. I imagine this has something to do with my machine and/or with using 3090s. |
Can confirm that setting |
Having the same issue here, but wierdly, the DPO script can not run even with multi-gpu.yaml on my machine, could you please share your multi-gpu.yaml file? In my understanding, multi-gpu.yaml is for data parallelising, so it should not have problem with merge Qlora adaptator. |
So I'm attempting to run the DPO LoRA script and I'm getting this error:
... when the
model.merge_and_load()
runs here:Any ideas?
The text was updated successfully, but these errors were encountered: