llava - RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source. #3807

Katehuuh · 2024-05-18T22:18:59Z

Reminder

I have read the README and searched the existing issues.

Reproduction

Broke after update to 2bec28e

set CUDA_VISIBLE_DEVICES=0 && llamafactory-cli train --stage sft --do_train True --model_name_or_path llava-hf/llava-1.5-13b-hf --preprocessing_num_workers 16 --finetuning_type lora --quantization_bit 8 --template vicuna --rope_scaling linear --flash_attn fa2 --visual_inputs True --dataset_dir data --dataset pokemon_1k --cutoff_len 4096 --learning_rate 2e-05 --num_train_epochs 3.0 --max_samples 100000 --per_device_train_batch_size 1 --gradient_accumulation_steps 1 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 1000 --warmup_steps 0 --optim adamw_torch --packing False --upcast_layernorm True --report_to none --output_dir saves\LLaVA1.5-13B-Chat\lora\LLaVA1.5-13B-Chat_pokemon --fp16 True --plot_loss True --lora_rank 256 --lora_alpha 512 --lora_dropout 0 --create_new_adapter True --lora_target all

:LLaMA-Factory\venv\lib\site-packages\bitsandbytes\autograd\_functions.py:322: UserWarning: MatMul8bitLt: inputs will be cast from torch.float32 to float16 during quantization
  warnings.warn(f"MatMul8bitLt: inputs will be cast from {A.dtype} to float16 during quantization")
Traceback (most recent call last):
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\username\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\LLaMA-Factory\venv\Scripts\llamafactory-cli.exe\__main__.py", line 7, in <module>
    sys.exit(main())
  File "C:\LLaMA-Factory\src\llamafactory\cli.py", line 65, in main
    run_exp()
  File "C:\LLaMA-Factory\src\llamafactory\train\tuner.py", line 34, in run_exp
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "C:\LLaMA-Factory\src\llamafactory\train\sft\workflow.py", line 73, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "C:\LLaMA-Factory\venv\lib\site-packages\transformers\trainer.py", line 1859, in train
    return inner_training_loop(
  File "C:\LLaMA-Factory\venv\lib\site-packages\transformers\trainer.py", line 2203, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\transformers\trainer.py", line 3138, in training_step
    loss = self.compute_loss(model, inputs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\transformers\trainer.py", line 3161, in compute_loss
    outputs = model(**inputs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\accelerate\utils\operations.py", line 822, in forward
    return model_forward(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\accelerate\utils\operations.py", line 810, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "C:\LLaMA-Factory\venv\lib\site-packages\torch\amp\autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\peft\peft_model.py", line 1129, in forward
    return self.base_model(
  File "C:\LLaMA-Factory\venv\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\peft\tuners\tuners_utils.py", line 161, in forward
    return self.model.forward(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\accelerate\hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "C:\LLaMA-Factory\venv\lib\site-packages\transformers\models\llava\modeling_llava.py", line 438, in forward
    inputs_embeds, attention_mask, labels, position_ids = self._merge_input_ids_with_image_features(
  File "C:\LLaMA-Factory\venv\lib\site-packages\transformers\models\llava\modeling_llava.py", line 340, in _merge_input_ids_with_image_features
    final_embedding[image_to_overwrite] = image_features.contiguous().reshape(-1, embed_dim).to(target_device)
RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source.

Expected behavior

No response

System Info

No response

Others

No response

The text was updated successfully, but these errors were encountered:

Katehuuh · 2024-05-18T23:51:35Z

and also please add preprocessing_num_workers in WebUI.

hiyouga · 2024-05-19T04:26:59Z

please provide training config for reproduce

Katehuuh · 2024-05-19T08:19:28Z

Already said it up, Dataset: pokemon_1k :

set CUDA_VISIBLE_DEVICES=0 && llamafactory-cli train --stage sft --do_train True --model_name_or_path llava-hf/llava-1.5-13b-hf --preprocessing_num_workers 16 --finetuning_type lora --quantization_bit 8 --template vicuna --rope_scaling linear --flash_attn fa2 --visual_inputs True --dataset_dir data --dataset pokemon_1k --cutoff_len 4096 --learning_rate 2e-05 --num_train_epochs 3.0 --max_samples 100000 --per_device_train_batch_size 1 --gradient_accumulation_steps 1 --lr_scheduler_type cosine --max_grad_norm 1.0 --logging_steps 5 --save_steps 1000 --warmup_steps 0 --optim adamw_torch --packing False --upcast_layernorm True --report_to none --output_dir saves\LLaVA1.5-13B-Chat\lora\LLaVA1.5-13B-Chat_pokemon --fp16 True --plot_loss True --lora_rank 256 --lora_alpha 512 --lora_dropout 0 --create_new_adapter True --lora_target all

hiyouga · 2024-05-19T09:07:14Z

This bug was introduced in b033232#diff-eda8bdc7f44e34692b3c3fa4827f6169b555c0c17c38bc9f031698ef245edb88L25

hiyouga closed this as completed in 1ebc890 May 19, 2024

hiyouga added the solved This problem has been already solved label May 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llava - RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source. #3807

llava - RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source. #3807

Katehuuh commented May 18, 2024

Katehuuh commented May 18, 2024

hiyouga commented May 19, 2024

Katehuuh commented May 19, 2024

hiyouga commented May 19, 2024

llava - RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source. #3807

llava - RuntimeError: Index put requires the source and destination dtypes match, got Half for the destination and Float for the source. #3807

Comments

Katehuuh commented May 18, 2024

Reminder

Reproduction

Expected behavior

System Info

Others

Katehuuh commented May 18, 2024

hiyouga commented May 19, 2024

Katehuuh commented May 19, 2024

hiyouga commented May 19, 2024