Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

更换 template 导致 train_bash.py FAILED,貌似与 cuda 管理有关 #3022

Closed
1 task done
syGOAT opened this issue Mar 28, 2024 · 9 comments
Closed
1 task done
Labels
solved This problem has been already solved

Comments

@syGOAT
Copy link

syGOAT commented Mar 28, 2024

Reminder

  • I have read the README and searched the existing issues.

Reproduction

我的指令如下:

accelerate launch --config_file /root/autodl-tmp/fhy/finetune/config.yaml src/train_bash.py     \
    --stage sft    \
    --do_train     \
    --flash_attn True     \
    --quantization_bit 4     \
    --model_name_or_path /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2     \
    --dataset filtered_sampled_data     \
    --template chatml     \
    --finetuning_type lora     \
    --lora_target q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj     \
    --output_dir /root/autodl-tmp/fhy/finetune3/mistral-instruct-2-lora-moretokens-chatml     \
    --overwrite_cache     \
    --per_device_train_batch_size 4     \
    --gradient_accumulation_steps 4     \
    --lr_scheduler_type cosine     \
    --logging_steps 10     \
    --save_steps 500     \
    --learning_rate 5e-5    \
    --num_train_epochs 1.5     \
    --plot_loss     \
    --bf16  \
    --overwrite_output_dir  \
    --lora_rank 16  \

数据载入的过程应该是成功的,随后终端输出的错误信息:

 0%|          | 0/1622 [00:00<?, ?it/s]../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [916,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [916,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [916,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [916,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [916,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [916,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
# 此处省略几十行相似内容
Traceback (most recent call last):
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/train_bash.py", line 14, in <module>
    main()
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/train_bash.py", line 5, in main
    run_exp()
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/llmtuner/train/tuner.py", line 32, in run_exp
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/llmtuner/train/sft/workflow.py", line 73, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 1624, in train
    return inner_training_loop(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 2902, in training_step
    loss = self.compute_loss(model, inputs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 2925, in compute_loss
    outputs = model(**inputs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1523, in forward
    else self._run_ddp_forward(*inputs, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward
    return self.module(*inputs, **kwargs)  # type: ignore[index]
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/utils/operations.py", line 822, in forward
    return model_forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/utils/operations.py", line 810, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/peft/peft_model.py", line 1091, in forward
    return self.base_model(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 160, in forward
    return self.model.forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1157, in forward
    outputs = self.model(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1004, in forward
    attention_mask = _prepare_4d_causal_attention_mask_for_sdpa(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 371, in _prepare_4d_causal_attention_mask_for_sdpa
    elif not is_tracing and torch.all(attention_mask == 1):
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [360,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [360,0,0], thread: [65,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [360,0,0], thread: [66,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [360,0,0], thread: [67,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [360,0,0], thread: [68,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
# 次数省略几十行相似内容
Traceback (most recent call last):
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/train_bash.py", line 14, in <module>
    main()
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/train_bash.py", line 5, in main
    run_exp()
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/llmtuner/train/tuner.py", line 32, in run_exp
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/root/autodl-tmp/xxx/LLaMA-Factory/src/llmtuner/train/sft/workflow.py", line 73, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 1624, in train
    return inner_training_loop(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 1961, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 2902, in training_step
    loss = self.compute_loss(model, inputs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/trainer.py", line 2925, in compute_loss
    outputs = model(**inputs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1523, in forward
    else self._run_ddp_forward(*inputs, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/parallel/distributed.py", line 1359, in _run_ddp_forward
    return self.module(*inputs, **kwargs)  # type: ignore[index]
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/utils/operations.py", line 822, in forward
    return model_forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/utils/operations.py", line 810, in __call__
    return convert_to_fp32(self.model_forward(*args, **kwargs))
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 16, in decorate_autocast
    return func(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/peft/peft_model.py", line 1091, in forward
    return self.base_model(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 160, in forward
    return self.model.forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1157, in forward
    outputs = self.model(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1004, in forward
    attention_mask = _prepare_4d_causal_attention_mask_for_sdpa(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 371, in _prepare_4d_causal_attention_mask_for_sdpa
    elif not is_tracing and torch.all(attention_mask == 1):
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.


  0%|          | 0/1622 [00:00<?, ?it/s]
terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f8717999d87 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f871794a75f in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f8717a6a8a8 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #3: <unknown function> + 0x1d40e (0x7f8717a3540e in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0x1f744 (0x7f8717a37744 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #5: <unknown function> + 0x1fb6d (0x7f8717a37b6d in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #6: <unknown function> + 0x540210 (0x7f8761a67210 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x649bf (0x7f871797e9bf in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #8: c10::TensorImpl::~TensorImpl() + 0x21b (0x7f8717977c8b in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #9: c10::TensorImpl::~TensorImpl() + 0x9 (0x7f8717977e39 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #10: <unknown function> + 0x802b98 (0x7f8761d29b98 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libtorch_python.so)
frame #11: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x4e0583]
frame #12: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x5c436c]
frame #13: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x5c3fa8]
frame #14: Py_FinalizeEx + 0x143 (0x5c2ad3 in /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python)
frame #15: Py_RunMain + 0x109 (0x5b48f9 in /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python)
frame #16: Py_BytesMain + 0x39 (0x584e49 in /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python)
frame #17: __libc_start_main + 0xf3 (0x7f8774f8f083 in /lib/x86_64-linux-gnu/libc.so.6)
frame #18: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x584cfe]

terminate called after throwing an instance of 'c10::Error'
  what():  CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:44 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f622c936d87 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7f622c8e775f in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7f622ca078a8 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #3: <unknown function> + 0x1d40e (0x7f622c9d240e in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #4: <unknown function> + 0x1f744 (0x7f622c9d4744 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #5: <unknown function> + 0x1fb6d (0x7f622c9d4b6d in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
frame #6: <unknown function> + 0x540210 (0x7f6276a04210 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libtorch_python.so)
frame #7: <unknown function> + 0x649bf (0x7f622c91b9bf in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #8: c10::TensorImpl::~TensorImpl() + 0x21b (0x7f622c914c8b in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #9: c10::TensorImpl::~TensorImpl() + 0x9 (0x7f622c914e39 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libc10.so)
frame #10: <unknown function> + 0x802b98 (0x7f6276cc6b98 in /root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/lib/libtorch_python.so)
frame #11: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x4e0583]
frame #12: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x5c436c]
frame #13: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x5c3fa8]
frame #14: Py_FinalizeEx + 0x143 (0x5c2ad3 in /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python)
frame #15: Py_RunMain + 0x109 (0x5b48f9 in /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python)
frame #16: Py_BytesMain + 0x39 (0x584e49 in /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python)
frame #17: __libc_start_main + 0xf3 (0x7f6289f35083 in /lib/x86_64-linux-gnu/libc.so.6)
frame #18: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python() [0x584cfe]

[2024-03-28 15:39:32,601] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: -6) local_rank: 0 (pid: 433075) of binary: /root/autodl-tmp/minicoda3/envs/llama_factory/bin/python
Traceback (most recent call last):
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 46, in main
    args.func(args)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1048, in launch_command
    multi_gpu_launcher(args)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/accelerate/commands/launch.py", line 702, in multi_gpu_launcher
    distrib_run.run(args)
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run
    elastic_launch(
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/autodl-tmp/minicoda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
=======================================================
src/train_bash.py FAILED
-------------------------------------------------------
Failures:
[1]:
  time      : 2024-03-28_15:39:32
  host      : autodl-container-8be147bec0-5b771465
  rank      : 1 (local_rank: 1)
  exitcode  : -6 (pid: 433076)
  error_file: <N/A>
  traceback : Signal 6 (SIGABRT) received by PID 433076
-------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-03-28_15:39:32
  host      : autodl-container-8be147bec0-5b771465
  rank      : 0 (local_rank: 0)
  exitcode  : -6 (pid: 433075)
  error_file: <N/A>
  traceback : Signal 6 (SIGABRT) received by PID 433075
=======================================================

这个报错看起来和 cuda 管理有关。在之前,我其他参数不变,设置 --template default 时是能正常训练的,但改为 chatml 后便出现上面的报错。

Expected behavior

我期望它正常进行训练。在之前,我其他参数不变,设置 --template default 时是能正常训练的,但改为 chatml 后便出现上面的报错。
这是我的 accelerate config:

compute_environment: LOCAL_MACHINE
debug: false
distributed_type: MULTI_GPU
downcast_bf16: 'no'
gpu_ids: 0,1
machine_rank: 0
main_training_function: main
mixed_precision: fp16
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

System Info

  • transformers version: 4.38.2
  • Platform: Linux-5.15.0-86-generic-x86_64-with-glibc2.31
  • Python version: 3.10.13
  • Huggingface_hub version: 0.21.4
  • Safetensors version: 0.4.2
  • Accelerate version: 0.28.0
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.2.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: use multi_gpus

Others

No response

@hiyouga
Copy link
Owner

hiyouga commented Mar 28, 2024

使用 --resize_vocab 增加词表大小

@hiyouga hiyouga added the solved This problem has been already solved label Mar 28, 2024
@hiyouga hiyouga closed this as completed Mar 28, 2024
@syGOAT
Copy link
Author

syGOAT commented Apr 1, 2024

@hiyouga 感谢您的回复!在训练过程中,遇到的问题已经解决。
但在简单的web部署中,即使加上 --resize-vocab,进入web界面后输入内容点击 submit,依然报错。
我的部署指令如下:

CUDA_VISIBLE_DEVICES=0 python src/web_demo.py \
    --model_name_or_path /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2 \
    --adapter_name_or_path /root/autodl-tmp/fhy/finetune3/mistral-instruct-2-lora-moretokens-chatml \
    --template chatml \
    --finetuning_type lora \
    --resize_vocab

输出与报错:

[INFO|tokenization_utils_base.py:2082] 2024-04-01 09:51:17,835 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2082] 2024-04-01 09:51:17,835 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2082] 2024-04-01 09:51:17,835 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2082] 2024-04-01 09:51:17,835 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2082] 2024-04-01 09:51:17,835 >> loading file tokenizer.json
[INFO|configuration_utils.py:724] 2024-04-01 09:51:17,890 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/config.json
[INFO|configuration_utils.py:789] 2024-04-01 09:51:17,891 >> Model config MistralConfig {
  "_name_or_path": "/root/autodl-tmp/models/Mistral-7B-Instruct-v0.2",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.39.1",
  "use_cache": true,
  "vocab_size": 32000
}

04/01/2024 09:51:17 - INFO - llmtuner.model.patcher - Using KV cache for faster generation.
[INFO|modeling_utils.py:3280] 2024-04-01 09:51:17,908 >> loading weights file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/model.safetensors.index.json
[INFO|modeling_utils.py:1417] 2024-04-01 09:51:17,908 >> Instantiating MistralForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:928] 2024-04-01 09:51:17,909 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.23it/s]
[INFO|modeling_utils.py:4024] 2024-04-01 09:51:20,778 >> All model checkpoint weights were used when initializing MistralForCausalLM.

[INFO|modeling_utils.py:4032] 2024-04-01 09:51:20,778 >> All the weights of MistralForCausalLM were initialized from the model checkpoint at /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use MistralForCausalLM for predictions without further training.
[INFO|configuration_utils.py:881] 2024-04-01 09:51:20,781 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/generation_config.json
[INFO|configuration_utils.py:928] 2024-04-01 09:51:20,781 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

04/01/2024 09:51:20 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
04/01/2024 09:51:21 - INFO - llmtuner.model.adapter - Merged 1 adapter(s).
04/01/2024 09:51:21 - INFO - llmtuner.model.adapter - Loaded adapter(s): /root/autodl-tmp/fhy/finetune3/mistral-instruct-2-lora-moretokens-chatml-continue/checkpoint-1500
04/01/2024 09:51:21 - INFO - llmtuner.model.loader - all params: 7241732096
04/01/2024 09:51:21 - INFO - llmtuner.data.template - Replace eos token: <|im_end|>
04/01/2024 09:51:21 - WARNING - llmtuner.data.template - New tokens have been added, make sure `resize_vocab` is True.
04/01/2024 09:51:21 - INFO - llmtuner.data.template - Add pad token: <|im_end|>
04/01/2024 09:51:21 - INFO - llmtuner.data.template - Add <|im_start|> to stop words.
04/01/2024 09:51:21 - WARNING - llmtuner.data.template - New tokens have been added, make sure `resize_vocab` is True.
Running on local URL:  http://0.0.0.0:7860

To create a public link, set `share=True` in `launch()`.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [33,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [34,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [35,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [36,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [37,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [38,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [39,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [40,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [41,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [42,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [43,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [45,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [46,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [47,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [48,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [49,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [50,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [51,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [52,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [53,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [54,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [55,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [56,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [57,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [58,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [59,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [60,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [61,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [62,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
../aten/src/ATen/native/cuda/Indexing.cu:1290: indexSelectLargeIndex: block: [700,0,0], thread: [63,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
Exception in thread Thread-9 (generate):
Traceback (most recent call last):
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/generation/utils.py", line 1575, in generate
    result = self._sample(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/generation/utils.py", line 2697, in _sample
    outputs = self(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1157, in forward
    outputs = self.model(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1004, in forward
    attention_mask = _prepare_4d_causal_attention_mask_for_sdpa(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_attn_mask_utils.py", line 335, in _prepare_4d_causal_attention_mask_for_sdpa
    elif not is_tracing and torch.all(attention_mask == 1):
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

我确定在指令中添加了 --resize_vocab,但貌似报了和之前的问题相似的错误。请问我的问题出在哪里?

@syGOAT
Copy link
Author

syGOAT commented Apr 1, 2024

@hiyouga 您好,我又尝试了 CUDA_VISIBLE_DEVICES=0 API_PORT=6006 python src/api_demo.py,即使加了 --resize_vocab,依然报出相似的错误。

@hiyouga
Copy link
Owner

hiyouga commented Apr 1, 2024

你在训练时候没有保存扩充后的词表,需要指定 --resize_vocab True --additional_target embed_tokens,lm_head 并且重新训练。推理时无需指定 --resize_vocab

@syGOAT
Copy link
Author

syGOAT commented Apr 2, 2024

@hiyouga 谢谢回复!但是按照您的指导对模型进行lora训练后,依然无法正确推理。具体情况如下:
我的训练指令(指定了 --resize_vocab True --additional_target embed_tokens,lm_head):

accelerate launch --config_file /root/autodl-tmp/fhy/finetune/config.yaml src/train_bash.py     \
    --stage sft    \
    --do_train     \
    --flash_attn True     \
    --quantization_bit 4     \
    --model_name_or_path /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2     \
    --dataset filtered_sampled_data     \
    --template chatml     \
    --finetuning_type lora     \
    --lora_target q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj     \
    --output_dir /root/autodl-tmp/fhy/finetune3/mistral-instruct-2-lora-moretokens-chatml     \
    --overwrite_cache     \
    --per_device_train_batch_size 4     \
    --gradient_accumulation_steps 4     \
    --lr_scheduler_type cosine     \
    --logging_steps 10     \
    --save_steps 500     \
    --learning_rate 5e-5    \
    --num_train_epochs 4    \
    --plot_loss     \
    --bf16  \
    --overwrite_output_dir  \
    --lora_rank 16  \
    --max_new_tokens 4096    \
    --top_p 1    \
    --num_beams 3   \
    --temperature 0   \
    --resize_vocab True \
    --additional_target embed_tokens,lm_head

正常训练成功后结束。然后使用如下推理指令:

CUDA_VISIBLE_DEVICES=0 python src/web_demo.py \
    --model_name_or_path /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2 \
    --adapter_name_or_path /root/autodl-tmp/fhy/finetune3/mistral-instruct-2-lora-moretokens-chatml \
    --template chatml \
    --finetuning_type lora

出现如下输出和报错:

[INFO|tokenization_utils_base.py:2082] 2024-04-02 09:52:38,130 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2082] 2024-04-02 09:52:38,130 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2082] 2024-04-02 09:52:38,130 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2082] 2024-04-02 09:52:38,130 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2082] 2024-04-02 09:52:38,130 >> loading file tokenizer.json
[INFO|configuration_utils.py:724] 2024-04-02 09:52:38,184 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/config.json
[INFO|configuration_utils.py:789] 2024-04-02 09:52:38,186 >> Model config MistralConfig {
  "_name_or_path": "/root/autodl-tmp/models/Mistral-7B-Instruct-v0.2",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.39.1",
  "use_cache": true,
  "vocab_size": 32000
}

04/02/2024 09:52:38 - INFO - llmtuner.model.patcher - Using KV cache for faster generation.
[INFO|modeling_utils.py:3280] 2024-04-02 09:52:38,203 >> loading weights file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/model.safetensors.index.json
[INFO|modeling_utils.py:1417] 2024-04-02 09:52:38,203 >> Instantiating MistralForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:928] 2024-04-02 09:52:38,204 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.24it/s]
[INFO|modeling_utils.py:4024] 2024-04-02 09:52:41,034 >> All model checkpoint weights were used when initializing MistralForCausalLM.

[INFO|modeling_utils.py:4032] 2024-04-02 09:52:41,034 >> All the weights of MistralForCausalLM were initialized from the model checkpoint at /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use MistralForCausalLM for predictions without further training.
[INFO|configuration_utils.py:881] 2024-04-02 09:52:41,037 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/generation_config.json
[INFO|configuration_utils.py:928] 2024-04-02 09:52:41,037 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

04/02/2024 09:52:41 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
Traceback (most recent call last):
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 11, in <module>
    main()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 5, in main
    demo = create_web_demo()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/interface.py", line 55, in create_web_demo
    engine = Engine(pure_chat=True)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/engine.py", line 20, in __init__
    self.chatter = WebChatModel(self.manager, demo_mode, lazy_init=(not pure_chat))
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/chatter.py", line 27, in __init__
    super().__init__()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/chat_model.py", line 23, in __init__
    self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/hf_engine.py", line 33, in __init__
    self.model, self.tokenizer = load_model_and_tokenizer(
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 149, in load_model_and_tokenizer
    model = load_model(tokenizer, model_args, finetuning_args, is_trainable, add_valuehead)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 94, in load_model
    model = init_adapter(model, model_args, finetuning_args, is_trainable)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/adapter.py", line 110, in init_adapter
    model: "LoraModel" = PeftModel.from_pretrained(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/peft/peft_model.py", line 356, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/peft/peft_model.py", line 730, in load_adapter
    load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 249, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([32064, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
        size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([32064, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

于是,我尝试修改 /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/ 的 config.json,令 "vocab_size": 32064,重新输入推理指令,但又报出如下输出和错误:

[INFO|tokenization_utils_base.py:2082] 2024-04-02 10:02:24,627 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2082] 2024-04-02 10:02:24,627 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2082] 2024-04-02 10:02:24,627 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2082] 2024-04-02 10:02:24,627 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2082] 2024-04-02 10:02:24,627 >> loading file tokenizer.json
[INFO|configuration_utils.py:724] 2024-04-02 10:02:24,684 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/config.json
[INFO|configuration_utils.py:789] 2024-04-02 10:02:24,685 >> Model config MistralConfig {
  "_name_or_path": "/root/autodl-tmp/models/Mistral-7B-Instruct-v0.2",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.39.1",
  "use_cache": true,
  "vocab_size": 32064
}

04/02/2024 10:02:24 - INFO - llmtuner.model.patcher - Using KV cache for faster generation.
[INFO|modeling_utils.py:3280] 2024-04-02 10:02:24,702 >> loading weights file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/model.safetensors.index.json
[INFO|modeling_utils.py:1417] 2024-04-02 10:02:24,702 >> Instantiating MistralForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:928] 2024-04-02 10:02:24,703 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

Loading checkpoint shards:   0%|                                                                                                                                     | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 11, in <module>
    main()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 5, in main
    demo = create_web_demo()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/interface.py", line 55, in create_web_demo
    engine = Engine(pure_chat=True)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/engine.py", line 20, in __init__
    self.chatter = WebChatModel(self.manager, demo_mode, lazy_init=(not pure_chat))
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/chatter.py", line 27, in __init__
    super().__init__()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/chat_model.py", line 23, in __init__
    self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/hf_engine.py", line 33, in __init__
    self.model, self.tokenizer = load_model_and_tokenizer(
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 149, in load_model_and_tokenizer
    model = load_model(tokenizer, model_args, finetuning_args, is_trainable, add_valuehead)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 89, in load_model
    model = AutoModelForCausalLM.from_pretrained(model_args.model_name_or_path, config=config, **init_kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3531, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3958, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 812, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 348, in set_module_tensor_to_device
    raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([32000, 4096]) in "weight" (which has shape torch.Size([32064, 4096])), this look incorrect.

我该如何对齐尺寸?等待您的回复!

hiyouga added a commit that referenced this issue Apr 2, 2024
@hiyouga
Copy link
Owner

hiyouga commented Apr 2, 2024

更新一下代码,然后推理时候也指定 --resize_vocab 参数

@hiyouga
Copy link
Owner

hiyouga commented Apr 3, 2024

应该不需要重新训练了,具体是哪个报错?

@syGOAT
Copy link
Author

syGOAT commented Apr 3, 2024

@hiyouga git pull 更新代码,使用如下指令推理:

CUDA_VISIBLE_DEVICES=1 python src/web_demo.py \
    --model_name_or_path /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2 \
    --adapter_name_or_path /root/autodl-tmp/fhy/finetune3/mistral-instruct-2-lora-moretokens-chatml \
    --template chatml \
    --finetuning_type lora \
    --resize_vocab True

输出与报错:

[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:45:46,508 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:45:46,508 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:45:46,508 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:45:46,508 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:45:46,508 >> loading file tokenizer.json
[INFO|configuration_utils.py:724] 2024-04-03 17:45:46,562 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/config.json
[INFO|configuration_utils.py:789] 2024-04-03 17:45:46,563 >> Model config MistralConfig {
  "_name_or_path": "/root/autodl-tmp/models/Mistral-7B-Instruct-v0.2",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.39.1",
  "use_cache": true,
  "vocab_size": 32000
}

04/03/2024 17:45:46 - INFO - llmtuner.model.patcher - Using KV cache for faster generation.
[INFO|modeling_utils.py:3280] 2024-04-03 17:45:46,580 >> loading weights file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/model.safetensors.index.json
[INFO|modeling_utils.py:1417] 2024-04-03 17:45:46,580 >> Instantiating MistralForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:928] 2024-04-03 17:45:46,581 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:02<00:00,  1.23it/s]
[INFO|modeling_utils.py:4024] 2024-04-03 17:45:49,458 >> All model checkpoint weights were used when initializing MistralForCausalLM.

[INFO|modeling_utils.py:4032] 2024-04-03 17:45:49,458 >> All the weights of MistralForCausalLM were initialized from the model checkpoint at /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2.
If your task is similar to the task the model of the checkpoint was trained on, you can already use MistralForCausalLM for predictions without further training.
[INFO|configuration_utils.py:881] 2024-04-03 17:45:49,461 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/generation_config.json
[INFO|configuration_utils.py:928] 2024-04-03 17:45:49,461 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

04/03/2024 17:45:49 - INFO - llmtuner.model.adapter - Fine-tuning method: LoRA
Traceback (most recent call last):
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 9, in <module>
    main()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 5, in main
    create_web_demo().queue().launch(server_name="0.0.0.0", server_port=None, share=False, inbrowser=True)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/interface.py", line 52, in create_web_demo
    engine = Engine(pure_chat=True)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/engine.py", line 19, in __init__
    self.chatter = WebChatModel(self.manager, demo_mode, lazy_init=(not pure_chat))
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/chatter.py", line 27, in __init__
    super().__init__()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/chat_model.py", line 23, in __init__
    self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/hf_engine.py", line 33, in __init__
    self.model, self.tokenizer = load_model_and_tokenizer(
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 148, in load_model_and_tokenizer
    model = load_model(tokenizer, model_args, finetuning_args, is_trainable, add_valuehead)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 93, in load_model
    model = init_adapter(model, model_args, finetuning_args, is_trainable)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/adapter.py", line 110, in init_adapter
    model: "LoraModel" = PeftModel.from_pretrained(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/peft/peft_model.py", line 356, in from_pretrained
    model.load_adapter(model_id, adapter_name, is_trainable=is_trainable, **kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/peft/peft_model.py", line 730, in load_adapter
    load_result = set_peft_model_state_dict(self, adapters_weights, adapter_name=adapter_name)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 249, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
        size mismatch for base_model.model.model.embed_tokens.modules_to_save.default.weight: copying a param with shape torch.Size([32064, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).
        size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([32064, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).

更改模型 config "vocab_size": 32064,运行推理指令,报错:

[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:47:05,387 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:47:05,388 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:47:05,388 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:47:05,388 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2082] 2024-04-03 17:47:05,388 >> loading file tokenizer.json
[INFO|configuration_utils.py:724] 2024-04-03 17:47:05,443 >> loading configuration file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/config.json
[INFO|configuration_utils.py:789] 2024-04-03 17:47:05,444 >> Model config MistralConfig {
  "_name_or_path": "/root/autodl-tmp/models/Mistral-7B-Instruct-v0.2",
  "architectures": [
    "MistralForCausalLM"
  ],
  "attention_dropout": 0.0,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "hidden_act": "silu",
  "hidden_size": 4096,
  "initializer_range": 0.02,
  "intermediate_size": 14336,
  "max_position_embeddings": 32768,
  "model_type": "mistral",
  "num_attention_heads": 32,
  "num_hidden_layers": 32,
  "num_key_value_heads": 8,
  "rms_norm_eps": 1e-05,
  "rope_theta": 1000000.0,
  "sliding_window": null,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.39.1",
  "use_cache": true,
  "vocab_size": 32064
}

04/03/2024 17:47:05 - INFO - llmtuner.model.patcher - Using KV cache for faster generation.
[INFO|modeling_utils.py:3280] 2024-04-03 17:47:05,461 >> loading weights file /root/autodl-tmp/models/Mistral-7B-Instruct-v0.2/model.safetensors.index.json
[INFO|modeling_utils.py:1417] 2024-04-03 17:47:05,461 >> Instantiating MistralForCausalLM model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:928] 2024-04-03 17:47:05,462 >> Generate config GenerationConfig {
  "bos_token_id": 1,
  "eos_token_id": 2
}

Loading checkpoint shards:   0%|                                                                                               | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 9, in <module>
    main()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/web_demo.py", line 5, in main
    create_web_demo().queue().launch(server_name="0.0.0.0", server_port=None, share=False, inbrowser=True)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/interface.py", line 52, in create_web_demo
    engine = Engine(pure_chat=True)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/engine.py", line 19, in __init__
    self.chatter = WebChatModel(self.manager, demo_mode, lazy_init=(not pure_chat))
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/webui/chatter.py", line 27, in __init__
    super().__init__()
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/chat_model.py", line 23, in __init__
    self.engine: "BaseEngine" = HuggingfaceEngine(model_args, data_args, finetuning_args, generating_args)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/chat/hf_engine.py", line 33, in __init__
    self.model, self.tokenizer = load_model_and_tokenizer(
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 148, in load_model_and_tokenizer
    model = load_model(tokenizer, model_args, finetuning_args, is_trainable, add_valuehead)
  File "/root/autodl-tmp/fhy/LLaMA-Factory/src/llmtuner/model/loader.py", line 88, in load_model
    model = AutoModelForCausalLM.from_pretrained(model_args.model_name_or_path, config=config, **init_kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3531, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3958, in _load_pretrained_model
    new_error_msgs, offload_index, state_dict_index = _load_state_dict_into_meta_model(
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/transformers/modeling_utils.py", line 812, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/root/autodl-tmp/minicoda3/envs/lfactory/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 348, in set_module_tensor_to_device
    raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([32000, 4096]) in "weight" (which has shape torch.Size([32064, 4096])), this look incorrect.

貌似与更新代码前相同?

hiyouga added a commit that referenced this issue Apr 3, 2024
tybalex added a commit to sanjay920/LLaMA-Factory that referenced this issue Apr 10, 2024
* fix packages

* Update wechat.jpg

* Updated README with new information

* Updated README with new information

* Updated README with new information

* Follow HF_ENDPOINT environment variable

* fix hiyouga#2346

* fix hiyouga#2777 hiyouga#2895

* add orca_dpo_pairs dataset

* support fsdp + qlora

* update readme

* update tool extractor

* paper release

* add citation

* move file

* Update README.md, fix the release date of the paper

* Update README_zh.md, fix the release date of the paper

* Update wechat.jpg

* fix hiyouga#2941

* fix hiyouga#2928

* fix hiyouga#2936

* fix Llama lora merge crash

* fix Llama lora merge crash

* fix Llama lora merge crash

* pass ruff check

* tiny fix

* Update requirements.txt

* Update README_zh.md

* release v0.6.0

* add arg check

* Update README_zh.md

* Update README.md

* update readme

* tiny fix

* release v0.6.0 (real)

* Update wechat.jpg

* fix hiyouga#2961

* fix bug

* fix hiyouga#2981

* fix ds optimizer

* update trainers

* fix hiyouga#3010

* update readme

* fix hiyouga#2982

* add project

* update readme

* release v0.6.1

* Update wechat.jpg

* fix pile datset hf hub url

* upgrade gradio to 4.21.0

* support save args in webui hiyouga#2807 hiyouga#3046

some ideas are borrowed from @marko1616

* Fix Llama model save for full param train

* fix blank line contains whitespace

* tiny fix

* support ORPO

* support orpo in webui

* update readme

* use log1p in orpo loss

huggingface/trl#1491

* fix plots

* fix IPO and ORPO loss

* fix ORPO loss

* update webui

* support infer 4bit model on GPUs hiyouga#3023

* fix hiyouga#3077

* add qwen1.5 moe

* fix hiyouga#3083

* set dev version

* Update SECURITY.md

* fix hiyouga#3022

* add moe aux loss control hiyouga#3085

* simplify readme

* update readme

* update readme

* update examples

* update examples

* add zh readme

* update examples

* update readme

* update vllm example

* Update wechat.jpg

* fix hiyouga#3116

* fix resize vocab at inference hiyouga#3022

* fix requires for windows

* fix bug in latest gradio

* back to gradio 4.21 and fix chat

* tiny fix

* update examples

* update readme

* support Qwen1.5-32B

* support Qwen1.5-32B

* fix spell error

* support hiyouga#3152

* rename template to breeze

* rename template to breeze

* add empty line

* Update wechat.jpg

* tiny fix

* fix quant infer and qwen2moe

* Pass additional_target to unsloth

Fixes hiyouga#3200

* Update adapter.py

* Update adapter.py

* fix hiyouga#3225

---------

Co-authored-by: hiyouga <[email protected]>
Co-authored-by: 刘一博 <[email protected]>
Co-authored-by: khazic <[email protected]>
Co-authored-by: SirlyDreamer <[email protected]>
Co-authored-by: Sanjay Nadhavajhala <[email protected]>
Co-authored-by: sanjay920 <[email protected]>
Co-authored-by: 0xez <[email protected]>
Co-authored-by: marko1616 <[email protected]>
Co-authored-by: Remek Kinas <[email protected]>
Co-authored-by: Tsumugii24 <[email protected]>
Co-authored-by: li.yunhao <[email protected]>
Co-authored-by: sliderSun <[email protected]>
Co-authored-by: codingma <[email protected]>
Co-authored-by: Erich Schubert <[email protected]>
@wwwbq
Copy link

wwwbq commented Jun 24, 2024

使用 --resize_vocab 增加词表大小

大佬您好,想请问一下这个参数的作用时啥,我今天也碰到了这个issue的问题,之前在qwen上正常训练,后面换成llama系列的模型后就报错,加上这个参数才恢复正常

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

3 participants