Qwen2-57B-A14-instruct训练报错 #4820

jiezhangGt · 2024-07-15T01:55:18Z

Reminder

I have read the README and searched the existing issues.

System Info

其他模型训练都正常，换Qwen2-57B-A14-instruct做sft时，报错了

Reproduction

Traceback (most recent call last):
  File "/mnt/pfs/nlp/zhangjie07/workspace/LLaMA-Factory/src/train.py", line 28, in <module>
    main()
  File "/mnt/pfs/nlp/zhangjie07/workspace/LLaMA-Factory/src/train.py", line 19, in main
    run_exp()
  File "/mnt/pfs/nlp/zhangjie07/workspace/LLaMA-Factory/src/llamafactory/train/tuner.py", line 50, in run_exp
    run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/mnt/pfs/nlp/zhangjie07/workspace/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 90, in run_sft
    train_result = trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1885, in train
    return inner_training_loop(
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2291, in _inner_training_loop
    self._maybe_log_save_evaluate(tr_loss, grad_norm, model, trial, epoch, ignore_keys_for_eval)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2721, in _maybe_log_save_evaluate
    metrics = self.evaluate(ignore_keys=ignore_keys_for_eval)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer_seq2seq.py", line 180, in evaluate
    return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3572, in evaluate
    output = eval_loop(
  File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3854, in evaluation_loop
    metrics = self.compute_metrics(EvalPrediction(predictions=all_preds, label_ids=all_labels))
  File "/mnt/pfs/nlp/zhangjie07/workspace/LLaMA-Factory/src/llamafactory/train/sft/metric.py", line 50, in compute_accuracy
    pred, label = preds[i, :-1], labels[i, 1:]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed
  1%|          | 20/3820 [1:06:10<209:33:28, 198.53s/it]

Expected behavior

No response

Others

No response

oliverh32 · 2024-07-15T06:12:21Z

same issue here, for Mixtral 8x7B

hiyouga · 2024-07-15T14:32:23Z

Could you try it again with fd8cc49?

oliverh32 · 2024-07-16T21:46:00Z

Could you try it again with fd8cc49?

Thanks! It works.

github-actions bot added the pending This problem is yet to be addressed label Jul 15, 2024

hiyouga closed this as completed in fd8cc49 Jul 15, 2024

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jul 15, 2024

xtchen96 pushed a commit to xtchen96/LLaMA-Factory that referenced this issue Jul 17, 2024

fix hiyouga#4820

c8cb9eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qwen2-57B-A14-instruct训练报错 #4820

Qwen2-57B-A14-instruct训练报错 #4820

jiezhangGt commented Jul 15, 2024

oliverh32 commented Jul 15, 2024

hiyouga commented Jul 15, 2024

oliverh32 commented Jul 16, 2024

Qwen2-57B-A14-instruct训练报错 #4820

Qwen2-57B-A14-instruct训练报错 #4820

Comments

jiezhangGt commented Jul 15, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

oliverh32 commented Jul 15, 2024

hiyouga commented Jul 15, 2024

oliverh32 commented Jul 16, 2024