Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于evaluation的几个问题 #1316

Closed
ntudy opened this issue Oct 30, 2023 · 8 comments
Closed

关于evaluation的几个问题 #1316

ntudy opened this issue Oct 30, 2023 · 8 comments
Labels
solved This problem has been already solved

Comments

@ntudy
Copy link

ntudy commented Oct 30, 2023

  1. 目前是不是只支持基于MCQ的evaluation,如果只想拿单轮对话的结果,应该如何处理呢?
  2. evaluation是否能评测没有经过lora, 无checkpoint的raw model呢
@hiyouga
Copy link
Owner

hiyouga commented Oct 30, 2023

  1. 单轮对话可以用 predict 功能来评估 ROUGE 分数
  2. 不加 checkpoint_dir 参数即可

@hiyouga hiyouga added the solved This problem has been already solved label Oct 30, 2023
@ntudy
Copy link
Author

ntudy commented Oct 30, 2023

你好,我有用以下参数做inference,但是回复甚至都不是英文,请问是某个参数设置错误了吗?

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --do_predict \
    --dataset alpaca_gpt4_en \
    --template llama2 \
    --finetuning_type none \
    --output_dir output/llama2/test \
    --per_device_eval_batch_size 8 \
    --max_samples 8 \
    --predict_with_generate

image

我在别的issue里面看到你有提及llama2的inference batch size要设置为1,我不太理解原因,而且试过之后还是一样的乱码。

@hiyouga hiyouga reopened this Oct 31, 2023
@hiyouga
Copy link
Owner

hiyouga commented Oct 31, 2023

问题已修复,请更新代码后重试,eval batch size 仍然需要设置为1

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --do_predict \
    --dataset alpaca_gpt4_en \
    --template llama2 \
    --output_dir out/debug/llama2/eval \
    --max_new_tokens 128 \
    --per_device_eval_batch_size 1 \
    --max_samples 30 \
    --predict_with_generate

@XuanRen4470
Copy link

问题已修复,请更新代码后重试,eval batch size 仍然需要设置为1

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --do_predict \
    --dataset alpaca_gpt4_en \
    --template llama2 \
    --output_dir out/debug/llama2/eval \
    --max_new_tokens 128 \
    --per_device_eval_batch_size 1 \
    --max_samples 30 \
    --predict_with_generate

您能解释下为什么要设置为1吗?不太想设置为1因为速度太慢了

@hiyouga
Copy link
Owner

hiyouga commented Nov 1, 2023

@XuanRen4470 llama2 的模型问题,如果使用了多个 Batch 就会溢出

@XuanRen4470
Copy link

@XuanRen4470 llama2 的模型问题,如果使用了多个 Batch 就会溢出

能详细解释下这个益出嘛? 我需要评估下对performance影响有多大 因为batch = 1实在是太慢。 谢谢您🙏

@slliao445
Copy link

per_device_eval_batch_size 

请问这个参数只能通过命令行添加吗? 在train_web页面上有吗

@XuanRen4470
Copy link

```shell
per_device_eval_batch_size 

请问这个参数只能通过命令行添加吗? 在train_web页面上有吗

命令行应该是可以 我没用过train_web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

4 participants