关于evaluation的几个问题 #1316

ntudy · 2023-10-30T09:41:00Z

目前是不是只支持基于MCQ的evaluation，如果只想拿单轮对话的结果，应该如何处理呢？
evaluation是否能评测没有经过lora, 无checkpoint的raw model呢

hiyouga · 2023-10-30T11:24:09Z

单轮对话可以用 predict 功能来评估 ROUGE 分数
不加 checkpoint_dir 参数即可

ntudy · 2023-10-30T15:55:18Z

你好，我有用以下参数做inference，但是回复甚至都不是英文，请问是某个参数设置错误了吗？

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --do_predict \
    --dataset alpaca_gpt4_en \
    --template llama2 \
    --finetuning_type none \
    --output_dir output/llama2/test \
    --per_device_eval_batch_size 8 \
    --max_samples 8 \
    --predict_with_generate

我在别的issue里面看到你有提及llama2的inference batch size要设置为1，我不太理解原因，而且试过之后还是一样的乱码。

hiyouga · 2023-10-31T03:33:46Z

问题已修复，请更新代码后重试，eval batch size 仍然需要设置为1

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --do_predict \
    --dataset alpaca_gpt4_en \
    --template llama2 \
    --output_dir out/debug/llama2/eval \
    --max_new_tokens 128 \
    --per_device_eval_batch_size 1 \
    --max_samples 30 \
    --predict_with_generate

XuanRen4470 · 2023-11-01T10:25:50Z

问题已修复，请更新代码后重试，eval batch size 仍然需要设置为1

CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
    --stage sft \
    --model_name_or_path meta-llama/Llama-2-7b-chat-hf \
    --do_predict \
    --dataset alpaca_gpt4_en \
    --template llama2 \
    --output_dir out/debug/llama2/eval \
    --max_new_tokens 128 \
    --per_device_eval_batch_size 1 \
    --max_samples 30 \
    --predict_with_generate

您能解释下为什么要设置为1吗？不太想设置为1因为速度太慢了

hiyouga · 2023-11-01T11:28:51Z

@XuanRen4470 llama2 的模型问题，如果使用了多个 Batch 就会溢出

XuanRen4470 · 2023-11-01T12:34:12Z

@XuanRen4470 llama2 的模型问题，如果使用了多个 Batch 就会溢出

能详细解释下这个益出嘛？我需要评估下对performance影响有多大因为batch = 1实在是太慢。谢谢您🙏

slliao445 · 2023-11-21T07:31:53Z

per_device_eval_batch_size

请问这个参数只能通过命令行添加吗？在train_web页面上有吗

XuanRen4470 · 2023-11-21T07:54:00Z

```shell
per_device_eval_batch_size 
请问这个参数只能通过命令行添加吗？在train_web页面上有吗

命令行应该是可以我没用过train_web

hiyouga added the solved This problem has been already solved label Oct 30, 2023

hiyouga closed this as completed in f4e4a04 Oct 31, 2023

hiyouga reopened this Oct 31, 2023

hiyouga closed this as completed Oct 31, 2023

hiyouga mentioned this issue Nov 1, 2023

为什么改变evaluation batch size会改变 accuracy？ #1256

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于evaluation的几个问题 #1316

关于evaluation的几个问题 #1316

ntudy commented Oct 30, 2023

hiyouga commented Oct 30, 2023

ntudy commented Oct 30, 2023

hiyouga commented Oct 31, 2023

XuanRen4470 commented Nov 1, 2023

hiyouga commented Nov 1, 2023

XuanRen4470 commented Nov 1, 2023

slliao445 commented Nov 21, 2023

XuanRen4470 commented Nov 21, 2023

关于evaluation的几个问题 #1316

关于evaluation的几个问题 #1316

Comments

ntudy commented Oct 30, 2023

hiyouga commented Oct 30, 2023

ntudy commented Oct 30, 2023

hiyouga commented Oct 31, 2023

XuanRen4470 commented Nov 1, 2023

hiyouga commented Nov 1, 2023

XuanRen4470 commented Nov 1, 2023

slliao445 commented Nov 21, 2023

XuanRen4470 commented Nov 21, 2023