Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

eval运行mmlu时,results.json中的结果少了一项 #3837

Closed
1 task done
BGFGB opened this issue May 21, 2024 · 1 comment
Closed
1 task done

eval运行mmlu时,results.json中的结果少了一项 #3837

BGFGB opened this issue May 21, 2024 · 1 comment
Labels
solved This problem has been already solved

Comments

@BGFGB
Copy link

BGFGB commented May 21, 2024

Reminder

  • I have read the README and searched the existing issues.

Reproduction

运行mmlu评估时,results.json中的结果少了一项。
results.json中没课结果只有四个答案,如下:

  "abstract_algebra": {
    "0": "B",
    "1": "C",
    "2": "A",
    "3": "A"
  },
  "anatomy": {
    "0": "D",
    "1": "C",
    "2": "C",
    "3": "B"
  },

mmlu的数据中每个应该有五项:
abstract_algebra如下:

Find all c in Z_3 such that Z_3[x]/(x^2 + c) is a field. 0 1 2 3 B
Statement 1 | If aH is an element of a factor group, then |aH| divides |a|. Statement 2 | If H and K are subgroups of G then HK is a subgroup of G. True, True False, False True, False False, True B
Statement 1 | Every element of a group generates a cyclic subgroup of the group. Statement 2 | The symmetric group S_10 has 10 elements. True, True False, False True, False False, True C
Statement 1| Every function from a finite set onto itself must be one to one. Statement 2 | Every subgroup of an abelian group is abelian. True, True False, False True, False False, True A
Find the characteristic of the ring 2Z. 0 3 12 30 A

anatomy如下所示:

What is the embryological origin of the hyoid bone? The first pharyngeal arch The first and second pharyngeal arches The second pharyngeal arch The second and third pharyngeal arches D
Which of these branches of the trigeminal nerve contain somatic motor processes? The supraorbital nerve The infraorbital nerve The mental nerve None of the above D
The pleura have no sensory innervation. are separated by a 2 mm space. extend into the neck. are composed of respiratory epithelium. C
In Angle's Class II Div 2 occlusion there is excess overbite of the upper lateral incisors. negative overjet of the upper central incisors. excess overjet of the upper lateral incisors. excess overjet of the upper central incisors. C
Which of the following is the body cavity that contains the pituitary gland? Abdominal Cranial Pleural Spinal B

运行的脚本:

python run/eval.py llama3_lora_eval.yaml

run/eval.py简单的套了个run_eval()函数,如下所示:

from llamafactory.eval.evaluator import run_eval
run_eval()

llama3_lora_eval.yaml如下所示:

### model
model_name_or_path: /opt/gfbai/models/Meta-Llama-3-8B-Instruct
adapter_name_or_path: saves/llama3-8b/lora/sft/checkpoint-54

### method
finetuning_type: lora

### dataset
task: mmlu
split: train
template: fewshot
lang: en
n_shot: 5

### output
save_dir: saves/llama3-8b/lora/eval

### eval
batch_size: 2
download_mode: force_redownload

Expected behavior

输入出完整的答案

System Info

  • transformers version: 4.40.2
  • Platform: Linux-5.15.0-105-generic-x86_64-with-glibc2.35
  • Python version: 3.10.14
  • Huggingface_hub version: 0.23.0
  • Safetensors version: 0.4.3
  • Accelerate version: 0.30.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.3.0+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Others

应该是mmlu数据集的脚本有问题
这个文件evaluation/mmlu/mmlu.py中:

def _generate_examples(self, filepath):
    df = pd.read_csv(filepath)
    df.columns = ["question", "A", "B", "C", "D", "answer"]

    for i, instance in enumerate(df.to_dict(orient="records")):
        yield i, instance

改为以下形式就好了

def _generate_examples(self, filepath):
    df = pd.read_csv(filepath, header=None)
    df.columns = ["question", "A", "B", "C", "D", "answer"]

    for i, instance in enumerate(df.to_dict(orient="records")):
        yield i, instance
@hiyouga hiyouga added the pending This problem is yet to be addressed label May 21, 2024
@hiyouga
Copy link
Owner

hiyouga commented May 29, 2024

fixed

@hiyouga hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved This problem has been already solved
Projects
None yet
Development

No branches or pull requests

2 participants