BUG: error_reason_correctness #3

zhangxjohn · 2024-03-27T08:18:05Z

In the line 63 of auto_grade_error_reasons.py, the code to_be_graded_data = [data for data in eval_data if data['error_reason_correctness'] != 'N/A'] has error_reason_correctness, but I run the eval_open_source_models.py, the output json does not have error_reason_correctness, why? this is a code Bug or not?

The text was updated successfully, but these errors were encountered:

Randolph-zeng · 2024-03-28T06:27:58Z

oh. thanks for pointing this out. This "error_reason_correctness" is actually a manual labelled field by our annotator that decides the correctness of the error reason returned by the evaluated models. The 'auto_grade_error_reasons.py' is supposed to use GPT4 to replace the human efforts. However, in the paper, we use the human labelling to verify the GPT4 labelling correctness thus used this field for filtering. For your case, you can safely ignore this field, thanks : )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: error_reason_correctness #3

BUG: error_reason_correctness #3

zhangxjohn commented Mar 27, 2024

Randolph-zeng commented Mar 28, 2024

BUG: error_reason_correctness #3

BUG: error_reason_correctness #3

Comments

zhangxjohn commented Mar 27, 2024

Randolph-zeng commented Mar 28, 2024