-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
关于利用simpo在ultrafeedback_binarized 数据集上进行偏好对齐 #4085
Comments
@Meaquadddd 最后这个问题如何解决呢 |
@hiyouga ultrafeedback_binarized是研究中常用的对齐数据集,能否提供支持? |
使用 |
我最后自己手动把源码做了下改动 |
请问可以分享一下具体修改了哪些地方吗? 非常感谢!! |
现在应该直接在脚本里设置 |
Reminder
System Info
transformers
version: 4.41.1- distributed_type: DEEPSPEED
- mixed_precision: bf16
- use_cpu: False
- debug: True
- num_processes: 3
- machine_rank: 0
- num_machines: 1
- rdzv_backend: static
- same_network: True
- main_training_function: main
- enable_cpu_affinity: False
- deepspeed_config: {'gradient_accumulation_steps': 2, 'zero3_init_flag': False, 'zero_stage': 0}
- downcast_bf16: no
- tpu_use_cluster: False
- tpu_use_sudo: False
- tpu_env: []
Reproduction
我尝试运行下面这段代码,利用simpo在ultrafeedback_binarized 上进行偏好对齐
由于ultrafeedback_binarized在dataset_info.json 中没有定义,我按照相关的readme文档定义了
huggingface上的ultrafeedback_binarized 数据集格式是
运行上面的命令之后打印报错
Expected behavior
猜测是由于多出来的
字段导致数据集不规范?
Others
上面的代码将dataset 修改为 dpo_en_demo后方可正常运行
No response
The text was updated successfully, but these errors were encountered: