ultrafeedback_binarized支持 #4132

Baichenjia · 2024-06-07T01:54:45Z

Reminder

I have read the README and searched the existing issues.

System Info

transformers version: 4.41.1
Platform: Linux-5.15.0-107-generic-x86_64-with-glibc2.35
Python version: 3.10.12
Huggingface_hub version: 0.23.2
Safetensors version: 0.4.3
Accelerate version: 0.29.3
Accelerate config: - compute_environment: LOCAL_MACHINE

distributed_type: DEEPSPEED
mixed_precision: bf16
use_cpu: False
debug: True
num_processes: 3
machine_rank: 0
num_machines: 1
rdzv_backend: static
same_network: True
main_training_function: main
enable_cpu_affinity: False
deepspeed_config: {'gradient_accumulation_steps': 2, 'zero3_init_flag': False, 'zero_stage': 0}
downcast_bf16: no
tpu_use_cluster: False
tpu_use_sudo: False
tpu_env: []
PyTorch version (GPU?): 2.2.1+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?: True
Using distributed or parallel set-up in script?: False

Reproduction

CUDA_VISIBLE_DEVICES=0 llamafactory-cli train \ --stage dpo \ --pref_loss simpo \ --simpo_gamma 1.0 \ --do_train True \ --model_name_or_path /home/ubuntu/date/llama_ckpts/llama_lx3_ckpts/BAdam_llama3_random_lr1e-6/checkpoint-9600 \ --preprocessing_num_workers 16 \ --finetuning_type full \ --template default \ --flash_attn auto \ --dataset_dir data \ --dataset ultrafeedback_binarized \ --split train \ --cutoff_len 2048 \ --learning_rate 2e-7 \ --num_train_epochs 1.0 \ --max_samples 10000000 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --gradient_accumulation_steps 8 \ --lr_scheduler_type cosine \ --max_grad_norm 1.0 \ --logging_steps 1 \ --save_steps 100 \ --warmup_ratio 0.1 \ --optim adamw_torch \ --packing False \ --report_to none \ --use_badam True \ --output_dir saves/LLaMA3-8B/full/train_2024-06-05 \ --pure_bf16 True \ --plot_loss True \ --use_badam True \ --badam_mode layer \ --badam_switch_mode random \ --badam_switch_interval 100 \ --val_size 0.05 \ --evaluation_strategy steps \ --eval_steps 20

Same to "#4085"

Expected behavior

ultrafeedback_binarized 是研究中常用的对齐数据集，能否提供支持？

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-06-07T18:42:52Z

已添加

hiyouga added a commit that referenced this issue Jun 7, 2024

add ultrafeedback and fineweb #4085 #4132

12d79f8

hiyouga added the solved This problem has been already solved label Jun 7, 2024

hiyouga closed this as completed Jun 7, 2024

Meaquadddd mentioned this issue Oct 23, 2024

关于利用simpo在ultrafeedback_binarized 数据集上进行偏好对齐 #4085

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ultrafeedback_binarized支持 #4132

ultrafeedback_binarized支持 #4132

Baichenjia commented Jun 7, 2024

hiyouga commented Jun 7, 2024

ultrafeedback_binarized支持 #4132

ultrafeedback_binarized支持 #4132

Comments

Baichenjia commented Jun 7, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

hiyouga commented Jun 7, 2024