Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

数据集格式 #20

Open
lcykww opened this issue Nov 2, 2024 · 1 comment
Open

数据集格式 #20

lcykww opened this issue Nov 2, 2024 · 1 comment

Comments

@lcykww
Copy link

lcykww commented Nov 2, 2024

RANK=8 deepspeed --num_gpus=8 --num_nodes=2 train.py \ --base_model <LLAMA-2> --micro_batch_size 4\ --wandb_run_name mora_math_r8 --lora_target_modules q_proj,k_proj,v_proj,o_proj,gate_proj,down_proj,up_proj \ --num_epochs 3 --deepspeed ds.config --wandb_project lora-math --lora_r $RANK --batch_size 128 \ --data_path meta-math/MetaMath \ --save_steps 3000 \ --learning_rate 3e-4 --mora_type 6 \ --logging_steps 5 --use_bf16 --use_16bit --use_mora
readme中提供的指令需要将数据集处理成什么格式

@kongds
Copy link
Owner

kongds commented Nov 2, 2024

这个脚本应该会自动从huggingface上下载对应的数据并进行预处理。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants