-
Notifications
You must be signed in to change notification settings - Fork 420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
更新代码后,重新执行finetune.sh出错, TypeError: init_process_group() got multiple values for keyword argument 'backend' #112
Comments
finetune.py一个月都没改过了,老问题了你单卡就别用torchrun了,直接用python跑 |
我现在运行bash finetune_others_continue.sh也报这个错,这个错误是在调用finetune.py:237行时发生的 |
@dizhenx 这个和哪个脚本没关系,多卡用torchrun(我们脚本都是默认多卡的),单卡就不要用了,直接用python |
@Facico A100 的训练参数组合有经验吗? # optimized for RTX 4090. for larger GPUs, increase some of these?MICRO_BATCH_SIZE = 4 # this could actually be 5 but i like powers of 2 |
@Facico 另外想问一下finetune的速度如何?用4090 finetune vicuna13b,100K的samples大概要多久?有可以参考的数据吗? |
这个框架支持精调vicuna13b吗? |
@wangrui6 一般只用根据硬件需求调CUTOFF_LEN。不太记得了,应该是几十万数据大概跑了200h |
昨日重新拉取git code之后,再次执行finetune.sh,torchrun 就会报错。
![image](https://user-images.githubusercontent.com/127492258/234479075-13733298-a20a-453c-a694-517b2ddce3d9.png)
【初始环境】
A100 * 1
accelerate 0.18.0
bitsandbytes 0.37.2
transformers 4.29.0.dev0
【修改环境v1】
执行pip install transformers==4.28.1
结果:仍然错误
【修改环境v2】
执行 pip install git+https://github.com/huggingface/transformers@ff20f9cf3615a8638023bc82925573cb9d0f3560
结果:仍然报错,错误如下:
The text was updated successfully, but these errors were encountered: