Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pretrain_yuan_13B中的参数 #11

Closed
ztysdu opened this issue Feb 26, 2022 · 3 comments
Closed

pretrain_yuan_13B中的参数 #11

ztysdu opened this issue Feb 26, 2022 · 3 comments

Comments

@ztysdu
Copy link

ztysdu commented Feb 26, 2022

NNODES=?
GPUS_PER_NODE=1
MASTER_PORT=?
NODE_RANK=?
MASTER_ADDR=?
DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES --node_rank $NODE_RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT"
您好,我们想在矩池云上租用多GPU进行训练,请问MASRER_PORT,MASTER_ADDR,NNODES,NODE_RANK等参数应该怎么设置比较好啊,有参考资料吗?蟹蟹!

@zhaoxudong01-ieisystem
Copy link
Contributor

NNODES=? GPUS_PER_NODE=1 MASTER_PORT=? NODE_RANK=? MASTER_ADDR=? DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES --node_rank $NODE_RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT" 您好,我们想在矩池云上租用多GPU进行训练,请问MASRER_PORT,MASTER_ADDR,NNODES,NODE_RANK等参数应该怎么设置比较好啊,有参考资料吗?蟹蟹!

https://pytorch.org/docs/stable/distributed.html
请阅读pytorch相关文档和Yuan 1.0的论文第二节部分内容,参数设置和具体硬件环境相关

@Shawn-IEITSystems
Copy link
Owner

@joe483 请问我们是否回答了您的问题?

@Shawn-IEITSystems
Copy link
Owner

已经完成了训练,关闭该issue。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants