We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NNODES=? GPUS_PER_NODE=1 MASTER_PORT=? NODE_RANK=? MASTER_ADDR=? DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES --node_rank $NODE_RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT" 您好,我们想在矩池云上租用多GPU进行训练,请问MASRER_PORT,MASTER_ADDR,NNODES,NODE_RANK等参数应该怎么设置比较好啊,有参考资料吗?蟹蟹!
The text was updated successfully, but these errors were encountered:
https://pytorch.org/docs/stable/distributed.html 请阅读pytorch相关文档和Yuan 1.0的论文第二节部分内容,参数设置和具体硬件环境相关
Sorry, something went wrong.
@joe483 请问我们是否回答了您的问题?
已经完成了训练,关闭该issue。
No branches or pull requests
NNODES=?
GPUS_PER_NODE=1
MASTER_PORT=?
NODE_RANK=?
MASTER_ADDR=?
DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES --node_rank $NODE_RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT"
您好,我们想在矩池云上租用多GPU进行训练,请问MASRER_PORT,MASTER_ADDR,NNODES,NODE_RANK等参数应该怎么设置比较好啊,有参考资料吗?蟹蟹!
The text was updated successfully, but these errors were encountered: