Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][Distributed] refactor pynccl to hold multiple communicators #4591
[Core][Distributed] refactor pynccl to hold multiple communicators #4591
Changes from 30 commits
9c6130a
1493243
cadcd02
7918798
5924038
813b047
b244e6c
7e15c98
e610f64
8480995
5ed6f07
8134287
e65e9ef
c8b6fc0
c7a2f0c
75a8d11
59c064e
16aeef1
4710fc3
c8542ec
b2d2661
67d1d9a
c86199c
0030a31
49f6d91
38b148b
d241480
d7209f1
7b55026
ee734b1
0516956
9f63bf8
e9aa766
0f64301
a64962e
d2f83ba
12f309b
68e448c
ad6f840
e2153b2
80aca94
c1b1cdb
c4e3b0f
70a7e26
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumb question: Why don't we import these at the top of the module?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if the variable is used outside the module where it's defined, the variable name should not start with
_
. If you want to make this private, why don't we use getters likeget_tp_pynccl_communicator
instead?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this variable is
None
when the module gets initialized. Import these at the top of the module will getNone
. It will not be updated as the variable in thevllm.distributed.parallel_state
.I actually want to merge
parallel_state
andcommunication_op
.communication_op
needs to access many private variable inparallel_state
. That's another story though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we use a get method then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a get method is doable. however, if the majority callers of these getter method come from
communication_op
, it's better to simply merge two files, imo.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe yes. But if we do not merge the files in this PR, I think using the getter method is a good idea. The current code has two problems: 1) It must be lazily imported for the reason you mentioned above, and 2) the variable name starts with
_
. I believe using the getter method is a simple solution to the two problems.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want this PR to be super complicated. I will merge the files in the next days, let's don't add new user-facing getter functions just for several days and remove them later 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I think in this case we should add the getter function even if there's a change that it can be deleted just after several days. This is because we don't have a concrete timeline for the next PR you are thinking of. It can be delayed or de-prioritized any way. I believe adding the getter function doesn't hurt us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added in 70a7e26 .