How to load unsharded pre-trained weights into hybrid parallel model? #3126
Replies: 1 comment 1 reply
-
Hi @ShinoharaHare , could you please try |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I refer to this example to build my own hybrid parallel model (TP + ZeRO + PP) for bloom and it seems to work well so far.
But now i want to initialize the model with the pre-trained weights which is unsharded, the thing is that i have no idea what slices of the unsharded weights is corespond to the sharded weights.
I wonder if there is a simple solution for this intention, or i need to convert it by myself.
I would appreciate it if someone could help.
Edit:
I think this is similar to this issue #2770.
Beta Was this translation helpful? Give feedback.
All reactions