Megatron train #6

germanjke · 2023-05-03T12:35:36Z

Hi!

I'm asking about train_megatron.py

are you using parallel mechanisms from fairscale and I don't see any sources of megatron library

it's your custom megatron with fairscale?

The text was updated successfully, but these errors were encountered:

germanjke · 2023-05-03T12:55:36Z

fairscale.nn.model_parallel is forked from [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), Copyright 2020, NVIDIA CORPORATION, licensed under [Apache License](http://www.apache.org/licenses/LICENSE-2.0).

I have found this in fariscale repo, so you mean, you just import some Megatron-LM from fairscale.nn.model_parallel?

germanjke · 2023-05-03T13:09:28Z

from fairscale.nn.model_parallel.layers import ( ParallelEmbedding, RowParallelLinear, ColumnParallelLinear, )
so yes, we are using this from fairscale, but it's fairscale forked by Megatron-LM, so we are using Megatron-LM, I think it's your logic

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Megatron train #6

Megatron train #6

germanjke commented May 3, 2023

germanjke commented May 3, 2023

germanjke commented May 3, 2023

Megatron train #6

Megatron train #6

Comments

germanjke commented May 3, 2023

germanjke commented May 3, 2023

germanjke commented May 3, 2023