Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why do u need 2GPUs to run Qwen2.5 3B? #33

Open
rpking1107 opened this issue Feb 1, 2025 · 4 comments
Open

why do u need 2GPUs to run Qwen2.5 3B? #33

rpking1107 opened this issue Feb 1, 2025 · 4 comments

Comments

@rpking1107
Copy link

sorry if it's a stupid question.
But my 4090 is 24GB ram, it's more than enough to run a 3B model right? can i use Qwen2.5 3B and change the parameter to GPU=1?

thanks guys

@Superskyyy
Copy link

Training is different from inference, you need way more vram, safe to say your 4090 is not enough.

@zacksiri
Copy link

zacksiri commented Feb 2, 2025

@rpking1107 i was able to get training running on my A4500 with 20GB ram. you just need to adjust the training parameters. It's going to take longer because of smaller batch size, but it does work.

@carlos-aguayo
Copy link

@zacksiri Can you share what parameters did you use? I tried lowering the batch sizes and a few others and would still run out of memory.

@zacksiri
Copy link

zacksiri commented Feb 2, 2025

@carlos-aguayo You can see my parameters here.

#5 (comment)

It works with qwen2.5 1.5b, I'm not sure I have the best configuration / output yet. Need to try a few more experiments.

I'm experimenting with 3b model but I think I'm pushing my luck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants