Optimal hardware setup (GPUs) #2296
Replies: 1 comment 17 replies
-
You need more VRAM per GPU to fit bigger batch sizes.
Adding another 8 Gb GPU won't really help.
You need to fit 16 or 32 batched samples 3 times on each GPU so try to aim for 24 Gb for the biggest batch sizes (64, 128) but 12 Gb or 11 Gb might just cut it for you.
Only then you can think about adding more GPUs of the same size to make training even faster.
Try to limit the length of each sample (10s < 20s). This is the biggest factor in the batch size computation so use it carefully.
Use automatic mixed precision (AMP) to enable bigger batch sizes and make the training faster but use it only for exploring with different parameters. Once you are happy with your tests, you can make a final training without AMP to export your final model. |
Beta Was this translation helpful? Give feedback.
-
I am trying to speed up my training process. Current setup: 1 GPU RTX 2080 8Gb (12 CPUs server + 32 gb ram) and I am training using over 1k hours of data. Unfortunately, the process is very slow; it takes almost 1 day for only 1 epoch and I can only use a batch size of 1 (over 4Gb are used in the GPU ram). My questions here are:
I appreciate all the advice and details of your experience.
Thank you in advance :))
Beta Was this translation helpful? Give feedback.
All reactions