Large model requires 4-10x time to process long file. Suggestions to improve time? #1747
-
Hey there! Thanks a lot for this transcription/translation model; it is proving to be of very high value for our business by helping us transcribe our important meetings. What is the issue?The large model (
Both of which are not in usage outside of said transcriptions. What is the expected behaviour?The large model to perform at the same time to transcription ratio as documented. DiscussionI hypothesize that the difference in documented speeds is related to the hardware used - after all, NN are typically run on GPUs, not CPUs. Could this be the case? Alternatively, if this is not the case, is there some kind of parameter or execution context that must be established for the program to access the full capability of the machine (like some flag to use the GPU or some permission to be given to the process to use the GPU)? Additionally, would chunking the video into multiple smaller video files (audio files) improve speed without degrading the quality of the output? Thanks a lot for your help! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
The documentation doesn't give times to transcription ratios, it gives only relative speed. Thus, however fast the large model happens to run on your particular hardware, the medium model will be ~2x faster than that, the small model ~6x faster, and so on. Of course the large model will run faster on faster hardware, but the medium model will run ~2x faster than that on the same hardware. How fast does Whisper run on particular hardware? There are too many hardware choices that it would be too expensive to buy them all and test them, but individual users who have particular hardware have posted discussions here reporting on how fast Whisper runs on their hardware. You can search this discussion board or the broader internet to find Whisper benchmarks on different hardware. If you search the discussion board, you can also find discussions about speeding up Whisper on the same hardware. |
Beta Was this translation helpful? Give feedback.
The documentation doesn't give times to transcription ratios, it gives only relative speed. Thus, however fast the large model happens to run on your particular hardware, the medium model will be ~2x faster than that, the small model ~6x faster, and so on. Of course the large model will run faster on faster hardware, but the medium model will run ~2x faster than that on the same hardware.
How fast does Whisper run on particular hardware? There are too many hardware choices that it would be too expensive to buy them all and test them, but individual users who have …