How to use Coqui STT for a text-to-speech server (in NodeJs) #1870
Replies: 5 comments 4 replies
-
Some advances: With the new version of coquittsjs, I partially "solved" the multithreading need: I'm still renew my question 1: is Coqui STT Model loaded once in memory (as a shared library)? thanks |
Beta Was this translation helpful? Give feedback.
-
The model is not loaded as a shared library. When using the |
Beta Was this translation helpful? Give feedback.
-
Hi Reuben, mmap keyword clarify the point! As far as I understand it's like a not-persistent shared memory and works well for me: on my linux machine I experience that the first time a program load the (huge) model in RAM, loading latency is ~20 seconds. Thanks |
Beta Was this translation helpful? Give feedback.
-
Yes, that's the reason I'm working on setting-up a multithread architecture on top pf native APIs: The doubt I have now is if multiples threads can concurrently access the model, and I guess they can just because the model is a mmap, right?
That's happens on my linux desktop laptop, with small amount of free RAM, and it's probably due to a pages swap, but it's not an issue because it happens just the "first time". At regime, model loading take few msecs! |
Beta Was this translation helpful? Give feedback.
-
Hi @reuben Now, If I "stress test" the above demo server, with a simple bash script that send curl requests, every few hundreds of milliseconds, the server crashes with segmentation fault. See logs. Sometime, if I relax the delay between requests, up to 500 msecs, the program crashes after a while and in this case I also read in stderr the message:
Now I'm confused, because the I don't understand what happens. it seems that the server crashes if incoming requests are "close" in time. (under 300 msecs). Premising also that:
My question/help request is: Let me know if you need more info. |
Beta Was this translation helpful? Give feedback.
-
H all!
I just published a very simple opensource project: https://github.com/solyarisoftware/CoquiSTTJs enabling NodeJs developers to use Coqui STT with a simplified API.
Now I want to set up a speech recognition SERVER architecture, using Coqui STT engine to manage multiple concurrent user requests.
The problem:
Following some quick tests (using CPU, without a GPU), STT (as DeepSpeech) decoder seems to me a single-thread "long-processing" application that put a single CPU core at 100% for a while (speech to text of a 3 words english sentence has a latency of more than 1 sec on my laptop. For details see this test).
To build a server, I instead need a multi-process / multi-thread architecture. My preferred approach, in NodeJs, would be to use NodeJs "worker threads", passing the loaded Model object from a main dispatcher thread and the workers (that could make the STT in a separate thread), nevertheless, believe it doesn't run because data passing with worker threads is "by value" and I suppose the Model is a huge in memory object.
Questions:
It seems to me that the STT Model is loaded once in memory (using Linux) as a shared library. That's correct? How can see how much RAM memory a Model uses?
If true, probably the solution is to build a pool of worker processes each one accessing the Model separately. Does it make sense?
See also this thread:
https://discourse.mozilla.org/t/how-to-use-deepspeech-for-a-text-to-speech-server-in-nodejs/79636/2
Thoughts? Suggestions?
Thanks!
Giorgio
Beta Was this translation helpful? Give feedback.
All reactions