Replies: 5 comments
-
>>> lissyx |
Beta Was this translation helpful? Give feedback.
-
>>> Sushantmkarande |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
>>> Sushantmkarande
[May 9, 2019, 6:02am]
I am trying to improve deepspeech model accuracy for indian english
dataset. slash
how my data should look like, is there any requirements. slash
what I did: slash
first I tried recording 6 people's voice on 8-10 average word sentence.
about 30 sentence each, note that same sentence repeated by every
person. I got reasonably good accuracy, slash
then i tried again with 6 people with 40 sentence each. but this time
some of the sentence only had one word which is kind of keyword i want
to predict correctly. accuracy did not improve like the first time.
so what are some requirement we should keep in mind while recording
voice. slash
1.should there be only sentences not words slash
2.does some noise around effect accuracy slash
3. fine tuning on already finetuned model effects accuracy. slash
4.If I repeated those sentence again with varied decible in wav. (data
augumetetion) will it effect accuracy slash
5. what you suggest for data agumentetion. slash
1.will adding some noise help slash
2. will varying speed and pitch helps.
sorry for the big essay but i did not know how to frame this question.
[This is an archived TTS discussion thread from discourse.mozilla.org/t/fine-tuning-data-requirements]
Beta Was this translation helpful? Give feedback.
All reactions