Fine tuning data requirements #626

JRMeyer · 2021-03-08T03:11:19Z

JRMeyer
Mar 8, 2021
Maintainer

>>> Sushantmkarande
[May 9, 2019, 6:02am]

I am trying to improve deepspeech model accuracy for indian english
dataset. slash
how my data should look like, is there any requirements. slash
what I did: slash
first I tried recording 6 people's voice on 8-10 average word sentence.
about 30 sentence each, note that same sentence repeated by every
person. I got reasonably good accuracy, slash
then i tried again with 6 people with 40 sentence each. but this time
some of the sentence only had one word which is kind of keyword i want
to predict correctly. accuracy did not improve like the first time.

so what are some requirement we should keep in mind while recording
voice. slash
1.should there be only sentences not words slash
2.does some noise around effect accuracy slash
3. fine tuning on already finetuned model effects accuracy. slash
4.If I repeated those sentence again with varied decible in wav. (data
augumetetion) will it effect accuracy slash
5. what you suggest for data agumentetion. slash
1.will adding some noise help slash
2. will varying speed and pitch helps.

sorry for the big essay but i did not know how to frame this question.

[This is an archived TTS discussion thread from discourse.mozilla.org/t/fine-tuning-data-requirements]

JRMeyer · 2021-03-08T03:11:22Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> lissyx
[May 10, 2019, 8:27am]

At first, I think your data augmentation for fine-tuning is just not
enough. There seems to be several problems here to address separately:

For the indian accent, there's no better solution than having a bit more
than a few dozen of minutes of sound with it. You should try and look
into Common Voice dataset, filtering for indian accent, that should
already be a good basis. Contributing to Common Voice would of course
help a lot.

For the noisy background, the only reliable solution is making the model
noise-robust, which we are working on but is not yet ready. It's done
with data augmentation where we add noise, you can find more about it on
github.

For specific words, if you need them to be properly identified, the best
solution is to re-build a language model and add your own words. Better
long-term solution is helping us add the feature of having multiple
language models, which would allow to better control that and avoid
re-building from scratch the base language model.

[Archived Post]

0 replies

JRMeyer · 2021-03-08T03:11:24Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> Sushantmkarande
[May 11, 2019, 5:20am]

slash
although I have few questions. this is very good explination to my
question.

can I again train on mozilla corpus indian accent even though you might
have already trained on it in pretraining.

do you mean start training from scratch when you say build your own
language model. if thats the case how should I do it, I dont have enough
data.

[Archived Post]

0 replies

JRMeyer · 2021-03-08T03:11:27Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> Sushantmkarande
[May 11, 2019, 6:26am]

I just saw common voice dataset. there are lot nan values in accent
columns in validated tsv so what is the solution. i could only find
16000 indian accent sample from 490000 samples.

[Fine Tuning with limited data - Questions on Fine Tuning in
General

[Archived Post]

0 replies

JRMeyer · 2021-03-08T03:11:30Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> lissyx
[May 11, 2019, 6:38am]

> can I again train on mozilla corpus indian accent even though you
> might have already trained on it in pretraining.

Not a good idea

> do you mean start training from scratch when you say build your own
> language model. if thats the case how should I do it, I dont have
> enough data.

No, just follow the docs in data/lm/README.md, and adjust the text
file to add your specific words.

[Archived Post]

0 replies

JRMeyer · 2021-03-08T03:11:32Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> lissyx
[May 11, 2019, 6:39am]

> 16000 indian accent sample

Then it's likely we already have trained with those in 0.4.1, so
unfortunately it's not being helpful for you
![:confused:](

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine tuning data requirements #626

{{title}}

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Fine tuning data requirements #626

JRMeyer Mar 8, 2021 Maintainer

Replies: 5 comments

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author