Practical tests of hot word feature and default models accuracy #1718

JRMeyer · 2021-03-08T08:38:45Z

JRMeyer
Mar 8, 2021
Maintainer

>>> Clockworker
[January 14, 2021, 8:00pm]

Hi,

We are students currently doing their Bachelor's Degree and during
Software Testing classes our task was to test an open-source project. A
few months ago I asked here what could have use some testing. I was glad
to receive a message and we've started working. Trying our best, we have
made an analysis of practical usage of hot-word feature and DeepSpeech
default model's accuracy, so you determine it's practical quality for
many scenarios.

Full report:

deepspeech_test_report.pdf
(426,0 KB)

Short summary for 250 different audio files with taggings:

scientific words was the most accurate. (95.6%)

there were more files for male speech in this combination or input
files may have been just a little bit harder for a model to
understand. (94.0%)

and 82.5% common speech) where lecture speech of non-accented
speaker is above 94.6%

in our data set for those that contained at least one of them), note
that if we used more proper nouns then this accuracy difference
could have been higher. It all depends on number of those words;
however, this serves as a proof that in fact the difference is real
and should be considered.

male common speech (87.3% female, 91.4% male).

males speaking in common voice and with scientific words accuracy
was: 86.4%, while the same tag combination but without scientific
words achieved 91.4% accuracy. That gives 5% drop.

As for hot-words feature:

behavior. Probably because it doesn't appear in word detection
mechanism and is not modified.

everything else that comes after that word, because of letter
splitting bug. Example: 'okay google'.

but be careful of this word to appear as a splitted one: 'another'
- slash > 'an other' or as a word of similar sound: 'gold' - slash > 'god'.

add a very small priority and it could work.

the given hot-word cause no change if the audio doesn't include the
sound of that word.

hot-words were detected.

Any opinions would be appreciated.

We are really glad that we've made this far and it was an honor for us
to anyhow support this great project

[How can I know what boost value to give for a particular hot
word

[This is an archived TTS discussion thread from discourse.mozilla.org/t/practical-tests-of-hot-word-feature-and-default-models-accuracy]

JRMeyer · 2021-03-08T08:38:48Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[January 15, 2021, 8:29am]

Great work, thanks guys and all the best.

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:38:50Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[January 15, 2021, 1:28pm]

Read in the other post, that you have code and considerable experience
with hotword boosting. If you have a little bit of time, open a github
repo and make a simple readme with your findings and links to the code.
This way other people find it more easily.

If you have a little bit more time, check the DS
docs and insert some
documentation on how this feature works with examples. More and more
people are asking about.

Again, thanks for your time, it is great to know more about how it
works.

([ slash kreid](
whether that's within the scope, but this is good material

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:38:53Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> Clockworker
[January 15, 2021, 2:03pm]

You can count on me, I'll do it in my spare time

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:38:56Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> Clockworker
[January 19, 2021, 4:30pm]

I've added the test code for hot-word testing to the public repository
on github, I could not edit the original post, so I put it here:

{.site-icon
GitHub

### Ideefixze/deepspeech-hot-words-booster

Simple script for testing different boost values for hot-words in
Mozilla slash 's STT: Deepspeech - Ideefixze/deepspeech-hot-words-booster

Also, I'll soon edit the DS docs based on those findings. Not sure how
long it would take as my exams are closer and closer

[Archived Post]

0 replies

JRMeyer · 2021-03-08T08:38:58Z

JRMeyer
Mar 8, 2021
Maintainer Author

>>> othiele
[January 19, 2021, 4:51pm]

Thanks a lot, we can link to it now if there are any questions about
them.

[Archived Post]

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Practical tests of hot word feature and default models accuracy #1718

{{title}}

Replies: 5 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Practical tests of hot word feature and default models accuracy #1718

JRMeyer Mar 8, 2021 Maintainer

Replies: 5 comments

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer Mar 8, 2021 Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author

JRMeyer
Mar 8, 2021
Maintainer Author