Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to Coqui STT 1.4.0 #163

Closed
wants to merge 52 commits into from
Closed

Conversation

wasertech
Copy link

@wasertech wasertech commented May 20, 2022

This branch implements everything needed to train STT models for french using CommonVoice 9.0 with STT version 1.4.0.

Notes

Checkout the released models from this branch: STT French v0.9.

I've added the import_cv_perso.sh importer script to download personal CV data and ease the process of fine-tuning from checkpoints. See this commit and this article on Discourse.

I've also added a custom python script for lm_optimizer to catch the results of the optimization and save them to disk so we can use them during testing and exporting steps.

train.sh has been split into train.sh, test.sh and export.sh. See this commit.

@wasertech
Copy link
Author

wasertech commented May 23, 2022

Managed to make this branch export a model from scratch. See the full logs here.

@wasertech wasertech mentioned this pull request May 25, 2022
@wasertech wasertech changed the title This branch passes the batch memory test Switch to Coqui STT 1.4.0 May 25, 2022
@wasertech wasertech marked this pull request as ready for review May 30, 2022 18:18
@wasertech

This comment was marked as outdated.

@wasertech
Copy link
Author

wasertech commented Sep 4, 2022

STT 1.4.0 was released as stable! I've updated stt_branch accordingly. This branch stt140-cv9 is now completed.

Full build logs and checks.

Version 10 of CV is out so I'll probably make another branch for it (I'll probably wait for more affordable energy to train cv-fr-10 though).

@wasertech wasertech mentioned this pull request Sep 6, 2022
Added link to french tutorial for fine-tuning
@wasertech
Copy link
Author

This branch made the mistake to delete commonvoice-fr/DeepSpeech/ to create commonvoice-fr/STT/.
It is now obsolete thanks to #168.

@wasertech wasertech closed this Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant