Skip to content
This repository has been archived by the owner on Nov 28, 2022. It is now read-only.

Submission for team Devisa #7

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "submit/Variant_Accent_Dialect/Devisa/source-code"]
path = submit/Variant_Accent_Dialect/Devisa/source-code
url = https://gitlab.com/prvInSpace/romansh-stt-project
53 changes: 53 additions & 0 deletions submit/Variant_Accent_Dialect/Devisa/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Devisa: STT-models for different Romansh dialects

In this folder you can find our submission for the competition. The source-code repository is available as a submodule here.

All of the models that were created for the project are available on the [project's Gitlab package repository](https://gitlab.com/prvInSpace/romansh-stt-project/-/packages)
in a tflite format. The KenLM language models created for each of the dialects are also available there and the custom splits of the Common Voice datasets.

## Important links:
* The main repository for the source code can be found at [Gitlab](https://gitlab.com/prvInSpace/romansh-stt-project).
* The video submission can be found at [Google Drive](https://drive.google.com/file/d/17Tfj7nfZEhVOid7HqhnqZGwM_V9zLT4w/view?usp=sharing)
* The spreadsheet used to record the results can be found at [Google Spreadsheets](https://docs.google.com/spreadsheets/d/1TBw0GrosfgvsdqPXYzgkaN3ZsMQys8574L6bhlNh4rw/edit?usp=sharing)
* The main README file for the project can be found in the source-code folder.

## How to recreate the results

The source-code folder doesn't contain any of the Common Voice data, and to run anything you need to setup the environment. the Makefile should be able to recreate the environment, simply by running `make`. This should download the Common Voice datasets, and perform all of the required preprocessing (This was tested on a Linux system and seemed to work)

After that if you want to copy in the models and language-models instead of retraining them, please run `make download-checkpoints`. This should download the checkpoints from the Gitlab package repository and unzip them. Alternatively you can find tflite versions on the [project's Gitlab package repository](https://gitlab.com/prvInSpace/romansh-stt-project/-/packages)

If you want to train the models from scratch you'll need the base model (English-German). This can be dowloaded by running `make data/base`.

Note that the application that splits the Common Voice dataset is non-deterministic.
As such, if you want to recreate the results, you can find the custom splits both in the `splits/` folder in the source-code directory and in the individual packages for RM-Sursilv and RM-Vallader on the [project's Gitlab package repository](https://gitlab.com/prvInSpace/romansh-stt-project/-/packages). For your convenience these will be copied into the correct folder when you run the `make` command.

The Docker environment can be started via the Makefile using a couple of flags. The most notable of which is LANG which specifies which language-code to target.
For example, to create an environment with both spoken and text data for rm-sursilv, this can be done by calling:
```bash
make train LANG=rm-sursilv
```
You can specify which language model you want by adding the LM flag when starting the Docker and you can also specify which Common Voice dataset you want by using the DATA flag as such:
```bash
make train LANG=rm-sursilv DATA=rm-vallader LM=rm-puter
```

Once in the environment you can rerun the tests by either running:
```bash
bash /scripts/eval.bash
```
if you want to test it without the language model.

If you want to test it with a language mode, please run this command instead:
```bash
bash /scripts/eval_scorer.bash
```

Models can also be trained using the training scripts in the `/scripts` folder inside of the Docker environment.
For example, to retrain the acoustic models, please run:
```bash
bash /scripts/train.bash
```

That should be everything.
If you just want to test the best performance for RM-Sursilv and RM-Vallader, then LANG should be the only required flag.
1 change: 1 addition & 0 deletions submit/Variant_Accent_Dialect/Devisa/source-code
Submodule source-code added at d9cbcb