BembaSpeech Baseline Experiments

This repository contains resources (dataset and notebooks) for reproducing experiments in the BembaSpeech: A Speech Recognition Corpus for the Bemba Language.

Please consider citing as follows if you use part of the code or data in your work or project:

@InProceedings{sikasote-anastasopoulos:2022:LREC,
  author    = {Sikasote, Claytone  and  Anastasopoulos, Antonios},
  title     = {BembaSpeech: A Speech Recognition Corpus for the Bemba Language},
  booktitle      = {Proceedings of the Language Resources and Evaluation Conference},
  month          = {June},
  year           = {2022},
  address        = {Marseille, France},
  publisher      = {European Language Resources Association},
  pages     = {7277--7283},
  abstract  = {We present a preprocessed, ready-to-use automatic speech recognition corpus, BembaSpeech, consisting over 24 hours of read speech in the Bemba language, a written but low-resourced language spoken by over 30\% of the population in Zambia. To assess its usefulness for training and testing ASR systems for Bemba, we explored different approaches; supervised pre-training (training from scratch), cross-lingual transfer learning from a monolingual English pre-trained model using DeepSpeech on the portion of the dataset and fine-tuning large scale self-supervised Wav2Vec2.0 based multilingual pre-trained models on the complete BembaSpeech corpus. From our experiments, the 1 billion XLS-R parameter model gives the best results. The model achieves a word error rate (WER) of 32.91\%, results demonstrating that model capacity significantly improves performance and that multilingual pre-trained models transfers cross-lingual acoustic representation better than monolingual pre-trained English model on the BembaSpeech for the Bemba ASR. Lastly, results also show that the corpus can be used for building ASR systems for Bemba language.},
  url       = {https://aclanthology.org/2022.lrec-1.790}
}

1. DeepSpeech Experiments

In this project we used the DeepSpeech v0.8.2 release for our experiments. We refer the reader to Mozilla DeepSpeech for latest updates.

Dataset

The data used in this project is a 17hrs portion of the BembaSpeech corpus consisting of audio files whose size is not more than 10 seconds as per DeepSpeech input pipeline requirement.

ID	Dataset	CSV file	No. of Utterances	Size	Description
1	training	train.csv	10200	14hrs, 20min	Used for training
2	development	dev.csv	1437	2hrs	Used for validation
3	testing	test.csv	756	1hr, 18min	Used for testing

Language Model

To create the language model for our experiments, we used two sets of Bemba text; transcript (from train and dev sets) denited as LM1 and a combination of transcripts and JW300 denoted as LM2.

You can run and follow the notebook which provides the step by step process of creating different N-gram language models using KenLM tool.

Notebooks

In the notebooks folder, you will find notebooks used in the training of the DeepSpeech Bemba ASR model.

lm.ipynb - used to create the N-gram language models
baseline.ipynb - used to train the baseline for our experiments
ft_model.ipynb - used to finetune DeepSpeech English pretrained model without inclusion of language model.
ftune_5glm_trans.ipynb - used to finetune DeepSpeech`s English pretrained model with inclusion of the 5-gram LM (from LM1 Bemba text) scorer.

Deepspeech Bemba Models

You can download the models (both acoustic and scorer) that achieved the best results 54.78%.

1. Using SSL Models [XLSR] Experiments

The code used to finetune the XLSR models on the BembaSpeech can be found HERE.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
lm		lm
notebooks		notebooks
LICENSE		LICENSE
README.md		README.md
Using_Pretrained_English_DeepSpeech_Model.ipynb		Using_Pretrained_English_DeepSpeech_Model.ipynb
baseline_model.ipynb		baseline_model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BembaSpeech Baseline Experiments

1. DeepSpeech Experiments

Dataset

Language Model

Notebooks

Deepspeech Bemba Models

1. Using SSL Models [XLSR] Experiments

About

Releases

Packages

Languages

License

csikasote/bembaspeech-exps

Folders and files

Latest commit

History

Repository files navigation

BembaSpeech Baseline Experiments

1. DeepSpeech Experiments

Dataset

Language Model

Notebooks

Deepspeech Bemba Models

1. Using SSL Models [XLSR] Experiments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages