Replies: 1 comment 12 replies
-
You are correct. Building your own little scorer for your task with a pre-trained model is the fastest way the improve your results. # Make a dummy scorer to find apha and beta
generate_lm.py \
--input_txt /mnt/extracted/sources_lm.txt \
--output_dir /mnt/lm/ \
--top_k ${LM_TOP_K} \
--kenlm_bins ${HOMEDIR}/kenlm/build/bin/ \
--arpa_order 4 \
--max_arpa_memory "85%" \
--arpa_prune "0|0|1" \
--binary_a_bits 255 \
--binary_q_bits 8 \
--binary_type trie
generate_scorer_package \
--checkpoint /mnt/models/ \
--lm /mnt/lm/lm.binary \
--vocab /mnt/lm/vocab-${LM_TOP_K}.txt \
--package /mnt/lm/kenlm.scorer \
--default_alpha 0.0 \
--default_beta 0.0
# Find best values
lm_optimizer.py \
--show_progressbar true \
--train_cudnn true \
--alphabet_config_path /mnt/models/alphabet.txt \
--scorer_path /mnt/lm/kenlm.scorer \
--feature_cache /mnt/sources/feature_cache \
--test_files ${all_test_csv} \
--test_batch_size ${TEST_BATCH_SIZE} \
--n_hidden ${N_HIDDEN} \
--lm_alpha_max ${LM_ALPHA_MAX} \
--lm_beta_max ${LM_BETA_MAX} \
--n_trials ${LM_N_TRIALS} \
--checkpoint_dir /mnt/checkpoints/
# Repackage the scorer with the correct values
rm /mnt/lm/kenlm.scorer
generate_scorer_package \
--checkpoint /mnt/models/ \
--lm /mnt/lm/lm.binary \
--vocab /mnt/lm/vocab-${LM_TOP_K}.txt \
--package /mnt/lm/kenlm.scorer \
--default_alpha ${LM_ALPHA} \
--default_beta ${LM_BETA}
# Testing with the best values
python -m coqui_stt_training.evaluate \
--show_progressbar true \
--train_cudnn true \
${AMP_FLAG} \
--alphabet_config_path /mnt/models/alphabet.txt \
--scorer_path /mnt/lm/kenlm.scorer \
--test_files ${all_test_csv} \
--test_batch_size ${TEST_BATCH_SIZE} \
--n_hidden ${N_HIDDEN} \
--lm_alpha ${LM_ALPHA} \
--lm_beta ${LM_BETA} \
--checkpoint_dir /mnt/checkpoints/ \
--test_output_file /mnt/models/test_output.json
# Exporting the best models
/tflite-venv/bin/python -m coqui_stt_training.export \
--alphabet_config_path /mnt/models/alphabet.txt \
--scorer_path /mnt/lm/kenlm.scorer \
--feature_cache /mnt/sources/feature_cache \
--n_hidden ${N_HIDDEN} \
--beam_width ${BEAM_WIDTH} \
--lm_alpha ${LM_ALPHA} \
--lm_beta ${LM_BETA} \
--load_evaluate "best" \
${LOAD_CHECKPOINT_FROM} \
--export_dir /mnt/models/ \
--export_tflite true \
${ALL_METADATA_FLAGS} \
${METADATA_MODEL_NAME_FLAG} |
Beta Was this translation helpful? Give feedback.
-
Hello everybody,
I have a conceptual question about the LM aka scorer in Coqui.
A bit of background first. I've been working on open-source voice assistants for a long time now (e.g. SEPIA) and although the ASR systems have come a long way (since Sphinx4 ^^) it is still necessary to build small domain, custom LMs to get acceptable recognition quality.
Pre-trained models usually come with larger LMs and since it's often not practicable or possible to reuse the original training data I have to replace the original LM completely instead of simply "augmenting" it with my own data.
So here is my question. When I noticed that the Coqui models were performing quite OK even without a scorer I was wondering if it's possible to use a custom scorer, trained on a few dozens of own sentences and apply it in a way so that Coqui is preferring those sentences without loosing the original large vocabulary completely? Basically just shifting weights a bit?
I have a feeling that the alpha and beta arguments for the scorer might be able to control this but I can't find any explanation of what they actually do.
[EDIT] The 'hot words' feature seems to be very similar, but boosts only single words not sentences
Thanks in advance for any info or help 🙂
Florian
Beta Was this translation helpful? Give feedback.
All reactions