From b1c5249875cb244a04e59dcda71f585000b191c8 Mon Sep 17 00:00:00 2001 From: wasertech Date: Sat, 4 Jun 2022 21:40:33 +0200 Subject: [PATCH] Update docs --- STT/CONTRIBUTING.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/STT/CONTRIBUTING.md b/STT/CONTRIBUTING.md index 8db9e3d3..88c04d83 100644 --- a/STT/CONTRIBUTING.md +++ b/STT/CONTRIBUTING.md @@ -45,6 +45,7 @@ Some parameters for the model itself: - `duplicate_sentence_count` to control if Common Voice dataset might need to be regenerated with more duplicated allowed using Corpora Creator **USE WITH CAUTION** + - `enable_augments` to help the model to better genralise on noisy data by augmenting the data in various ways. - `cv_personal_first_url` to download only your own voice instead of all Common Voice dataset (first url) - `cv_personal_second_url` to download only your own voice instead of all Common Voice dataset (second url) @@ -84,6 +85,8 @@ files, with proper `checkpoint` descriptor as TensorFlow produces. To use an existing checkpoint, just ensure the `docker run` includes a mount such as: `type=bind,src=PATH/TO/CHECKPOINTS,dst=/transfer-checkpoint`. Upon running, the checkpoints will be automatically used as starting point. +Checkpoints don't typically use automatic mixed precision nor fully-connected layer normalization and mostly use a standard number of hidden layers (2048 unless specified otherwise). So don't change those parameters to fine-tune from them. + ## Hardware Training successfull on: