What is the `--force_initialize_learning_rate` flag useful for? #2299

JRMeyer · 2022-09-14T18:55:23Z

JRMeyer
Sep 14, 2022
Maintainer

when transfer learning, it's usually the case you set a new learning rate.

unfortunately, you have to both set the LR and force_initialize_learning_rate, or else you get the old LR and have no idea :(

I don't see any reason we should keep this flag

ghost · 2022-09-17T16:02:56Z

ghost
Sep 17, 2022

force_initialize_learning_rate flag determines whether you will load learning rate from checkpoint in training case.
If you want to start transfer learning or fine tuning with new learning rate, this flag should be set as True, but if you stop training to update some parameters and then want to train continuously, you will want to load previous LR, so this flag should be set as False.

There are several cases when to load LR or not, so this flag exists, I think.

0 replies

wasertech · 2022-09-24T05:51:32Z

wasertech
Sep 24, 2022
Collaborator

It's nice to be able to decide from where to load the LR (checkpoints or --learning_rate flag).
I'll close this issue and convert it to a discussion if it's alright with everyone.

0 replies

HarikalarKutusu · 2022-09-24T11:42:31Z

HarikalarKutusu
Sep 24, 2022

I think @JRMeyer suggests to do it automagically.

If base model has 0.001 and you again use 0.001 in TL, there will be no change.
If base model has 0.001 and you use 0.0001 in TL, that flag's effect should be used automatically.

Isn't LR a mandatory argument?
Can LR be known from the .tflite file if hyperparameters are not supplied?

Usual process is for low resource languages where you should drop the LR. Sometimes you do hyperparameter search to find a better converging one. So, you usually change it...

1 reply

wasertech Sep 24, 2022
Collaborator

TLDR: This flag allows you to automate the transfer-learning process. Without it you can't because you cannot load the previous value from checkpoints manually (which is required for complete automation).

Isn't LR a mandatory argument?

The arg is but not the flag as you could decide to load the value from last checkpoint instead.

Let's imagine I have a checkpoint trained on LR 1 ‱. Then I want to use transfer-learning on some new custom data but with a different LR value (i.e 0.1 ‱). After a while, power goes out (or something) and I need to restart the process.

How would you suggest we implement an automatic way to detect which LR I want, given, by default, --learning_rate flag is set to 1‱ and --force_initialize_learning_rate is set to false.

If I resume training while omitting both --force_initialize_learning_rate and --learning_rate, 0.1 ‱ will be loaded from the last checkpoint.

If we only had --learning_rate flag to manage LR, 1 ‱ will be used since it's the default value.

So my question is how do you auto-magically set the correct default value of --learning_rate (in the code) before the user has even decided?

Now let's try to reinitialize LR at the end of optimization (back to 1 ‱) so that if I use this new checkpoint for TL in the future, it's has a nice default value. See the issue? How do you decide in a programmatic fashion when it's appropriate to load the LR value from the last checkpoint and when it's not? You can't make that decision for the user ahead of time.

Usual process is for low resource languages where you should drop the LR. Sometimes you do hyperparameter search to find a better converging one. So, you usually change it...

Does it work without --force_initialize_learning_rate? Most of the time yes but to automate training (and have it work reliably, all the time, in every scenario) we need this kind of fine control over whether to load the value from checkpoint or not.

STT will not take control from you to automate your training session. That's something you have to do yourself based on your specific needs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the `--force_initialize_learning_rate` flag useful for? #2299

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

What is the --force_initialize_learning_rate flag useful for? #2299

JRMeyer Sep 14, 2022 Maintainer

Replies: 3 comments · 1 reply

ghost Sep 17, 2022

wasertech Sep 24, 2022 Collaborator

HarikalarKutusu Sep 24, 2022

wasertech Sep 24, 2022 Collaborator

What is the `--force_initialize_learning_rate` flag useful for? #2299

JRMeyer
Sep 14, 2022
Maintainer

Replies: 3 comments 1 reply

ghost
Sep 17, 2022

wasertech
Sep 24, 2022
Collaborator

HarikalarKutusu
Sep 24, 2022

wasertech Sep 24, 2022
Collaborator