RFC: do you rely on feature caching for training? #1872

reuben · 2021-05-19T11:51:23Z

reuben
May 19, 2021
Maintainer

Feature caching is a functionality of our training code where pre-computed MFCC features for the training set will be saved to disk in-order, so that subsequent reads are faster. It was added to deal with a hardware setup for training that had slow IO which was bottlenecking the GPUs.

Since then, specially after the introduction of the augmentation features, feature caching has become a troublesome feature to support. We have documented the common pitfall of mixing augmentations and feature caching, but people still run into issues.

I'm opening this discussion to hopefully gather feedback from users of the training code: do you rely on feature caching? If so, why? What's your hardware setup and what benefit do you get from it? I'd love to understand more how others are using it so we can evaluate whether we want to keep this functionality or not.

bernardohenz · 2021-05-19T12:11:04Z

bernardohenz
May 19, 2021

Speaking for my group, as soon as data augmentations were introduced, we stopped using feature caching. I can't remember if there was a slow-down during training, but the accuracy gain when using augmentations was worth it.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: do you rely on feature caching for training? #1872

{{title}}

Replies: 1 comment

{{title}}

Select a reply

RFC: do you rely on feature caching for training? #1872

reuben May 19, 2021 Maintainer

Replies: 1 comment

bernardohenz May 19, 2021

reuben
May 19, 2021
Maintainer

bernardohenz
May 19, 2021