You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! My schoolmates and I are working on a group project using DeepSpeech, we want to ask a couple of questions.
1. What are the operations that deepspeech does for preprocessing the audio files?
2. Will silence in the front and end of an audio file affect the package's ability to do inference?
3. The main objective of our project is to be able to recognize sung notes (especially solfege). For example, someone says 'do re mi fa re do'. We want to be able to get the exact thing that person said without consideration of the actual note the person sang. (This means that they say 'do' but sang 'la' - we want this 'do') Is this python package suitable for such a use case? If not, what are some suggested packages for this kind of project?
Thanks a bunch!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/preprocessing-silence-lyric-recognition]
Hello! My schoolmates and I are working on a group project using DeepSpeech, we want to ask a couple of questions.
1. What are the operations that deepspeech does for preprocessing the audio files?
2. Will silence in the front and end of an audio file affect the package's ability to do inference?
3. The main objective of our project is to be able to recognize sung notes (especially solfege). For example, someone says 'do re mi fa re do'. We want to be able to get the exact thing that person said without consideration of the actual note the person sang. (This means that they say 'do' but sang 'la' - we want this 'do') Is this python package suitable for such a use case? If not, what are some suggested packages for this kind of project?
Thanks a bunch!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/preprocessing-silence-lyric-recognition]
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
>>> nganferneejoan
[April 10, 2019, 12:35pm]
Hello! My schoolmates and I are working on a group project using
DeepSpeech, we want to ask a couple of questions.
1. What are the operations that deepspeech does for preprocessing the
audio files?
2. Will silence in the front and end of an audio file affect the
package's ability to do inference?
3. The main objective of our project is to be able to recognize sung
notes (especially solfege). For example, someone says 'do re mi fa
re do'. We want to be able to get the exact thing that person said
without consideration of the actual note the person sang. (This
means that they say 'do' but sang 'la' - we want this 'do') Is this
python package suitable for such a use case? If not, what are some
suggested packages for this kind of project?
Thanks a bunch!
[This is an archived TTS discussion thread from discourse.mozilla.org/t/preprocessing-silence-lyric-recognition]
Beta Was this translation helpful? Give feedback.
All reactions