Replies: 6 comments
-
Beta Was this translation helpful? Give feedback.
-
>>> reuben |
Beta Was this translation helpful? Give feedback.
-
>>> erdoc |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
>>> erdoc |
Beta Was this translation helpful? Give feedback.
-
>>> reuben |
Beta Was this translation helpful? Give feedback.
-
>>> erdoc
[May 3, 2019, 8:37pm]
Trying to replication ruben's small python program:
https://hacks.mozilla.org/2018/09/speech-recognition-deepspeech/ in
macos mojave terminal.
libsox installed and I have access to microphone from terminal. inside a
python 3.7.0 virtual environment.
program runs without errors but the output is only 'Transcription: ...
BLANK'. It appears model.finishStream(sctx) doesn't output anything.
I have ensured the mic is working by changing the rec parameter -q to -S
and V3, the paths to the model, LM and trie files are all correct
(DeepSpeech works when called from the cmd line and supplied an audio
file argument).
Lastly this is the console output:
> python test.py --model models/output_graph.pbmm --alphabet models/alphabet.text --lm models/lm.binary --trie models/trie
> Initializing model...
> TensorFlow: v1.12.0-10-ge232881c5a
> DeepSpeech: v0.4.1-0-g0e40db6
> 2019-05-03 15:14:11.615995: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
> You can start speaking now. Press Control-C to stop recording.
> rec: SoX v
> rec WARN formats: can't set sample rate 16000; using 44100
> rec WARN formats: can't set 1 channels; using 2
>
> Input File : 'default' (coreaudio)
> Channels : 2
> Sample Rate : 44100
> Precision : 32-bit
> Sample Encoding: 32-bit Signed Integer PCM
> Endian Type : little
> Reverse Nibbles: no
> Reverse Bits : no
>
> Output File : '-' (raw)
> Channels : 1
> Sample Rate : 16000
> Precision : 16-bit
> Sample Encoding: 16-bit Signed Integer PCM
> Endian Type : little
> Reverse Nibbles: no
> Reverse Bits : no
> Comment : 'Processed by SoX'
>
> rec INFO sox: effects chain: input 44100Hz 2 channels
> rec INFO sox: effects chain: gain 44100Hz 2 channels
> rec INFO sox: effects chain: channels 44100Hz 1 channels
> rec INFO sox: effects chain: rate 16000Hz 1 channels
> rec INFO sox: effects chain: dither 16000Hz 1 channels
> rec INFO sox: effects chain: output 16000Hz 1 channels
> In:0.00% 00:00:08.61 [00:00:00.00] Out:137k [ | ] Clip:0 ^C
> Aborted.
> Transcription:
thank you.
[This is an archived TTS discussion thread from discourse.mozilla.org/t/streaming-api-on-mac-os-x]
Beta Was this translation helpful? Give feedback.
All reactions