Issue Transcribing Twilio Media Stream Using @aws-sdk/client-transcribe-streaming #4648
-
I'm attempting to transcribe audio coming from a Twilio Media Stream using @aws-sdk/client-transcribe-streaming. I am sending audio to AWS via StartStreamTranscriptionCommand and getting responses back, but they're all empty (ex: { TranscriptEvent: { Transcript: { Results: [] } } }). The audio from Twilio is coming over a websocket connection in 20ms chucks. The audio is base64 encoded mono, 8-bit 8kHz mulaw. Here is the transform stream I'm writing to.
I'm using the wavefile lib to help with the mulaw > pcm conversion. The other pieces involved are pretty much copy and pasted from the code snippets at https://www.npmjs.com/package/@aws-sdk/client-transcribe-streaming. I'm not sure if the audio encoding is the issue, if it's something related to the transform stream, maybe 20ms chunks are too small. I did see this mentioned on the best practices webpage "PCM (only signed 16-bit little-endian audio formats, which does not include WAV)". I thought a WAV file was just a wrapper for PCM encoding. According to wavefile, fromMulaw() decodes 8-bit mu-Law as 16-bit linear PCM - which sounds like what I need. I'd really appreciate any help that you could provide to get this working. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
Quick update, I was able to get this working this morning with a slight variation on the code snipped above.
This is how I'm writing to the transform stream.
|
Beta Was this translation helpful? Give feedback.
-
Hello! Reopening this discussion to make it searchable. |
Beta Was this translation helpful? Give feedback.
-
Thanks For the help. Saved a lot of time. |
Beta Was this translation helpful? Give feedback.
Quick update, I was able to get this working this morning with a slight variation on the code snipped above.
This is how I'm writing to the transform stream.