-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[googlestt] Fix audio streaming reliability #14649
Conversation
Signed-off-by: Miguel Álvarez <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! A few minor questions from my side.
...hab.voice.googlestt/src/main/java/org/openhab/voice/googlestt/internal/GoogleSTTService.java
Outdated
Show resolved
Hide resolved
...hab.voice.googlestt/src/main/java/org/openhab/voice/googlestt/internal/GoogleSTTService.java
Show resolved
Hide resolved
...hab.voice.googlestt/src/main/java/org/openhab/voice/googlestt/internal/GoogleSTTService.java
Outdated
Show resolved
Hide resolved
@jlaur, when I was testing the binding in OH 4 I started to face also an error like the one reported here grpc/grpc-java#9682 after the first use, and it hangs the dialog execution :( so maybe there is something missing in the dialog processor logic. But I realize it was only when using the single utterance option. So I changed a little the implementation to avoid calling the onComplete method manually and let the sdk call it by itself, and that seems to solve the problem. In the meantime when I was getting out of ideas I updated the library and its dependencies. Now seems to work ok with OH 4 (I think now, much better than the others). Do you think is better to split some part in to a different PR or is it ok like this? |
It's okay, but perhaps you can rephrase the PR title to mention something user-relatable, i.e. what is fixed for the user? If you think the fix should be cherry-picked into 3.4.x, it would need to be split. |
Perfect, I'm going move the dependency update to a separate PR. I think the part about the onComplete method can stay in this one as I think is the correct way to use the library, is that ok? About the title I'm not totally sure, because could be something like "Fix audio streaming implementation"? because it was really broken, I think it should be something that tells you "If you already tried and it performs badly, give it another try, as it was broken". |
Signed-off-by: Miguel Álvarez <[email protected]>
264cf0e
to
467b87b
Compare
It seems to be a part of the fix, so yes. 🙂
If I understand this correctly, this add-on is a "Speech-to-Text" service, i.e. you will most likely use it to trigger something when saying something? If the audio streaming implementation had a severe bug, does it mean that it didn't work at all, or perhaps that only the first part of the sentence would be translated correctly, because the buffer would then not be filled up correctly? Or would it drop some of the audio stream randomly, but then again be able to stream and translate? I guess it all depends on the details, i.e. how this could be perceived by the user. I can help formulating it when I understand how this could possibly influence the users. |
Yes, I think it was randomly injecting noise into the phrase, because I don't think that loosing that audio part was causing so much trouble, maybe some wrong word in the middle of a sentence or a short connector missed, but as the audio used was encoded using two bytes per sample I think each time we dropped an odd number of bytes, we where sending mostly noise, until dropping another odd number of bytes. So causing a very random behavior when some times it was working almost perfect another times missing multiple consecutive words in phrase. |
Signed-off-by: Miguel Álvarez <[email protected]>
Signed-off-by: Miguel Álvarez <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
So maybe just "Fix audio streaming reliability", WDYT? |
* [googlestt] Fix drop bytes * fix unhandled cancelation error when using single utterance mode Signed-off-by: Miguel Álvarez <[email protected]>
* [googlestt] Fix drop bytes * fix unhandled cancelation error when using single utterance mode Signed-off-by: Miguel Álvarez <[email protected]>
* [googlestt] Fix drop bytes * fix unhandled cancelation error when using single utterance mode Signed-off-by: Miguel Álvarez <[email protected]>
There was a malfunction some times with this bundle and realize we were dropping the data chunk when we were not able to read the desired size.
This is a problem because those are bytes not samples so we were probably breaking the audio encoding when an odd number of bytes were dropped.
I copied the same solution used in the rustpotterks bundle to read until fill the buffer.