Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[googlestt] Fix audio streaming reliability #14649

Merged
merged 4 commits into from
Mar 23, 2023

Conversation

GiviMAD
Copy link
Member

@GiviMAD GiviMAD commented Mar 21, 2023

There was a malfunction some times with this bundle and realize we were dropping the data chunk when we were not able to read the desired size.

This is a problem because those are bytes not samples so we were probably breaking the audio encoding when an odd number of bytes were dropped.

I copied the same solution used in the rustpotterks bundle to read until fill the buffer.

Signed-off-by: Miguel Álvarez <[email protected]>
@jlaur jlaur added the bug An unexpected problem or unintended behavior of an add-on label Mar 22, 2023
Copy link
Contributor

@jlaur jlaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! A few minor questions from my side.

@GiviMAD
Copy link
Member Author

GiviMAD commented Mar 22, 2023

@jlaur, when I was testing the binding in OH 4 I started to face also an error like the one reported here grpc/grpc-java#9682 after the first use, and it hangs the dialog execution :( so maybe there is something missing in the dialog processor logic.

But I realize it was only when using the single utterance option. So I changed a little the implementation to avoid calling the onComplete method manually and let the sdk call it by itself, and that seems to solve the problem. In the meantime when I was getting out of ideas I updated the library and its dependencies.

Now seems to work ok with OH 4 (I think now, much better than the others).

Do you think is better to split some part in to a different PR or is it ok like this?

@jlaur
Copy link
Contributor

jlaur commented Mar 22, 2023

Do you think is better to split some part in to a different PR or is it ok like this?

It's okay, but perhaps you can rephrase the PR title to mention something user-relatable, i.e. what is fixed for the user? If you think the fix should be cherry-picked into 3.4.x, it would need to be split.

@GiviMAD
Copy link
Member Author

GiviMAD commented Mar 22, 2023

Perfect, I'm going move the dependency update to a separate PR.

I think the part about the onComplete method can stay in this one as I think is the correct way to use the library, is that ok?

About the title I'm not totally sure, because could be something like "Fix audio streaming implementation"? because it was really broken, I think it should be something that tells you "If you already tried and it performs badly, give it another try, as it was broken".

@GiviMAD GiviMAD force-pushed the googlestt_stop_drop_bytes branch from 264cf0e to 467b87b Compare March 22, 2023 21:26
@jlaur
Copy link
Contributor

jlaur commented Mar 22, 2023

I think the part about the onComplete method can stay in this one as I think is the correct way to use the library, is that ok?

It seems to be a part of the fix, so yes. 🙂

About the title I'm not totally sure, because could be something like "Fix audio streaming implementation"? because it was really broken, I think it should be something that tells you "If you already tried and it performs badly, give it another try, as it was broken".

If I understand this correctly, this add-on is a "Speech-to-Text" service, i.e. you will most likely use it to trigger something when saying something? If the audio streaming implementation had a severe bug, does it mean that it didn't work at all, or perhaps that only the first part of the sentence would be translated correctly, because the buffer would then not be filled up correctly? Or would it drop some of the audio stream randomly, but then again be able to stream and translate? I guess it all depends on the details, i.e. how this could be perceived by the user. I can help formulating it when I understand how this could possibly influence the users.

@jlaur jlaur added the patch A PR that has been cherry-picked to a patch release branch label Mar 22, 2023
@GiviMAD
Copy link
Member Author

GiviMAD commented Mar 22, 2023

Yes, I think it was randomly injecting noise into the phrase, because I don't think that loosing that audio part was causing so much trouble, maybe some wrong word in the middle of a sentence or a short connector missed, but as the audio used was encoded using two bytes per sample I think each time we dropped an odd number of bytes, we where sending mostly noise, until dropping another odd number of bytes. So causing a very random behavior when some times it was working almost perfect another times missing multiple consecutive words in phrase.

GiviMAD added 2 commits March 22, 2023 22:55
Signed-off-by: Miguel Álvarez <[email protected]>
Signed-off-by: Miguel Álvarez <[email protected]>
Copy link
Contributor

@jlaur jlaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jlaur jlaur merged commit a0f442e into openhab:main Mar 23, 2023
@jlaur jlaur added this to the 4.0 milestone Mar 23, 2023
@jlaur
Copy link
Contributor

jlaur commented Mar 23, 2023

Yes, I think it was randomly injecting noise into the phrase, because I don't think that loosing that audio part was causing so much trouble, maybe some wrong word in the middle of a sentence or a short connector missed, but as the audio used was encoded using two bytes per sample I think each time we dropped an odd number of bytes, we where sending mostly noise, until dropping another odd number of bytes. So causing a very random behavior when some times it was working almost perfect another times missing multiple consecutive words in phrase.

So maybe just "Fix audio streaming reliability", WDYT?

@GiviMAD GiviMAD changed the title [googlestt] Fix drop bytes [googlestt] Fix audio streaming reliability Mar 23, 2023
renescherer pushed a commit to renescherer/openhab-addons that referenced this pull request Mar 23, 2023
* [googlestt] Fix drop bytes
* fix unhandled cancelation error when using single utterance mode

Signed-off-by: Miguel Álvarez <[email protected]>
jlaur pushed a commit that referenced this pull request Mar 23, 2023
* [googlestt] Fix drop bytes
* fix unhandled cancelation error when using single utterance mode

Signed-off-by: Miguel Álvarez <[email protected]>
FordPrfkt pushed a commit to FordPrfkt/openhab-addons that referenced this pull request Apr 20, 2023
* [googlestt] Fix drop bytes
* fix unhandled cancelation error when using single utterance mode

Signed-off-by: Miguel Álvarez <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An unexpected problem or unintended behavior of an add-on patch A PR that has been cherry-picked to a patch release branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants