Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI-34 [CoreService] Implement the API endpoint for Speech-to-Text #24

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

hphun9
Copy link

@hphun9 hphun9 commented Aug 19, 2023

  • Implement endpoint /transcribe to translate audio speech-to-text

@hphun9 hphun9 requested review from BinhPhamQuang, minhnld and tritct and removed request for BinhPhamQuang, minhnld and tritct August 19, 2023 13:36
Comment on lines +66 to +67
GOOGLE_CREDENTIAL_FILE: ${{ secrets.GOOGLE_CREDENTIAL_FILE }}
GOOGLE_PROJECT_ID: ${{ secrets.GOOGLE_PROJECT_ID }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test this,
Do you have permission to add these secrets? I don't remember granting you those

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I've just testing from local. I don't have any permission to add these secrets. So, Can you grant permission for me then i can test it?

detail="We have an error uploading files",
)

speech_to_text = Speech2Text(gcp_credentials)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not how dependency injection works, do you understand how it works?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, dependency injection is the pattern that class A using some method of class B and those methods is called dependency. I try to use this like some api you wrote before. I feel a little bit strange when i code it. Can you give me some recommend?

Comment on lines 32 to 41
config = cloud_speech.RecognitionConfig(
auto_decoding_config=cloud_speech.AutoDetectDecodingConfig(),
language_codes=[language_code],
model="long",
# Reference: https://cloud.google.com/speech-to-text/v2/docs/transcription-model
features=cloud_speech.RecognitionFeatures(
enable_automatic_punctuation=True,
),
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config should be a global variable or can be injected somewhere,

But I just got the feeling that you just copied this code from an example without changing it to adapt to the codebase convention and tech stack, didn't you?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not got you mind. But I will inject this configuration instead of using like this.

Comment on lines 42 to 44
recogniser_gcp = f"projects/{project_id}/locations/global/recognizers/_".format(
project_id=project_id
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a constant to be reused not recomputed every time this function was called, like I said you have to understand the code you wrote to see any abnormal about it...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got it, I will update it later

@hphun9 hphun9 requested a review from minhnld September 7, 2023 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants