You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When the community gets access to the alpha version of the data set there may be sentences of the form
"I am in room 2049."
This may be validly spoken in several ways:
"I am in room two thousand forty nine" "I am in room twenty forty nine." "I am in room two zero four nine."
Currently, a preprocessor, with its current API, would only have access to the sentence:
"I am in room 2049."
and be tasked with, among other things, converting the digits into words.
In this case, the text alone is insufficient to disambiguate these cases and find the correct way to convert these digits to words. So the current preprocessor API can't handle this case. Only in listening to the audio can one correctly disambiguate these cases.
Now each audio clip is uniquely identified by the pairing of a user_id who spoke the sentence and the sentence itself. So if passed the user_id and sentence pairing, a preprocessor can know that the sentence corresponds to a particular audio clip.
Then the creator of the preprocessor can actually listen to that clip and correctly convert the sentence
"I am in room 2049."
to the actual audio spoken in the clip:
"I am in room two thousand forty nine" "I am in room twenty forty nine." "I am in room two zero four nine."
...
So the locale specific preprocessor API should change to accept the pair of parameters user_id and sentence.
The text was updated successfully, but these errors were encountered:
When the community gets access to the alpha version of the data set there may be sentences of the form
"I am in room 2049."
This may be validly spoken in several ways:
"I am in room two thousand forty nine"
"I am in room twenty forty nine."
"I am in room two zero four nine."
Currently, a preprocessor, with its current API, would only have access to the
sentence
:"I am in room 2049."
and be tasked with, among other things, converting the digits into words.
In this case, the text alone is insufficient to disambiguate these cases and find the correct way to convert these digits to words. So the current preprocessor API can't handle this case. Only in listening to the audio can one correctly disambiguate these cases.
Now each audio clip is uniquely identified by the pairing of a
user_id
who spoke the sentence and thesentence
itself. So if passed theuser_id
andsentence
pairing, a preprocessor can know that the sentence corresponds to a particular audio clip.Then the creator of the preprocessor can actually listen to that clip and correctly convert the sentence
"I am in room 2049."
to the actual audio spoken in the clip:
"I am in room two thousand forty nine"
"I am in room twenty forty nine."
"I am in room two zero four nine."
...
So the locale specific preprocessor API should change to accept the pair of parameters
user_id
andsentence
.The text was updated successfully, but these errors were encountered: