-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whisper ASR is broken #52
Comments
Has the Whisper ASR response format changed? If so, can submit a PR or use the ASR docker image versioned before the format change |
I might be seeing the same issue:
I've replicated the above error on Whisper ASR versions 1.3.0 (latest) and also 1.2.4 and 1.1.1. It seems to be failing here. It looks like the code above expects interface Segment {
avg_logprob: number;
compression_ratio: number;
end: number;
id: number;
no_speech_prob: number;
seek: number;
start: number;
temperature: number;
text: string;
tokens: number[];
} I'm happy to revert to a previous version of the docker image -- does anyone know what the latest functional version of |
Hey! Were you able to solve the problem? |
This is definitely a regression w v3. I haven't touched my ASR in months and it broke completely today following Obsidian updates. Of course I'm using a WhisperX fork which may be out of date, but assuming backward-compatible API.... I think it's related to the timestamps functionality. AFAIK, there's no TS flag in the ASR endpoint, so this is probably causing the ASR to throw an error response. I have the timestamp setting turned off, but the URL being generated still has the flag. The proper behavior here should be not to include the parameter at all: @djmango looks like this is a bug on your end? |
Thanks for investigating @dahifi - i will corroborate your findings and release an update later today accordingly |
I found the line here: https://github.com/djmango/obsidian-transcription/pull/50/files#diff-0f4208f0163c212f445df35fc43b99ceb09432700a33b38b514883c99c7d6169R131 I'm going to grab lunch then would like to do the PR myself, kind ser. |
Gotcha, much appreciated |
Not supported by WhisperASR. Fixes djmango#52
WhisperASR expects the file to be a multipart form, e.g.
Where it looks like you're using octet-streams in the request body? I've had bad luck trying to get things working in my dev notebook. I went back through all the various commits that modified that function and finally wound up on the last contribution I did back in October, which should be v.3.1.6. For now I recommend ASR users load up that version, I'm not sure I want to spend more time on it. DJ may want to reconsider supporting it since Swiftlink has diverged from it so much now. Might just want to make a note in the readme and leave it at that. |
It looks like this issue does not occur when using |
Doing further investigation, it looks like
I'll open up an issue on https://github.com/ahmetoner/whisper-asr-webservice to try to standardize this, but for now, I'll just tweak the code to handle both so we can get this issue fixed. |
Update released |
This plugin cannot work with the Whisper ASR Webservice anymore. It sends audio file to the docker, but cannot pick up the response properly.
The text was updated successfully, but these errors were encountered: