Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uncaught exceptions in audio captcha generation #238

Open
anarcat opened this issue Dec 2, 2024 · 7 comments
Open

uncaught exceptions in audio captcha generation #238

anarcat opened this issue Dec 2, 2024 · 7 comments

Comments

@anarcat
Copy link

anarcat commented Dec 2, 2024

In our environment, we're getting spurious exceptions like this:

Oct 03 13:27:05 donate-01 donate[949]: /usr/bin/sox FAIL formats: can't open input file `/tmp/1a0e906c0bb38acf4526812068c1e497d90c1b4d_arbitrary.wav': No such file or directory
Oct 03 13:27:05 donate-01 donate[949]: [ERROR: django.request.log_response] Internal Server Error: /captcha/audio/1a0e906c0bb38acf4526812068c1e497d90c1b4d.wav
Oct 03 13:27:05 donate-01 donate[949]: Traceback (most recent call last):
Oct 03 13:27:05 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
Oct 03 13:27:05 donate-01 donate[949]:     response = get_response(request)
Oct 03 13:27:05 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^
Oct 03 13:27:05 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response
Oct 03 13:27:05 donate-01 donate[949]:     response = wrapped_callback(request, *callback_args, **callback_kwargs)
Oct 03 13:27:05 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 03 13:27:05 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/captcha/views.py", line 197, in captcha_audio
Oct 03 13:27:05 donate-01 donate[949]:     os.remove(arbnoisepath)
Oct 03 13:27:05 donate-01 donate[949]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/1a0e906c0bb38acf4526812068c1e497d90c1b4d_arbitrary.wav'
Oct 03 13:27:05 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/1a0e906c0bb38acf4526812068c1e497d90c1b4d.wav HTTP/1.1" 500 6628 45
Oct 03 13:27:05 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/1a0e906c0bb38acf4526812068c1e497d90c1b4d.wav HTTP/1.1" 200 33088 33
Oct 03 13:27:05 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/image/a0f3b95e27cdadccd06aacb165004bf80ea4b97d/ HTTP/1.1" 200 6001 7
Oct 03 13:27:05 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/a0f3b95e27cdadccd06aacb165004bf80ea4b97d.wav HTTP/1.1" 206 32044 31
Oct 03 13:27:07 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/a0f3b95e27cdadccd06aacb165004bf80ea4b97d.wav HTTP/1.1" 200 32044 30
Oct 03 13:27:07 donate-01 donate[949]: /usr/bin/sox FAIL formats: can't open input file `/tmp/a0f3b95e27cdadccd06aacb165004bf80ea4b97d_arbitrary.wav': No such file or directory
Oct 03 13:27:07 donate-01 donate[949]: [ERROR: django.request.log_response] Internal Server Error: /captcha/audio/a0f3b95e27cdadccd06aacb165004bf80ea4b97d.wav
Oct 03 13:27:07 donate-01 donate[949]: Traceback (most recent call last):
Oct 03 13:27:07 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
Oct 03 13:27:07 donate-01 donate[949]:     response = get_response(request)
Oct 03 13:27:07 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^
Oct 03 13:27:07 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response
Oct 03 13:27:07 donate-01 donate[949]:     response = wrapped_callback(request, *callback_args, **callback_kwargs)
Oct 03 13:27:07 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 03 13:27:07 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/captcha/views.py", line 197, in captcha_audio
Oct 03 13:27:07 donate-01 donate[949]:     os.remove(arbnoisepath)
Oct 03 13:27:07 donate-01 donate[949]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/a0f3b95e27cdadccd06aacb165004bf80ea4b97d_arbitrary.wav'
Oct 03 13:27:07 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/a0f3b95e27cdadccd06aacb165004bf80ea4b97d.wav HTTP/1.1" 500 6628 24
[...]
Oct 03 14:31:41 donate-01 donate[949]: /usr/bin/sox FAIL formats: can't open input file `/tmp/ed9879ca11de0a7ad7f3d910793f888af26b4d3d_arbitrary.wav': WAVE: RIFF header not found
Oct 03 14:31:41 donate-01 donate[949]: [ERROR: django.request.log_response] Internal Server Error: /captcha/audio/ed9879ca11de0a7ad7f3d910793f888af26b4d3d.wav
Oct 03 14:31:41 donate-01 donate[949]: Traceback (most recent call last):
Oct 03 14:31:41 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
Oct 03 14:31:41 donate-01 donate[949]:     response = get_response(request)
Oct 03 14:31:41 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^
Oct 03 14:31:41 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response
Oct 03 14:31:41 donate-01 donate[949]:     response = wrapped_callback(request, *callback_args, **callback_kwargs)
Oct 03 14:31:41 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 03 14:31:41 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/captcha/views.py", line 199, in captcha_audio
Oct 03 14:31:41 donate-01 donate[949]:     os.rename(mergedpath, path)
Oct 03 14:31:41 donate-01 donate[949]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ed9879ca11de0a7ad7f3d910793f888af26b4d3d_merged.wav' -> '/tmp/ed9879ca11de0a7ad7f3d910793f888af26b4d3d.wav'
Oct 03 14:31:41 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/ed9879ca11de0a7ad7f3d910793f888af26b4d3d.wav HTTP/1.1" 500 6628 33
Oct 03 14:31:41 donate-01 donate[949]: /usr/bin/sox FAIL formats: can't open input file `/tmp/ed9879ca11de0a7ad7f3d910793f888af26b4d3d.wav': No such file or directory
Oct 03 14:31:41 donate-01 donate[949]: [ERROR: django.request.log_response] Internal Server Error: /captcha/audio/ed9879ca11de0a7ad7f3d910793f888af26b4d3d.wav
Oct 03 14:31:41 donate-01 donate[949]: Traceback (most recent call last):
Oct 03 14:31:41 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
Oct 03 14:31:41 donate-01 donate[949]:     response = get_response(request)
Oct 03 14:31:41 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^
Oct 03 14:31:41 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response
Oct 03 14:31:41 donate-01 donate[949]:     response = wrapped_callback(request, *callback_args, **callback_kwargs)
Oct 03 14:31:41 donate-01 donate[949]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Oct 03 14:31:41 donate-01 donate[949]:   File "/home/tordonate/venv/lib/python3.11/site-packages/captcha/views.py", line 197, in captcha_audio
Oct 03 14:31:41 donate-01 donate[949]:     os.remove(arbnoisepath)
Oct 03 14:31:41 donate-01 donate[949]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/ed9879ca11de0a7ad7f3d910793f888af26b4d3d_arbitrary.wav'
Oct 03 14:31:41 donate-01 donate[949]: 0.0.0.0 - "GET /captcha/audio/ed9879ca11de0a7ad7f3d910793f888af26b4d3d.wav HTTP/1.1" 500 6628 30

it's unclear why this is happening, we suspect it might be related with multithreading issues in the codebase and lack of locking. we also have issues with old captcha files not being cleaned up for example.

but exactly why this is happening is somewhat besides the point: such failures should at least be handled better than a 500 error with a backtrace. we should add exception handling in (e.g.) this code:

if settings.CAPTCHA_SOX_PATH:
arbnoisepath = str(
os.path.join(tempfile.gettempdir(), "%s_arbitrary.wav") % key
)
mergedpath = str(os.path.join(tempfile.gettempdir(), "%s_merged.wav") % key)
subprocess.call(
[
settings.CAPTCHA_SOX_PATH,
"-r",
"8000",
"-n",
arbnoisepath,
"synth",
"2",
"brownnoise",
"gain",
"-15",
]
)
subprocess.call(
[
settings.CAPTCHA_SOX_PATH,
"-m",
arbnoisepath,
path,
"-t",
"wavpcm",
"-b",
"16",
mergedpath,
]
)
os.remove(arbnoisepath)
os.remove(path)
os.rename(mergedpath, path)

so that we can somewhat recover more gracefully when that happens. if at least this was a custom exception from this module (instead of a fairly generic OSError/FileNotFoundError, we could handle it (or ignore it, at least for now) differently than other exceptions...

i'd be happy to propose a PR to at least do the latter if there's an opening here.

Note that this issue is tracked at https://gitlab.torproject.org/tpo/web/donate-neo/-/issues/142 where you can also see the entire source code for this project.

@mbi
Copy link
Owner

mbi commented Dec 3, 2024

Hello, thanks for opening this issue.

but exactly why this is happening is somewhat besides the point:

I suspect this is due to #196, i.e. sox producing wav files at different sample rates and then being unable to merge them. If you'd like to invest some time into investigating the real source of this and submitting a pull request, I'd really appreciate this.

so that we can somewhat recover more gracefully when that happens. if at least this was a custom exception from this module (instead of a fairly generic OSError/FileNotFoundError, we could handle it (or ignore it, at least for now) differently than other exceptions

I mean, I agree, but in the end I'm more concerned by the user experience, so I'd rather fix the underlying problem. In the end the user will get a 404 instead of a 500 when requesting the audio, and both suck in pretty much the same way.

mbi added a commit that referenced this issue Dec 3, 2024
@mbi
Copy link
Owner

mbi commented Dec 3, 2024

@anarcat I just pushed a fix for #196 and, if it is indeed related, could potentially also fix this issue.
Could you please test against HEAD and report back?

@anarcat
Copy link
Author

anarcat commented Dec 3, 2024

awesome! we're going to deploy this for testing soon, thanks, you can follow our progress in https://gitlab.torproject.org/tpo/web/donate-neo/-/issues/142 but i'll try to report back here specifically soon!

part of the challenge is this is intermittent, so it's hard to tell whether it's fixed... but i'll do my best!

@anarcat
Copy link
Author

anarcat commented Dec 16, 2024

@anarcat I just pushed a fix for #196 and, if it is indeed related, could potentially also fix this issue.
Could you please test against HEAD and report back?

we're currently running 0613733 and we're still getting FileNotFound exceptions, here's an example:

Dec 16 17:31:00 donate-01 donate[1311653]: Traceback (most recent call last):
Dec 16 17:31:00 donate-01 donate[1311653]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/exception.py", line 55, in inner
Dec 16 17:31:00 donate-01 donate[1311653]:     response = get_response(request)
Dec 16 17:31:00 donate-01 donate[1311653]:                ^^^^^^^^^^^^^^^^^^^^^
Dec 16 17:31:00 donate-01 donate[1311653]:   File "/home/tordonate/venv/lib/python3.11/site-packages/django/core/handlers/base.py", line 197, in _get_response
Dec 16 17:31:00 donate-01 donate[1311653]:     response = wrapped_callback(request, *callback_args, **callback_kwargs)
Dec 16 17:31:00 donate-01 donate[1311653]:                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Dec 16 17:31:00 donate-01 donate[1311653]:   File "/home/tordonate/venv/lib/python3.11/site-packages/captcha/views.py", line 213, in captcha_audio
Dec 16 17:31:00 donate-01 donate[1311653]:     os.remove(arbnoisepath)
Dec 16 17:31:00 donate-01 donate[1311653]: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/7ba478e90cc50f10d9c8aff4c658a571d245f942_arbitrary.wav'

update: fix copy-paste, previous version was from before the fix, but the error is mostly unchanged.

@mbi mbi closed this as completed in 1c92409 Dec 17, 2024
@mbi mbi reopened this Dec 17, 2024
@mbi
Copy link
Owner

mbi commented Dec 17, 2024

Right, I suppose that what is happening is that for some reason the URL to the audio file is being requested multiple times at a very short interval.

The way audio files are handled is

  1. Generate a text-to-speach wav file at /tmp/hash-of-captcha.wav
  2. Generate random noise file at /tmp/hash-of-captcha-noise.wav
  3. Merge the two files into /tmp/hash-of-captcha-noise-merged.wav
  4. Delete /tmp/hash-of-captcha-noise.wav
  5. Delete /tmp/hash-of-captcha.wav
  6. Rename /tmp/hash-of-captcha-noise-merged.wav to /tmp/hash-of-captcha.wav
  7. Read /tmp/hash-of-captcha.wav into a temporary file
  8. Delete /tmp/hash-of-captcha.wav
  9. Serve the temporary file in a RangedFileResponse

Now, your errors all happen at steps 4, 5 and 6, which probably means that a second process is serving another request to the same url. The second process generates the files (steps 1, 2, 3) but the first process deletes them (steps 4, 5, 6) before they can be deleted / renamed / read again.

The underlying issue is that each process should have its own temporary directory to avoid one process deleting the other's files, but that's not the case today.

As a workaround, 1c92409 adds a random postfix to the three filenames generated in the first three steps, so that each process has their own set of files.

Could you please test this commit?

@anarcat
Copy link
Author

anarcat commented Dec 17, 2024

Hi!

As a workaround, 1c92409 adds a random postfix to the three filenames generated in the first three steps, so that each process has their own set of files.

I don't want to be that guy, but random, in general, and randint in particular, seems like a Bad Idea. I would suggest using secrets.token_urlsafe instead, which has higher entropy, is not predictable, and will be less likely to result in a crash.

(now, i know, you'd need like half a billion parallel requests to hit that collision, but i always feel seeing random in codebases is code smell and just generally avoid it.)

Could you please test this commit?

will do!

The underlying issue is that each process should have its own temporary directory to avoid one process deleting the other's files, but that's not the case today.

note that we're using multithreading here, so having "processes" in their own directories wouldn't fix this.

in general though, it seems we're fixing this the wrong way: if the files already exist, why not just reuse them or wait for another (async?) process to complete the generation, instead of rebuilding the files?

@mbi
Copy link
Owner

mbi commented Dec 18, 2024

I don't want to be that guy, but random, in general, and randint in particular, seems like a Bad Idea. I would suggest using secrets.token_urlsafe instead, which has higher entropy, is not predictable, and will be less likely to result in a crash.

No that's a good point, fixed in b7abe2e

note that we're using multithreading here, so having "processes" in their own directories wouldn't fix this.

Yes I was thinking of using tempfile.TemporaryDirectory which ought to be threadsafe and also handles automatic deletion when used as a context manager. This way every thread should get their own copy.

in general though, it seems we're fixing this the wrong way: if the files already exist, why not just reuse them or wait for another (async?) process to complete the generation, instead of rebuilding the files?

Because repetition attacks. This is precisely why we're injecting noise into the generated wav file, otherwise one cleartext will generate the same output sound file when requested multiple times, which will give an attacker agency to create a rainbow table of file hashes. And also, it makes the architecture of the audio generation view utterly complicated for no good reason at all IMO. 🤷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants