-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python: Handle diagnostics writing for BuiltinModuleExtractable
#16940
Python: Handle diagnostics writing for BuiltinModuleExtractable
#16940
Conversation
So we have the info in the logs if the diagnostics processing fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I like the extra information being logged, and presumably the shift from unhandled exception to warning should allow extraction to continue (and not fail).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. 👍
I'm still somewhat surprised that this error is occurring at all, but I think this should fix it.
Hey @RasmusWL - chiming in because you'd commented you weren't sure how to test this out and couldn't reproduce locally. This is how we found the problem in the wild, and how I kind of diagnosed it as a hybrid between an environment problem and a python extractor problem. This client that we are rolling out CodeQL to is using AmazonLinux to build their runner images (yep, I know, that's where this becomes an environment issue), and AmazonLinux handles cryptography functions in a weird way because of FIPS compliance. I targeted the python
I wrote a small python script which would try to create a blake2 hash, and ran it in python 3.9 in AL2, AL2023, and Ubuntu (Ubuntu was running 3.10). Here's the script: from hashlib import blake2b
h = blake2b()
h.update(b'Hello world')
print(h.hexdigest()) On AmazonLinux, it failed because blake2b (and blake2s) don't exist in hashlib. On Ubuntu, it works fine and produced a hex digest. According to the docs In theory, running this same python script through CodeQL's Python Extractor in an AmazonLinux container and on Ubuntu should be a good sanity check to test this. AmazonLinux is available from dockerhub, so in theory you should be able to spin up a AL2 or AL2023 container locally and just run the CodeQL CLI there to reproduce the error. |
Some of the internal tooling would not be too happy about this :D
Thanks @molson504x, that's really valuable context 👍 When the change from this PR is applied, we will still have the same underlying problem with oh, I forgot to actually submit this comment 🙈 |
Thanks! Yeah I'm not saying blake2 was the only one, just the one I happened to notice and used as my test. My theory is amazonlinux is using some kind of fips crypto library that doesn't include, among others, blake2 algorithms which would explain the pathing problem. |
In the hopes this will fix a problem encountered in the wild (that I couldn't reproduce locally).
I couldn't find any tests that I could easily extend to test the new behavior either, so what I did was to locally alter
BuiltinExtractor
to always throw an exception. I observed this causes us to generateException: 'BuiltinModuleExtractable' object has no attribute 'path'
in the logs.With the commits from this PR, we now properly log the failure reason:
and generate diagnostics such as (EDIT: updated to not include location after
354394d)
I'm not 100% sure if the tooling will be happy with the fake "file" here, so want to confirm that before merging this PR.
EDIT: I forgot to mention I also tried inserting an exception in
internal_error_message
function, to verify that error handling also works 👍