Python: Open URL without Certificate Validation #3878

dilanbhalla · 2020-07-02T17:14:12Z

Query that detects use of urlopen within urllib.request and urllib2.request modules without appropriate certificate validation. A similar query already exists in python for the requests module, however there is no coverage yet for these urllib modules, which are very commonly used. By specifying a cafile, capath, or context a developer can ensure that they are verifying their certificate, however without these parameters they are vulnerable to man in the middle attacks.

python/ql/src/experimental/Security/CWE-295/UrlopenWithoutValidation.ql

…dation.ql Co-authored-by: intrigus-lgtm <[email protected]>

…urlopen

dilanbhalla · 2020-07-10T09:18:44Z

Hi Rasmus, apologies in advance for the long follow up, but I have a question unrelated to this query and am unsure how else to contact you (since our discussion thread from earlier is now closed). This may seem a little silly, but the trace-command you showed my for python won't work due to the simple error that the '$' is not recognized. My end goal is to simply use the CLI to build a python database that includes some custom xml files I wrote, so I believe your method would work for python (init, index the xml files, trace-command, finalize). Does your PR need to be merged before this trace-command will work? Or is it something simple that I may be doing wrong with regards to the expression starting with '$'? And lastly, if including the xml is not at all possible, would you happen to know any other method to include custom data (maybe through something like a csv) and reference it within a python ql file? Thank you so much!

RasmusWL · 2020-07-10T09:35:03Z

Hi @dilanbhalla, I provided an answer on to the relevant issue instead :)

dilanbhalla · 2020-07-12T21:07:33Z

Thanks for the update @RasmusWL, and apologies for the silly mistake. Also I just added/pushed the fixes you mentioned @intrigus-lgtm, so thank you too!

tausbn

I have added a few comments about how to rewrite this using API graphs, which will have to take place before it can be merged. Apart from that, I think the PR looks pretty solid. 👍

tausbn · 2021-04-08T18:37:16Z

python/ql/src/experimental/Security/CWE-295/UrlopenWithoutValidation.ql

+  urlopen = Value::named(["urllib2.request.urlopen", "urllib.request.urlopen"]) and
+  http_arg = urlopen.getArgumentForCall(call, 0) and
+  http_arg.pointsTo(http_string) and
+  http_string.getText().matches("https://%") and
+  (
+    not exists(Value verify |
+      verify = call.getArgByName(["cafile", "capath", "context"]).pointsTo()
+    )
+    or
+    empty = call.getArgByName(["cafile", "capath", "context"]).pointsTo() and
+    empty = Value::none_()
+  )


Our current approach to modelling libraries is to use API graphs.

In this case, I think it should be fairly simple to rewrite the code to use API graphs. The calls you're interested in can be found using

API::moduleImport(["urllib","urllib2"]).getMember("request").getMember("urlopen").getACall()

For the value that is being passed to the named argument, I would either see if the named argument is absent altogether (in which case the default is used), or if it is present and equal to API::builtin("None").getAUse(). That should mostly replicate the behaviour you have above.

Thanks for the advice, API graphs are a really cool feature! I added the changes you recommended, and reformatted the code a bit (made sure to remember the autoformatting too 👍 ).

I weirdly did not catch one of my test cases when I used API::builtin("None").getAUse() insead of comparing the value of the argument to Value::none_() however. I kept the latter method, which worked, and commented out the former. If you have any ideas why that might be the case that would be super helpful, but otherwise the method I am currently using works anyway (I will remove the comments once I get feedback on this decision). Thanks again, and let me know if there is anything else I need to fix!

Ah, it's possible we have a bug in API::builtin where we fail to include things like None (since API::builtin is really a model of the builtins module, but None is kind of special). I'll see if I can make a quick fix for that.

#5639 should fix the problem with None.

Incidentally, for this one it might be better to use getAnImmediateUse rather than getAUse. The difference here is that the latter will also take some (possibly unsound) dataflow into account, and this may lead to undesirable false negatives.

@tausbn new pull request is here: #5644. I used getAnImmediateUse instead and modified the test slightly to allow for this. I also kept API::builtin("None").getAUse() since you have opened a PR to fix this. 👍

dilanbhalla · 2021-04-08T21:40:07Z

Apoligies, I believe I made a bit of a mistake. I tried to update my repo by pulling upstream changes that have been made since I last worked on this, but instead seemed to complicate things a bit. I don't want to request review unnecessarily from codeql-java and codeql-javascript, so should I just open a fresh PR and resubmit there?

dilanbhalla · 2021-04-08T21:51:12Z

Going to resubmit in a fresh PR.

urlopen without cert validation

f52d44d

dilanbhalla requested a review from a team as a code owner July 2, 2020 17:14

intrigus-lgtm reviewed Jul 2, 2020

View reviewed changes

Update python/ql/src/experimental/Security/CWE-295/UrlopenWithoutVali…

bdf2068

…dation.ql Co-authored-by: intrigus-lgtm <[email protected]>

RasmusWL added the Python label Jul 3, 2020

dilanbhalla added 2 commits July 7, 2020 11:58

removed precision tag and fixed syntax error in qhelp file

f5ae679

Merge branch 'urlopen' of https://github.com/dilanbhalla/codeql into …

ec9985d

…urlopen

RasmusWL mentioned this pull request Jul 10, 2020

Java: Include all XML files with codeql-cli #3887

Closed

added pr fixes

32de708

adityasharad changed the base branch from master to main August 14, 2020 18:33

tausbn requested changes Apr 8, 2021

View reviewed changes

dilanbhalla added 3 commits April 8, 2021 12:14

Merge branch 'master' of https://github.com/github/codeql into urlopen

dc89e91

Merge branch 'main' of https://github.com/github/codeql into urlopen

64dfcb5

pulling upstream changes, use apigraph in query

4cdd54c

github-actions bot added the documentation label Apr 8, 2021

dilanbhalla requested a review from tausbn April 8, 2021 21:30

use api graphs, update upstream changes

19f037b

dilanbhalla requested review from a team as code owners April 8, 2021 21:34

github-actions bot added Java JS labels Apr 8, 2021

dilanbhalla closed this Apr 8, 2021

tausbn mentioned this pull request Apr 9, 2021

Python: Add missing builtins to API::builtin #5639

Merged

dilanbhalla mentioned this pull request Apr 9, 2021

Python: Urlopen without Certificate Validation #5644

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Open URL without Certificate Validation #3878

Python: Open URL without Certificate Validation #3878

dilanbhalla commented Jul 2, 2020

dilanbhalla commented Jul 10, 2020

RasmusWL commented Jul 10, 2020

dilanbhalla commented Jul 12, 2020

tausbn left a comment

tausbn Apr 8, 2021

dilanbhalla Apr 8, 2021

tausbn Apr 9, 2021

tausbn Apr 9, 2021

dilanbhalla Apr 9, 2021

dilanbhalla commented Apr 8, 2021

dilanbhalla commented Apr 8, 2021

Python: Open URL without Certificate Validation #3878

Python: Open URL without Certificate Validation #3878

Conversation

dilanbhalla commented Jul 2, 2020

dilanbhalla commented Jul 10, 2020

RasmusWL commented Jul 10, 2020

dilanbhalla commented Jul 12, 2020

tausbn left a comment

Choose a reason for hiding this comment

tausbn Apr 8, 2021

Choose a reason for hiding this comment

dilanbhalla Apr 8, 2021

Choose a reason for hiding this comment

tausbn Apr 9, 2021

Choose a reason for hiding this comment

tausbn Apr 9, 2021

Choose a reason for hiding this comment

dilanbhalla Apr 9, 2021

Choose a reason for hiding this comment

dilanbhalla commented Apr 8, 2021

dilanbhalla commented Apr 8, 2021