Python: Flask & Django Constant Secret Key initialization #13561

am0o0 · 2023-06-25T10:39:22Z

This is part of All for one, one for all query submission, I'm going to submit an issue in github/securitylab for this pull request too.
the major differences between some secret finder tools that working with regex and this query are:

not all of secrets can be found by regex
most of them assigned in multiple steps and don't have a same name like SECRET_KEY
precision.

I tried my best to write sanitizers, i think this query need modularization which need separate queries for Django and Flask and need to put some common classes and predicates in a library file. I'll do this in this week if you agree or in another pull request in future.

am0o0 · 2023-06-26T20:18:49Z

Hi @RasmusWL , if you think it's better to I write this query with a better structure please let me know.
Also I've tested this query against apache superset and airflow and I didn't observe any significant performance issues in these repositories.

RasmusWL

Overall, I like these queries, have thought about writing something like this myself 👍

Can you please submit the security lab bounty issue now as well? 🙏

However, I think the best practice is to load the SECRET_KEY from the environment (without a default value of course). This is for example the recommended practice in the Django security docs: https://docs.djangoproject.com/en/4.2/howto/deployment/checklist/#secret-key

python/ql/test/experimental/query-tests/Security/CWE-287-ConstantSecretKey/app_unsafe.py

RasmusWL · 2023-06-28T15:17:25Z

python/ql/test/experimental/query-tests/Security/CWE-287-ConstantSecretKey/app_safe.py

@@ -0,0 +1,22 @@
+from flask import Flask, session
+from flask_session import Session


Can you highlight for me how flask_session makes this safe? (preferably in a code comment so it doesn't get lost)

Hi, I decided to not include this sanitizer because this library was suggested in a blog post I added that as sanitizer.
after putting more time on this, this is not guarantied( or I'm not sure about it) to use this lib if the SECRET_KEY is not safe.

RasmusWL · 2023-06-28T15:20:32Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+          // this can be ideal if we assume that best security practice is that
+          // we don't get SECRET_KEY from env and we always assign a secure generated random string to it
+          cn.getNumArgument() = 1


As already outlined, I don't think this is best security practice. Do you have any resources to back up your claims? 😊

well, I think the best practice can be getting secret_key from the environment (without a default value) AND check whether it is none or not, I wanted to do this but my query performance is really low currently because of two Global DataFlow Configuration.
also the link from this blog post we have

We also included two other SECRET_KEYs we found, one in a deployment template, thisISaSECRET_1234, and another in the documentation YOUR_OWN_RANDOM_GENERATED_SECRET_KEY.

and in the below chart of the blog post you can see that the number of instances that didn't even change the document or configuration keys!

so the best practice in my opinion is that get secret key from config and check whether it is empty or same as default or not in case of not generating it automatically.

https://docs.djangoproject.com/en/4.2/howto/deployment/checklist/#secret-key

I think Django Doc is not really good about open source projects because if we assume we have an open source project then there is a possibility that developers and users can make a mistake about changing secret keys or documenting their configuration well, also according to doc there is a constant secure key which in open source it is constant and no matter it is random.

there is also some libs that load env vars from file and then they can be accessed by os.getenv , ... so there is a possible constant value.

@amammad Sorry for the late response, I completely missed this thread.

Storing secret keys in environment variables is a globally recognized good practice. Loading those variables from local files completely defeat the purpose of adding them to the OS environment at boot time. However, from a static analysis point of view, we dont have visibility on where the env vars are loaded from and it is ok to assume that loading secrets from env vars is a safe approach.

Reporting issues/vulnerabilities for each case where secrets are loaded from env vars will lead to too many FPs. Please adjust the query to not report those issues.

@pwntester no worries!
what about the default constants of these env vars methods? should I consider these default values too?

The default values should be reported since they are potential hardcoded credentials

RasmusWL · 2023-06-28T15:22:43Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+        sink =
+          [
+            n.getReturn().getAMember().getSubscript(["SECRET_KEY", "JWT_SECRET_KEY"]).asSink(),
+            n.getReturn().getMember(["SECRET_KEY", "JWT_SECRET_KEY"]).asSink(),


As per https://flask.palletsprojects.com/en/2.3.x/api/#flask.Flask.secret_key

Suggested change

n.getReturn().getMember(["SECRET_KEY", "JWT_SECRET_KEY"]).asSink(),

n.getReturn().getMember(["secret_key", "JWT_SECRET_KEY"]).asSink(),

Although I'm not sure this will cover the value assigned in app.secret_key = <bad>, I think you would be better off using DataFlow::AttrWrite attr, and use the value that is being assigned as the sink 😊

RasmusWL · 2023-06-28T15:25:31Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+ * @name Initializing SECRET_KEY of Flask application with Constant value
+ * @description Initializing SECRET_KEY of Flask application with Constant value
+ * files can lead to Authentication bypass


Since query also covers Django, title should be updated (or query should have own file)

RasmusWL · 2023-06-28T15:32:28Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+      // this query checks for Django SecretKey too
+      if exists(API::moduleImport("django"))
+      then
+        exists(AssignStmt e | e.getTarget(0).toString() = "SECRET_KEY" |


No use of .toString() in production code.

You probably want something like .(Name).getId() = "SECRET_KEY"

RasmusWL · 2023-06-28T15:36:07Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+// *it seems that sanitizer have a lot of performance issues*
+// for case check whether SECRECT_KEY is empty or not
+predicate sanitizer(Expr sourceExpr) {
+  exists(DataFlow::Node source, DataFlow::Node sink, If i |
+    source.asExpr() = sourceExpr and
+    DataFlow::localFlow(source, sink)
+  |
+    not i.getASubExpression().getAChildNode*().(Compare) = sink.asExpr() and
+    not sink.getScope().getLocation().getFile().inStdlib() and
+    not source.getScope().getLocation().getFile().inStdlib() and
+    not i.getScope().getLocation().getFile().inStdlib()
+  )
+}


This predicate seems to select any expression that is not locally used in an if expression. If you only want to apply it for assignments of SECRECT_KEY, start by being able to select these in a predicate, and then ONLY look for which of these can flow to an if statement.

However, configurations also have a predicate called isSanitizer, which you should be able to use for this (instead of restricting the sources).

As highlighted in the comment for [app_safe_2.py](https://github.com/github/codeql/pull/13561/files#diff-0427b14b91512fc56afbdbe6b40f05ef99a40f497ae599c01e9ca95fb1d2d3c6) I don't think the fact that we have an if statement makes this safe either.

I'm using DataFlow::ConfigSig I can't find isSanitizer predicate, also i'm not familiar with sanitizers a lot, do these predicates sanitize a node out of between source and sink?

I think there is a need of sanitizer because one of the CVEs related to this query did a sanitize that whether the SECRET_KEY value after getting env variable or config value is same as the default constant or not.

Since you are using the newest version of the API (great!) the predicate is called isBarrier.
A barrier is a node that does not allow flow to continue (in fact, it is removed from the flow graph). The comparison with a known safe value would be a prototypical example of a barrier. In fact, we have in our libraries an implementation of "comparison with a constant" since that is a barrier in many queries.

You might be able to adapt the code by using just the first part, handling eq and noteq and replacing the string constant with the node for the default value.

In your case, you have a comparison with a known unsafe value, so if you try to adapt the code to you should flip the true and false branches.

RasmusWL · 2023-06-28T15:36:58Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+    |
+      config.hasFlow(n1, _) and
+      n1.asExpr().isConstant() and
+      fileNamehelper = n1.asExpr().(StrConst).getS() and


Suggested change

fileNamehelper = n1.asExpr().(StrConst).getS() and

fileNamehelper = n1.asExpr().(StrConst).getText() and

RasmusWL · 2023-06-28T15:40:35Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+// using flask_session library is safe
+predicate flask_sessionSanitizer(DataFlow::Node source) {
+  not DataFlow::localFlow(source,
+    API::moduleImport("flask_session").getMember("Session").getACall().getArg(0))
+}


This predicate is defined the "wrong way around" for performance.

It holds for all dataflow nodes that doesn't flow to first argument of flask_session.Session(), so will hold for a large number of tuples. Instead define it for only the ones that do have flow, and then use it as not flask_sessionSanitizer(app) 👍

I always thought there is no difference between not before DataFlow and not before any caller predicate of it ??
now I understand :)

RasmusWL · 2023-06-28T15:56:21Z

python/ql/src/experimental/Security/CWE-287-ConstantSecretKey/ConstantSecretKey.ql

+/**
+ * Assignments like `SECRET_KEY = ConstantValue`
+ * which ConstantValue will be found by another DataFlow Configuration
+ * and `SECRET_KEY` location must be a argument of `from_object` or `from_pyfile` methods
+ * the argument/location value will be found by another Taint Tracking Configuration.
+ */
+class SecretKeyAssignStmt extends AssignStmt {
+  SecretKeyAssignStmt() {
+    exists(
+      string configFileName, string fileNamehelper, DataFlow::Node n1, FromObjectFileName config
+    |
+      config.hasFlow(n1, _) and
+      n1.asExpr().isConstant() and
+      fileNamehelper = n1.asExpr().(StrConst).getS() and
+      // because of `from_object` we want first part of `Config.AClassName` which `Config` is a python file name
+      configFileName = fileNamehelper.splitAt(".") and
+      // after spliting, don't look at %py% pattern
+      configFileName != "py"
+    |
+      this.getLocation().getFile().getShortName().matches("%" + configFileName + "%") and
+      this.getTarget(0).toString() = ["SECRET_KEY", "JWT_SECRET_KEY"]
+    ) and
+    not this.getScope().getLocation().getFile().inStdlib()
+  }
+}
+
+/**
+ * we have some file name that telling us the SECRET_KEY location
+ * which have determined by these two methods
+ * `app.config.from_pyfile("configFileName.py")` or `app.config.from_object("configFileName.ClassName")`
+ * this is a helper configuration that help us skip the SECRET_KEY variables that are not related to Flask.
+ */
+class FromObjectFileName extends TaintTracking::Configuration {
+  FromObjectFileName() { this = "FromObjectFileName" }
+
+  override predicate isSource(DataFlow::Node source) {
+    source.asExpr().isConstant() and
+    not source.getScope().getLocation().getFile().inStdlib()
+  }
+
+  override predicate isSink(DataFlow::Node sink) {
+    exists(API::Node n |
+      n = flaskInstance() and
+      flask_sessionSanitizer(n.getReturn().asSource())
+    |
+      sink =
+        n.getReturn()
+            .getMember("config")
+            .getMember(["from_object", "from_pyfile"])
+            .getACall()
+            .getArg(0)
+    ) and
+    not sink.getScope().getLocation().getFile().inStdlib()
+  }
+}


All this part could essentially be replaced by something like the following:

from API::Node app, API::CallNode cn, string filenameUsed where app = API::moduleImport("flask").getMember("Flask").getASubclass*().getACall().getReturn() and // your sanitizer for Session cn = app.getMember("config") .getMember(["from_object", "from_pyfile"]) .getACall() and filenameUsed = cn .getParameter(0) .getAValueReachingSink() .asExpr().(StrConst).getText() select cn, filenameUsed

am0o0 · 2023-06-29T11:04:30Z

Hi @RasmusWL I opened another pull request because of my mistake :(

am0o0 · 2023-06-29T12:07:50Z

Also I'm commenting here that I moved this pull request to here

yoff · 2023-07-26T14:45:21Z

For me to accept this, I need you to look into having consistent results for all the uses of os.environ and os.getenv in the config.py file. -- Currently some are found by the query, and some are not.

Is this fixed now?

yoff · 2023-07-26T14:54:55Z

Everyone will be on vacation until end of next week, but it seems you are close :-)

am0o0 · 2023-07-26T15:39:32Z

Is this fixed now?

Yes

RasmusWL

OK to merge now, will have to look things over a bit when we promote this query, especially the sanitizer around flask_session.

This means tests can pass on any machine now 👍

RasmusWL

see comments on previous approving review

am0o0 · 2023-08-14T11:27:17Z

see comments on previous approving review

@RasmusWL I did variant analysis and it seems there are many false positives because of the test projects, can we do a sanitize that the path don't contain any test/example keyword?

RasmusWL · 2023-08-14T11:59:57Z

@RasmusWL I did variant analysis and it seems there are many false positives because of the test projects, can we do a sanitize that the path don't contain any test/example keyword?

@amammad sure, you can use the following piece of if you want to

codeql/python/ql/src/Security/CWE-326/WeakCryptoKey.ql

Line 22 in 771e686

not origin.getScope().getScope*() instanceof TestScope

am0o0 · 2023-08-14T12:37:39Z

@amammad sure, you can use the following piece of if you want to

codeql/python/ql/src/Security/CWE-326/WeakCryptoKey.ql

Line 22 in 771e686

not origin.getScope().getScope*() instanceof TestScope

@RasmusWL thanks I tried this just now, but it seems it is not effective, I've tested following and it seems there are much fewer false positives related to example/test/demo code examples.

  predicate isBarrier(DataFlow::Node node) {
    node.getLocation().getFile().inStdlib() or
    node.getLocation().getFile().getAbsolutePath().matches(["%test%", "%demo%", "%example%"])
  }

RasmusWL · 2023-08-14T13:11:39Z

@amammad it's your choice. In the future though, I would appreciate if you made such changes before making the PR 😊

am0o0 · 2023-08-14T13:49:54Z

@amammad it's your choice. In the future though, I would appreciate if you made such changes before making the PR 😊

@RasmusWL I do apologize about my noob behaviors :)

RasmusWL · 2023-08-16T13:06:25Z

Had to retrigger CI, so just merged in main 😊

am0o0 · 2023-08-17T08:20:42Z

@RasmusWL I'm so sorry I should have checked the tests. :(

RasmusWL

for future me: see comments on previous approving review

V1

e3e0307

am0o0 requested a review from a team as a code owner June 25, 2023 10:39

github-actions bot added documentation Python labels Jun 25, 2023

am0o0 changed the title ~~Python: Flask & Django Constant Key initialization~~ Python: Flask & Django Constant Secret Key initialization Jun 25, 2023

calumgrant assigned RasmusWL Jun 26, 2023

RasmusWL requested changes Jun 28, 2023

View reviewed changes

V2

7a17b99

am0o0 mentioned this pull request Jun 29, 2023

Python: Flask & Django Constant Secret Key initialization #13614

Closed

RasmusWL added the external-contribution label Jun 30, 2023

upgrade query to detect redash CVE too

816799c

am0o0 requested review from a team as code owners June 30, 2023 12:15

github-actions bot added C# JS C++ Java Go labels Jun 30, 2023

am0o0 added 2 commits July 25, 2023 00:11

fix a mistake :(

1e1d42f

remove saniter which was responsible for a defensive technique

591d81b

remove unused saniter

bee8e6f

RasmusWL previously approved these changes Aug 14, 2023

View reviewed changes

RasmusWL added 5 commits August 14, 2023 11:29

Python: Fix formatting

eeefdc5

Merge branch 'main' into amammad-python-WebAppsConstatntSecretKeys

0fba38c

Python: Only interested in StrConst

6e168ff

Python: Model os.getenv[b]

794d04e

Python: Remove flow through stdlib

1c3cc1f

This means tests can pass on any machine now 👍

RasmusWL dismissed their stale review via 1c3cc1f August 14, 2023 09:56

RasmusWL added the no-change-note-required This PR does not need a change note label Aug 14, 2023

RasmusWL previously approved these changes Aug 14, 2023

View reviewed changes

sanitize resutls exist in test/demo/example/sample directories

eb5529e

am0o0 dismissed RasmusWL’s stale review via eb5529e August 14, 2023 13:48

Merge branch 'main' into amammad-python-WebAppsConstatntSecretKeys

0443057

RasmusWL self-assigned this Aug 16, 2023

Python: Fix tests

24f9f13

RasmusWL approved these changes Aug 21, 2023

View reviewed changes

RasmusWL merged commit c8c69aa into github:main Aug 21, 2023
10 checks passed

am0o0 deleted the amammad-python-WebAppsConstatntSecretKeys branch September 14, 2024 11:14

		@@ -0,0 +1,22 @@
		from flask import Flask, session
		from flask_session import Session

	n.getReturn().getMember(["SECRET_KEY", "JWT_SECRET_KEY"]).asSink(),
	n.getReturn().getMember(["secret_key", "JWT_SECRET_KEY"]).asSink(),

	fileNamehelper = n1.asExpr().(StrConst).getS() and
	fileNamehelper = n1.asExpr().(StrConst).getText() and

Python: Flask & Django Constant Secret Key initialization #13561

Python: Flask & Django Constant Secret Key initialization #13561

Conversation

am0o0 commented Jun 25, 2023 • edited Loading

am0o0 commented Jun 26, 2023

RasmusWL left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

am0o0 Jun 28, 2023 • edited Loading

Choose a reason for hiding this comment

am0o0 Jun 28, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

am0o0 commented Jun 29, 2023

am0o0 commented Jun 29, 2023

yoff commented Jul 26, 2023

yoff commented Jul 26, 2023

am0o0 commented Jul 26, 2023

RasmusWL left a comment

Choose a reason for hiding this comment

RasmusWL left a comment

Choose a reason for hiding this comment

am0o0 commented Aug 14, 2023 • edited Loading

RasmusWL commented Aug 14, 2023

am0o0 commented Aug 14, 2023

RasmusWL commented Aug 14, 2023

am0o0 commented Aug 14, 2023

RasmusWL commented Aug 16, 2023

am0o0 commented Aug 17, 2023

RasmusWL left a comment • edited Loading

Choose a reason for hiding this comment

am0o0 commented Jun 25, 2023 •

edited

Loading

am0o0 Jun 28, 2023 •

edited

Loading

am0o0 Jun 28, 2023 •

edited

Loading

am0o0 commented Aug 14, 2023 •

edited

Loading

RasmusWL left a comment •

edited

Loading