Dj db duplicate saves #52

DJensen94 · 2024-12-09T22:36:38Z

🗣 Description

💭 Motivation and context

🧪 Testing

✅ Pre-approval checklist

This PR has an informative and human-readable title.
Changes are limited to a single goal - eschew scope creep!
All future TODOs are captured in issues, which are referenced
in code comments.
All relevant type-of-change labels have been added.
I have read the CONTRIBUTING document.
These code changes follow cisagov code standards.
All relevant repo and/or project documentation has been updated
to reflect the changes in this PR.
Tests have been added and/or modified to cover the changes in this PR.
All new and existing tests pass.

✅ Pre-merge checklist

Revert dependencies to default branches.
Finalize version.

✅ Post-merge checklist

Create a release.

Add report generator page to UI to generate biweekly reports and bulletins for cybersix alerts and credential breaches

add helpers folder to package data to be able to reference helpers from app

Moved bulletin folder to the correct helpers folder

add pdfkit to the modules in setup file

Update the logging declaration in the config file in pe-source folder

Changes location and changed location, also used Central Logging

…eports into EM-improve-dnstwist

test

renaming db_query to avoid errors

Can't import same name

Error

…w other new endpoints

… staging

…andit

….ini for pre-commit check

…s to be tasked

…0,11,12,16,17,18

Add mini data lake app and models and router to point models too correct database

add script that creates the empty datalake so that models can be migrated into it

Update shodan api calls to save to mdl as well as pe database

…app to run

update pe_source scripts to save to mdl

src/pe_reports/helpers/download_encrypt_excel.py

+# cisagov Libraries
+from pe_reports.data.config import db_password_key
+from pe_reports.data.db_query import connect_to_staging, get_orgs, get_orgs_pass
+


To fix the problem, we should remove the print statement that logs the sensitive PASSWORD variable. Instead of printing the password, we can log a message indicating that the password has been retrieved without revealing its value. This ensures that sensitive information is not exposed in the logs.

Remove the print(PASSWORD) statement on line 33.

Optionally, add a log message indicating that the password has been retrieved.

src/pe_source/data/pe_db/db_query_source.py

+    except json.decoder.JSONDecodeError as err:
+        LOGGER.error(err)
+
+


To fix the problem, we should avoid logging sensitive information directly. Instead, we can log a sanitized version of the data or avoid logging it altogether. In this case, we will sanitize the data variable before logging it by removing or masking sensitive information such as API keys.

Identify the lines where sensitive information is being logged.

Sanitize the data by removing or masking sensitive information before logging.

Ensure that the functionality of the code remains unchanged.

src/pe_source/data/pshtt/cli.py

+    with smart_open(out_filename) as out_file:
+        json_content = utils.json_for(results)
+
+        out_file.write(json_content + "\n")


To fix the problem, we should ensure that the JSON content is encrypted before being written to the file. We can use the cryptography library to handle encryption and decryption. Specifically, we will:

Encrypt the JSON content before writing it to the file.

Decrypt the JSON content when reading it back from the file (if needed).

We will need to:

Import the necessary modules from the cryptography library.

Define functions to handle encryption and decryption.

Modify the to_json function to encrypt the JSON content before writing it to the file.

src/pe_source/data/pshtt/cli.py

+    with smart_open(out_filename) as out_file:
+        json_content = utils.json_for(results)
+
+        out_file.write(json_content + "\n")


src/pe_source/data/pshtt/pshtt.py

+            logging.warning(
+                "%s: Not publicly trusted - not trusted by %s.",
+                endpoint.url,
+                ", ".join(public_not_trusted_names),


To fix the problem, we should avoid logging the names of the trust stores directly. Instead, we can log a generic message indicating that the certificate is not publicly trusted without revealing specific details. This approach maintains the functionality of informing about the trust status while protecting potentially sensitive information.

Replace the logging statement on line 889 to avoid logging the specific names of the trust stores.

Ensure that the new logging message still conveys the necessary information without exposing sensitive details.

src/pe_source/pshtt_wrapper.py

+
+            i += 1
+        LOGGER.info("%s: Completed running PSHTT", thread)
+


To fix the problem, we should avoid logging the entire results object directly. Instead, we can log a sanitized version of the results object that excludes any sensitive information. This can be achieved by creating a function that filters out sensitive fields from the results object before logging it.

Create a function to sanitize the results object by removing or masking sensitive fields.

Replace the direct logging of results with the sanitized version.

Ensure that the changes are made in the run_pshtt function where the logging occurs.

edujosemena and others added 30 commits September 23, 2022 10:36

pre-commit issues

b0dff77

Merge branch 'develop' into EM-improve-dnstwist

0f477cc

merge latest develop

24196b9

fix linting

040b1de

Add report generator page to UI

34d1600

Add report generator page to UI to generate biweekly reports and bulletins for cybersix alerts and credential breaches

Replace logger init with logging.getLogger

cace04b

Remove spacy from views

f77b36a

add helpers folder to setup.py

53435e5

add helpers folder to package data to be able to reference helpers from app

Moved Bulletin folder

0ee5ef4

Moved bulletin folder to the correct helpers folder

add pdfkit to modules

ed5d8cc

add pdfkit to the modules in setup file

update logging call in config file

fe2dd83

Update the logging declaration in the config file in pe-source folder

Central formatting

05d43a6

Changes location and changed location, also used Central Logging

Merge branch 'EM-improve-dnstwist' of https://github.com/cisagov/pe-r…

5d97b8f

…eports into EM-improve-dnstwist

Update dnstwist.py

44d45bc

test

0b4ab72

test

Save data_schema.sql changes for a separate PR

65ea84f

renamed db_query

c77f345

renaming db_query to avoid errors

lock sqlalchemy

c03f1fa

Set port as an int in __init__

3099bca

Set URI port to float before int

7c9ae08

Set port to 5000 if empty

19e8db1

Fix celery import in __init__.py

971b981

Can't import same name

Specify importlib-metadata as 4.12.0

6a36379

set importlib-metadata to 4.8

4e88394

Address lgtm error: copy dark_web_date

ce72e84

Errors in Setup

1d28a06

error

a9b402e

errors

3df55ac

Errors

c59923f

Error

lgmt Errors

2977ce5

cduhn17 and others added 28 commits January 3, 2024 10:55

Update CODEOWNERS

bca5e7b

Updated most of the api code to match staging, still need to add a fe…

1c32952

…w other new endpoints

Added endpoints for NIST CVE and Xpanse scans

b4606ec

Updated all scan code to be in line with staging

36c52c1

Updated PE report generation, encryption, and mailing code

6f51c68

updated asm_sync, db backup, and pe_scorecard code to be in line with…

67bc318

… staging

also updating this __init__ file

2e19d18

Added go files

60fa0a2

Updated miscellaneous repo files to be in line with staging

55345a1

Updated the rest of the pe_reports_django_project folder

6750d41

Updated everything to pass all pre-commit and linting checks except b…

1fd8fe3

…andit

Bandit, markdownlint, and prettier pre-commit fixes added

3b1b192

updated os.popen calls and added nosec, added blank .env and database…

d307fad

….ini for pre-commit check

went back to gitignoring .env and database.ini

3868bf2

Fixed and updated /breachdetails and /breachcomp_credsbydate endpoint…

384b178

…s to be tasked

Resolve merge conflicts

51a512b

CODEOWNERS handle correction

61163ee

Added endpoint code for github issues 699,700,701,702,703,5,6,7,8,9,1…

b402cb0

…0,11,12,16,17,18

Various logging improvements, ASM summary tweaks, and scan fixes

9e8166a

Added adhoc investigation scripts and updated gitignore

8b5eb2b

Some additional adhoc investigation tweaks

1e91bea

Revamped ASM sync code and minor fixes for c6g alerts/topcve scans

20f09d8

Add mini data lake app and models

057fb57

Add mini data lake app and models and router to point models too correct database

add sync_dmz_mdl script

e8699b9

add script that creates the empty datalake so that models can be migrated into it

Update shodan api calls to save to mdl

2de72eb

Update shodan api calls to save to mdl as well as pe database

Fixes django project, resolves numerous config issues and allows the …

9f21b3f

…app to run

Gets app running locally

e24e63b

update pe_source scripts to save to mdl

da733fe

update pe_source scripts to save to mdl

github-advanced-security bot found potential problems Dec 9, 2024

View reviewed changes

DJensen94 closed this Dec 9, 2024

@@ -32,3 +32,3 @@
             PASSWORD = db_password_key()
-            print(PASSWORD)
+            LOGGER.info("Database password has been retrieved.")

@@ -968,3 +968,4 @@
-                LOGGER.info(data)
+                sanitized_data = {k: (v if k != "access_token" else "****") for k, v in pshtt_dict.items()}
+                LOGGER.info(json.dumps(sanitized_data, default=str))
                 try:

@@ -32,2 +32,3 @@
             import pytablewriter
+            from cryptography.fernet import Fernet
@@ -63,6 +64,27 @@
-                    out_file.write(json_content + "\n")
+                    # Encrypt the JSON content before writing to the file
+                    key = generate_key()
+                    encrypted_content = encrypt_data(json_content, key)
+                    out_file.write(encrypted_content.decode() + "\n")
                     if out_file is not sys.stdout:
-                        logging.warning("Wrote results to %s.", out_filename)
+                        logging.warning("Wrote encrypted results to %s.", out_filename)
+            def generate_key():
+                """Generate a key for encryption."""
+                return Fernet.generate_key()
+            def encrypt_data(data, key):
+                """Encrypt the provided data using the provided key."""
+                fernet = Fernet(key)
+                return fernet.encrypt(data.encode())
+            def decrypt_data(data, key):
+                """Decrypt the provided data using the provided key."""
+                fernet = Fernet(key)
+                return fernet.decrypt(data).decode()

Package	Version	Security advisories
cryptography (pypi)	44.0.0	None

@@ -93,2 +93,10 @@
+            def sanitize_results(results):
+                """Sanitize the results object by removing or masking sensitive fields."""
+                sanitized_results = []
+                for result in results:
+                    sanitized_result = {key: value for key, value in result.items() if key not in ["sensitive_field1", "sensitive_field2"]}
+                    sanitized_results.append(sanitized_result)
+                return sanitized_results
             def run_pshtt(domains, thread):
@@ -123,3 +131,4 @@
                             LOGGER.error("%s: %s", thread, e)
-                            LOGGER.error("%s: failed result %s", thread, results)
+                            sanitized_results = sanitize_results(results)
+                            LOGGER.error("%s: failed result %s", thread, sanitized_results)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dj db duplicate saves #52

Dj db duplicate saves #52

DJensen94 commented Dec 9, 2024

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

Provide additional feedback

Please help us improve GitHub Copilot by sharing more details about this comment.

		except json.decoder.JSONDecodeError as err:
		LOGGER.error(err)

@@ -3 +3,3 @@
             wheel
+            cryptography==44.0.0

Dj db duplicate saves #52

Dj db duplicate saves #52

Conversation

DJensen94 commented Dec 9, 2024

🗣 Description

💭 Motivation and context

🧪 Testing

✅ Pre-approval checklist

✅ Pre-merge checklist

✅ Post-merge checklist