[DRAFT] #3307: duplicate key bug for public contact #3362

zandercymatics · 2025-01-17T17:08:55Z

Ticket

Resolves #3307

Changes

Change 1
Change 2

Context for reviewers

As far as I can tell, this appears to be a race condition (notes). In both instances of this bug occurring, get_contact_in_keys was called twice with a very small time interval between. Right after this occurs - we get this fail condition. In addition, it seems like our app was undergoing heavy load at this same exact time (use the view surrounding documents button).

The check in _get_or_create_public_contact is very concrete in that we just return the record if one exists but does not modify it. This is fine sequentially, but it's vulnerable to a common "check-then-act" type of race condition. Consider this sequence:

Thread A checks if contact exists (returns empty queryset)
Thread B checks if contact exists (returns empty queryset)
Thread A creates contact
Thread B attempts to create contact => IntegrityError

In other words, it appears as though the only code path that can cause an integrity error is if the filter return an empty queryset at just the right time. I've included a unit test for this, and it reliably reproduces the bug without my fix and the test passes with it.

I've directly tested (modified code for) two alternative ideas:

The get_id on public_contact.py function returns the same id.
a. Sequentially, the probability of collision is really low here -- roughly 10^25. random.choices() isn't uniformly distributed though like uuid, but the probability space is so high that I find this option less convincing.
b. In parallel (threads), random is thread-safe. Pythons docs note concurrency issues when using a free-threaded build (so without the GIL), but we do not use that build.
The interaction between _get_or_create_public_contact and _add_missing_contacts_if_unknown creates duplicates because .save() is called twice on the same record.
a. It is worth noting that _add_missing_contacts_if_unknown calls save, but doesn't check for duplicates like get_or_create does.
b. This doesn't seem consistent with our log stack and is in line with what addAllDefaults does.
c. Sequentially, I couldn't find a code path for an id conflict to occur because this error happens within _get_or_create_public_contact (which is called before add_missing_contacts) AND we have a check for its existence before attempting a save.

Ultimately though, we are missing just enough logging to prevent a more exhaustive search here. I've tried to add it to places that I think are important.

Setup

This is fairly tricky to test as this bug is somewhat rare in the first place. I've included a test case that is able to reliably reproduce the issue though - and that might be a good place to test this.

To cause this error, do the following in _get_or_create_public_contact:
Comment out this code:

            try:
                with transaction.atomic():
                    public_contact.save(skip_epp_save=True)
                    logger.info(f"Created a new PublicContact: {public_contact}")
            except IntegrityError as err:
                logger.error(
                    f"_get_or_create_public_contact() => tried to create a duplicate public contact: {err}",
                    exc_info=True,
                )
                return PublicContact.objects.get(
                    registry_id=public_contact.registry_id,
                    contact_type=public_contact.contact_type,
                    domain=self,
                )

And replace it with this (or the old code):

public_contact.save(skip_epp_save=True)

Then run the included unit test. You should reliably see an integrity error as per the original issue description.

Code Review Verification Steps

As the original developer, I have

Satisfied acceptance criteria and met development standards

Met the acceptance criteria, or will meet them in a subsequent PR
Created/modified automated tests
Update documentation in READMEs and/or onboarding guide

Ensured code standards are met (Original Developer)

If any updated dependencies on Pipfile, also update dependencies in requirements.txt.
Interactions with external systems are wrapped in try/except
Error handling exists for unusual or missing values

Validated user-facing changes (if applicable)

Tag @dotgov-designers in this PR's Reviewers for design review. If code is not user-facing, delete design reviewer checklist
Verify new pages have been added to .pa11yci file so that they will be tested with our automated accessibility testing
Checked keyboard navigability
Tested general usability, landmarks, page header structure, and links with a screen reader (such as Voiceover or ANDI)

As a code reviewer, I have

Reviewed, tested, and left feedback about the changes

Pulled this branch locally and tested it
Verified code meets all checks above. Address any checks that are not satisfied
Reviewed this code and left comments. Indicate if comments must be addressed before code is merged
Checked that all code is adequately covered by tests
Verify migrations are valid and do not conflict with existing migrations

Validated user-facing changes as a developer

Note: Multiple code reviewers can share the checklists above, a second reviewer should not make a duplicate checklist. All checks should be checked before approving, even those labeled N/A.

New pages have been added to .pa11yci file so that they will be tested with our automated accessibility testing
Checked keyboard navigability
Meets all designs and user flows provided by design/product
Tested general usability, landmarks, page header structure, and links with a screen reader (such as Voiceover or ANDI)
(Rarely needed) Tested as both an analyst and applicant user

As a designer reviewer, I have

Verified that the changes match the design intention

Checked that the design translated visually
Checked behavior. Comment any found issues or broken flows.
Checked different states (empty, one, some, error)
Checked for landmarks, page heading structure, and links

Validated user-facing changes as a designer

Checked keyboard navigability
Tested general usability, landmarks, page header structure, and links with a screen reader (such as Voiceover or ANDI)
Tested with multiple browsers (check off which ones were used)
- Chrome
- Microsoft Edge
- FireFox
- Safari
(Rarely needed) Tested as both an analyst and applicant user

References

Code review best practices

Screenshots

github-actions · 2025-01-17T17:13:34Z

🥳 Successfully deployed to developer sandbox za.

github-actions · 2025-01-17T18:14:56Z

🥳 Successfully deployed to developer sandbox za.

zandercymatics added 9 commits January 13, 2025 15:06

Better table + simplify get_or_create function

828c47d

Update admin.py

c0c9817

Update domain.py

f86a065

Update domain.py

4649198

Update domain.py

511e961

Add test for race condition

1ece895

Test part 2

4de5860

Squash bug

cfcdd88

cleanup

a0d6e1e

zandercymatics requested a review from abroddrick as a code owner January 17, 2025 17:08

Remove unrelated changes

8c204d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] #3307: duplicate key bug for public contact #3362

[DRAFT] #3307: duplicate key bug for public contact #3362

zandercymatics commented Jan 17, 2025 •

edited

Loading

github-actions bot commented Jan 17, 2025

github-actions bot commented Jan 17, 2025

[DRAFT] #3307: duplicate key bug for public contact #3362

Are you sure you want to change the base?

[DRAFT] #3307: duplicate key bug for public contact #3362

Conversation

zandercymatics commented Jan 17, 2025 • edited Loading

Ticket

Changes

Context for reviewers

Setup

Code Review Verification Steps

As the original developer, I have

Satisfied acceptance criteria and met development standards

Ensured code standards are met (Original Developer)

Validated user-facing changes (if applicable)

As a code reviewer, I have

Reviewed, tested, and left feedback about the changes

Validated user-facing changes as a developer

As a designer reviewer, I have

Verified that the changes match the design intention

Validated user-facing changes as a designer

References

Screenshots

github-actions bot commented Jan 17, 2025

github-actions bot commented Jan 17, 2025

zandercymatics commented Jan 17, 2025 •

edited

Loading