[DRAFT] #3307: duplicate key bug for public contact #3362
+93
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Ticket
Resolves #3307
Changes
Context for reviewers
As far as I can tell, this appears to be a race condition (notes). In both instances of this bug occurring,
get_contact_in_keys
was called twice with a very small time interval between. Right after this occurs - we get this fail condition. In addition, it seems like our app was undergoing heavy load at this same exact time (use the view surrounding documents button).The check in
_get_or_create_public_contact
is very concrete in that we just return the record if one exists but does not modify it. This is fine sequentially, but it's vulnerable to a common "check-then-act" type of race condition. Consider this sequence:In other words, it appears as though the only code path that can cause an integrity error is if the filter return an empty queryset at just the right time. I've included a unit test for this, and it reliably reproduces the bug without my fix and the test passes with it.
I've directly tested (modified code for) two alternative ideas:
get_id
on public_contact.py function returns the same id.a. Sequentially, the probability of collision is really low here -- roughly 10^25. random.choices() isn't uniformly distributed though like uuid, but the probability space is so high that I find this option less convincing.
b. In parallel (threads),
random
is thread-safe. Pythons docs note concurrency issues when using a free-threaded build (so without the GIL), but we do not use that build._get_or_create_public_contact
and_add_missing_contacts_if_unknown
creates duplicates because .save() is called twice on the same record.a. It is worth noting that
_add_missing_contacts_if_unknown
calls save, but doesn't check for duplicates like get_or_create does.b. This doesn't seem consistent with our log stack and is in line with what addAllDefaults does.
c. Sequentially, I couldn't find a code path for an id conflict to occur because this error happens within _get_or_create_public_contact (which is called before add_missing_contacts) AND we have a check for its existence before attempting a save.
Ultimately though, we are missing just enough logging to prevent a more exhaustive search here. I've tried to add it to places that I think are important.
Setup
This is fairly tricky to test as this bug is somewhat rare in the first place. I've included a test case that is able to reliably reproduce the issue though - and that might be a good place to test this.
To cause this error, do the following in
_get_or_create_public_contact
:Comment out this code:
And replace it with this (or the old code):
Then run the included unit test. You should reliably see an integrity error as per the original issue description.
Code Review Verification Steps
As the original developer, I have
Satisfied acceptance criteria and met development standards
Ensured code standards are met (Original Developer)
Validated user-facing changes (if applicable)
As a code reviewer, I have
Reviewed, tested, and left feedback about the changes
Validated user-facing changes as a developer
Note: Multiple code reviewers can share the checklists above, a second reviewer should not make a duplicate checklist. All checks should be checked before approving, even those labeled N/A.
As a designer reviewer, I have
Verified that the changes match the design intention
Validated user-facing changes as a designer
References
Screenshots