Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORGI-552: Fix duplicate purl error due to small Syft / Brew NEVRA differences #560

Merged
merged 5 commits into from
Aug 14, 2023

Conversation

juspence
Copy link
Contributor

@juspence juspence commented Aug 4, 2023

@RedHatProductSecurity/corgi-devs This is honestly pretty terrible code, but there's no point in me staring at it any longer. I hope this will fix the errors we're seeing in our monitoring email, but I'm not confident the fix is right.

Maybe we should just deploy this and see if it helps, then clean it up if the errors go away. Or if there are still other edge cases, I can handle those in a follow-up PR.

@juspence juspence self-assigned this Aug 4, 2023
@juspence juspence force-pushed the fix-maven-namespace branch 6 times, most recently from 976fa9d to e2b888f Compare August 8, 2023 19:38
@juspence juspence requested a review from a team August 10, 2023 15:50
Copy link
Contributor

@JimFuller-RedHat JimFuller-RedHat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, its hard to know how to handle all this.

FWIW

package-url/purl-spec#136

shows that case sensitivity heuristics will probably have to be derived from each package ecosystem ... I agree we should try it out in stage (and cross our fingers).

@juspence juspence force-pushed the fix-maven-namespace branch 5 times, most recently from 11cb89b to 007c82e Compare August 11, 2023 15:04
@juspence juspence force-pushed the fix-maven-namespace branch from 007c82e to 3842e04 Compare August 11, 2023 17:29
@juspence
Copy link
Contributor Author

Re: above, case sensitivity isn't one of the issues I'm worried about. That's a farily simple fix, and only seems to affect Github and PyPI components anyway (based on testing I did when we first saw this issue).

The bigger problem is purl creation in general. I know that Github and PyPI purls are always lowercased, and I know that PyPI purls have _ underscores converted to - dashes, but I don't know what I don't know. There might be other special rules that cause two different components to end up with the same purl.

Most of the errors in our daily monitoring email are for PyPI components, so I deployed this to stage for testing to see if it would prevent errors for at least some of those. But it actually caused a deadlock in the DB, so I'll need to look more at this next week after I finish other tickets. Putting back into draft for now.

@juspence juspence marked this pull request as draft August 11, 2023 17:47
@juspence juspence force-pushed the fix-maven-namespace branch from 3e46abf to aa04e7f Compare August 14, 2023 14:09
@juspence
Copy link
Contributor Author

juspence commented Aug 14, 2023

I tweaked this slightly and deployed again, then didn't see any more deadlock issues. Those might be a rare problem, or a one-off issue when deploying this change while other tasks were running, or a real bug in the code that's now fixed.

I'm going to merge this as-is since it does seem to help reduce the IntegrityErrors we're seeing. Will open follow-up PRs or back this out as needed.

@juspence juspence marked this pull request as ready for review August 14, 2023 17:30
@juspence juspence force-pushed the fix-maven-namespace branch from aa04e7f to d7891a9 Compare August 14, 2023 17:32
@juspence juspence merged commit f87e9f8 into main Aug 14, 2023
@juspence juspence deleted the fix-maven-namespace branch August 14, 2023 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants