Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: handle UID mismatch between incoming and existing resources #242

Merged
merged 2 commits into from
Dec 12, 2024

Conversation

chetan-rns
Copy link
Collaborator

What does this PR do / why we need it:

Currently, we don't handle a case where the incoming app/app project could have a different source UID than the resource on the cluster with the same name and namespace. This indicates that the existing resource in the agent was created from a different resource in principal which no longer exists. We annotate the resources in the agent with the source UID to track their source of truth.
For more details: #225 (comment)

  1. An app is present on both the principal and the agent
  2. Principal process stops
  3. User deletes the app
  4. Principal starts again and the user creates a new app
  5. The agent couldn't differentiate between the new and old apps.

Which issue(s) this PR fixes:

Fixes #225 (comment)

How to test changes / Special notes to the reviewer:

Checklist

  • Documentation update is required by this PR (and has been updated) OR no documentation update is required.

@codecov-commenter
Copy link

codecov-commenter commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 44.21053% with 53 lines in your changes missing coverage. Please review.

Project coverage is 48.48%. Comparing base (d950aaa) to head (f42c63e).

Files with missing lines Patch % Lines
agent/inbound.go 39.02% 16 Missing and 9 partials ⚠️
internal/manager/appproject/appproject.go 44.44% 14 Missing and 1 partial ⚠️
internal/manager/application/application.go 51.85% 11 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #242      +/-   ##
==========================================
+ Coverage   47.21%   48.48%   +1.26%     
==========================================
  Files          55       55              
  Lines        4882     4977      +95     
==========================================
+ Hits         2305     2413     +108     
+ Misses       2401     2365      -36     
- Partials      176      199      +23     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@jannfis jannfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I have some minor comments about error handling, otherwise this looks good to go.

agent/inbound.go Outdated
if !sourceUIDMatch {
logCtx.Debug("An app already exists with a different source UID. Deleting the existing app")
if err := a.deleteApplication(incomingApp); err != nil {
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should give more context to the returned error here.

The caller will expect an application to be created, and probably reflect that in any error message logged, but could potentially receive an error from the delete operation.

For example, the following case would be confusing (admittedly, the chance for this happening is very slim, but it should ):

  1. A create event is sent for an application where sourceUIDMatch is false
  2. The application is deleted on the cluster by a third party
  3. a.deleteApplication fails because the resource already is deleted. The error would be "the requested resource could not be found"
  4. That error is passed back to the caller
  5. The error log for the caller would look similar to "Could not create app foo: requested resource not found"

Or think about RBAC - agent might have RBAC to create applications, but not delete applications, but the create call would return a permission denied. Troubleshooting that would be fun :)

So it would make more sense to augment the returned error by more information, e.g. fmt.Errorf("could not delete existing resource prior to creation: %w", err) or something similar.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Added more context to errors.

agent/inbound.go Outdated
if !sourceUIDMatch {
logCtx.Debug("Source UID mismatch between the incoming app and existing app. Deleting the existing app")
if err := a.deleteApplication(incomingApp); err != nil {
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about augmenting the error as above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

agent/inbound.go Outdated
if !sourceUIDMatch {
logCtx.Debug("An appProject already exists with a different source UID. Deleting the existing appProject")
if err := a.deleteAppProject(incomingAppProject); err != nil {
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about augmenting the error as above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

agent/inbound.go Outdated
if !sourceUIDMatch {
logCtx.Debug("Source UID mismatch between the incoming and existing appProject. Deleting the existing appProject")
if err := a.deleteAppProject(incomingAppProject); err != nil {
return err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about augmenting the error as above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@@ -287,7 +334,7 @@ func (a *Agent) deleteAppProject(project *v1alpha1.AppProject) error {
// means that we're out-of-sync from the control plane.
//
// TODO(jannfis): Handle this situation properly instead of throwing an error.
if !a.appManager.IsManaged(project.Name) {
if !a.projectManager.IsManaged(project.Name) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@chetan-rns chetan-rns force-pushed the source-uid-annotation branch from 61b7ef0 to f42c63e Compare December 12, 2024 13:48
@chetan-rns chetan-rns requested a review from jannfis December 12, 2024 13:50
Copy link
Collaborator

@jannfis jannfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jannfis jannfis merged commit 93e6716 into argoproj-labs:main Dec 12, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants