Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update the existing selectResources related func to support new API #390

Merged
merged 1 commit into from
Jun 27, 2023

Conversation

zhiying-lin
Copy link
Contributor

@zhiying-lin zhiying-lin commented Jun 19, 2023

Description of your changes

Differentiate the error types

  • user error
  • api server error (from cache or api server)
  • unexpected behavior error
  • expected behavior error

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Need to add more UT in the future

Special notes for your reviewer

@zhiying-lin zhiying-lin force-pushed the crp-res branch 4 times, most recently from 8f61e55 to 61381b8 Compare June 26, 2023 03:14
@@ -33,25 +34,56 @@ var (
// There should be something wrong with the system and cannot be recovered by itself.
ErrUnexpectedBehavior = errors.New("unexpected behavior which cannot be handled by the controller")

// ErrExpectedBehavior indicates the current situation is expected, which can be recovered by itself after retries.
ErrExpectedBehavior = errors.New("expected behavior which can be recovered by itself")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure if I understand this part, and a little bit clarification would be greatly appreciated.

What would be if there's a resource version conflict (API error)? Is this an API error or an expected error? Similarly, what would be if an error is expected with sporadic occurrences and unexpected if repeated consistently?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one way to interpret this is we can alert on too many ExpectedBehaviorError which will be treated the same as "UnExpectedError" while APIServerError will clearly say that there is something wrong with the api-server. We will look more into cluster status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now there are api server errors, user errors and the rest are expected/unexpected logic errors. They should have no overlap.

agree with Ryan, the alert could be set based on the occurrences or status code returned by the API server.

}
if !r.InformerManager.IsInformerSynced(gvr) {
return nil, fmt.Errorf("informer cache for %+v is not synced yet", restMapping.Resource)
return nil, controller.NewExpectedBehaviorError(fmt.Errorf("informer cache for %+v is not synced yet", restMapping.Resource))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an example of my confusion specified in an earlier comment. It could be, at this point, that the cache is simply not working fast enough and will get sync'd soon; or it could also be that there is a connection/RBAC issue that will never recover without intervention and needs attention.

@zhiying-lin zhiying-lin merged commit 6954b1d into Azure:main Jun 27, 2023
@zhiying-lin zhiying-lin deleted the crp-res branch June 27, 2023 05:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants