Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging spam after upgrading to helm 0.29.0 and operator version 0.42.0 #892

Closed
bh-tt opened this issue Mar 5, 2024 · 3 comments
Closed
Assignees
Labels
bug Something isn't working

Comments

@bh-tt
Copy link
Contributor

bh-tt commented Mar 5, 2024

Last night we upgraded to the new operator version (from 0.41.2), and today we woke up to approximately 70 million log messages from vm operator. Something appears to go wrong with a watch on prometheus custom resources, as the log is filled with the following 2 messages:

{"level":"error","ts":"2024-03-05T08:26:50Z","msg":"k8s.io/[email protected]+incompatible/tools/cache/reflector.go:231: expected type *v1.ServiceMonitor, but watch event object had type <nil>","stacktrace":"k8s.io/klog/v2.(*loggingT).output\n\tk8s.io/klog/[email protected]/klog.go:895\nk8s.io/klog/v2.(*loggingT).printWithInfos\n\tk8s.io/klog/[email protected]/klog.go:723\nk8s.io/klog/v2.(*loggingT).printDepth\n\tk8s.io/klog/[email protected]/klog.go:705\nk8s.io/klog/v2.ErrorDepth\n\tk8s.io/klog/[email protected]/klog.go:1574\nk8s.io/apimachinery/pkg/util/runtime.logError\n\tk8s.io/[email protected]/pkg/util/runtime/runtime.go:115\nk8s.io/apimachinery/pkg/util/runtime.HandleError\n\tk8s.io/[email protected]/pkg/util/runtime/runtime.go:109\nk8s.io/client-go/tools/cache.watchHandler\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:726\nk8s.io/client-go/tools/cache.(*Reflector).watch\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:431\nk8s.io/client-go/tools/cache.(*Reflector).ListAndWatch\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:356\nk8s.io/client-go/tools/cache.(*Reflector).Run.func1\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:289\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tk8s.io/[email protected]/pkg/util/wait/backoff.go:226\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tk8s.io/[email protected]/pkg/util/wait/backoff.go:227\nk8s.io/client-go/tools/cache.(*Reflector).Run\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:288\nk8s.io/client-go/tools/cache.(*controller).Run.(*Group).StartWithChannel.func2\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:55\nk8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:72"}

and occasionally we also have:

{"level":"error","ts":"2024-03-05T08:26:50Z","msg":"k8s.io/[email protected]+incompatible/tools/cache/reflector.go:231: expected type *v1.Probe, but watch event object had type <nil>","stacktrace":"k8s.io/klog/v2.(*loggingT).output\n\tk8s.io
/klog/[email protected]/klog.go:895\nk8s.io/klog/v2.(*loggingT).printWithInfos\n\tk8s.io/klog/[email protected]/klog.go:723\nk8s.io/klog/v2.(*loggingT).printDepth\n\tk8s.io/klog/[email protected]/klog.go:705\nk8s.io/klog/v2.ErrorDepth\n\tk8s.io/klog/[email protected]
.1/klog.go:1574\nk8s.io/apimachinery/pkg/util/runtime.logError\n\tk8s.io/[email protected]/pkg/util/runtime/runtime.go:115\nk8s.io/apimachinery/pkg/util/runtime.HandleError\n\tk8s.io/[email protected]/pkg/util/runtime/runtime.go:109\nk
8s.io/client-go/tools/cache.watchHandler\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:726\nk8s.io/client-go/tools/cache.(*Reflector).watch\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:431\nk8s.io/cli
ent-go/tools/cache.(*Reflector).ListAndWatch\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:356\nk8s.io/client-go/tools/cache.(*Reflector).Run.func1\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:289\nk8
s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\tk8s.io/[email protected]/pkg/util/wait/backoff.go:226\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\tk8s.io/[email protected]/pkg/util/wait/backoff.go:227\nk8s.io/client-go/tool
s/cache.(*Reflector).Run\n\tk8s.io/[email protected]+incompatible/tools/cache/reflector.go:288\nk8s.io/client-go/tools/cache.(*controller).Run.(*Group).StartWithChannel.func2\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:55\nk8s.io/ap
imachinery/pkg/util/wait.(*Group).Start.func1\n\tk8s.io/[email protected]/pkg/util/wait/wait.go:72"}

For now we have rollbacked to 0.41.2.

@f41gh7 f41gh7 added the bug Something isn't working label Mar 5, 2024
@f41gh7 f41gh7 self-assigned this Mar 5, 2024
@f41gh7
Copy link
Collaborator

f41gh7 commented Mar 5, 2024

Thanks for reporting, we're going to fix it soon.

f41gh7 added a commit that referenced this issue Mar 5, 2024
previously it may bug and produce a lot of fall positives if connection with kubernetes API server is unstable
now it must properly retried without noisy logs
#892
@f41gh7
Copy link
Collaborator

f41gh7 commented Mar 6, 2024

Must be fixed at v0.42.2 release

@f41gh7 f41gh7 closed this as completed Mar 6, 2024
@bh-tt
Copy link
Contributor Author

bh-tt commented Mar 6, 2024

It is, tested it this morning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants