-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grafana Agent Operator - Logs: Too many open files #1844
Comments
After 11 hours of CrashloopBackOff, the pod is now running... Log of the config-reloader container of the grafana-agent-logs pod:
|
Hi Julien! 👋 This looks like an error coming from Where are you running your cluster at (is it a managed service from a Cloud provider, an on-prem installation, or just a local cluster on your laptop)? Could you check the relevant system-imposed limits on files and see if it could be the cause? |
Hello ! All my clusters are single-node K3s-based (v1.23.5), running on cloud VMs (AWS, GCP, Scaleway). |
Hey, apologies for taking too long to get back to you, the notification got lost in all the noise. I'm not sure what's the case here, but I'd look into a possible K3S issue and the default Linux parameters between the different cloud providers and distros. For example, could it be similar to the issue reported here? |
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. |
A possible workaround to fix this problem would be this: https://kind.sigs.k8s.io/docs/user/known-issues/#pod-errors-due-to-too-many-open-files I'll leave the issue open for now as it might be a bug in the operator and needs more investigation. |
I'm going to close this as won't fix since this doesn't appear to a code problem, and more of an environment problem (e.g., increase fs.inotify limits). If you go through that workaround and you're still running into issues, please open a new issue so we can track it; updates in closed issues may get missed. |
I have the same issue.
Fixed with:
|
Not sure how to reproduce the bug.
It happened the first time I tried Grafana Agent Operator, setting a PodLogs configured to retrieve logs for all pods of all namespaces of the K8s cluster.
With PodLogs having a more reduced scope, the bug happens on 1 of my 6 clusters.
Below is the log of the config-reloader container of the grafana-agent-logs pod:
As a result the grafana-agent container of the grafana-agent-logs pod keeps crashing (CrashLoopBackOff) because of:
Below are my PodLogs:
The text was updated successfully, but these errors were encountered: