Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grafana Agent Operator - Logs: Too many open files #1844

Closed
jallaix opened this issue Jul 1, 2022 · 8 comments
Closed

Grafana Agent Operator - Logs: Too many open files #1844

jallaix opened this issue Jul 1, 2022 · 8 comments
Labels
bug Something isn't working frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.

Comments

@jallaix
Copy link

jallaix commented Jul 1, 2022

Not sure how to reproduce the bug.
It happened the first time I tried Grafana Agent Operator, setting a PodLogs configured to retrieve logs for all pods of all namespaces of the K8s cluster.

With PodLogs having a more reduced scope, the bug happens on 1 of my 6 clusters.

Below is the log of the config-reloader container of the grafana-agent-logs pod:

add config file /var/lib/grafana-agent/config-in/agent.yml to watcher: create watcher: too many open files

As a result the grafana-agent container of the grafana-agent-logs pod keeps crashing (CrashLoopBackOff) because of:

error loading config file /var/lib/grafana-agent/config/agent.yml: error reading config file open /var/lib/grafana-agent/config/agent.yml: no such file or directory

Below are my PodLogs:

apiVersion: monitoring.grafana.com/v1alpha1
kind: PodLogs
metadata:
  labels:
    instance: primary
  name: system
  namespace: monitoring
spec:
  namespaceSelector:
    matchNames:
    - kube-system
    - external-secrets
  selector:
    matchLabels: {}
---
apiVersion: monitoring.grafana.com/v1alpha1
kind: PodLogs
metadata:
  labels:
    instance: primary
  name: split
  namespace: monitoring
spec:
  namespaceSelector:
    matchNames:
    - split
    - ingress-nginx
  selector:
    matchLabels: {}
@jallaix
Copy link
Author

jallaix commented Jul 1, 2022

After 11 hours of CrashloopBackOff, the pod is now running...

Log of the config-reloader container of the grafana-agent-logs pod:

started watching config file and directories for changes" cfg=/var/lib/grafana-agent/config-in/agent.yml out=/var/lib/grafana-agent/config/agent.yml dirs=

@tpaschalis
Copy link
Member

tpaschalis commented Jul 4, 2022

Hi Julien! 👋 This looks like an error coming from fsnotify.

Where are you running your cluster at (is it a managed service from a Cloud provider, an on-prem installation, or just a local cluster on your laptop)? Could you check the relevant system-imposed limits on files and see if it could be the cause?

@jallaix
Copy link
Author

jallaix commented Jul 4, 2022

Hello ! All my clusters are single-node K3s-based (v1.23.5), running on cloud VMs (AWS, GCP, Scaleway).
My problem occured on a GCP e2-medium instance.

@tpaschalis
Copy link
Member

Hey, apologies for taking too long to get back to you, the notification got lost in all the noise.

I'm not sure what's the case here, but I'd look into a possible K3S issue and the default Linux parameters between the different cloud providers and distros. For example, could it be similar to the issue reported here?

@rfratto rfratto added bug Something isn't working operator Grafana Agent Operator related labels Jul 20, 2022
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had any activity in the past 30 days.
The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed in 7 days if there is no new activity.
Thank you for your contributions!

@github-actions github-actions bot added the stale Issue/PR mark as stale due lack of activity label Aug 20, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 27, 2022
@marctc marctc reopened this Sep 2, 2022
@marctc marctc added keepalive Never close from staleness and removed stale Issue/PR mark as stale due lack of activity labels Sep 2, 2022
@marctc
Copy link
Contributor

marctc commented Sep 6, 2022

A possible workaround to fix this problem would be this: https://kind.sigs.k8s.io/docs/user/known-issues/#pod-errors-due-to-too-many-open-files

I'll leave the issue open for now as it might be a bug in the operator and needs more investigation.

@marctc marctc added area/signals and removed keepalive Never close from staleness operator Grafana Agent Operator related area/operator labels Oct 31, 2022
@rfratto
Copy link
Member

rfratto commented Nov 3, 2022

I'm going to close this as won't fix since this doesn't appear to a code problem, and more of an environment problem (e.g., increase fs.inotify limits).

If you go through that workaround and you're still running into issues, please open a new issue so we can track it; updates in closed issues may get missed.

@rfratto rfratto closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2022
@omidraha
Copy link

I have the same issue.

kubectl logs -n fluent-bit loki-a2444106-logs-2rbmq

2024/01/18 23:20:21 error loading config file /var/lib/grafana-agent/config/agent.yml: error reading config file open /var/lib/grafana-agent/config/agent.yml: no such file or directory

Fixed with:

sudo sysctl fs.inotify.max_user_watches=524288
sudo sysctl fs.inotify.max_user_instances=512

@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Feb 21, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed.
Projects
No open projects
Development

No branches or pull requests

5 participants