Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging may fill up /tmp mount until all zellij instances are closed #1410

Closed
raphCode opened this issue May 10, 2022 · 6 comments · Fixed by #1454 or #1548
Closed

Logging may fill up /tmp mount until all zellij instances are closed #1410

raphCode opened this issue May 10, 2022 · 6 comments · Fixed by #1454 or #1548
Labels
config Improvements to the configuration system enhancement New feature or request

Comments

@raphCode
Copy link
Contributor

raphCode commented May 10, 2022

I did something with zellij and it started spamming the logfile with empty ipc messages.
This filled up the maximum capacity of the /tmp mount, which is backed by RAM or swap, consuming those ressources.

image

$ df -h /tmp
Filesystem      Size  Used Avail Use% Mounted on
tmpfs            12G   12G     0 100% /tmp
$ free -h
               total        used        free      shared  buff/cache   available
Mem:            23Gi       9,5Gi       498Mi        11Gi        13Gi       1,6Gi
Swap:          4,0Gi       2,4Gi       1,6Gi

I noticed because the build process fails when it can't write to /tmp.

The actual problem is that deleting the offending log file does not remedy the situation as long as any zellij instance is running: All zellij processes have an open file descriptor to that file and the kernel holds on to the file content as long as there are any references to it.

$ rm /tmp/zellij-1000/zellij-log/zellij.log
$ du -hs /tmp
6,7M	/tmp
$ echo "aaa" > /tmp/test
bash: echo: write error: No space left on device
$ lsof | grep zellij.log
zellij     7363                 raph    4w      REG               0,36 12567707648         60 /tmp/zellij-1000/zellij-log/zellij.log (deleted)
zellij     7363  7368 stdin_han raph    3w      REG               0,36 12567707648         60 /tmp/zellij-1000/zellij-log/zellij.log (deleted)
...

This effectively forces me to close my working session which is bad.

Possible solutions:

  • every zellij instance / version / session uses a new logging file, which makes it possible to delete offending files and only closing that / some zellij process(es) to free up the space
  • special handling of repeated logging messages ("last message repeated x times") so they don't fill up so fast (possibly combine with previous point)
  • limit log filesize (deleting old entries when size grows)
  • logging somewhere else, like /var/log (potentially filling up disk space, ouch my SSD lifetime!)
  • log to systemd or equivalent?
  • since data is only appended to the log (assuming open(2) with O_APPEND), in theory it should be possible to free up space once the filename is deleted since no fd refers to the "early parts" of the file. But I don't think the kernel supports something like this.
@a-kenji
Copy link
Contributor

a-kenji commented May 11, 2022

My preferences here would be the following:

every zellij instance / version / session uses a new logging file, which makes it possible to delete offending files and only closing that / some zellij process(es) to free up the space

I think this is a very good idea.

limit log filesize (deleting old entries when size grows)

I think this can be configurable, as well as the option to turn the log off optionally.

@a-kenji a-kenji added enhancement New feature or request config Improvements to the configuration system labels May 11, 2022
@SpyrosRoum
Copy link
Contributor

Hello, I was recently affected by this and it's certainly not fun, so I'd like to help get it moving.

@imsnif Would you like to help come up with/settle on an acceptable solution?

@imsnif
Copy link
Member

imsnif commented Jun 1, 2022

I don't have a lot of context on this - maybe @raphCode can lead you the right direction?

@raphCode
Copy link
Contributor Author

raphCode commented Jun 1, 2022

We can either do something of the options proposed in first post, probably the best would be to limit log size or create a per-session / per-process logfile. I think we are using some logging crate, accessed by the log! and error! macros. Maybe you find something useful in the crate's docs to mitigate the problem.

While this fixes the problem with filling up /tmp, the underlining issue seems to be that the zellij server process does not react well to a disappearing client process: I already described the problem in more detail here: #1419 (comment)

@raphCode
Copy link
Contributor Author

raphCode commented Jun 6, 2022

The filling up may still happen when log messages are repeated, #1454 fixes only one specific repeating message that we observed a few times so far.

So in general, I think the issue is still valid and logging could be improved.

@raphCode
Copy link
Contributor Author

I created a PR for rotating the log once it reaches 20 MB: #1548
Is everyone okay with that solution?

This should fix the unbounded log growth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config Improvements to the configuration system enhancement New feature or request
Projects
None yet
4 participants