-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Also consider creation time for purge_files_after #119
Comments
Files initially created on some filesystems (ZFS) may have an access time of the Unix epoch. Addresses anatol#119
Files initially created on some filesystems (ZFS) may have an access time of the Unix epoch. Addresses anatol#119
This is a stab at incompletely addressing this, by ignoring files with an access time set to the Unix epoch: I also had a go at using the birth time, but it doesn't look like we are able to fake the birth time, so I'm not sure what to do about the tests without changing the signature of This should work, but causes a test to fail as the fileToRemove is not removed due to being created too recently. atime := t.AccessTime()
var btime time.Time
if t.HasBirthTime() {
btime = t.BirthTime()
}
if atime.Before(removeIfOlder) && btime.Before(removeIfOlder) {
log.Printf("Removing stale file %v as its access time (%v) and birth time (if available: %v) are too old", path, atime, btime)
if err := os.Remove(path); err != nil {
log.Print(err)
}
} |
So What happens if you access the ZFS file, do you see that the file access time actually changed? |
I do not know why, but here are the repro steps. Yes, when accessing it it changes. When using pacoloco, a second client requesting the package will update the time. As will
|
In this case the problem seems in the way the file is created. Currently we set the file times only if the upstream repository has reported it with We should set birth and access times even if the HTTP header is absent. To test this idea @chennin could you please try this change? https://github.com/anatol/pacoloco/tree/issue-119 |
Hi, it does not, and when I tried commenting out the |
So this looks a lot like a ZFS bug/PR openzfs/zfs#15773 that would need to make it in to the Ubuntu kernel to fix it for me. I made a playground if anyone still wants to test (disclaimer: I don't know go). Thanks for your time. For now I have a cron job that runs |
And I confirmed this does NOT reproduce on Ubuntu 24.10 with zfs-2.2.6-1ubuntu1 and 6.11.0-9-generic. |
OS: Ubuntu 24.04
Filesystem: ZFS. relatime=on, atime=on.
When pacoloco downloads a file, creation time is set, but access time is not set, whether it was downloaded at client request or as part of prefetch. Access time is defaulting to time
0
. Then wheneverpurge_files_after
runs, it deletes anything that hasn't been accessed in the meantime by a second client, wasting bandwidth.The access time is the same whether I have relatime on or off, tested with new file creation. ext4 does set the access time upon file creation.
Please have
purge_files_after
consider creation time as well as access time, and not delete files created within thepurge_files_after
time.As for workarounds: if I manually delete files with
find
, will pacoloco behave, or is it keeping its own database that that would mess up?Unix epoch 0, 1969 in my time zone:
Example logs of a totally wasted download - it was downloaded and deleted while I was asleep, and the computer that would download this package will be turned on and updated on tomorrow:
The text was updated successfully, but these errors were encountered: