-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing IN_CREATE events #2
Comments
I've tried to replicate this with the following bash command: mkdir foo; touch foo/{bar,baz,corge} and it reliably picks up all the create events. Is this a fixed bug (there seems to be a little gymnastics in process_IN_CREATE to handle this case), or a flaw in my reproducing technique, or simply a difference in systems? |
I tried this: Stdout show: I'm using python-inotify-0.9.0-2.fc14.noarch |
Thanks guys for your updates on this issue. Indeed, I've just tried with a recent kernel (2.6.35) and given this limited example (but with 100 files) I can't reproduce this issue anymore. However I don't know if it is definitively fixed or if it is somehow mitigated. If someone knows more about it, please share! |
Hi, I'm using pyinotify (stock version on Ubuntu Maverick & Lucid servers) in a heavy parallel-changing environment (10k watches or more) where we need custom ACLs to be applyed in subdirs, based on special rules. There, monitoring a dir where a user tar -xzf anything > 100 files and subdirs (roughly) triggers the "bug": some events are missed. Untaring a pyudev tarball doesn't trigger it if the machine is not loaded, untaring nautilus or samba will trigger it in 99% of cases (my test case consists of untaring pyudev, then nautilus, then samba, then a linux kernel archive, to check ACLs are applyed on every file and every subdir created, and add them automatically to monitored files and dirs). But I believe this is not a pyinotify "real bug", because it relies on concurrent accesses in the kernel, the libc and every part of the multi-user system: if the inotify call to create a watch for a sub-dir arrives after file creation in the given subdir, then the IN_CREATE of the file is missed. I intimately feel this as a perfectly normal situation, due to the concurrent nature of the environment (and the fast nature of a untar operation on very small files, compared to multi-layer OO interpreted python code). But this more a feeling than a real proof. Still, it is a problem from the developer stand-point, in normal conditions (where we want all events, without any doubt ;-) ), and I wrote some kind of "rewalk_directory()" helper to catch missed files and dirs. My own implementation of the ThreadedNotifier and its helpers now have something that will "complement" the _eventq (directories and files are not walked twice, when possible). Perhaps the generic parts of my rewalk_() method could be integrated into the SysProcessEvent() ? I don't know. Note that my rewalk() code is higly (who said "uglily ?") dependant on other things in my app, so it's useless to use it verbatim, but I find it quite readable and hope you will get the idea, perhaps find it interesting. Project: http://dev.licorn.org/ Best regards from a fellow user, |
Thanks for your feedback!!, I'll try to find time to read your rewalk() method. Just out of curiosity when do you trigger the call to this method? |
The rewalk is trigerred on every dir only CREATE | MOVED_TO. It is internally recursive, but avoids re-triggers on already watched dirs, and already checked files (with a controlled expiration mechanism). The rewalk is synchronous, but all ACL applying are asynchronous (delegated to another thread), and pre-fill a list of expected ATTRIB events, to avoid loops. To monitor my watched dirs, I only take IN_CREATE | IN_ATTRIB | IN_MOVED_TO | IN_MOVED_FROM | IN_DELETE_SELF into account. regards, |
Oliver, I too am experiencing this exact sort of behavior. I might have to implement some sort of rewalk method too. |
I think the rewalk hack should be investigated indeed. I did not reply in depth to olivier because I must say I don't have the whole problem in mind right now :( so I didn't want to rush making comment that would be false, I need to find time to think about it. However, what olivier have implemented (at least from my understanding) seems very appealing and promising. I think rewalk() is called at a good time where the missed directories/files would likely be found and also because add_watch() is synchronous thus every iterated file/directory can be tested against the dictionary storing all watches for deciding if the file/directory was missed or not and call add_watch() if needed. |
I can confirm this behavior. I have been implementing some event listening code and have been noticing situations where copying a folder with a few subfolders is dropping events. Sometimes they hit...sometimes they don't (almost 50/50). I'm going to check into your rewalk logic and see if that solves my issue. |
Hey Olivier, I tried to look at your rewalk method, but the link is broken. Can you point me to where it is? I'd like to give working that into the pyinotify a go. Thanks, |
Karl, here's a code fragment that I'm using to process events that may be missed as a result of IN_CREATE|IN_ISDIR or IN_MOVED_TO|IN_ISDIR. It may be a little too implementation specific (self.event_create is a method I have defined somewhere else that would handle the file creation), but you should get the idea behind the logic:
Fred |
Thanks for the code example, Fred. I'll be working on a this patch this week, hopefully we can get something out of this that will satisfy all of or needs. |
Hi guys, This morning I took time to rethink this issue and re-read relevant parts of pyinotify's code. I did a small change that should (I hope) help mitigating this issue. Pyinotify now generates IN_CREATE events for all files contained in a newly created and watched directory. Please consult and test this commit. I did not apply this behavior to IN_MOVE_TO events because I think it would be wrong to generate IN_CREATE events on moved files. To test this commit I only used this tiny script, so I encourage you to test it more extensively in your projects.
|
That's a welcome responsive reply, Sebastien! I will document my tests this morning to share. Thanks! Karl |
@seb-m I definitely think this is the way to go, in terms of approach. I'll give it a go and share what I find. |
Just from the few tests I've run this morning so far this is looking great, Sebastien. I will provide my logs and test documentation after lunch for you to confirm. |
@karlgrz @wayoutmind great, if it confirms this is great news. |
Here's a link to my pre-patch log file and my post-patched log file: https://gist.github.com/904199 Here is the directory structure that I used in the test:
In my pre-patch log I am NOT logging the IN_CLOSE_WRITE event, but in the post-patch I am, just to show that the patch is working a little more clearly. Pre-patch, every single time I would START my script it would seem to catch all the events. As the script was running, though, it would eventually start to drop the /srv/content3/pll.jpg file very consistently. Post-patch, every event seems to be caught and accounted for correctly. This patch has fixed the problem I was having with dropped events. Thanks, Sebastien! |
@seb-m Your patch seems to have caught all IN_CLOSE events as a result of creating a directory. Here is my log output for a tar -xzvf on the latest Wordpress download. It caught everything, in a very deep and complicated directory hierarchy! As far as I can tell, this fixes my problem completely. Thanks again for the great work. |
@wayoutmind @karlgrz thanks for your testing, I think it has come time to release a new version of Pyinotify with these changes. I will prepare it soon. I think we should let this issue open at least for now, just to give peoples a heads-up on the current situation of this issue and for new comers informing them on the subtilities of the way how inotify works and how this issue is currently mitigated. |
@seb-m @wayoutmind no problems. Thanks for the responsiveness. This is working great for us. |
Hi, has a new version with this fix been released? Which version is it? |
Hi folks. Sorry for the delay, I was much too busy... I will test 0.9.2 as soon as possible and report about the results. @seb-m, thanks a lot for taking time to fix this issue. BTW, the links are back. Too late. Sorry again for delay. Best regards, |
Duplicated from Pyinotify Trac - Issue 8
Consider this case:
/wdir
: is a watched directory, where each new created directory is automatically watched/sdir
: is a directory with several small files (file1, file2,..., file10)Now,
/sdir
and its content are copied to/wdir
:Under these circumstances if we had started pyinotify like this:
We would have observed the following output (assuming files1-10 are copied in that order):
What happened is that there is a short laps of time between the moment
/sdir
is copied into/wdir
and the moment where pyinotify effectively starts watching it. And during that time where/wdir/sdir
is not yet watched the first files file1, file2, file3 are copied into this directory and mistakenly do not generateIN_CREATE
events as expected.(Thanks to Peter Smith for his initial report)
The text was updated successfully, but these errors were encountered: