-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: replace offline-mode filewatcher with polling #317
Conversation
This pull request has been linked to Shortcut Story #231466: Flaky filewatcher test. |
a01fb46
to
919bd49
Compare
4183a15
to
1ef2d1f
Compare
@@ -186,7 +186,7 @@ func TestFileUpdatedWithInvalidDataAndDoesNotBecomeValid(t *testing.T) { | |||
|
|||
writeMalformedArchive(p.filePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: It's unclear to me whether these should be errors or warnings. In Eli's spike of this functionality, there is extra logic to determine if the file once existed and then was removed - which might not be considered an error.
This could be nice but it's more logic in the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are using relay proxy is deleting a file a valid thing to do, and what is the expected outcome of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's valid in the sense that you can do it and nothing bad will happen in that exact moment, since the environments are already loaded into memory.
But I'd say it's not an expected thing, and not recommended, since it means that if Relay crashes or restarts, it's going to fail.
am.loggers.Debug("Got retry signal") | ||
pendingRetry = false | ||
maybeReload() | ||
if fileMayHaveChanged(prevInfo, nextInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, one thing to note, this fileMayhaveChanged
would be a bad match for use with a watcher. If you programmatically modify a file, then it gets several changes in a small period. This period is less than the timestamp precision, which is 1 second typically. So depending on how things go you get random states within that update cycle. Node server had this problem.
I do think there is a letter problem here. Which is 100ms is also faster, multiple updates within 1 second you could load the first change, and then subsequently thetimestamp remains the same, so you don't. But it isn't too bad because it also has to be the same size. So it would affect a change of variation from something like "a" to "b".
🤖 I have created a release *beep* *boop* --- ## [8.5.0](v8.4.2...v8.5.0) (2024-05-14) ### Features * replace offline-mode filewatcher with polling ([#317](#317)) ([7bea824](7bea824)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Requirements
I have validated my changes against all supported platform versions:
Related issues
We've encountered robustness issues with the current offline-mode file-watcher implementation in Docker images that use shared volumes. Additionally, we're seeing intermittent CI failures in the file-watcher tests.
Overall, I believe the complexity introduced by file-watcher isn't worth the payoff. It is supposed to unify all the different operating system methods of notifying that files have changed, but it falls short.
The vast majority of the latency when using offline mode would be downloading the actual archive from LaunchDarkly and the interval between those downloads - which might be minutes/hours/days.
Describe the solution you've provided
This PR replaces the file-watcher with a simple polling loop. Every interval, it calls
stat()
and determines if the offline archive needs to be reloaded.The default interval is 1 second, and the minimum configurable is 100ms. The minimum was chosen to protect the system in case of accidental configuration of an extremely short interval (like 0).
In practice, most users may raise the interval - for example, if they're fetching the archive every hour, there is no need to use a 1 second interval.
Describe alternatives you've considered
An alternative is to more rigorously test the existing file-watcher, document all of its failure modes, and overall invest more in that solution.