Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bookmark file is being used by another process #262

Closed
murlidakhare opened this issue Jun 23, 2022 · 30 comments · Fixed by #366
Closed

Bookmark file is being used by another process #262

murlidakhare opened this issue Jun 23, 2022 · 30 comments · Fixed by #366
Assignees
Labels
bug Something isn't working

Comments

@murlidakhare
Copy link

We are frequently getting below error:
Exception while emitting periodic batch from Serilog.Sinks.Http.Private.Network.HttpLogShipper: System.IO.IOException: The process cannot access the file 'F:\AuditLogs\LogBuffer.bookmark' because it is being used by another process.
at System.IO.FileStream.ValidateFileHandle(SafeFileHandle fileHandle)
at System.IO.FileStream.CreateFileOpenHandle(FileMode mode, FileShare share, FileOptions options)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
at Serilog.Sinks.Http.Private.Network.BookmarkFile..ctor(String bookmarkFileName)
at Serilog.Sinks.Http.Private.Network.HttpLogShipper.OnTick()

During the investigation, we found the post about setting bufferFileShared to True. However, this doesn't work.
Could you please suggest a fix to this issue?

@murlidakhare murlidakhare added the bug Something isn't working label Jun 23, 2022
@github-actions
Copy link
Contributor

Hi there and welcome to this repository!

A maintainer will be with you shortly, but first and foremost I would like to thank you for taking the time to report this issue. Quality is of the highest priority for us, and we would never release anything with known defects. We aim to do our best but unfortunately you are here because you encountered something we didn't expect. Lets see if we can figure out what went wrong and provide a remedy for it.

@FantasticFiasco
Copy link
Owner

I think we need to make your issue reproducible. Could you please create a GitHub repo with a small sample showcasing the issue? Please also provide the steps needed to get the exception.

@julichan
Copy link

julichan commented Jul 8, 2022

Hello, this issue is rather problematic i hoped to do the same and ended with the same error. I noticed it even prevent storing any log for the application suffering the problem.
While i don't have a shareable exemple, i reproduced it and can tell you how to reproduce it.
I think this will happen in any type of project but i did reproduce it using two net4.8 applications hosted by iis. So yes two different applications are required. You may not reproduce it with two different threads of the same app since the http sinks would belong to the same process.
You also need to setup selflog.enable() otherwise, you won't be able to see the error since it is the only way to obtain serilog's own logs but i guess you already know that.
Have both applications use the same buffer directory otherwise there is no point in a shared buffer. Off course that means both apps have the targetted url.
The first app will lock the bookmark file and the other application will be stuck with the error. The second app won't even be able to fill its buffers and lose all data.
Finally i read some of the code althrough not completely. I noticed the shared buffer option is only provided to the inner sink but does not seem to be used by the httpsink itself. You may have to update your code based on this option to handle the bookmark file correctly.

@FantasticFiasco
Copy link
Owner

Hi @julichan! I'll see if I can reproduce the issue after the summer vacations. I hope you can work around the issue until then.

@julichan
Copy link

julichan commented Jul 15, 2022

Thanks @FantasticFiasco, for this fix i wish to provide you further information. I believed the RollingFileSink had the same problem as the HttpSink but here's what i found during the investigation:

  • The HttpSink is using a dead version of the RollingFileSink, please abandon package Serilog.Sinks.RollingFileSink and use Serilog.Sinks.File which has the most up to date RollingFileSink
  • The package Serilog.File.Sink exists in two flavors: netframework4.5 and netstandard1.3. the netframework4.5 for some reason uses atomic sharedfile object which works only cross threads while the netstandard uses mutex which works cross apps. You'll want to force the usage of the netstandard version. I think the net45 flavor has no reason to exists or the reason is escaping me.

On our side, we found a temporary alternative to the http sink, it's the netstandsard rollingfilesink in net48 + filebeat but this involve a lot of extra configuration in filebeat and logstash. We hope to hear from you soon.

Perhaps you can handle the bookmark file through the sharedfile object too.

@julichan
Copy link

julichan commented Jul 24, 2022

Hello, @FantasticFiasco,
I went on and investigated further:

  • i'd like to correct one statement, the netframework 4.5 flavor of serilog.sink.file should work cross apps so it doesn't matter if you reference netframework 4.5 or net standard 1.3 flavor but it needs to be tested as there are flockiness reports for the netframework one.
  • in http sink, the file bookmarkfile.cs at line 31 open the bookmarkfile with only fileshare.read access preventing this to work in iis or in any multi process context but the next point should be the fix.
  • the PortableTimer is using a lock mechanism to execute the ontick method. It should use an os mutex or some other cross app lock mechanism in order to work cross apps. The constructor should contain a parameter saying whether the ontick method must be executed or not one last time on dispose to make sure the lock is taken into account when executing the httplogshipper ontick method at dispose time. That point shows there currently is potentially a bug even in cross thread mode at dispose time when using multiple times the same buffer with multiple instances of the http shipper.

If you mind, i could provide a merge request myself if i have time before you attempt to make a fix. My proposal is to fix the portabletimer lock at dispose and to create an alternative portabletimer that can be used instead when a crossapp option is enabled in the setting. Or if you don't want a new option, i can just replace the portabletimer lock with an os mutex.

@FantasticFiasco
Copy link
Owner

Thanks for the added information. I'll take a look at this issue when the summer vacations are over here in Sweden. I'll definitely wan't to make sure that we are using the correct internal dependencies to the file sink.

@julichan
Copy link

julichan commented Sep 7, 2022

Hello @FantasticFiasco,
How are you? Just wanted to know if you had time to look at this. Just so you know, we couldn't wait for an update so we've used the RollingFileSink from the file sink repository which is working flawlessly alongside filebeat. However it would be so much better if we could use the Http Sink.
Any idea when you can look at this?

@FantasticFiasco
Copy link
Owner

I'll try to reproduce this in this week or in next.

One thing that gets me confused is that it will require two different applications to reproduce the issue, is this correct? I don't think this sink has been built to support two different applications logging to the same buffer files. Although you mention that there is a problem here since the bookmark file is locked, another much greater problem is the fact that both applications will think that they are responsible for sending the log events over the network. That's one reason why you are successful using Filebeat, it's only running as one instance and it has a clear responsibility of sending the log events. Having two separate applications, sharing the same buffers files and bookmark file is like running two instances of Filebeat, one will surely fail or you will end up with duplicate log events on the log server.

@julichan
Copy link

julichan commented Sep 8, 2022

Hello @FantasticFiasco ,

Haha, i'm quite sure you can also reproduce it with just one application and two instances of the sink sharing the same buffer althrough i haven't tested that.

You know i think that serilog shouldn't fail where log4net and nlog succeeds. The http sink shouldn't fail where the others succeeds either. Plus you should know that even for a single application, iis will run multiple instance of it so it should be supported even if only for this case.

It doesn't matter if multiple sources sends the log if the bookmark file is handled properly, the logs won't be sent as duplicates. :)

I'm trusting you on this one. I know you can do it. 👍 🥇

@FantasticFiasco
Copy link
Owner

...you can also reproduce it with just one application and two instances of the sink sharing the same buffer althrough i haven't tested that.

Again, I wouldn't know why you would do that, the sink hasn't been implemented with that use case in mind.

Regarding the comparison with log4net and nlog, I'm not in competition with either of them, nor do they dictate the requirements of this sink. Regarding usage in IIS, I'm no expert in IIS, most of my experience comes from working in cloud environments. Are you referring to worker threads in the app pool when you mention instances, or is instances something else?

@janichirag11
Copy link

I think this could be due to multiple reasons, let me put them down one by one

  1. Do we have issues with multi-threading, as IIS creates multiple threads trying to access this file?
  2. Are any special permission that needs to be provided to the folder for reading, if yes then what are they?
  3. Is this because we are not disposing of the file stream for the bookmark file creating a locked bookmark file?

@julichan
Copy link

julichan commented Jan 2, 2023

Hello and happy new year,
Visited the code and saw at least the dependency on the file sink has been fixed. Cool. As you can guess 6 month waiting for a fix is not an option for a company so we've setup everything using filebeat. I don't know if we'll ever go back. Not any time soon anyway.

It's pretty easy to reproduce the bug but it seems you still need help here is the case: iis has a recycling process which works by creating a second instance of the web application (you can see two w3wp.exe at that time). Well in our case we notice that behaviour in prod only as it is probably happening too fast in dev but you can see the gist of it: two instances of an app means two instance of the sink with identical settings even for a short time and no way around this.
You'll get easy time reproducing this by starting two instances of the same app rather than trying to reproduce the iis recycling.
There are two problem in this sink:

  • the bookmark file is exclusively locked: two exe won't work on it at the same time, you need a rolling lock on this instead like the rolling file sink does.
  • the bookmark file current handling isn't right for this use case. It needs to be locked it the entire time the http operation is ongoing. And it also needs to be read before attempting to send anything to avoid sending logs in duplications...
    As fiasco mentioned, that means multiple app with make http requests. I don't see a problem with that but it ll never happen simultaneously. That said, there ways around it. An exemple would have an app send the data and the others monitor the bookmark file modification date to know if it should take over log sending or not. This could be an option. There are other ways.

Now this is way behind me but i also remember reproducing this using multiple instance of the sink in the same app. Anyway the sink was embedded in an async sink in case that helps.

We also have reasons to have multiple apps sharing the same sink: multiple iis web apps using same code and configuration with logs that had the same destination. Splitting the buffers is not an option as it would require reworking a large monolyth and large deployment code. We've tried work around but no easy solution would allow to resume sending unsent logs after recycling or app shutdown... Http sink has been then deemed unsuable for us.

Now you might still not consider this a use case. It's sad to get such an answer when all other sinks support this. I don't mean to stir a fight but Fiasco refused that i provide a MR and now i have other priorities too...

@FantasticFiasco
Copy link
Owner

@janichirag11, regarding your questions.

  1. Do we have issues with multi-threading, as IIS creates multiple threads trying to access this file?

Don't know. The test suite is not running in IIS, nor do I have any experience in hosting my applications there. But I would love a GitHub repository with a sample application showcasing the issue, with clear instructions on how to configure and host the application in IIS. With this level of detail I think I would be able to understand the use case a bit more.

  1. Are any special permission that needs to be provided to the folder for reading, if yes then what are they?

Are we talking about the folder hosting the buffer files and the bookmark file? If yes, then the application process will need write permissions to this folder.

  1. Is this because we are not disposing of the file stream for the bookmark file creating a locked bookmark file?

Maybe. The locking strategy selected for the buffer files and the bookmark file is not the same. Serilog.Sinks.File, the downstream dependency responsible for writing log events to buffer files, has support for multithreading, while the implementation in this codebase of the bookmark file hasn't. The reasoning behind this is that log events can be written from parallel threads, e.g. a web server handling parallel requests, while the bookmark file is only accessed from the log shipper that is responsible for sending log events over the network. The log shipper is acting as a singleton in the codebase, and is not designed for a situation where multiple instances of log shippers are sharing the workload of sending log events.

@julichan, regarding your comments.

As you can guess 6 month waiting for a fix is not an option for a company...

My comment from 6 months ago ended with me asking you a very specific question on the subject of IIS. I specifically told you that "I'm no expert in IIS...". I didn't get an answer from you, thus my investment in this issue stopped.

Let me try to explain. I don't know whether you've published open source. I have a couple of repositories here on GitHub that have gotten to a level where developers are asking me questions, filing issues and proposing feature requests. In these engagements I will try my best to provide answers. But the problem is that there's always more of you than there is of me 😃 We have an uneven relationship here. We're engaging in this issue, and you've made an investment in this specific issue (thanks!). I'm engaging in all interactions across all my repositories, while still maintaining dependencies, and sporadically implement new features. You will have to fight for my priority. The best way you do that is to optimize for the time I will have to invest. If you leave something open ended, if you force me to find my way to a reproducible sample, then your priority is downgraded. I don't think this is unfair, not to you and not to me. Especially when you, as you say, work for a company and get paid for doing this. Me? I don't get paid for providing this as open source. We don't share the same incentive for spending time on this. This is open source, you can fork my code, you can suite it to fit your needs. The license gives you this right!

...setup everything using filebeat. I don't know if we'll ever go back. Not any time soon anyway.

I am happy for you, you've solved your problem and can continue with focusing on the important parts of your business! I don't mind this at all. There is no benefit for me, direct or indirect, from you using this codebase. You should use it only if you find it valuable. Again, this is open source, you are free to use it and there is no downside for either of us if you don't.

It's pretty easy to reproduce the bug but it seems you still need help here is the case...

Yes, I definitely need help here 😄 Thanks for the explanation regarding recycling in IIS. But I would like to push back on the proposed solution. This sink has never been implemented to support the use case where multiple different applications are sharing the same buffer files. I don't think this is something that we will opt to support. However, IIS is a major player in the .NET ecosystem and if we have a problem with recycling processes it is something we might have to address. But the solution I would propose is one that would handle the recycling problem, where we handle the fact that two processes in a short time tries to access the same file. This solution seems much easier to tackle, using e.g. retries, than one where multiple processes for a long time share access to the buffer file. @janichirag11, would this be sufficient to you? In the that case it does, I would still need help setting this up in IIS. A sample application and instructions on how to reproduce the problem in IIS would be the trigger to have me invest more time in this.

@janichirag11
Copy link

Multiple retries could be one solution. IMO, if we use some mutex or lock mechanism where the resource could not be reopened would help to solve the current problem. In the BookmarkFile class, if we can add a lock, then it will help if another thread is trying to access the bookmark file and will not allow it and will reduce the exceptions.

This will give better handling for multi-threading errors and will still solve the purpose of a singleton.

In regards to creating a sample application, let me try that and will post once I am done with it.

Happy Coding :)

@FantasticFiasco
Copy link
Owner

Thanks! Happy coding to you as well! 😄

@janichirag11
Copy link

SerilogFlow

Here is the possible flow we might be dealing with... Serilog is configured for 2 sec and it takes time more than 2 seconds to get processed on the consumer side, and since we do not have locking mechanism we are getting exceptions

@FantasticFiasco
Copy link
Owner

Ok, if the issue we're running into is that attempt to ship the log events overlap, I think we're in a good situation. I'll see if I can create a test for this.

@janichirag11
Copy link

And then we should avoid throwing exceptions in this case.

@prasadpaul53
Copy link

prasadpaul53 commented Feb 2, 2024

Hello,

We're encountering a persistent issue on our production environment where we receive a large volume of error messages stating "The process cannot access the LogBuffer.bookmark file because it is being used by another process" during app pool recycling.

Here's a summary of the problem:

Error:
Exception while emitting periodic batch from Serilog.Sinks.Http.Private.Durable.HttpLogShipper: System.IO.IOException: The process cannot access the file 'LogBuffer.bookmark' because it is being used by another process. at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options) at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.FileStreamHelpers.ChooseStrategyCore(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize) at System.IO.Strategies.FileStreamHelpers.ChooseStrategy(FileStream fileStream, String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, Int64 preallocationSize) at System.IO.File.Open(String path, FileMode mode, FileAccess access, FileShare share) at Serilog.Sinks.Http.Private.Durable.BookmarkFile..ctor(String bookmarkFileName) at Serilog.Sinks.Http.Private.Durable.HttpLogShipper.OnTick()

Occurrence: Frequent during app pool recycling.

Serilog.Sink.Http version: 8.0.0

Our investigation revealed:
The issue is caused by app pool overlapped recycling. When the app pool is recycled, the previous process briefly holds onto the LogBuffer.bookmark file, causing access conflicts with the new process.

This happens according to the below steps:

  1. Process Hold on LogBuffer.bookmark: A specific process within the app pool retains the LogBuffer.bookmark file during
    app pool recycling.

  2. App Pool Shutdown Timeout: The app pool has a pre-defined timeout period for processes to complete tasks and shutdown gracefully before the recycling process commences.

  3. Process Exceeds Timeout: In this case, the process holding the LogBuffer.bookmark file exceeds the set shutdown timeout and is forcefully terminated by the app pool recycling mechanism.

  4. Concurrent New Process Access: While the old process is forcefully terminated, a new process from the recycled app pool attempts to access the same LogBuffer.bookmark file.

  5. Access Conflict and Error: Since the file is still locked by the remnants of the terminated process, the new process encounters an access violation, resulting in the "The process cannot access the LogBuffer.bookmark file" error.

Can you please implement the graceful shutdown to gracefully stop the ongoing process utilizing the logbuffer.bookmark file?
Is there any other alternative?

@FantasticFiasco
Copy link
Owner

Hi @prasadpaul53.

I've started to analyze this on branch test/locked-bookmark-file. I'm creating two log event producers sharing the same buffer and bookmark file. I'm using the durable file-size rolled sink (extension method DurableHttpUsingFileSizeRolledBuffers). The only configuration I apply is setting bufferFileShared: true.

My observations are as follows:

  • Since the sink is configured using bufferFileShared: true, the same buffer file will be shared among the producers. This is good, as this means that we won't drop any log events from either of the producers.
  • The bookmark file is locked when the log events from the buffer file is sent over the network. This means that one of the producers will be prevented from opening the bookmark file when the other producer is in the process of sending the log events.
  • Disposing HttpLogShipper seems to be slow, I was expecting this to be a much faster process. Let's look into this.
  • If app-pooling seems to be a common issue, perhaps we need to catch that specific exception and write a more friendly error message to the SelfLog.

Reflections?

@prasadpaul53
Copy link

Hello @FantasticFiasco,

Great that you have started the analysis.
Your observation regarding the locking of the bookmark file is absolutely correct.
My reflections are:

  • Since the default value of the time to wait between checking for event batches is 2 seconds. Means the HttpLogShipper will attempt to send batches of log events to the configured HTTP endpoint every 2 seconds.
  • During the app pool recycle, since the log events are being sent over every 2 seconds, the old process is not able to release the lock on the bookmark file. So, we get the exception whenever a new process tries to access the logbookmark file.
  • If the previous attempt was unsuccessful due to a file lock, this frequent retry can put repeated pressure on the locked file,
    potentially prolonging the issue.
  • As you pointed out, you can look into the faster disposing of the resources which are using the logbookmark file.

@FantasticFiasco
Copy link
Owner

The problem at heart is that we want to use the sink in two different use cases. The first one, originally in mind when creating the sink, was that the instance would terminate, and all log events ought to be sent before we terminate. That's why this line exists.

The other use case, which is relevant in many of the comments in this thread, is that the instance will restart. In this context the priority is to terminate quickly, and log events can remain on file since they will be sent after the process restart.

@prasadpaul53 does this sound correct?

@prasadpaul53
Copy link

Yes, it is correct. @FantasticFiasco

@FantasticFiasco
Copy link
Owner

FantasticFiasco commented Mar 27, 2024

Release 9.0.0-beta.2 has now been published on nuget, please assert that the release fixes your issues. If it does, please report back and I'll create a new official release.

Thanks for being persistent in your feedback.

@prasadpaul53
Copy link

Thank you for the update on the release. We will verify the fixes and provide our feedback soon.

@FantasticFiasco
Copy link
Owner

Just released this as an official version, you can find it on nuget.

@paullpra
Copy link

paullpra commented Dec 4, 2024

Hi,

We are encountering this issue where LogBuffer.bookmark is being accessed by another process. We are using version 9.0.0-beta.2 of serilog-sinks-http.

Error:

Exception while emitting periodic batch from Serilog.Sinks.Http.Private.Durable.HttpLogShipper: System.IO.IOException: The process cannot access the file 'F:\LogBuffer.bookmark' because it is being used by another process.
   at Microsoft.Win32.SafeHandles.SafeFileHandle.CreateFile(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options)
   at Microsoft.Win32.SafeHandles.SafeFileHandle.Open(String fullPath, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at System.IO.Strategies.OSFileStreamStrategy..ctor(String path, FileMode mode, FileAccess access, FileShare share, FileOptions options, Int64 preallocationSize, Nullable`1 unixCreateMode)
   at Serilog.Sinks.Http.Private.Durable.HttpLogShipper.OnTick()

We are receiving a significant number of these errors. Has there been any further improvement since version 9.0.0-beta.2 to address this scenario? Are there any configuration adjustments we should consider?

@FantasticFiasco
Copy link
Owner

No I don't think anything notably happened between 9.0.0-beta.2 and 9.0.0. Is your environment matching the environment of the other contributors of this issue?

@paullpra
Copy link

paullpra commented Dec 9, 2024

Hi,

Our environment aligns with the general description provided by other contributors. Here’s a generic outline of our configuration:

image

Given this configuration and the nature of the error, could you suggest a fix to prevent the bookmark file from being locked by another process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants