-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Media cache : A background task to purge all periodically. #8751
Conversation
Including ms-cache and is-cache folders.
src/OrchardCore.Modules/OrchardCore.Media/Services/MediaCacheBackgroundTask.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a good start but I think we may a better way to purge the cache, coz what I see is the deletion may take long time for a large cache also it might fail if the file is locked if the timeout is reach
True but you can adjust the repeat time too. I left a long delay because it generally never reach the timeout. It gives time to ImageSharp to do it's job. If you have images that needs to be resized and that it takes more than 25 seconds, these are huge files. |
As I said it's good start, waiting for other guys thoughts Thanks |
Made it a feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a bit of an internal hack I shared with you @Skrypt so I'm not sure if it's the right approach to include in the full solution, but good to see the many improvements to my quick hack ;)
Might be better to have something that is UI driven first for ImageSharp, and then possibly a feature for ImageSharp as a background task, that could consume the same service, and a background task for the media cache (again consuming the existing services w/any tweaks or extensions that are needed to those.)
I think I already did the ui for the blob media cache, so we could lean on that for ImageSharp?
Here's it's mixed with blob storage and image sharp, where users might only be using image sharp, and not blob storage.
src/OrchardCore.Modules/OrchardCore.Media/Services/MediaCacheBackgroundTask.cs
Outdated
Show resolved
Hide resolved
Yes, maybe we should have a separate background task for each of these cache then. And the purge button could just trigger manually these background tasks instead. But right now, I made it dependent on OrchardCore.Media.Cache feature because of that. I think we need to see if we want to have this in OC first. But for not it's only some practical hacky bg task 😉 |
src/OrchardCore.Modules/OrchardCore.Media/Services/MediaCacheBackgroundTask.cs
Outdated
Show resolved
Hide resolved
src/OrchardCore.Modules/OrchardCore.Media/Services/MediaCacheBackgroundTask.cs
Show resolved
Hide resolved
@Skrypt Good start Yes, it is related to one of the last things i'm thinking about, to keep instances in sync in a distributed env. As i remember, if we don't purge all cached files periodically, we were also thinking about removing the ones that may have been invalidated e.g. when they have been updated to the blob by another node. Or, because there may be many cached files, maybe use an expiration time from when they were created until now, but need more thinkings. But not incompatible with a feature that purges all of them periodically, that would be by nature distributed if each node purge its own cached files, assuming that a given file system is not shared by multiple nodes, but we use try / catch blocks. |
Oh well guys thank @deanmarcussen he did most of the work I just had fun making it a little better. |
TODO : Make it purge only the current tenant ms-cache. We can't do anything about that for the is-cache as it is global. |
Reverting async modifications from previous PR.
So tonight, another night into the is-cache purging story. Got files in there that stayed locked up untill the app closed. So the way I fixed it is that I cleaned up the folders manually and restarted the app. I previously restarted the app without cleaning them and the lock was still there. So, I cleaned them manually and then I've been able to execute the background task without any issue (no logs). This one I will be having a hard time to repro. |
@Skrypt just an idea on the fly
Purging the is-cache maybe a separate feature or not, the idea being that only the default tenant will be allowed to do it. Then still a potential problem if multiple instances run on the same file system, but we now can configure a given background task to be atomic / exclusive in a multi instances env by using the redis lock feature. But anyway for now Lucene can't be used in such a context, so if we assume that when using Lucene we should not use multiple instances on the same file system, that's "okay" we don't need the distibuted redis lock. |
Allowing only the host tenant to clean the is-cache folder might cause some issues when we need to refresh a file that has been updated but has kept the same name. Here, from a specific tenant you will require to refresh these resized image files in the is-cache else it will still serve the old version of that file. Currently, what I'm doing is that I delete these files manually on the server is-cache and I'm cleaning my browser cache to force it requesting back the new file on the Azure Blob Storage. So basically, if the is-cache is global to all tenants then you can't clean all files of a specific tenant because you need to search for a specific file in those hashed folders hierarchy which makes no sense at all. So, either we make one is-cache folder per tenant or we keep the actual design and we allow to purge it entirely from any tenant. |
Yes, my idea on the fly was only based on what is currently done here if i understood correctly, if each tenant has a background task purging periodically all is-cache files indifferently, i was thinking that it could be done by only one tenant e.g. by a DefaultTenantOnly feature. But yes, maybe others things to do, a mix with auto purging periodically and / or based on the file created files for example, and some UI purging cmds that would need to be distributed if possible, looks not so hard to do to say to all instances to purge all files, this by sharing an identifier as we already do for other shared documents, but maybe not so simple to invalidate specific files as it seems that the hashed folders hierarchy would not be the same on each instances. But not sure about all of this, just ideas on the fly, would need to focus on it again more in details ;) But i think it is important to work on this so that we could find the best compromise Having a folder per tenant, if it's possible, looks like a good idea |
[assembly: Feature( | ||
Id = "OrchardCore.Media.Cache.BackgroundTask", | ||
Name = "Media Cache Background Task", | ||
Description = "Provides remote store cache and ImageSharp cache periodical purging ability.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description = "Provides remote store cache and ImageSharp cache periodical purging ability.", | |
Description = "Provides a way to purge media cache regularly.", |
{ | ||
_logger.LogInformation("Media cache background task cleaning started"); | ||
|
||
var directoryInfo = new DirectoryInfo(Path.Combine(_webHostEnvironment.WebRootPath, "is-cache")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be using IMediaFileStoreCache
service here by calling GetDirectoryContents("is-cache")
instead of builder the DirectoryInfo manually?
src/OrchardCore.Modules/OrchardCore.Media/Services/MediaCacheBackgroundTask.cs
Show resolved
Hide resolved
Don't remember all details but just as a reminder about things that may be related.
|
RecursiveDeleteAsync(directoryInfo, false, cancellationToken); | ||
|
||
directoryInfo = new DirectoryInfo(GetMediaCachePath(_webHostEnvironment, DefaultMediaFileStoreCacheFileProvider.AssetsCachePath, _shellSettings)); | ||
RecursiveDeleteAsync(directoryInfo, false, cancellationToken); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the idea is to purge-all, you should use IMediaFileStoreCache
and call PurgeAll()
method which does the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have used that method if that would have worked all the time. The thing I'm trying to do here is to find out why it fails to delete some of these files because of a lock. I don't want to use another service that does that because I don't want to change the service itself. Once, we will have figured out the lock then we will update that service accordingly.
@jtkech I'm pretty busy right now with other things, I just wanted to give @MikeAlhayek the link to this PR as an example of how to purge the cache. Thinking about having a cache size limit may be a bad idea if there is a lot of small files in these cache folder. It may become very slow exponentially really quick. I think we need to limit the upload file size first and then, in the cache, we could purge the files based on some DateTime like you are suggesting. |
No problem take your time, will be happy to work with you on this one as it is an important remaining feature to do. Do you meant that some files get locked not only because of concurrent accesses but more like e.g. Lucene index files until some services are disposed? |
Yes, the ImageSharp cache is the culprit. |
Be really careful about this. I wrote it, it's terrible, and I would be very disapointed to see it merged into Orchard Core, with it's current implementation. The original intent was to be able to hold the cached files in a cdn, so that the server didn't need them anymore. Both expensive, and very bad for performance. Being able to delete the cached image sharp files from the admin but not from a background task would be my recomendation. (but for now only from the default tenant, due to the way imagesharp files are stored) This from @jtkech might make it more useful, but then you would still probably want the delete image sharp cache files from the admin as well, as there is a use case and need sometimes to do that. Just not a good idea to do it on a background task.
|
Oh yes forgot that currently it is global, not per tenant. So just to say that when #11890 will get merged, it will be per tenant, |
Okay, maybe it has been solved in the meantime, if I look at this comment #8751 (comment) where you are talking about this, around the same date @deanmarcussen fixed an issue on a not disposed stream SixLabors/ImageSharp.Web#162 (comment), and he was hoping that it also solves locking issues ;)
|
Maybe is the constant here. 😉 |
@Piedone This PR was kept as a draft because it had issues with the ImageSharp cache. Needs to be tested after merging latest main branch on top of it. Though, this backgrond task is a really tiedous patch for some other issues that comes from ImageSharp when it fails to resize images else it would not be necessary. I would rather keep this as a draft PR for people to use as a tool when needed. |
This might be interesting going forward, even if not this PR necessarily, but the idea. Would you like to instead create an issue? BTW I might do something similar under #15016 (comment) for Azure Blobs and ImageSharp. |
So, is this something you'd like to revisit any time soon @Skrypt or should we close? If you don't want to make this into an actual OC feature, but just keep as a reference for some, then perhaps a blog post, or a discussion on the discussion board would be better. However, for the original issue you had, with leftover ImageSharp files, rather, there should be a fix implemented in IS - if it's still required. |
Unless we find a way to fix the issue we can't really merge this in OC. I would just leave it as it is for now as the discussion here is valuable for me to remember where I was at. |
It's just that PRs are for things where the goal is to (eventually) merge them into OC. A draft PR is otherwise not really easy to discover if you're looking for a solution. So if you don't want to merge this in the foreseeable future, the mentioned discussion board or blog post are better ways to ensure that people can benefit from it. |
Related with #8715
Including ms-cache and is-cache folders.
I've made some updates to @deanmarcussen original work and wanted to share it here instead of Skype.
This could be something that we want in OC but with a configurable CRON value.already existingSomething that I mentioned in the issue is the fact that we require to check for locked files because of ImageSharp. Here a purge cache would require to pause the service while we are purging the is-cache folder. Maybe something to look into for later.
I keep this one as a draft because there is definitely surely a better way to handle purging the ImageSharp cache. The while loop with a timeout I made here could be an issue but I coded the bgtask so that it doesn't log anything from executing. It is efficient "at least locally", maybe a little too much, I did not test it's perf on Azure yet.
One issue we often have "or had" is that when we purge the current ms-cache ; it is not purging it's is-cache. So, if I'm updating an image without changing it's filename, in the ms-cache, it won't refresh the thumbnails in the is-cache folder. Small things like that.
So using a background task to clean every assets periodically will allow to refresh all these images (for now).
TODO :
Closes #15064