Assets sometimes fail to load from the asset cache. #2663

kwvanderlinde · 2021-05-19T22:49:33Z

Describe the bug
When adding many new tokens using assets from the resource library, it sometimes happens that the asset is not fully read and results in MapTools freezing.

To Reproduce
Unfortunately I was not able to reproduce this without 5e Framework since I'm not sure exactly what it's doing to load the assets.

Steps to reproduce the behavior:

Clean MapTool installation
Downloads Rod's 5e Framework
Install the image files from https://get.5e.tools/.
Open up the Creature List and clicking "Make Token" for several different creatures.
After MapTool freezes, force quit.
Load up the campaign again and try to add the same creature that resulted in a freeze. This time it will work.

Expected behavior
MapTool should load the asset without freezing.

MapTool Info

Version: 1.8.3, reproduced on latest develop (7664101).
Install: New

Desktop (please complete the following information):

OS: Linux Mint
Version 20

Additional context
I'm not sure if MapTool itself is freezing, or if the framework is somehow holding it up. I also don't know how to get the logs from a frozen MapTool process, so I reproduced the issue with the latest on develop and got these logs right when the freeze occurred:

14:49:11.392 (AssetManager.java:454) [pool-4-thread-1] ERROR net.rptools.maptool.model.AssetManager - MD5 for asset Camel corrupted
14:49:11.397 (ImageManager.java:428) [pool-8-thread-1] ERROR net.rptools.maptool.util.ImageManager - BackgroundImageLoader.run(Camel,png, d453500b4f82966f1d166b65acaa1c12): image not resolved
javax.imageio.IIOException: Error reading PNG image data
	at com.sun.imageio.plugins.png.PNGImageReader.readImage(PNGImageReader.java:1523) ~[?:?]
	at com.sun.imageio.plugins.png.PNGImageReader.read(PNGImageReader.java:1840) ~[?:?]
	at javax.imageio.ImageIO.read(ImageIO.java:1468) ~[?:?]
	at javax.imageio.ImageIO.read(ImageIO.java:1363) ~[?:?]
	at net.rptools.lib.image.ImageUtil.bytesToImage(ImageUtil.java:302) ~[main/:?]
	at net.rptools.maptool.util.ImageManager$BackgroundImageLoader.run(ImageManager.java:424) [main/:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:831) [?:?]
Caused by: java.io.EOFException: Unexpected end of ZLIB input stream
	at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:244) ~[?:?]
	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158) ~[?:?]
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:244) ~[?:?]
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:284) ~[?:?]
	at java.io.BufferedInputStream.read(BufferedInputStream.java:343) ~[?:?]
	at java.io.DataInputStream.readFully(DataInputStream.java:201) ~[?:?]
	at com.sun.imageio.plugins.png.PNGImageReader.decodePass(PNGImageReader.java:1245) ~[?:?]
	at com.sun.imageio.plugins.png.PNGImageReader.decodeImage(PNGImageReader.java:1381) ~[?:?]
	at com.sun.imageio.plugins.png.PNGImageReader.readImage(PNGImageReader.java:1514) ~[?:?]
	... 8 more

I did some digging and added some additional logging to AssetManager, and what I found is that this call to FileUtils.readFileToByteArray() does not return all the data for the asset. This makes the MD5 to be incorrect and causes the inflation to fail.

The text was updated successfully, but these errors were encountered:

kwvanderlinde · 2021-05-19T23:30:59Z

I did some digging in the code, and I have a theory as to why this is happening. It looks like there is a race condition in AssetManager that allows a partially written asset files to be read back.

These are a few culprits here:

When creating the cached file, the file is written to on its own thread, and there doesn't appear to be any way of indicating that whether all the data has been written.
When reading the cached file, the file is considered to be in the cache if it exists and has a non-zero length.
When reading the cached file, the asset is used regardless of whether the MD5 matches. An error is logged, but otherwise this mismatch is not acted upon.

It seems to me that all three of these points could be made more robust, and fixing any one of them should fix the underlying issue. E.g.,

We could write the asset to a temporary file (e.g., <md5 hash>.tmp or something) and then rename it once the write has completed. That way, if the file exists with the right name, it should have all the data in it. If the write hasn't completed, or even if it failed part way through, the asset file won't exist and can't be read.
If we could know the expected length of the file, we could compare against that instead of comparing against zero. This would ensure that an asset is only considered to be in the cache if its file has the correct amount of data. This would also suggest that the write has actually completed (barring external factors such as tampering with the files, of course). Alternatively, the asset could be considered to be in the cache whenever its file exists, regardless of size, with corruption being handled purely by the MD5 check.
If the MD5 does not match the expectation, then the asset is not considered successfully loaded and the corrupted asset is not associated with the MD5 in the asset map. Perhaps the file could even be ejected from the cache in this case.

Phergus · 2021-06-15T13:58:57Z

I suspect what you have found here explains many of the cases where players end up with dreaded red X for tokens/maps.

It seems like your #1 suggestion will fix most cases and is a common way of avoiding these problems. Adding the MD5 check in #3 will allow us to identify corrupt images in the cache which would allow us to flush it out of the cache and, for player clients, request it again from the server.

Phergus · 2021-07-03T21:09:49Z

Testing. This appears to have created a problem with saving new images to the assetcache.

Do any of the following:

Create New Map and add map image that has not been loaded before and thus not in cache.
Import DungeonDraft UVTT file.
Drop new image as Token.

These all produce exceptions like this:

2021-07-03 15:03:17.292 ERROR net.rptools.maptool.model.AssetManager - Could not persist asset while writing image data
java.nio.file.FileSystemException: C:\Users\Phil\.maptool-rptools\assetcache\tmp.10391504194488045576 -> C:\Users\Phil\.maptool-rptools\assetcache\e3b4e8912980831edd7d0cc514c5ee17: The process cannot access the file because it is being used by another process
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:92) ~[?:?]
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[?:?]
	at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309) ~[?:?]
	at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) ~[?:?]
	at java.nio.file.Files.move(Files.java:1426) ~[?:?]
	at net.rptools.maptool.model.AssetManager$AssetWriteRenameOperation.commit(AssetManager.java:850) ~[main/:?]
	at net.rptools.maptool.model.AssetManager.lambda$putInPersistentCache$3(AssetManager.java:544) ~[main/:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
	at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
	at java.lang.Thread.run(Thread.java:831) [?:?]

Only the .info file is added to the assetcache. Fortunately the assets are saved with the campaign or exported map files.

kwvanderlinde · 2021-07-03T22:45:29Z

I have a feeling this may be a windows-specific thing, though I don't see why. I'll see if I can reproduce it.

Phergus · 2021-07-03T22:52:52Z

Could be given where the error occurs. The exception occurs 3 or 4 times depending on image size it seems.

kwvanderlinde · 2021-07-03T23:02:04Z

Ah, this may be as simple as an incorrect closing order for streams. Will test later and post a PR if that's the case.

Phergus · 2021-07-05T13:56:26Z

Re-tested. New assets are now being written to the asset cache and with no exceptions.

kwvanderlinde added the bug label May 19, 2021

Azhrei mentioned this issue Jun 29, 2021

Maptool hangs on corrupt asset loading. #2831

Closed

kwvanderlinde mentioned this issue Jun 30, 2021

Make the asset cache more robust, to avoid broken assets. #2835

Merged

Phergus assigned kwvanderlinde Jul 1, 2021

Phergus closed this as completed in #2835 Jul 1, 2021

Phergus reopened this Jul 3, 2021

kwvanderlinde mentioned this issue Jul 5, 2021

Proper closing order when writing assets #2843

Merged

Phergus added the tested This issue has been QA tested by someone other than the developer. label Jul 5, 2021

Phergus closed this as completed Jul 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assets sometimes fail to load from the asset cache. #2663

Assets sometimes fail to load from the asset cache. #2663

kwvanderlinde commented May 19, 2021

kwvanderlinde commented May 19, 2021

Phergus commented Jun 15, 2021

Phergus commented Jul 3, 2021

kwvanderlinde commented Jul 3, 2021

Phergus commented Jul 3, 2021

kwvanderlinde commented Jul 3, 2021

Phergus commented Jul 5, 2021

Assets sometimes fail to load from the asset cache. #2663

Assets sometimes fail to load from the asset cache. #2663

Comments

kwvanderlinde commented May 19, 2021

kwvanderlinde commented May 19, 2021

Phergus commented Jun 15, 2021

Phergus commented Jul 3, 2021

kwvanderlinde commented Jul 3, 2021

Phergus commented Jul 3, 2021

kwvanderlinde commented Jul 3, 2021

Phergus commented Jul 5, 2021