-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WebAssembly build #3522
WebAssembly build #3522
Conversation
a55afb0
to
82c0159
Compare
This is a WebAssembly build of sharp usable in Node.js and Stackblitz environments. There is quite a lot going on here to make it work, so I won't list all the implementation details I had to make it in the PR description, but instead would ask you to refer to the comments which hopefully explains the individual hacks, and feel free to ask any questions in the review. The build script itself piggy-backs on the excellent work done by @kleisauke in [wasm-vips](https://github.com/kleisauke/wasm-vips) with some modifications to use it only for building libvips rather than custom bindings. One thing I'll emphasise is that this is a build made specifically for Node.js / Stackblitz with the intention of being 1:1 API-compatible - among other things, it means that Wasm instantiation is intentionally synchronous, and that it will use the native filesystem via Node.js raw filesystem. I had / have a separate branch with an almost working browser build of sharp, but that requires some API changes to be usable in a browser without locking the main thread and with accepting e.g. `File` object instead of using virtual filesystem, so for now keeping it out of scope. The concurrency is currently limited to a fixed-size threadpool. While I made it possible to create threads on-demand in recent versions of Emscripten, there are still some issues and bugs when trying to use it with internal libvips threadpool, so for now keeping a fixed-size threadpool is a safer and time-tested option. Among other things, this will run only one concurrent sharp operation at a time. Also, in Stackblitz environment libvips will be limited to only 2 threads to keep memory usage under control - this is done because libvips already always needs +3 extra threads, and async emnapi operation will need yet another +1 thread per async operation, so the number of Workers quickly adds up. Additionally, some formats and operations - namely, SVG, DZI and text operations - are currently unsupported just like they're in wasm-vips. Text (both on its own and in SVG) is notoriously difficult due to lots of questions around font loading (Local Font Access API, remote fonts, etc.), but other formats might come in time. For now, though, this Wasm build should already cover most common use-cases. Finally, for now I committed the build script together with prebuilt binaries as part of the PR, as it made testing easiest, but I'd ask the maintainers to integrate it properly into their "prebuilt addon" system somehow - I can't do that from my end, as I neither know how it works nor have access to the storage.
89a4650
to
65e504a
Compare
Intentionally skipping coverage here, as Wasm doesn't cover everything.
360198c
to
b9e340c
Compare
So there's one flaky test that I'm rarely seeing locally, but currently failed on CI, and whenever I saw it, it always showed number 52:
It seems to suggest there's a leak of 52 MB somewhere in the Wasm build (at least I haven't seen it failing on native builds yet), but no idea where :/ |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Ah ok I can reproduce this consistently locally when I disable |
Narrowed down to Interesting that both tests are specifically for failures. |
Went through tests in those suites by uncommenting one by one and looks like only these 3 tests cause the detected memory leak:
It's interesting that all three 1) use JPEG input, same tests for e.g. PNG are fine and 2) return an error. It increasingly looks like a legitimate leak in... mozjpeg? vips JPEG module?... but only when built on Wasm and only somewhere in error handling path. |
This comment was marked as outdated.
This comment was marked as outdated.
If I'm reading VIPS_LEAK logs correctly, it seems to confirm that JPEG image is the one leaked (although there's a bunch of other objects here):
@kleisauke This is probably a leak that affects wasm-vips itself too, rather than specific to this PR. |
a9009b8
to
b9e340c
Compare
Resetted the branch to exclude my |
This comment was marked as outdated.
This comment was marked as outdated.
Can't help but feel some sort of irony at the coincidence, because the last time I wrote an article on debugging memory leaks in Wasm, it also involved mozjpeg 😅 https://web.dev/webassembly-memory-debugging/ I don't think it's the same issue though - as far as I can tell, libvips correctly calls |
Wow, thank you very much Ingvar, this looks like a great start in getting sharp compiled to WASM. дякую! Over the next few days I'd like to focus on getting the new release out (major libvips upgrade, TypeScript definitions moving "in-house", etc.), then will come back and take a closer look at your PR. I will have questions :) |
No-no, just click that merge button 🙈 |
I'm curious - why start? Do you mean in terms of missing formats, or browser support or something else?
💗 |
Fantastic work @RReverser, really 👏 👏 👏 I can't wait for this to be integrated upstream. This will be a game changer for the Web and also a great case study for Wasm and the Node.js ecosystem. |
Okay @kleisauke and me found the reason for the leak (in case of wasm-vips, 2 separate reasons 😅). I committed the rebuilt JS+Wasm binaries to confirm it on the CI as well - which passed on wasm32 now (x64 seems a flake) - but I'll wait on him to update wasm-vips repo so that I could update the source hash in the build script. UPD: done. |
Nice to see emnapi is used in so great project! I'm a bit curious why emnapi needs embind, is there any problem without linking embind? |
Uhh good question 😅 I think it was necessary before? I'm not sure, but there was a reason I added it back when I started... Happy to remove now. |
Seems no longer necessary.
I doubt this install overhaul will be finished in 2023, so we need a temporary solution here. Ideally this would go into sharp repo, but publishing |
@kleisauke Thanks for the code snippet, that was helpful! I ran the Wasm build against all the Next.js tests and all of them pass. However, I found that AVIF was extremely slow. It actually performed worse than Squoosh which is also Wasm.
The one that stands out is @RReverser This makes me think there might be a bug. The code running in this case would be: const sharp = require(`sharp`);
const transformer = sharp(bufferInput)
transformer.resize(2048, undefined, { withoutEnlargement: true })
transformer.avif({ quality: 60, chromaSubsampling: '4:2:0' })
const result = await transformer.toBuffer() |
I don't think it's a bug. Sharp (or, rather, libvips) and Squoosh use different libraries for AVIF compression, as well as handle multithreading in very different ways, so differences are expected. You can bring it up with libvips if you can reproduce it natively too, but that's out of scope of this PR. |
Now that #3750 is in a good place, I've finally made some time to start to look at this PR and understand how best to integrate it. Thank you for your patience. The prebuilt libvips binaries provided for use with sharp are, where possible, a single shared library containing libvips and all of its dependencies statically linked. The prebuilt sharp binaries are a shared library with a dependency on the above shared library.
This is done partly to help clarify licensing. User code has a shared library dependency on Apache 2 sharp, which has a shared library dependency on LGPLv2+ libvips. Mixing sharp and libvips (and other LGPLv2+/v3) code in the same shared library could mean parts of sharp itself falling under the terms of LGPLv3, which might introduce a very small risk of GPL exposure that some would not be willing to take. What options are available, if any, for keeping these shared libraries separate in WASM? For example, could we publish |
Yay! Glad to see progress!
Yeah there is no official dynamic linking in Wasm (yet and likely not soon); there is this Emscripten's side module mechanism but it tends to be problematic both in terms of bugs and performance overhead (as now you need to have different Wasm modules, individually compiled by the engine, communicating to each other through separate pieces of JS), so it's normally recommended to avoid it in favour of static linking. If it's not too much of a concern, I'd rather recommend sticking to static linking as well. I assume the different license would only apply to users of the Wasm npm package if Wasm binaries aren't going to be actually checked into the repo? |
For the sake of setting clear expectations, I probably should also say that it's been a while and meanwhile I've switched and am working on other projects. I'd still want to see this landed and will try my best to respond here to any comments and minor suggestions, but can't promise time commitment for doing actual work on further integration. |
@RReverser Thanks Ingvar, I've spent the last couple of days immersing myself in the world of WebAssembly and have come out the other side with a much better understanding, or at least with enough experience to be dangerous ;) Your advice on linking is really useful, I agree static is the way to go. As a result the use of wasm will be opt-in e.g. "add package X to your dependencies" and will require suitable licensing notices. If you hadn't seen, the forthcoming libvips 8.15.0 will include support for SIMD via highway, which means (greater) SIMD support for WebAssembly too. @toyobayashi Loving your work on |
That's awesome, should be easy to make use of it when it's out. |
@lovell Thank you! Great to see sharp wasm succeed. Also great thanks to @RReverser for his many valuable suggestions and contributions to |
I've split this change into two separate (draft) PRs and updated to fit with the new installation package structure.
There will be a prerelease soon should people wish to help test these - please see #3750 for updates. |
This PR has landed as lovell/sharp-libvips@e219aea and a8f68ba and can now be tested via the instructions in #3750 (comment). It will be included as part of sharp v0.33.0. дякую @RReverser for all your work on this, 谢谢 @toyobayashi for emnapi and, as always, dankjewel @kleisauke for wasm-vips where this all began. For StackBlitz support please see stackblitz/webcontainer-core#1236 |
Amazing, thanks @lovell! |
This is huge! Awesome work everyone and thanks a lot again @RReverser for working on this with us. Thanks @lovell for getting this merged! 🙏 |
Wow this is very interesting, I came across this thread researching for the next itteration of our web based email designer. Were currently using Sharp server-side using an XHR request to sanity check and resize the dimensions of the image to a sensible size. One of the things on the list was client side processing for this. @RReverser it sounds like that's a separate project branch you have? being able to pass in a file object and manipulate in the browser would be perfect - are there any plans to push this at some point. May not be able to contribute with the code but can certainly help testing in a perfect use case |
I would also love a browser build of Sharp, being able to scale images and generate thumbnails on the client side before uploads would diminish stress over the remote architecture, cut costs, and increase speed! |
That really would be awesome - I have some applications where sharp processing on the server becomes the perf bottleneck with many users active, and distributing that work to their clients would fix that. |
Yeah I can probably rebase & push it somewhere now that this PR is done. Most has been done, but I won't be able to work on remaining issues further myself, maybe someone can fork and continue that way. That said, I wonder if for browser and Cloudflare Workers usecases something like https://github.com/jamsinclair/jSquash would be better suited? (no offence @lovell) It's a fork of Squoosh.app but as a library. It's a bit more modular, since it compiles those codecs individually, and, by avoiding libvips, it can also run in environments without Web Workers. |
@RReverser Thanks for the steer on JSquash I've had a good look at that and integrated the code but the performance is poor. I note someone else has recently raised that as an issue so it looks like the code around the codecs is some kind of bottle-neck. I've also experienced crashes on images that work on the core Squoosh app Unpicking the react front end from squoosh is an option but it does not support animated GIF. I know animated GIF in 2024!! but my use case is an email designer and email design is stuck in 1994 and animated GIF is still a "thing" and Sharp does a great job. I can take a look at forking the project if it's rebased, never done anything with WASM so it may be a learning curve :) |
This adds a WebAssembly build of sharp usable in Node.js and Stackblitz environments.
There is quite a lot going on here to make it work, so I won't list all the implementation details I had to make it in the PR description, but instead would ask you to refer to the comments which hopefully explains the individual hacks; and, of course, feel free to ask any questions in the review. Some things were already upstreamed to Emscripten and emnapi, and others probably will be over time, hopefully simplifying the porting process for everyone, but for now some hacks remain.
The build script itself piggy-backs on the excellent work done by @kleisauke in wasm-vips with some modifications to use it only for building libvips rather than custom bindings.
One thing I'll emphasise is that this is a build made specifically for Node.js / Stackblitz with the intention of being 1:1 API-compatible - among other things, it means that Wasm instantiation is intentionally synchronous, and that it will use the native filesystem via Node.js raw filesystem. I had / have a
separate branch with an almost working browser build of sharp, but that requires some API changes to be usable in a browser without locking the main thread and with accepting e.g.
File
object instead of using virtual filesystem, so for now keeping it out of scope.The concurrency is currently limited to a fixed-size threadpool. While I made it possible to create threads on-demand in recent versions of Emscripten, there are still some issues and bugs when trying to use it with internal libvips threadpool, so for now keeping a fixed-size threadpool is a safer and time-tested option. Among other things, this will run only one concurrent sharp operation at a time. Also, in Stackblitz environment libvips will be limited to only 2 threads to keep memory usage under control - this is done because libvips already always needs +3 extra threads, and async emnapi operation will need yet another +1 thread per async operation, so the number of Workers quickly adds up.
Additionally, some formats and operations - namely, SVG, DZI and text operations - are currently unsupported just like they're in wasm-vips. Text (both on its own and in SVG) is notoriously difficult due to lots of questions around font loading (Local Font Access API, remote fonts, etc.), but other formats might come in time. For now, though, this Wasm build should already cover most common use-cases. I did have to modify some tests to use more widely supported formats as inputs though, as well as sprinke some conditional skips for
npm test --arch=wasm32
to work.Finally, for now I committed the build script together with prebuilt binaries as part of the PR, as it made testing easiest, but I'd ask the maintainers to integrate it properly into their "prebuilt addon" system somehow - I can't do that from my end, as I neither know how it works nor have access to the storage.
Oh, and for the curious here are the native x64 vs Wasm benchmarks executed with various concurrency options (raw CSV here):
Chart
As you can see, Wasm performs at around native speed in some benchmarks, but is ~2x slower where SIMD is required - which is expected, given that libvips heavily relies on runtime SIMD code generation, which is not yet supported for Wasm.
Fixes #3336.