Skip to content
Alon Zakai edited this page Feb 24, 2014 · 119 revisions

LLVM Backend, aka "fastcomp"

Fastcomp is a new compiler core for emscripten, replacing much of the original compiler core in src/*.js. This replaces only the JS compiler itself, not the toolchain code nor library code (src/library*.js) nor JS optimizer code. (For context, the core compiler is a few thousand lines of code, to be replaced with a few other thousand lines of code, whereas all the other stuff not being replaced is far larger.)

Why is this change happening?

Fastcomp is much more streamlined than the original compiler - the original compiler supports dozens of various code generation modes (no typed arrays, typed arrays in various modes, asm.js vs non-asm.js, etc.). Fastcomp on the other hand is directly focused on asm.js code generation, which has proven to give the best results.

Fastcomp, as a C++ LLVM backend, is much faster than the original JS compiler, often 4x faster or more. It also requires much less memory and avoids unpredictable pathological compiler slowdowns that the old compiler had.

Fastcomp also generates better code - by being an LLVM backend, it can integrate more tightly with LLVM.

Are there downsides?

The main downside is that Emscripten can no longer use a stock build of LLVM, because we have changes that must be built with LLVM.

Note that you actually can still use a stock build, but only because you can make emscripten use the original compiler (see next section) - but this is not good, because you miss out on the benefits of the backend (see above), and also you are running a code path that is not recommended and less tested.

This will hopefully be a temporary issue because the new Emscripten backend might get upstreamed to LLVM eventually, in which case a stock build would contain it.

Status of Original Compiler (and how to disable fastcomp)

The original compiler is still present, and you may want to use it if you need a feature not present in fastcomp. There should be very few such features, as everything not deprecated or planned to be rewritten has already been ported. However, if you do need one of those features, you can use the old compiler, by building with

EMCC_FAST_COMPILER=0 emcc [..]

so that EMCC_FAST_COMPILER is set in the environment to 0. This will turn off fastcomp.

You can check if fastcomp is on or off by looking at debug output. For example, run emcc -v tests/hello_world.c and if fastcomp is on, then among the output will be

DEBUG    root: emscript: llvm backend: ...
DEBUG    root:   emscript: llvm backend took

That shows both the command used to run the backend, and how much time it took. If fastcomp is off on the other hand, the old compiler is used, and you will instead

DEBUG    root: emscript: ll=>js
DEBUG    root:   emscript: scan took ...
...
DEBUG    root: emcc step "emscript (llvm=>js)" took ...

This shows that the old compiler (ll=>js) is called, as well as how much time each step takes, and the total time. Again, this is the output for the old compiler, so hopefully you will never see it :)

Some features not present in fastcomp are:

  • Embind does not work, as it is not asm.js compatible yet, and the new backend is asm.js specific. There are plans to experiment with other approaches to binding between the languages that should eventually fix this.
  • Various deprecated settings.js options (e.g. FORCE_ALIGNMENT, HEAP_INIT, etc.) have no effect. You should receive a compile-time error if you use a setting which is not yet supported, if it has not been missed.
  • Linking of asm.js shared modules (note that normal static linking as used by almost all projects works fine, it is just specifically the options MAIN_MODULE and SIDE_MODULE that do not work). This is not deprecated, but may need to be partially reconsidered, so it has not been ported to fastcomp.

Using fastcomp

Fastcomp is an LLVM backend. It is not in upstream LLVM, it is far too new for that (but hopefully eventually will be). So you need to use the emscripten fork of LLVM. You can either build it from source, or get it as part of the emscripten SDK.

This means that if you use another build of LLVM - like an older one you built yourself, or one from your linux distro's repos, etc. - it will not contain fastcomp. Emscripten will give an error about this, briefly explain the issue and link to this page, where you can see instructions for running without fastcomp earlier up. The downsides to you will then be

  • Not taking advantage of the benefits of fastcomp (faster compilation, better generated code).
  • Using the old compiler which is deprecated and consequently less tested.

Getting Fastcomp

To use fastcomp, you need both emscripten (see the Tutorial) and the emscripten LLVM code, either from the SDK or from source. Instructions from source are as follows:

  • Clone the repository: git clone [email protected]:kripken/emscripten-fastcomp, which is based off of PNaCl's LLVM fork.
  • cd emscripten-fastcomp/tools/
  • Clone the clang fork: git clone [email protected]:kripken/emscripten-fastcomp-clang clang - note: into a dir named "clang"! This is based off PNaCl's clang fork.
  • Build it:
  • cd .. to get back to the root of the llvm checkout
  • mkdir build and then cd build
  • ../configure --enable-optimized --disable-assertions --enable-targets=host,js
  • (Alternatively, you can use CMake instead of configure: cmake .. -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD="X86;JSBackend" -DLLVM_INCLUDE_EXAMPLES=OFF -DLLVM_INCLUDE_TESTS=OFF, replace X86 if you are on something else.)
  • make -j 4 (or whatever number of cores you want to use)
  • Set it up in ~/.emscripten (set the path to the llvm checkout + something like /build/Release/bin as LLVM_ROOT, look for where the clang binary shows up under build/)

It used to be necessary to turn fastcomp on, by doing

   EMCC_FAST_COMPILER=1 ./emcc -O2 tests/hello_world.cpp

But when fastcomp is on by default (as it currently is on the incoming branch), that is not needed.

Branches

To get a stable snapshot of emscripten development, use the master branch in all repos: emscripten, emscripten-fastcomp and emscripten-fastcomp-clang (that is, in emscripten, in emscripten's LLVM fork and in emscripten's clang fork). You can also use the incoming branch in all three if you want the very latest changes, but those branches can sometimes have breakage.

Make sure to use the SAME branch in all three. Changes might land in emscripten incoming for example that will not work on the master branches of the other two.

Compilation Notes

  • If you are building a large project, you will need a 64-bit build of llvm+clang, as compiling and optimizing can take more memory than a 32-bit build can use.

  • To build 64 bit using cmake and visual studio, use the -G "Visual Studio 10 Win64" directive. Note: VS 11/12 don't work yet.

  • If you want to build with MinGW instead and have that in path, replace -G directive in above with "-G MinGW Makefiles", and run mingw32-make to build (not tested yet).

Backend code structure

The backend is in the repo linked to above, and code is in lib/Target/JSBackend/. The main file is JSBackend.cpp but the the other files in that directory are important too.

There is also the I64 simplification pass, which is currently in lib/Transforms/NaCl/ExpandI64.cpp, but which should move to another directory probably. There are also some other passes there to lower exceptions, setjmp, etc.