2022-7-30-Binaryen-Readme
Binaryen
Binaryen 是一个用于 WebAssembly 的编译器和工具链基础设施库,用 C++ 编写。 它旨在使 编译成 WebAssembly 简单、快速和高效:
简单: Binaryen 提供了个简单的 C API,也可以
使用 JavaScript 调用. 它接受来自 类 WebAssembly
格式 的输入,也接受一个通用的 control flow graph.快速: Binaryen 的内部 IR 使用紧凑的数据结构,专为完全并行的代码生成和优化而设计,使用所有可用的 CPU 内核。 Binaryen 的 IR 也非常容易和快速地编译成 WebAssembly,因为它本质上是 WebAssembly 的一个子集。
高效:
Binaryen 的优化器有许多可以改进代码大小和速度的关卡pass(参见后面的概述)。 这些优化旨在使 Binaryen 足够强大,可以单独用作编译器后端。 一个特定的关注领域是特定于 WebAssembly 的优化(通用编译器可能不会这样做),您可以将其视为 wasm minification,类似于 JavaScript、CSS 等的缩小版,所有这些都是具体的语言。
Compilers using Binaryen include:
AssemblyScript
它将 TypeScript 的变体编译为 WebAssemblywasm2js
支持将 WebAssembly 编译成 JSAsterius
which compiles Haskell to WebAssemblyGrain
which compiles Grain to WebAssembly
Binaryen also provides a set of toolchain utilities that can
- Parse and emit WebAssembly. In particular this lets you load
WebAssembly, optimize it using Binaryen, and re-emit it, thus implementing a
wasm-to-wasm optimizer in a single command. - Interpret WebAssembly as well as run the WebAssembly spec tests.
- Integrate with Emscripten in order to provide a
complete compiler toolchain from C and C++ to WebAssembly. - Polyfill WebAssembly by running it in the interpreter compiled to
JavaScript, if the browser does not yet have native support (useful for
testing).
Consult the contributing instructions if you’re interested in
participating.
Binaryen IR
Binaryen’s internal IR is designed to be
- Flexible and fast for optimization.
- As close as possible to WebAssembly so it is simple and fast to convert
it to and from WebAssembly.
There are a few differences between Binaryen IR and the WebAssembly language:
- Tree structure
- Binaryen IR is a tree, i.e., it has hierarchical structure,
for convenience of optimization. This differs from the WebAssembly binary
format which is a stack machine. - Consequently Binaryen’s text format allows only s-expressions.
WebAssembly’s official text format is primarily a linear instruction list
(with s-expression extensions). Binaryen can’t read the linear style, but
it can read a wasm text file if it contains only s-expressions. - Binaryen uses Stack IR to optimize “stacky” code (that can’t be
represented in structured form). - When stacky code must be represented in Binaryen IR, such as with
multivalue instructions and blocks, it is represented with tuple types that
do not exist in the WebAssembly language. In addition to multivalue
instructions, locals and globals can also have tuple types in Binaryen IR
but not in WebAssembly. Experiments show that better support for
multivalue could enable useful but small code size savings of 1-3%, so it
has not been worth changing the core IR structure to support it better. - Block input values (currently only supported in
catch
blocks in the
exception handling feature) are represented aspop
subexpressions.
- Binaryen IR is a tree, i.e., it has hierarchical structure,
- Types and unreachable code
- WebAssembly 将块/if/loop 类型限制为 none 和具体值类型(i32、i64、f32、f64)。Binaryen IR 有一个不可访问的类型,它允许 block/if/loop 接受它,允许不需要知道全局上下文的本地转换。因此,Binaryen 的默认文本输出不一定是有效的 wasm 文本。(要获得有效的 wasm 文本,您可以执行
--generate-stack-ir --print-stack-ir
打印 Stack IR 的操作,这保证对 wasm 解析器有效。) - Binaryen 在读取 WebAssembly 二进制文件时会忽略无法访问的代码。这意味着如果您读取包含无法访问代码的 wasm 文件,该代码将被丢弃,就好像它已被优化一样(通常这就是您想要的,优化的程序无论如何都没有无法访问的代码,但是如果您编写一个未优化的文件并且然后阅读它,它可能看起来不同)。这种行为的原因是 WebAssembly 中无法访问的代码具有在 Binaryen IR 中难以处理的极端情况(它可能非常非结构化,并且如前所述,Binaryen IR 比 WebAssembly 更具结构化)。请注意,Binaryen 确实支持 .wat 文本文件中无法访问的代码,因为正如我们所见,Binaryen 仅支持结构化的 s 表达式。
- WebAssembly 将块/if/loop 类型限制为 none 和具体值类型(i32、i64、f32、f64)。Binaryen IR 有一个不可访问的类型,它允许 block/if/loop 接受它,允许不需要知道全局上下文的本地转换。因此,Binaryen 的默认文本输出不一定是有效的 wasm 文本。(要获得有效的 wasm 文本,您可以执行
- Blocks
- Binaryen IR 只有一个节点包含可变长度的操作数列表:块blocks。另一方面,WebAssembly 允许循环中的列表、if 臂和函数的顶层。Binaryen 的 IR 对所有非块节点都有一个操作数;这个操作数当然可以是一个块。这个属性的动机是许多通道需要特殊的代码来迭代列表,因此使用带有列表的单个 IR 节点可以简化它们。
- As in wasm, blocks and loops may have names. Branch targets in the IR are
resolved by name (as opposed to nesting depth). This has 2 consequences:- Blocks without names may not be branch targets.
- Names are required to be unique. (Reading .wat files with duplicate names
is supported; the names are modified when the IR is constructed).
- As an optimization, a block that is the child of a loop (or if arm, or
function toplevel) and which has no branches targeting it will not be
emitted when generating wasm. Instead its list of operands will be directly
used in the containing node. Such a block is sometimes called an “implicit
block”.
- Reference Types
- The wasm text and binary formats require that a function whose address is
taken byref.func
must be either in the table, or declared via an(elem declare func $..)
. Binaryen will emit that data when necessary, but
it does not represent it in IR. That is, IR can be worked on without needing
to think about declaring function references.
As a result, you might notice that round-trip conversions (wasm => Binaryen IR
=> wasm) change code a little in some corner cases.
- When optimizing Binaryen uses an additional IR, Stack IR (see
src/wasm-stack.h
). Stack IR allows a bunch of optimizations that are
tailored for the stack machine form of WebAssembly’s binary format (but Stack
IR is less efficient for general optimizations than the main Binaryen IR). If
you have a wasm file that has been particularly well-optimized, a simple
round-trip conversion (just read and write, without optimization) may cause
more noticeable differences, as Binaryen fits it into Binaryen IR’s more
structured format. If you also optimize during the round-trip conversion then
Stack IR opts will be run and the final wasm will be better optimized.
Notes when working with Binaryen IR:
- 如上所述,Binaryen IR 具有树结构。因此,每个表达式都应该只有一个父节点 - 您不应该通过让节点在树中多次出现来“重用”节点。这个限制的动机是当我们优化时我们修改节点,所以如果它们在树中出现多次,一个地方的变化可能会错误地出现在另一个地方
- For similar reasons, nodes should not appear in more than one functions.
Intrinsics
Binaryen intrinsic functions look like calls to imports, e.g.,
1 | (import "binaryen-intrinsics" "foo" (func $foo)) |
Implementing them that way allows them to be read and written by other tools,
and it avoids confusing errors on a binary format error that could happen in
those tools if we had a custom binary format extension.
An intrinsic method may be optimized away by the optimizer. If it is not, it
must be lowered before shipping the wasm, as otherwise it will look like a
call to an import that does not exist (and VMs will show an error on not having
a proper value for that import). That final lowering is not done
automatically. A user of intrinsics must run the pass for that explicitly,
because the tools do not know when the user intends to finish optimizing, as the
user may have a pipeline of multiple optimization steps, or may be doing local
experimentation, or fuzzing/reducing, etc. Only the user knows when the final
optimization happens before the wasm is “final” and ready to be shipped. Note
that, in general, some additional optimizations may be possible after the final
lowering, and so a useful pattern is to optimize once normally with intrinsics,
then lower them away, then optimize after that, e.g.:
1 | wasm-opt input.wasm -o output.wasm -O --intrinsic-lowering -O |
Each intrinsic defines its semantics, which includes what the optimizer is
allowed to do with it and what the final lowering will turn it to. See
intrinsics.h
for the detailed definitions. A quick summary appears here:
call.without.effects
: Similar to acall_ref
in that it receives
parameters, and a reference to a function to call, and calls that function
with those parameters, except that the optimizer can assume the call has no
side effects, and may be able to optimize it out (if it does not have a
result that is used, generally).
Tools
该仓库包含了 bin/
目录下工具的代码:
- wasm-opt: 加载 WebAssembly 并在其上运行 Binaryen IR pass。
- wasm-as: 将文本格式(当前为 S-Expression 格式)的 WebAssembly 组装成二进制格式(通过 Binaryen IR)。
- wasm-dis: 将二进制格式的 WebAssembly 反汇编成文本格式(通过 Binaryen IR)。
- wasm2js: WebAssembly-to-JS 编译器。Emscripten 使用它来生成 JavaScript 作为 WebAssembly 的替代方案。
- wasm-reduce: WebAssembly 文件的测试用例缩减器。给定一个由于某种原因有趣的 wasm 文件(例如,它使特定的 VM 崩溃),wasm-reduce 可以找到一个具有相同属性的较小的 wasm 文件,这通常更容易调试。有关更多详细信息,请参阅 文档 。
- wasm-shell: 一个可以加载和解释 WebAssembly 代码的 shell。它还可以运行规范测试套件。
- wasm-emscripten-finalize: 获取由 llvm+lld 生成的 wasm 二进制文件,并对其执行特定的 emscripten pass。
- wasm-ctor-eval: 一个可以在编译时执行函数(或部分函数)的工具。
- binaryen.js: 一个独立的 JavaScript 库,公开了用于创建和优化 Wasm 模块的 Binaryen 方法。对于构建,请参阅 npm 上的 binaryen.js(或直接从 github、rawgit 或 unpkg 下载)。最低要求:Node.js v15.8 或 Chrome v75 或 Firefox v78。
Usage instructions for each are below.
Binaryen 优化
Binaryen 包含了
一些优化 passes,可以让 WebAssembly 变得体量更小而且更快速。 你可以使用 wasm-opt
来运行 Binaryen 优化器, 你也可以使用其他工具来运行优化器,比如wasm2js
和 wasm-metadce
.
- 默认的优化流水线是在
addDefaultFunctionOptimizationPasses
. - 你可以设置各种
pass 设置
以调整优化和收缩级别、是否忽略不太可能的陷阱、内联启发式、快速数学等。请参阅wasm-opt --help
如何设置它们和其他详细信息。
See each optimization pass for details of what it does, but here is a quick
overview of some of the relevant ones:
- CoalesceLocals - Key “register allocation” pass. 可以进行生命周期分析,然后重用局部变量以减少局部变量数量,并尽可能的减少局部变量的复制.
- CodeFolding - 通过合并避免重复代码(例如,如果两个if分支在其末端有一些共享指令)。
- CodePushing - “Pushes” code forward past branch operations, potentially
allowing the code to not be run if the branch is taken. - DeadArgumentElimination - LTO pass 旨在在始终使用相同的常量调用函数的情况下,删除方法参数。
- DeadCodeElimination
- Directize - 当表索引不变时,将间接调用转换为正常调用。
- DuplicateFunctionElimination - LTO pass.
- Inlining - LTO pass.
- LocalCSE - 简单的局部公共子表达式消除。
- LoopInvariantCodeMotion
- MemoryPacking - Key “optimize data segments” pass that combines segments,
removes unneeded parts, etc. - MergeBlocks - Merge a
block
to an outer one where possible, reducing
their number. - MergeLocals - When two locals have the same value in part of their
overlap, pick in a way to help CoalesceLocals do better later (split off from
CoalesceLocals to keep the latter simple). - MinifyImportsAndExports - Minifies them to “a”, “b”, etc.
- OptimizeAddedConstants - Optimize a load/store with an added constant into
a constant offset. - OptimizeInstructions - Key peephole optimization pass with a constantly
increasing list of patterns. - PickLoadSigns - Adjust whether a load is signed or unsigned in order to
avoid sign/unsign operations later. - Precompute - Calculates constant expressions at compile time, using the
built-in interpreter (which is guaranteed to be able to handle any constant
expression). - ReReloop - Transforms wasm structured control flow to a CFG and then goes
back to structured form using the Relooper algorithm, which may find more
optimal shapes. - RedundantSetElimination - Removes a
local.set
of a value that is already
present in a local. (Overlaps with CoalesceLocals; this achieves the specific
operation just mentioned without all the other work CoalesceLocals does, and
therefore is useful in other places in the optimization pipeline.) - RemoveUnsedBrs - Key “minor control flow optimizations” pass, including
jump threading and various transforms that can get rid of abr
orbr_table
(like turning ablock
with abr
in the middle into anif
when possible). - RemoveUnusedModuleElements - “Global DCE”, an LTO pass that removes
imports, functions, globals, etc., when they are not used. - ReorderFunctions - Put more-called functions first, potentially allowing
the LEB emitted to call them to be smaller (in a very large program). - ReorderLocals - Put more-used locals first, potentially allowing the LEB
emitted to use them to be smaller (in a very large function). After the
sorting, it also removes locals not used at all. - SimplifyGlobals - Optimizes globals in various ways, for example,
coalescing them, removing mutability from a global never modified, applying a
constant value from an immutable global, etc. - SimplifyLocals - Key “
local.get/set/tee
” optimization pass, doing things
like replacing a set and a get with moving the set’s value to the get (and
creating a tee) where possible. Also createsblock/if/loop
return values
instead of using a local to pass the value. - Vacuum - Key “remove silly unneeded code” pass, doing things like removing
anif
arm that has no contents, a drop of a constant value with no side
effects, ablock
with a single child, etc.
“LTO” in the above means an optimization is Link Time Optimization-like in that
it works across multiple functions, but in a sense Binaryen is always “LTO” as
it usually is run on the final linked wasm.
Advanced optimization techniques in the Binaryen optimizer include
SSAification,
Flat IR, and
Stack/Poppy IR.
Binaryen also contains various passes that do other things than optimizations,
like
legalization for JavaScript,
Asyncify,
etc.
Building
Binaryen uses git submodules (at time of writing just for gtest), so before you build you will have to initialize the submodules:
1 | git submodule init |
After that you can build with CMake:
1 | cmake . && make |
A C++17 compiler is required. Note that you can also use ninja
as your generator: cmake -G Ninja . && ninja
.
To avoid the gtest dependency, you can pass -DBUILD_TESTS=OFF
to cmake.
Binaryen.js can be built using Emscripten, which can be installed via the SDK).
1 | emcmake cmake . && emmake make binaryen_js |
Visual C++
Using the Microsoft Visual Studio Installer, install the “Visual C++ tools for CMake” component.
Generate the projects:
1
2
3mkdir build
cd build
"%VISUAL_STUDIO_ROOT%\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" ..Substitute VISUAL_STUDIO_ROOT with the path to your Visual Studio
installation. In case you are using the Visual Studio Build Tools, the path
will be “C:\Program Files (x86)\Microsoft Visual Studio\2017\BuildTools”.From the Developer Command Prompt, build the desired projects:
1
msbuild binaryen.vcxproj
CMake generates a project named “ALL_BUILD.vcxproj” for conveniently building all the projects.
Running
wasm-opt
Run
1 | bin/wasm-opt [.wasm or .wat file] [options] [passes, see --help] [--help] |
wasm优化器以 WebAssembly 文件作为输入,然后执行转换pass,(在转换过程之前或之后)输出执行结果。比如,
1 | bin/wasm-opt test/lit/passes/name-types.wast -all -S -o - |
这将输出测试套件中的一个测试用例。要对其进行转换,请尝试执行以下语句:
1 | bin/wasm-opt test/lit/passes/name-types.wast --name-types -all -S -o - |
name-types
pass 会确保每个类型都有一个名称并重命名异常长的类型名称。通过比较两个命令的输出,您可以看到转换导致的变化。
将您自己的转换过程添加到 shell 很容易,只需将 .cpp
文件添加到 src/passes
中,然后重新构建 shell。可以参考 name-types
pass 的代码.
Some more notes:
- See
bin/wasm-opt --help
for the full list of options and passes. - Passing
--debug
will emit some debugging info.
wasm2js
Run
1 | bin/wasm2js [input.wasm file] |
This will print out JavaScript to the console.
For example, try
1 | bin/wasm2js test/hello_world.wat |
That output contains
1 | function add(x, y) { |
as a translation of
1 | (func $add (; 0 ;) (type $0) (param $x i32) (param $y i32) (result i32) |
wasm2js’s output is in ES6 module format - basically, it converts a wasm
module into an ES6 module (to run on older browsers and Node.js versions
you can use Babel etc. to convert it to ES5). Let’s look at a full example
of calling that hello world wat; first, create the main JS file:
1 | // main.mjs |
The run this (note that you need a new enough Node.js with ES6 module
support):
1 | bin/wasm2js test/hello_world.wat -o hello_world.mjs |
Things keep to in mind with wasm2js’s output:
- You should run wasm2js with optimizations for release builds, using
-O
or another optimization level. That will optimize along the entire pipeline
(wasm and JS). It won’t do everything a JS minifer would, though, like
minify whitespace, so you should still run a normal JS minifer afterwards. - It is not possible to match WebAssembly semantics 100% precisely with fast
JavaScript code. For example, every load and store may trap, and to make
JavaScript do the same we’d need to add checks everywhere, which would be
large and slow. Instead, wasm2js assumes loads and stores do not trap, that
int/float conversions do not trap, and so forth. There may also be slight
differences in corner cases of conversions, like non-trapping float to int.
wasm-ctor-eval
wasm-ctor-eval
executes functions, or parts of them, at compile time.
After doing so it serializes the runtime state into the wasm, which is like
taking a “snapshot”. When the wasm is later loaded and run in a VM, it will
continue execution from that point, without re-doing the work that was already
executed.
For example, consider this small program:
1 | (module |
We can evaluate part of it at compile time like this:
1 | wasm-ctor-eval input.wat --ctors=main -S -o - |
This tells it that there is a single function that we want to execute (“ctor”
is short for “global constructor”, a name that comes from code that is executed
before a program’s entry point) and then to print it as text to stdout
. The
result is this:
1 | trying to eval main |
The logging shows us managing to eval part of main()
, but not all of it, as
expected: We can eval the first global.get
, but then we stop at the call to
the imported function (because we don’t know what that function will be when the
wasm is actually run in a VM later). Note how in the output wasm the global’s
value has been updated from 0 to 1, and that the first global.get
has been
removed: the wasm is now in a state that, when we run it in a VM, will seamlessly
continue to run from the point at which wasm-ctor-eval
stopped.
In this tiny example we just saved a small amount of work. How much work can be
saved depends on your program. (It can help to do pure computation up front, and
leave calls to imports to as late as possible.)
Note that wasm-ctor-eval
‘s name is related to global constructor functions,
as mentioned earlier, but there is no limitation on what you can execute here.
Any export from the wasm can be executed, if its contents are suitable. For
example, in Emscripten wasm-ctor-eval
is even run on main()
when possible.
Testing
1 | ./check.py |
(or python check.py
) will run wasm-shell
, wasm-opt
, etc. on the testcases in test/
, and verify their outputs.
The check.py
script supports some options:
1 | ./check.py [--interpreter=/path/to/interpreter] [TEST1] [TEST2].. |
- If an interpreter is provided, we run the output through it, checking for
parse errors. - If tests are provided, we run exactly those. If none are provided, we run
them all. To see what tests are available, run./check.py --list-suites
. - Some tests require
emcc
ornodejs
in the path. They will not run if the
tool cannot be found, and you’ll see a warning. - We have tests from upstream in
tests/spec
, in git submodules. Running./check.py
should update those.
Note that we are trying to gradually port the legacy wasm-opt tests to use lit
and filecheck
as we modify them.
For passes
tests that output wast, this can be done automatically with scripts/port_passes_tests_to_lit.py
and for non-passes
tests that output wast, see
https://github.com/WebAssembly/binaryen/pull/4779 for an example of how to do a simple manual port.
Setting up dependencies
1 | ./third_party/setup.py [mozjs|v8|wabt|all] |
(or python third_party/setup.py
) installs required dependencies like the SpiderMonkey JS shell, the V8 JS shell
and WABT in third_party/
. Other scripts automatically pick these up when installed.
Run pip3 install -r requirements-dev.txt
to get the requirements for the lit
tests. Note that you need to have the location pip
installs to in your $PATH
(on linux, ~/.local/bin
).
Fuzzing
1 | ./scripts/fuzz_opt.py [--binaryen-bin=build/bin] |
(or python scripts/fuzz_opt.py
) will run various fuzzing modes on random inputs with random passes until it finds
a possible bug. See the wiki page for all the details.
Design Principles
- Interned strings for names: It’s very convenient to have names on nodes,
instead of just numeric indices etc. To avoid most of the performance
difference between strings and numeric indices, all strings are interned,
which means there is a single copy of each string in memory, string
comparisons are just a pointer comparison, etc. - Allocate in arenas: Based on experience with other
optimizing/transformating toolchains, it’s not worth the overhead to
carefully track memory of individual nodes. Instead, we allocate all elements
of a module in an arena, and the entire arena can be freed when the module is
no longer needed.
FAQ
- Why the weird name for the project?
“Binaryen” is a combination of binary - since WebAssembly is a binary format
for the web - and Emscripten - with which it can integrate in order to
compile C and C++ all the way to WebAssembly, via asm.js. Binaryen began as
Emscripten’s WebAssembly processing library (wasm-emscripten
).
“Binaryen” is pronounced in the same manner as “Targaryen“: bi-NAIR-ee-in. Or something like that? Anyhow, however Targaryen is correctly pronounced, they should rhyme. Aside from pronunciation, the Targaryen house words, “Fire and Blood”, have also inspired Binaryen’s: “Code and Bugs.”
- Does it compile under Windows and/or Visual Studio?
Yes, it does. Here’s a step-by-step tutorial on how to compile it
under Windows 10 x64 with with CMake and Visual Studio 2015.
However, Visual Studio 2017 may now be required. Help would be appreciated on
Windows and OS X as most of the core devs are on Linux.
2022-7-30-Binaryen-Readme
install_url
to use ShareThis. Please set it in _config.yml
.