Static linking in Reason


#1

Hi :wave:

This is a quick write-up of what we have so far with static linking and where we’re headed.

We’re working on the 0.6.8 release of esy that will
be statically linked for Linux distros. The prebuilts were having
trouble with different runtimes (glibc) and it was necessary to bundle
the dependencies so that they run in different environment reliably. Along the way,
we learnt some lessons and this post shares those lessons, and most
importantly unveiling plans with the tooling so that you dont have to
go through the hardships all over again.

What is static linking?

Quick refresher for those unfamiliar: native binaries built with the
Reason toolchain can depend on libraries that are installed
separately. For instance, if you use postgres via
caqti, which depends on ocaml-postgres, you’ll need the postgres library,
libpq, either in the esy sandbox or system wide. Listing all the dependencies with ldd ./path/to/your/binary will include an entry like this,

	libpq.so.5 => /usr/lib/libpq.so.5 (0x7f7b3996f000)

The target machine where the binary is expected to run requires the
exact version of libpq to ensure your binary doesn’t crash.

Statically linking your binaries can avoid this by bundling the
libpq library into you binary itself.

$ ldd path/to/my/binary
	statically linked

If this still sounds unfamiliar to you, but are familiar with Javascript,
this would be like using a bundler like webpack to bundle all your npm
dependencies, so that during the runtime would dont need them, thereby
letting you have a lean package.json. In the native world, getting the
dependency right is not just a matter of API compatibility, but binary too.

How to bundle your binaries with all the dependencies?

Not at platform support statically linking equally. Our best bet right
now is Linux with gcc -static. Static linking with MacOS is poorly
understood. Windows has official recommendations on how to do them,
but Linux by far is the most reliable here.

To start with, we can use gcc -static to bundle all dynamic
dependencies. This would mean our build system would have to pass
ocamloptFlags with a -static. Rudi Grinberg has a great post on
how to do this here.

Getting the first build statically linked requires using, preferably,
the musl-libc, correct C compiler options,
preferably static+musl variant of the OCaml compiler distribution
and endless chase of linker messages when things dont go your way.

Avoiding the common class of issues

I believe we can avoid a large pool of these error by encoding all the
knowledge in tools. Template generators like spin,
modern-ocaml and mkocaml can provide docker
images out of the box. I’m building pesy (with help from you contributors
of course), which help in this regard to an extent.

pesy is a layer before Dune build system that accepts build config
from the same package.json and generates Dune files on-the-fly. As
mentioned in rgrinberg’s blog post, static linking requires passing ocamlopt_flags to dune
file which often result in patches like these

diff --git a/bin/dune b/bin/dune
index d94b6ec2..abd02e00 100644
--- a/bin/dune
+++ b/bin/dune
@@ -4,6 +4,7 @@
  (preprocess
   (pps lwt_ppx ppx_let ppx_deriving_yojson ppx_deriving.std))
  (flags
+  (-ccopt -static)
   (:standard
    (-w -39)
    "-open"
diff --git a/esy-build-package/bin/dune b/esy-build-package/bin/dune
index 1c0b8479..74301b9d 100644
--- a/esy-build-package/bin/dune
+++ b/esy-build-package/bin/dune
@@ -3,6 +3,7 @@
  (public_name esyBuildPackageCommand)
  (modules esyBuildPackageCommand)
  (flags
+  (-ccopt -static)
   (:standard
    (-w -39)))
  (preprocess

In fact, this is exactly how do it, at esy.

In some cases, platform’s memory layout policies (ASLR) could mean
that PIC could be enabled on some machines and not on
others. Additional compiler flags could be needed to ensure build environment is uniform everywhere.
pesy can function a tool that handles this for you so that all you need to do
would be to set a json field to a true when needed.

{
    "myBinary": {
      "imports": [
	    "Tables = require('../tables')",
        "Morph = require('morph')",
		"Pbkdf = require('@opam/pbkdf')",
        "Middlewares = require('../middlewares')",
		"Str = require('str')"
	  ],
	  "static": true
	}
}

Because pesy also setups up the CI for you, docker images that are
tried and tested can be provided out of the box. Simpler build config
and out-of-the-box CI setup can avoid a lot of pain for common cases.

Full Reason/OCaml implementations.

By far, the most challenging aspect of static linking can be library
specific compiler flags. Linker flags could be missing in
intermediate package tree, putting the burden on the application
developer to provide the right linker flags.

There is a good news. Albeit, something that needs patience.

OCaml compiler already builds statically by default. You might have
heard this a lot. It’s true. But here’s the caveat - ocamlopt can only
link the Reason/OCaml portion of the dependency tree statically by
default. The rest - C libaries - are passed on to the system wide
linker.

This is the primary reason to prefer pure OCaml implementations to
bindings to C libraries. Static linking is much simpler with libaries
like pgx than with ocaml-postgres. Libraries written in pure Reason/OCaml
can be easily compiled to other platforms too. Bindings are really convenient and practical,
but pure implementations have their benefits too. The story of ocaml-tls is an inspiring
one in this regard
.

Static linking can be much more convenient if there are more such implementations and this requires active
community participation in the development of such libraries.

Hope this sheds some light on where we are with static linking in
native Reason/OCaml. Updates to the tooling will be shared from time to
time here. Feel free to reach out if you have trouble.

Happy hacking!