You make it sound simple. Bazel's docs are minimal and what exists is often wrong. For a C or C++ toolchain you might turn to a contributing project such as toolchains_llvm, but even those experts are still figuring it out, and they give you little guidance on how to assemble your sysroot, which is the actual issue in the article. And, the upstream distributions of LLVM are built under certain assumptions, so unless you know exactly what you are doing, you probably also want to build LLVM instead of using their archives. If you figure out how to build LLVM in a completely consistent and correct way with respect to the STL and support libraries, you will be perhaps the 3rd or 4th person in history to achieve it.
You include the target glibc as part of your toolchain
And, I always prefer to actually pick my glibc independently from the host os and ship it with the binary (generally in an OCI image). This way you can patch/update the host without breaking the service.
no, you just need a compatible starlit/glibc. you pick an old version (e.g. RHEL8) or otherwise compile with multiple tool chains if you need newer glibc features
Whether you use Bazel or not, this is a well-known issue with an equally well-known solution. There’s no need for such a lengthy write-up: just use a consistent sysroot across your entire build environment.
If you can’t create your own sysroot image, you can simply download Chromium’s prebuilt one and configure your C++ compile rules correctly. Problem solved.
We also have an dockerfile for clang/LLVM in that repo so the whole thing is hermetic. It’s a bit of shame Bazel doesn’t come with stronger options/defaults here, because I feel like I want to reproduce this same toolchain on every C++ project with Bazel
> the service crashes with a mysterious error: version 'GLIBC_2.28' not found
Mysterious?
This is by the way why many binary Python packages use https://github.com/pypa/manylinux for builds: if you build on an old glibc, your library will still (generally) work with newer versions.
A heterogenous build cluster with non-hermetic builds and shared caching. The fact that this is only a glibc symbol versioning problem and not something far more severe is, well, a blessing.
At the bare fucking minimum, I would expect the CI builds to have a homogenous configuration and their own cache, not writable from outside CI. If you’re lazy, just wipe the cache every time you upgrade the CI cluster. Maybe I’ve just been living too long in environments where we care about artifact provenance and I’m blind to the whims of people who don’t care about provenance.
I want to feel sympathetic, because I know Bazel is a pain in the ass to learn, but it sounds like the author was juggling knives in a windstorm and caught a few pointy ends.
One of the niceties of Zig's build system is its batteries-included support for targeting an explicitly provided glibc version. No need for alternate sysroots or dedicated old-distro containers like you would with the traditional C/C++ compilers, all you have to do is append it to the target triplet, like so:
You don’t need containers. Just hermetic builds. Ideally every byte read during the compilation process comes from a file under your control, that you version and you test. That includes the compiler, glibc, and all of your dependencies.
Ambient, implicit dependencies are the devil’s playthings.
Containers tend to be coarse-grained. For example, maybe you are writing a program, so you put the entire build environment in a container. If you have lots of programs with different dependencies, do you need lots of containers? Do you need to rebuild lots of containers when dependencies change? Bazel is much more fine-grained, and in Bazel, each individual build action gets its own build environment.
By default, that build environment includes the host (build system) glibc, compiler, system headers, system libraries, etc. If you want repeatable builds, you turn that off, and give Bazel the exact glibc, compiler, libraries, etc. to use for building your program.
You get the isolation of containers, but much faster builds and better caching.
I'm in the middle of submitting PRs to multiple projects because they are compiling on ubuntu-latest and forcing a glibc 2.38 requirement. These are multiplatform projects where most or none of the devs use Linux.
The first project I was able to change their workflow to build inside a 20.04 container. The other project uses tauri and it requires some very recent libraries so I don't know if an older container will work.
Do you have any documentation or generic recommendations for solving these issues caused by blindly using GitHub Actions for all compilations?
> The first project I was able to change their workflow to build inside a 20.04 container.
This approach does _not_ work because you end up with the `node` that runs GitHub Actions not being able to run, certainly this will happen if you end using a sufficiently old container.
> Do you have any documentation or generic recommendations for solving these issues caused by blindly using GitHub Actions for all compilations?
where you replace `DEBIAN_RELEASE` with the release you want to target, and then
- configure your project's build to use that sysroot.
That's it.
If your project does not support sysroots, make it do so. In general compilers will support sysroots, so it's just a matter of making your build configuration facility support sysroots.
Why ain't the first answer.. use a bazel provided tool chain instead of using system toolchains? This article is totally mad.
Yeah, making the entire tool chain hermetic and versioned is one of the main benefits of bazel.
You can have every developer cross-compile from a different os/cpu platform
You make it sound simple. Bazel's docs are minimal and what exists is often wrong. For a C or C++ toolchain you might turn to a contributing project such as toolchains_llvm, but even those experts are still figuring it out, and they give you little guidance on how to assemble your sysroot, which is the actual issue in the article. And, the upstream distributions of LLVM are built under certain assumptions, so unless you know exactly what you are doing, you probably also want to build LLVM instead of using their archives. If you figure out how to build LLVM in a completely consistent and correct way with respect to the STL and support libraries, you will be perhaps the 3rd or 4th person in history to achieve it.
Yes: https://github.com/cerisier/toolchains_llvm_bootstrapped
llvm_toolchain or gcc-toolchain or the uber one are all some possibilities
Because those Bazel toolchains don’t come with a glibc? How could they?
This one does, we use it with great success: https://github.com/cerisier/toolchains_llvm_bootstrapped
What? Of course they can, how is that hard?
You need that glibc on the target system, and glibc doesn’t like static linking. How do you ship the built binary?
You include the target glibc as part of your toolchain
And, I always prefer to actually pick my glibc independently from the host os and ship it with the binary (generally in an OCI image). This way you can patch/update the host without breaking the service.
Right, so you need to end up creating a container to package the glibc with the binary. Which is not very different from having sysroots.
But what about any tools compiled from source and used during the build? Those can also suffer from these issues.
Yeah, ideally you could build your toolchain from source on any platform. But, in some cases that’s not possible
Does that result in working NSS?
I normally statically link as much as possible and avoid nss, but you can make that work as well, just include it along with glibc.
Is musl not an option? It's best to avoid glibc altogether if possible.
no, you just need a compatible starlit/glibc. you pick an old version (e.g. RHEL8) or otherwise compile with multiple tool chains if you need newer glibc features
Agreed.
Whether you use Bazel or not, this is a well-known issue with an equally well-known solution. There’s no need for such a lengthy write-up: just use a consistent sysroot across your entire build environment.
If you can’t create your own sysroot image, you can simply download Chromium’s prebuilt one and configure your C++ compile rules correctly. Problem solved.
Yeah I built a custom sysroot for Redpanda (Bazel/C++/Distributed Kafka) using a really simple docker image: https://github.com/redpanda-data/redpanda/blob/dev/bazel/too...
We also have an dockerfile for clang/LLVM in that repo so the whole thing is hermetic. It’s a bit of shame Bazel doesn’t come with stronger options/defaults here, because I feel like I want to reproduce this same toolchain on every C++ project with Bazel
Thanks for sharing. As a non-google bazel user this is quite helpful.
Sounds like a jerb for Docker
> the service crashes with a mysterious error: version 'GLIBC_2.28' not found
Mysterious?
This is by the way why many binary Python packages use https://github.com/pypa/manylinux for builds: if you build on an old glibc, your library will still (generally) work with newer versions.
JFC the author has done everything wrong here.
A heterogenous build cluster with non-hermetic builds and shared caching. The fact that this is only a glibc symbol versioning problem and not something far more severe is, well, a blessing.
At the bare fucking minimum, I would expect the CI builds to have a homogenous configuration and their own cache, not writable from outside CI. If you’re lazy, just wipe the cache every time you upgrade the CI cluster. Maybe I’ve just been living too long in environments where we care about artifact provenance and I’m blind to the whims of people who don’t care about provenance.
I want to feel sympathetic, because I know Bazel is a pain in the ass to learn, but it sounds like the author was juggling knives in a windstorm and caught a few pointy ends.
> The CI system picks it up, gets a cache hit from the developer’s build, and produces a release artifact.
Why would you cache developer builds on CI?
This sandboxed LLVM toolchain completely solves that problem: https://github.com/cerisier/toolchains_llvm_bootstrapped
One of the niceties of Zig's build system is its batteries-included support for targeting an explicitly provided glibc version. No need for alternate sysroots or dedicated old-distro containers like you would with the traditional C/C++ compilers, all you have to do is append it to the target triplet, like so:
zig cc -target x86_64-linux-gnu.2.17 file.c
It’s not hard in Bazel either, it’s just not batteries-included.
It appears the Bazel community is getting closer to inventing containers.
You don’t need containers. Just hermetic builds. Ideally every byte read during the compilation process comes from a file under your control, that you version and you test. That includes the compiler, glibc, and all of your dependencies.
Ambient, implicit dependencies are the devil’s playthings.
I don’t think there’s any real insights there.
Containers tend to be coarse-grained. For example, maybe you are writing a program, so you put the entire build environment in a container. If you have lots of programs with different dependencies, do you need lots of containers? Do you need to rebuild lots of containers when dependencies change? Bazel is much more fine-grained, and in Bazel, each individual build action gets its own build environment.
By default, that build environment includes the host (build system) glibc, compiler, system headers, system libraries, etc. If you want repeatable builds, you turn that off, and give Bazel the exact glibc, compiler, libraries, etc. to use for building your program.
You get the isolation of containers, but much faster builds and better caching.
Bazel comes with ironclad sandboxing features, but people don't use them because build correctness is such a chore.
Bazel has number of strategies, including containers.
But that’s only a minority of what it does.
Am I the first to say "Nix solves this" Nix effectively ships a sysroot with every binary !
This happens in CIs. It's happened to me on GitHub Actions. The answer is always sysroots.
I'm in the middle of submitting PRs to multiple projects because they are compiling on ubuntu-latest and forcing a glibc 2.38 requirement. These are multiplatform projects where most or none of the devs use Linux.
The first project I was able to change their workflow to build inside a 20.04 container. The other project uses tauri and it requires some very recent libraries so I don't know if an older container will work.
Do you have any documentation or generic recommendations for solving these issues caused by blindly using GitHub Actions for all compilations?
> The first project I was able to change their workflow to build inside a 20.04 container.
This approach does _not_ work because you end up with the `node` that runs GitHub Actions not being able to run, certainly this will happen if you end using a sufficiently old container.
> Do you have any documentation or generic recommendations for solving these issues caused by blindly using GitHub Actions for all compilations?
Install these pkgs in an `ubuntu-latest` image:
then where you replace `DEBIAN_RELEASE` with the release you want to target, and then That's it.If your project does not support sysroots, make it do so. In general compilers will support sysroots, so it's just a matter of making your build configuration facility support sysroots.