Static Bundle Object: Modernizing Static Linking

(medium.com)

34 points | by ingve 5 days ago ago

21 comments

  • stabbles 2 days ago ago

    I would be really happy if instead of a dumb tool like

        ar -r libexample.a f.o g.o
    
    it was like creating shared libraries

        ld --static -o libexample.a f.o g.o -L /somewhere/lib -ldep1 -ldep2 -rpath /somewhere/lib
    
    And that this would make the linker:

    1. check whether all "undefined" symbols can be resolved in dependencies mentioned in `-l`, otherwise fail

    2. store metadata of "used" libraries.

    So, it would create an archive `libexample.a` containing:

        f.o
        g.o
        METADATA
    
    where METADATA contains the search path `/somewhere/lib` and needed libraries `dep1` and `dep2`.

    So that ultimately when you compile and link `gcc foo.c -lexample`, the linker resolves the dependencies just like in shared linking.

    • eyalitki 2 days ago ago

      OP here, it was also my opinion that the handling of static libraries should significantly be improved, and ld should have a "--static-lib" flag to properly handle it. Sadly, the ELF committee prefers a more subtle approach, hence even the proposed build of the static bundle object is done on top of the existing "ld -r", and the output is still wrapped inside a .a archive for compatibility.

      I hope that once integrated to linkers the adoption of this new format will help convince that it deserves significantly better tooling.

  • tux3 2 days ago ago

    Very nice to see an update!

    I have to admit the pushback on ET_STAT is not completely unexpected. I can see why the gABI thread would argue that it can be solved without a new e_type. It would be a major change that requires updating all toolchains, on the consumer and producer side. It would certainly take over a decade for the support to percolate everywhere, including some of the mobile and embedded toolchains where I've had these static linking & symbol visibility issues before.

    The reason I still wouldn't mind ET_STAT is that I feel it does remove some complexity in the long run by not relying on things like ar archives, their hidden members with special meaning, and their different flavors. We will still be working with ELF many decades from now, so I still see some value in doing these slow migrations that simplify the final design. I would be happy if fifty or a hundred years from now ar archives becomes history, like ELF has done to the a.out format.

    That aside, there are also some interesting alternative ideas in the gABI thread (STB_LIBLOCAL is intriguing), but it seems like the path forward is through the toolchain first.

    I'd consider it a very nice result if third-party tools like armerge are deprecated by upstream support in the native toolchain (but I appreciate the mention in the article!)

    And thank you for continuing to work on this, really appreciate the work you've been doing there.

    • eyalitki 2 days ago ago

      Thanks, appreciate your feedback. Crossing my fingers that my PR for GNU ld will go as planned.

  • dfe a day ago ago

    It’s been a while but I’m pretty sure the Mach-O linker on macOS has exactly the feature you are looking for.

    Basically there is a linker flag to produce a .o while maintaining relocation info.

    This can then be fed into another linker later, but the important point is that internal symbols can be stripped, or even just remain internal through namespacing.

  • Panzerschrek 2 days ago ago

    Modernizing of static libraries doesn't solve their main problem. They still contain compiled binary code, which is used by linker mostly as is. It maybe was fine 40 years ago, but nowadays this limitation leads to result binaries with suboptimal performance. Something better should be used instead, like compiler-dependent libraries containing intermediate code, which may be further composed and optimized, like it happens with LTO for non-static-library code.

    • stabbles 2 days ago ago

      Hm, `-ffat-lto-objects` exists, and tools like `gcc-ar` and `llvm-ar`. Linker plugins can work with these objects.

  • wyldfire 2 days ago ago

    How do .rlibs work? Do those resemble these .sbos? .rlibs look like archives IIRC but maybe they're able to resolve relocations internal to them?

    EDIT: after some brief searching around, I believe .rlibs are little more than archives with rust-specific metadata and internal relocations are not resolved.

    • eyalitki 2 days ago ago

      There is nothing magical in resolving the local relocations. It is just that current static libraries (static archives of plain .o files) are produced directly using "ar" and don't even go through the linker... The changes to the linker so to apply the relocation finalization are less than 50 lines of code on top of the existing "ld -r" that creates a relocatable object (which despite its name, does not handle relocations).

      The key point in the proposal for a static-bundle-object is to properly handle static libraries as linked objects, instead of as a bunch of plain .o files.

      • bonzini 2 days ago ago

        Regarding --whole-archive, is it correct that it would be the default and you could opt-out of it with the function-sections/gc-sections combination?

        Are there cases in which function-sections doesn't work (GCs too much) but a hypothetical "file-sections" does? For example cases in which the code relies on side effects of global constructors, but those would be left out by function-sections?

        • eyalitki 2 days ago ago

          > Regarding --whole-archive, is it correct that it would be the default and you could opt-out of it with the function-sections/gc-sections combination?

          This is the current intention, as implemented in the up-to-date draft: https://github.com/bminor/binutils-gdb/commit/99aef6ba8db670.... Please note however that one would no longer need to specify the "--whole-archive" flag, hence resolving issues with potential duplicate placement of the static library in the linker's CLI.

          > Are there cases in which function-sections doesn't work (GCs too much) but a hypothetical "file-sections" does? For example cases in which the code relies on side effects of global constructors, but those would be left out by function-sections?

          Good question. In the first article in this series I discussed issues with global constructors and was pretty much waved away, being told that code should not be written this way. One of the members of the ELF committee did suggest an alternative for handling it yet pretty much mentioned that there are still missing pieces that require handling for their proposal to work (https://groups.google.com/g/generic-abi/c/sT25-xfX9yc/m/NRo0...).

          • bonzini 2 days ago ago

            Would you consider adding something like "file-sections" support to -r, preserving file boundaries as separate subsections? I have no idea how hard it would be.

            • eyalitki 2 days ago ago

              Correct me if i'm wrong, but wouldn't "file-sections" be identical to generating a static bundle object per original object file, and wrapping them all inside a .a archive?

              • bonzini 2 days ago ago

                If I didn't misread your proposal, file-sections would handle visibilities and resolve symbols across the entire sbo, not per-file.

                • eyalitki 2 days ago ago

                  OK, now I understood the gap. There is a technical limitation for relocation resolution when the relocation is against a different section. This means that for function sections we de-facto have no relocation finalization, only conversion of symbols from "global" to "local".

                  Hence, for a "file-sections" flag, we would only resolve relocations within a given file, but will leave intact relocations that cross the file boundary. Accordingly, this means that "function-sections" is identical to generating a static bundle object per original object file, and bundling them all together inside a .a archive.

                  • bonzini 2 days ago ago

                    > Accordingly, this means that "function-sections" is identical to generating a static bundle object per original object file, and bundling them all together inside a .a archive.

                    Ah, how are local symbols resolved within the whole .a file? I thought it would be only within the individual object file.

                  • 2 days ago ago
                    [deleted]
  • M3Henry 2 days ago ago

    This is great, static linking has always been my preferred way to build.

  • lb90 2 days ago ago

    That would be awesome!

  • zoobab 2 days ago ago

    "Glibc static compilation is broken for political purposes"

    http://stalinux.wikidot.com/

  • noncoml 2 days ago ago

    Why was their build broken?