56 comments

  • nickysielicki 2 days ago ago

    Doesn’t surprise me at all that people who know what they’re doing are building their own images with nix for ML. Tens of millions of dollars have been wasted in the past 2 years by teams who are too stupid to upgrade from buggy software bundled into their “golden” docker container, or too slow to upgrade their drivers/kernels/toolkits. It’s such a shame. It’s not that hard.

    Edit: see also the horrors that exist when you mix nvidia software versions: https://developer.nvidia.com/blog/cuda-c-compiler-updates-im...

  • pluto_modadic 2 days ago ago

    New corollary: sometimes new tech gets built because you don't know how to correctly use existing tech.

    • dima55 2 days ago ago

      Are you referring to this Nix effort or to Docker? Because that largely applies to most usages of Docker.

      • nine_k 2 days ago ago

        Saying that Docker could be replaced by a simple script that does chroot + ufw + nsenter is like saying that Dropbox could be a simple script using rsync and cron. That is, technically not wrong, but it completely misses the UX / product perspective.

        • dima55 18 hours ago ago

          I'm saying that all usage of docker that I have ever seen is as a laughably over-engineered schroot. Literally. It can probably do more stuff, but I haven't seen it.

        • tymscar 2 days ago ago

          Saying that nix is a simple script that does chroot + ufw + nsenter is missing the point even more

          • ranguna a day ago ago

            Glancing over the UX/DX criticism is also missing the point. So I guess everyone in this thread is just looking at their own little world.

  • amadio a day ago ago

    At CERN, software stacks are created centrally and software distribution for experiments is done with CVMFS (https://cernvm.cern.ch/fs/), an HTTP-based read-only FUSE filesystem.

    EESSI (https://eessi.io) has taken this model further by using CVMFS, Gentoo Prefix (https://prefix.gentoo.org), and EasyBuild to create full HPC environments for various architectures.

    CVMFS also has a docker driver to allow only used parts of a container image to be fetched on demand, which is very good for cases in which only a small part of a fat image is used in a job. EESSI has some documentation about it here: https://www.eessi.io/docs/tutorial/containers/

  • zenmac 2 days ago ago

    Great, can't wait for the systemd crew come out with: Docker was Too Slow, So We Replaced It: Systemd in Production [asciinema]

    • mianos 2 days ago ago

      No joke, it's already there, systemd-nspawn can run OCI containers.

      • miladyincontrol 2 days ago ago

        Honestly I've been loving systemd-nspawn using mkosi to build containers, distroless ones too at that where sensible. Works a treat for building vms too.

        Scales wonderfully, fine grained permissions and configuration are exactly how you'd hope coming from systemd services. I appreciate it leverages various linux-isms like btrfs snapshots for faster read only or ephemeral containers.

        People still by large have this weird assumption that you can only do OS containers with nspawn, never too sure where that idea came from.

        • dusanh 2 days ago ago

          Building VMs?

    • whateveracct 2 days ago ago

      funnily enough, I stopped using Docker and use NixOS-configured systemd services half a decade ago and never looked back

      • flyer23 a day ago ago

        "half a decade ago" is the nix way of saying "5 years ago" :P

    • hamandcheese 2 days ago ago

      What does systemd have to do with the video?

  • fouc 12 hours ago ago

    Given that he calls their containers "pods", does anyone think he's using `podman --rootfs` to bypass overlayfs and the 128 layer limit? Or is it just a coincidence they're called "pods" ?

    Or he's using NixOS as the image OS and using nixos-containers (which use systemd-nspawn)

  • forrestthewoods 2 days ago ago

    Alternative: just produce relocatable builds that don’t require all of this unnecessary extra infrastructure

    • hamandcheese 2 days ago ago

      Please elaborate. How does one "just" do that?

      • 2 days ago ago
        [deleted]
      • forrestthewoods 2 days ago ago

        Deploying computer programs isn't that hard. What you actually need to run is pretty straight forward. Depend on glibc, copypaste all your other shared lib dependencies and plop them in RPATH. Pretend `/lib` is locked at initial install. Remove `/usr/lib` from the path and include everything.

        Docker was made because Linux sucks at running computer programs. Which is a very silly thing to be bad at. But here we are.

        What has happened in more recent years is that CMake sucks ass so people have been abusing Docker and now Nix as build system. Blech.

        The speaker does get it right at the end. A Bazel/Buck2 type solution is correct. An actual build system. They're abusing Nix and adding more layers to provide better caching. Sure, I guess.

        If you step back and look at the final output of what needs to be produced and deployed it's not all that complicated. Especially if you get rid of the super complicated package dependency graph resolution step and simply vendor the versions of the packages you want. Which everyone should do, and a company like Anthropic should definitely do.

        • hamandcheese 2 days ago ago

          You contrast Nix with Bazel, and liken it to Docker, tells me you don't have a great grasp of what Nix is. It is far more similar to Bazel than it is to Docker.

          > Depend on glibc, copypaste all your other shared lib dependencies and plop them in RPATH. Pretend `/lib` is locked at initial install. Remove `/usr/lib` from the path and include everything.

          You are not describing relocatable builds at all. You are describing... well, it kinda sounds like how Nix handles RPATH.

          • forrestthewoods 2 days ago ago

            Maybe. Ask 10 devs "what is Nix" and you'll get 15 to 25 responses. Maybe more. Nix is a million different things.

            There are certainly things within Nix that I like. But on the whole I think it's approximately two orders of magnitude more complicated than is necessary to efficiently build and deploy software.

            • hamandcheese 2 days ago ago

              Nix is 4 or 5 different things. I agree that the term is unfortunately quite overloaded.

              > But on the whole I think it's approximately two orders of magnitude more complicated than is necessary to efficiently build and deploy software.

              This might be true, if you dramatically constrain the problem space to something like "only build and deploy static Go binaries". If you have that luxury, by all means, simplify your life!

              But in the general case, it is an inherently complex problem space, and tools that attempt to rise to the challenge, like Bazel or Nix, will be lot more complex than a Dockerfile.

              • forrestthewoods 2 days ago ago

                My core hypothesis - which is maybe wrong - is that a good Bazel-like doesn't have to be that complex.

                I use Buck2 in my day job. For almost all projects its an order of magnitude simpler than CMake. It's got a ton of cruft to support a decades worth of targets that were made badly made. But the overall shape is actually pretty darn good. I think Starlark (and NixLang) are huge mistakes. Thou shalt not mix data and logic. Thou shalt not ever ever ever force users to use a language that doesn't have a great debugger.

                Build systems aren't actually that complicated. It's usually super duper trivial for me to write down the EXACT command I want to execute. It's "just" a matter of inducing a wobbly rube goldberg machine that can't be debugged to invoke the god damn command I know I want. Grumble grumble.

        • literalAardvark 2 days ago ago

          > simply vendor the versions of the packages you want

          That's really not how frontier research works.

          The packages they want are nightlies with lots of necessary fresh fixes that their team probably even contributed to, and waiting for Red Hat to include it in the next distro is completely out of the question.

          • forrestthewoods 2 days ago ago

            Objectively false.

            There is a wealth of options between latest_master -> nightly -> ..... -> RedHat update.

            And there's only a very small number of specific libraries that you'd even want to consider for nightly. Majority of repo should absolutely be pinned and infrequently updated.

            There have been so many vendor supply chain exploits on HN in 2025 that I'd consider it borderline malpractice to use packages less than a month old in almost all cases. Certainly by default.

        • edoceo 2 days ago ago

          > Docker was made because Linux sucks at running computer programs.

          I'm not sure that's why Docker was made.

          I'm pretty sure Linux is not-suck at running programs; it does run quite a lot of them. Might even be most of them? All those servers and phones and stuff.

          • forrestthewoods 2 days ago ago

            Nah. Docker was created to solve the "works on my machine" problem. Because Linux made the wrong choice of having a global pool of shared libraries. So Docker hacked around this by containerizing a full copy of the desired environment.

            What Docker has done is kinda right. Programs should include all of their dependencies! But it sucks that you have to fight really hard and go out of your way to work around Linux's original choices.

            Now you may be thinking wait a second Linux was right! It's better to have a single copy of shared libraries so they can be securely updated just once instead of forcing every program to release a new update! Theoretically sure. But since everyone uses Docker you have to (slowly and painfully) rebuild the image and so that supposed benefit is moot. Womp womp.

            Additional reading if you are so inclined: https://jangafx.com/insights/linux-binary-compatibility

            • seec a day ago ago

              I think the idea of shared libraries is also linked to a problem of the past: expensive storage (especially fast storage).

              Nowadays SSDs with decent random IO are quite cheap to the point where even low-end hosting has them, and getting spinning disks is a choice reserved for large media files. On the consumer side, we are below a hundred dollars for one TB, so the storage savings are not very relevant.

              But if you go back to when Linux was designed, storage was painfully expensive, and potentially saving even a few hundred megabytes was pretty good.

              But I do agree that shared libraries are generally a bad idea and something that will most likely create problems at some point. Self-contained software makes a lot more sense generally speaking. And I definitely think that Docker is a dumb "solution" for software distribution but the problem really starts with devs using way too many dependencies.

            • array_key_first 20 hours ago ago

              Linux didn't make that choice, some Linux distros did. And it's a trade off - the alternative uses dramatically more disk space. Plus, with a package manager, sharing libraries becomes much, much easier.

              If you really, truly, hate that then just statically link stuff. That's always been allowed.

              Statically linked Linux binaries will run for decades. The idea that Linux binaries rot while windows has some magic that makes them live is just made up.

              "But what about glibc???" Yeah that's made up. Glibc doesn't break ABI like people think they do. If you look, the glibc ABI is incredibly stable across decades. SOME highly specific gnu-only extensions have been broken. You're not using those, it's a made up problem.

            • edoceo 2 days ago ago

              I'm firmly in the SO camp.

              One thing I see is that folk who are making software for Linux target Distribution rather than Linux generic. It's because of the SO "problem".

              It's a benefit of the tight coupling of Windows or Mac (which I don't use).

              Full agree that security updates in the Docker world is like the problem from static builds.

              Disclosure: I ship some stuff on Linux; for the problems we do static & docker. The demand side also seems to favour docker. I also prefer the docker method; for the compatibility reasons.

              • forrestthewoods 2 days ago ago

                Strong agree on the “target distro” or even “target specific env” versus Linux.

                I think I disagree on Windows being tightly coupled. Windows simply has a barren global library path. Programs merely included their extra DLLs adjacent to the exe. Very simple and reliable.

                Linux has added complexity in that glibc sucks and is horribly architected. GCC/Clang are bad and depend on the local environment waaaay too much. Linux ecosystem is very much not built to support sharing cross compiled binaries. It’s so painful. :(

    • musicale 2 days ago ago

      Docker is overkill if all you really need is app packaging.

      Docker containers may not be portable anyway when the CUDA version used in the container has to match the kernel driver and GPU firmware, etc.

  • imiric 2 days ago ago

    Some people, when confronted with a problem, think "I know, I'll use Nix." Now they have two problems.

    • k__ 2 days ago ago

      Seems like anti-intellectualism is spreading at HN, too.

      • imiric 2 days ago ago

        Oh, please.

        The only anti-intellectualism is not accepting that every technology has tradeoffs.

    • HumanOstrich 2 days ago ago

      Yup, and there's a high correlation between people rewriting everything in Rust and converting everything else to Nix. It's like a complexity fetish.

      • nine_k 2 days ago ago

        It's removing complexity elsewhere, usually much more. Once you have invested in a relatively fixed bit of complexity, your other tasks become much easier to complete.

        Once you have invested in understanding the Clifford algebra, your whole classical electrodynamics turns from 30 equations into two.

        Once you have invested in writing a Fortran compiler, you can write numerical computations much easier and shorter than in assembly.

        Once you have invested in learning Nix, your whole elaborate build infra, plus chef / puppet / saltstack suddenly shrinks to a few declarations.

        Etc.

        • imiric 2 days ago ago

          > Once you have invested in learning Nix, your whole elaborate build infra, plus chef / puppet / saltstack suddenly shrinks to a few declarations.

          Your analogy breaks down with Nix, since learning and using it is a hostile experience, unlike (I assume) your other examples.

          I have been using NixOS for about 5 years now on several machines, and I still don't know what I'm doing. Troubleshooting errors and implementing features is like climbing a mountain in the dark.

          The language syntax is alien. Most functionality is unintuitive. The errors are cryptic. The documentation ranges from poor to nonexistent. It tries to replace every package manager in existence. The community is unwelcoming.

          Guix addresses some of these issues, and at least it uses a sane language, but the momentum is, unfortunately, largely with Nix.

          Nix pioneered many important concepts that most operating systems and package managers should have. But the implementation and execution of those concepts leaves a lot to be desired.

          So, sure, if you and your team have patience to deal with all of its shortcomings, I can see how it can be useful. But personally, I would never propose using it in a professional setting, and would rather use established and more "complex" tooling most engineers are familiar with.

      • justinrubek a day ago ago

        I see it more as a simplicity fetish. I don't want to have to deal with the complexity of software that is not packaged via nix. It's a big headache to deal with, I need it to be simple and work.

      • umvi 2 days ago ago

        "your entire static website is running on GitHub pages? Sounds like legacy tech debt. I need to replace it with kubernetes pronto"

        • LaurensBER 2 days ago ago

          The change with some engineers is a bit that if there's no user problem to solve, they'll happily solve some hypothetical problem.

          Having said that, my weekend project was "upgrading" my RSS reader to run HA on Kubernetes.