crates.io: Malicious crates faster_log and async_println

(blog.rust-lang.org)

16 points | by pjmlp a day ago ago

26 comments

  • vlovich123 a day ago ago

    I’ve seen lots of critiques of software registries, but no actual solutions for how to reliably secure your software supply chain other than vendoring all your dependencies which carries other security challenges like timely updates to pick up vulnerability fixes or not using any dependencies.

    What are actual things that crates.io or npm could do but aren’t to improve the security of the ecosystem?

    • pabs3 3 hours ago ago

      The only solution is open source, auditing source code and building from source code without binaries. First build an OS without using any binaries, then build the rest of the stack, auditing the code for each stage before doing any building.

      Some solutions for that include Bootstrappable Builds (and StageX), Reproducible Builds and crev.

      https://bootstrappable.org/ https://stagex.tools/ https://reproducible-builds.org/ https://github.com/crev-dev/

    • viraptor a day ago ago

      The registries can't do much beyond enforcing better auth for uploading packages. Forced 2fa will help a lot.

      Almost every other action would be just an guess with information for the devs and getting in the way of edge cases. For example, what if you genuinely want to publish a malware example or a vulnerability reproducer? What if you want your own fork of another package because you carry extra patches?

      • vlovich123 20 hours ago ago

        These weren't account take over issues like plagued npm the last little while. This is just a vanilla library that you hope someone adds as a dependency and you attack the users of whoever runs the code. 2FA does nothing here.

        • viraptor 20 hours ago ago

          I know, that's why the second paragraph. General, public repos can't solve this problem if they want to remain open to everyone. It's on the developers to deal with that side of the problem.

      • maxbond 20 hours ago ago

        I can understand why there's a research interest in publishing malware but I don't understand why there would be in publishing it to a language's official package repository. If you want to experiment with repositories hosting malware for some innocent reason, configure your package manager to use a self hosted repository.

        • viraptor 19 hours ago ago

          Because it's general and public. Then again, how would you tell the difference apart from the description? For example this https://www.npmjs.com/package/@celo/encrypted-backup is just a few lines away from a ransomware system. This https://www.npmjs.com/package/web-vuln-scanner can be both offensive and defensive. It's mostly how you use them, so there's little chance for any system to detect with certainty went no false positives.

          • maxbond 18 hours ago ago

            An offensive tool is one thing but a piece of malware meant to act within the supply chain (either at build time or runtime) is a different story. You tell the difference by reading the code and finding eg a crypto stealer, like Socket did here.

            • viraptor 13 hours ago ago

              That reading the code doesn't scale. There's not enough people ready to read all the published packages and even if there were, that's still acting after the packages are published and potentially used. Also as more people start looking at this, the malicious functionality will be hidden better and split into fragments between dependent crates. Think one crate providing directory walking, another the patterns to match but commented as something genuine, another doing genuine network lookups, another tying it together in a nonobvious way in a macro that gets part of the behaviour initialised at runtime. We're only seeing the fairly trivial cases these days.

              • maxbond 13 hours ago ago

                I don't disagree I just don't see how that contradicts anything I've said. I don't see why that would mean we should be okay with leaving a malicious package in the repository after we find out it's there, whether it's claimed to be research or not.

                We will struggle to read every release of every package and we won't catch every attack, though, I agree. If we were able to force adversaries to engage in sophisticated multi-pronged attacks instead of trivially malicious packages, that would be a win. It would make their operations more complex, time consuming, and prone to failure.

    • prdonahue 21 hours ago ago

      We're taking a very different[1] approach at Chainguard.

      Essentially: building the world from GitHub repos on SLSA L2 hardened infra and delivering directly to our customers to bypass the registry threat vector (which is where vast, vast majority of attacks occur—we'll be blogging about this soon with more data).

      [1] https://www.chainguard.dev/unchained/announcing-chainguard-l...

      • vlovich123 20 hours ago ago

        Doesn't really sound very different and I don't see how it helps here. This attack is just a vanilla library that you hope someone adds as a dependency and you attack the users of whoever runs the code. I fail to see how Chainguard helps at all here (not to mention this is Rust and not whatever "build 3p packages" means in a JS world).

        • prdonahue 19 hours ago ago

          It's the same principle as a company blocking access to domains registered in the past 30 days. Doing so eliminates a huge percent of phishing/malware as these domains are typically identified and taken down otherwise blocked in that window.

          In this particular case, the bogus libraries had been out there for months. But if in addition to a delay, you mirror just the most common subset of packages with some opinionated selection criteria and build directly from source, you eliminate most of these attacks. (The same is true across whatever language ecosystems, including JS as you mention npm, etc.)

          Is this 100% infallible? No, but security is a risk reduction game.

          • vlovich123 17 hours ago ago

            Ok. So basically the “in addition” means the techniques you’re highlighting you do aren’t enough and are basically arguing for manually curation of the registry which obviates all other techniques. Aside from the fact it doesn’t scale, xzutils famously faced a directed attack that would have passed through manual curation too.

    • maxbond a day ago ago

      Maybe checking new packages for the following:

      - Substantially the same README as another package

      - README links to a GitHub that links back to a different package

      And additionally:

      - Training a local LLM on supply-chain malware as they capture examples, and scanning new releases with it. This wouldn't stop an xz-style attack but will probably catch crypto stealers some of the time.

      - Make a "messages portal" for maintainers and telling them never to click a link in an email to see a message from the repository (and never including a link in legitimate emails). You get an email that you have a message and you log in to read it.

      • Hackbraten a day ago ago

        Checking the README for similarity to other packages can cause false positives for benign, legitimate forks.

        • maxbond 21 hours ago ago

          Sure, I'm not saying those projects should be automatically deleted or something. Just that it's worth looking into. Maybe you put a message on the package's page notifying potential users and put it into a moderation queue. Maybe a volunteer takes a look at it, and if they find something, they hit the "report malware" button. Maybe you ask for confirmation if they try to add such a package on the command line.

          Just spit balling.

          • vlovich123 20 hours ago ago

            And maybe with a banner like "WARNING: This package appears similar to this more popular package X. Did you mean to use that instead?".

    • octoberfranklin 20 hours ago ago

      > What are actual things that crates.io or npm could do but aren’t to improve the security of the ecosystem?

      Go back to the distribution/maintainer model. It worked. But it requires that developers slow down the rate of (non-alpha/beta/rc) releases until it matches the maintainer capacity of major software distributions. This is bitter medicine, but it's the solution.

      Software distributions exist for a reason. They have maintainers, who are responsible for watching for stuff like this. Unmoderated language-specific registries have encouraged a massive degree of churn. This churn is incompatible with maintainer review, which is why a lot of distributions have basically given up on language-specific registries.

      • detaro 3 hours ago ago

        You could do "distributions" on package registries too. Maintainers pooling up, intentionally keeping dependencies in the trusted set of maintainers as far as possible, including external ones only carefully and with pinned versions (registries that don't have namespaces are a bit annoying there), ...

        Or run a separate registry with reviewed packages, I'm sure we're going to see that soon as a service.

      • vlovich123 17 hours ago ago

        > Software distributions exist for a reason. They have maintainers, who are responsible for watching for stuff like this.

        And still completely missed the xzutils compromise.

        And I’m 90% sure those distribution maintainers don’t watch for stuff like this because they simply wouldn’t have the bandwidth to. I think they mostly rather just determine whether or not a software package is worth adding and maybe determining it initially and whether it has problems building. For example, the base available software in Arch is quite limited while the AUR is a choose your own adventure.

        • octoberfranklin 13 hours ago ago

          > And still completely missed the xzutils compromise.

          There's no comparison.

          That was the culmination of a three-year effort -- almost certainly state-backed. Stuff like that happens maybe three times a decade, and makes headlines. Meanwhile supply chain attacks against language-specific package managers are a monthly or perhaps even weekly event.

          There's no comparison.

  • never_inline 3 hours ago ago

    Why does this not happen in go? Maybe Go's design is practically better because you have to research the repository reputation before adding it? Or is it because go developers add few dependencies on average? Is it because of namespacing (such things are rare in Java as well).

  • jurschreuder 21 hours ago ago

    [flagged]

    • viraptor 20 hours ago ago

      You mean you take pleasure from a pain others experience? Enjoy gloating?