72 comments

  • mongol 17 hours ago ago

    Someone need to be the first to take the hit, and apparently Ubuntu volunteered

    • packetlost 16 hours ago ago

      They're doing it precisely so they can identify shortcomings and bugs. This is expected and good.

      • 3836293648 10 hours ago ago

        It's expected. Good is an entirely different issue. More permissively licenced core components is 100% a bad thing

        • Volundr 4 hours ago ago

          > More permissively licenced core components is 100% a bad thing

          I don't follow, can you explain why?

    • zahlman 16 hours ago ago

      Indeed. They knew there was risk associated with this, which is why they didn't just plop it into the LTS release. If it isn't working acceptably by the 26.04 release window, it'll just get reverted.

    • ASalazarMX 16 hours ago ago

      Seriously, topic like this are either commented as:

      1. This is an inevitable problem that is being handled in a sensible manner by competent engineers.

      2. X company is stupid and their engineers are stupid only someone as smart as I would be capable of doing it right

      It tells a lot about the mental maturity of each participant. Not a single comment is "Maybe I don't know enough about this to voice an informed opinion", although that's probably a good indiucator.

      • Jonnax 11 hours ago ago

        Real good example is the comments on the article itself:

        https://www.phoronix.com/forums/forum/phoronix/latest-phoron...

        Where it seems like text based forums using upvotes/likes or reactions encourages those who are less inquisitive and/or humble to take up a lot of the atmosphere.

        It got me thinking that the internet today has more people on it but fewer forums to engage with technical topics in depth.

      • majewsky 14 hours ago ago

        > Not a single comment is "Maybe I don't know enough about this to voice an informed opinion"

        Survivorship bias.

    • hu3 15 hours ago ago

      That's why I always run x.04 LTS Ubuntu editions and not x.10 for critical stuff.

  • sionisrecur 17 hours ago ago

    As expected! I wonder how many tools depend on bugs or edgecases in GNU Coreutils.

    • kpcyrd 11 hours ago ago

      Almost as if programming against losely defined command-line interfaces is maybe not a very robust way of programming~

  • fn-mote 17 hours ago ago

    Can someone post details about why md5sum from the Rust Coreutils is producing different results from GNU Coreutils. The post does not claim this is a bug. (Surprisingly.)

    • treesknees 17 hours ago ago

      The end of the post links to the bug tracker pointing out it’s already being tracked.

      But it actually appears to be an issue with dd, not md5sum - https://github.com/VirtualBox/virtualbox/issues/226#issuecom...

      • zahlman 16 hours ago ago

        > Out of the box, script will pass bs= option to dd for it to be aware of how much to skip from the beginning of input data (and on later while loop iterations). This seem to have handled by dd either improperly or at least in a different way than it was in the past (with GNU core utils). However, once bs= is replaced with ibs=, all seems to go back to normal.

        The bs/ibs/obs options don't "skip" anything, but determine how much data to buffer in memory at a time while transferring. Regardless, it's hard to fathom how something this simple got messed up, especially considering that the suite supposedly has good test coverage and has been getting close to a full green bar.

        • hulitu 16 hours ago ago

          > However, once bs= is replaced with ibs=, all seems to go back to normal.

          so it is a bug. bs is one thing, ibs is another.

          • zahlman 16 hours ago ago

            > bs is one thing, ibs is another.

              bs=BYTES
                     read and write up to BYTES bytes at a time (default: 512); over‐
                     rides ibs and obs
            
            As described, the script should have worked as is, and the problem is in the handling of the dd options. (But I didn't verify the accuracy of the description.)
            • lesuorac 15 hours ago ago

              Wonder if `\00` is handled different between them. Not sure how to run the rust version but my md5sum seems to care how many null bytes there are.

              echo -e "\00" | md5sum 8f7cbbbe0e8f898a6fa93056b3de9c9c -

              echo -e "\00\00" | md5sum a4dd23550d4586aee3b15d27b5cec433 -

              • hiccuphippo 13 hours ago ago

                Kind of off-topic, but those commands also add a newline character to the md5sums, giving unexpected results. I was trying it in a php interpreter and getting different values.

                Add -n to echo to avoid the new line.

              • zahlman 15 hours ago ago

                > Wonder if `\00` is handled different between them.

                `dd` is for copying all the bytes of a source (unless you explicitly set a limit with the `count` option), regardless of whether they're zero. It's fundamentally not for null-terminated strings but arbitrary binary I/O. In fact, "copying from" /dev/zero is a common use case. It seems frankly implausible that the `dd` implementation is just stopping at a null byte; that would break a lot of tests and demonstrate a complete, fundamental misunderstanding of what's supposed to be implemented.

                > Not sure how to run the rust version but my md5sum seems to care how many null bytes there are.

                Yes, the md5 algorithm also fundamentally operates on and produces binary data; `md5sum` just converts the bytes to a hex dump at the end. The result you get is expected (edit: modulo hiccuphippo's correct observation), because the correct md5 sum changes with every byte of input, even if that byte has a zero value.

      • sionisrecur 16 hours ago ago

        And dd is also part of coreutils. So this is still a rust-coreutils issue, or an issue in gnu-coreutils that scripts rely on.

        • kps 16 hours ago ago

          Which has inadequate tests and gets it wrong? (‘All of them’ is an option.)

  • kpcyrd 11 hours ago ago

    hopefully they got reported as bugs, I spent some time in June making sure a basic Arch Linux system can compile itself using uutils, and things mostly worked, the only build failures could be argued were shell scripts depending on undefined behavior, like "calling install(1) with multiple -m arguments":

    https://github.com/uutils/coreutils/issues/8033

    The other one was an edge-case with dangling symlinks:

    https://github.com/uutils/coreutils/issues/8044

    Both got fixed promptly, after taking the time to report them properly. :)

  • b_e_n_t_o_n 17 hours ago ago

    Hmm, this plus the performance regressions makes me wonder if it's too soon to move to the rust version of Coreutils. And makes me wonder if this is gonna cause more pushback regarding the rust in the kernel movement.

    • mustache_kimono 16 hours ago ago

      > this is gonna cause more pushback regarding the rust in the kernel movement.

      Only among those that don't understand that, if this is a problem, then it is Canonical problem, not a Rust problem.

      To give another example, Canonical includes ZFS in Ubuntu too. And, for a while, Canonical shipped a broken snapshot mechanism called zsys with Ubuntu too. Canonical ultimately ripped zsys out because it didn't work very well. zsys would choke on more than 4000 snapshots, etc. zsys was developed in Go, while other snapshot systems developed in Perl and Python did a little less and worked better.

      Now, is zsys a Go problem? Of course not. It wasn't ready because Canonical sometimes ships broken stuff.

      • zahlman 16 hours ago ago

        > Only among those that don't understand that, if this is a problem, then it is Canonical problem, not a Rust problem.

        (This is hard to express in a careful way where I'm confident of not offending anyone. Please take me at my word that I'm not trying to take sides in this at all.)

        The dominant narrative behind this pushback, as far as I can tell, is nothing to do with the Rust language itself (aside perhaps from a few fringe people who see the adoption of Rust as some kind of signal of non-programming-related politics, and who are counter-signaling). Rather, the opposition is to re-implementing "working" software (including in the sense that nobody seems to have noticed any memory-handling faults all this time) for the sake of seemingly nebulous benefits (like compiler-checked memory safety).

        The Rust code will probably also be more maintainable by Rust developers than the C code currently is by C developers given the advantages of Rust's language design. (Unless it turns out that the C developers are just intrinsically better at programming and/or software engineering; I'm pretty skeptical of that.) But most long-time C users are probably not going to want to abandon their C expertise and learn Rust. And they seem to outnumber the new Rust developers by quite a bit, at least for now.

        • mustache_kimono 14 hours ago ago

          > Rather, the opposition is to re-implementing "working" software

          I understand the argument, and its sounds good as far as most things go, but it misses an important fact: In OSS, you can and should find your own bliss. If you want to learn Rust, as I did, you can do it by reimplementing uutils' sort and ls, and fixing bugs in cp and mv, as I did. That was my bliss. OSS doesn't need to be useful to anyone. OSS can be a learning exercise or it can be simply for love of the game.

          The fact that Canonical wants to ship it, right now, simply makes them a little silly. It doesn't say a thing about me, or Rust, or Rust culture.

          • zahlman 13 hours ago ago

            > That was my bliss. OSS doesn't need to be useful to anyone.

            If you can afford it, sure. Some would really prefer to at least be able to get some attention (and perhaps a paid job) this way.

            • mustache_kimono 13 hours ago ago

              > Some would really prefer to at least be able to get some attention (and perhaps a paid job) this way.

              Not that I agree, but people seem to be giving uutils lots of attention right now? A. HN front page vs. B. obscure JS framework? I'll take door "A"?

              I had someone contact me for a job simply because my Rust personal project had lots of stars on Github. You really don't know what people will find interesting.

        • inejge 14 hours ago ago

          > The dominant narrative behind this pushback, as far as I can tell, is nothing to do with the Rust language itself (aside perhaps from a few fringe people who see the adoption of Rust as some kind of signal of non-programming-related politics, and who are counter-signaling).

          Difficult to say with certainty, because it's easy to dress "political" resistance in respectable preference for stability. (Scare quotes because it's an amalgam in which politics is just a part.) Besides, TFA is Phoronix, whose commentariat is not known for subtlety on this topic.

          Replacing coreutils is risky because of the decades of refinement/stagnation (depending on your viewpoint) which will inevitably produce snags when component utilities interact in ways unforeseen by tests -- as has happened here. But without risk there's no reward. Of course, what's the reward here is subject to debate. IMO the self-evident advantage of a rewrite is that it's prima facie evidence of interest in using the language, which is significant if there's a dearth of maintainers for the originals. (The very vocal traditionalists are usually not in a hurry to contribute.)

        • 16 hours ago ago
          [deleted]
        • knowitnone3 14 hours ago ago

          so why create Wayland when we had X? why create another linux distro when there are so many already? why create C if we already had assembly? why create new model cars every year? why architect new homes every year? What you are proposing is we stop making changes or progress.

          • zahlman 13 hours ago ago

            I don't propose this; I explain the apparent reasons why others do.

          • bsder 6 hours ago ago

            > so why create Wayland when we had X

            Because X11 had a lot of issues that got papered over with half-baked extensions and weird interfaces to the kernel.

            The problem is that Wayland didn't feel like doing the work to make fundamental things like screen sharing, IMEs, copy-paste, and pointer warping actually ... you know ... work.

            The problem Wayland now has is that they're finally reaching something usable, but they took so long that the assumptions they made nearly 20 years ago are becoming as big a problem as the issues that were plaguing X11 when Wayland started. However, the sunk cost fallacy means that everybody going to keep pounding on Wayland rather than throwing it out and talking to graphics cards directly.

            And client rendered decorations was always just a mind bogglingly stupid decision--but that's a Gnome problem rather than a Wayland issue.

      • pornel 15 hours ago ago

        This is more nuanced in Rust's case.

        Rust is trying to systemically improve safety and reliability of programs, so the degree to which it succeeds is Rust's problem.

        OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs, and they take any bug in any Rust program as a proof by contradiction that Rust doesn't work.

        • shikon7 14 hours ago ago

          It might be a bit of bad publicity for those who want to rewrite as much as possible in Rust. While Rust is not to blame, it shows that just rewriting something in Rust doesn't magically make it better (as some Rust hype might suggest). Maybe Ubuntu was a bit too eager in adopting the Rust Coreutils, caring more about that hype than about stability.

          • b_e_n_t_o_n 14 hours ago ago

            > Rust is not to blame

            Isn't that an unfalsifiable statement until the coreutils get written in another language and can be compared?

            • mustache_kimono 14 hours ago ago

              > Isn't that an unfalsifiable statement

              Sounds pretty axiomatic: Rust is not to blame for someone else's choice to ship beta software?

        • carlmr 12 hours ago ago

          >OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs, and they take any bug in any Rust program as a proof by contradiction that Rust doesn't work.

          Yeah, that's such a tired take. If anything this shows how good Rust's guarantees are. We had a bunch of non-experts rewrite a sizable number of tools that had 40 years of bugfixes applied. And Canonical just pulled the rewritten versions in all at once and there are mostly a few performance regressions on edge cases.

          I find this such a great confirmation of the Rust language design. I've seen a few rewrites in my career, and it rarely goes this smoothly.

        • mustache_kimono 14 hours ago ago

          > Rust is trying to systemically improve safety and reliability of programs, so the degree to which it succeeds is Rust's problem.

          GNU coreutils first shipped in what, the 1980s? It's so old that it would be very hard to find the first commit. Whereas uutils is still beta software which didn't ask to be representative of "Rust", at all. Moreover, GNU coreutils are still sometimes not compatible with their UNIX forebears. Even considering this first, more modest standard, it is ridiculous to hold this software to it, in particular.

          • collinfunk 13 hours ago ago

            You would not be able to find the first commit. The repositories for Fileutils, Shellutils, and Texutils do not exist, at least anywhere that I can find. They were merged as Coreutils in 2003 in a CVS repository. A few years later, it was migrated to git.

            If anyone has original Fileutils, Shellutils, or Textutils archives (released before the ones currently on GNU's ftp server), I would be interested in looking at them. I looked into this recently for a commit [1].

            [1] https://www.mail-archive.com/coreutils@gnu.org/msg12529.html

        • hulitu 13 hours ago ago

          > OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs

          That is the narative that rust fanboys promote. AFAIK rust could be usefull for a particular kind of bugs (memory safety). Rust programs can also have coding errors or other bugs.

          • carlmr 12 hours ago ago

            >That is the narative that rust fanboys promote.

            Strawmanning is not a good look.

      • b_e_n_t_o_n 14 hours ago ago

        It's not about rust specifically, it's about replacing working software with rewrites and going from a code base written in a single language to one written in multiple.

    • AllegedAlec 16 hours ago ago

      People have forgotten the sacred motto "Don't Break Userspace"

      • steveklabnik 13 hours ago ago

        That slogan is about the kernel, not about user space programs.

    • SahAssar 15 hours ago ago

      > And makes me wonder if this is gonna cause more pushback regarding the rust in the kernel movement.

      Does this have anything at all to do with the kernel?

      • shikon7 14 hours ago ago

        No, and I expect kernel developers to be much more careful not to break anything with Rust rewrites.

  • dmitrygr 16 hours ago ago

    Hyrum's Law

       With a sufficient number of users of an API,
       it does not matter what you promise in the contract:
       all observable behaviors of your system
       will be depended on by somebody.
    • Avamander 13 hours ago ago

      One more reason to actually reimplement coreutils. Figuring out these undocumented promises also makes it easier to ensure correctness in general.

      • steveklabnik 12 hours ago ago

        You’re downvoted but you’re correct. The uutils folks have been submitting bugs and test cases upstream. It actively helps both projects.

    • lioeters 14 hours ago ago

      Every change breaks someone's workflow. https://xkcd.com/1172/

  • lousken 12 hours ago ago

    it's not LTS release so it's mostly fine, but they still should postpone the release until such bugs are fixed

  • arp242 13 hours ago ago

    As an aside on the GNU *sum tools, I found they're quite slow. A few months ago I wrote a simple replacement in Go for UX reasons and somewhat to my surprise, the Go implementation of most hash algorithms seems about 2 to 4 times as fast when using a simple naïve "obvious" single-threaded implementation. It can be sped up even more by using more than one core. Go has assembly implementations for most hash functions. I didn't really look at the coreutils implementation but I'm guessing it's "just" in C.

    At any rate, small teething issues aside, long-term things should be better and faster.

    • collinfunk 12 hours ago ago

      What distribution do you use?

      GNU Coreutils uses the OpenSSL implementation of hashes by default, but some distributions have disabled it using './configure --with-openssl=no'. Debian used to do this, but now links to OpenSSL.

      • arp242 11 hours ago ago

        This is on Void. It doesn't have --with-openssl configure args in the package, although the binary also doesn't link to lib{ssl,crypto}. It probably gets auto-detected to "no"(?)

        • collinfunk 10 hours ago ago

          Yep, from a Void Linux container it appears that they do not link to libcrypto:

            $ ldd /usr/bin/cksum
             linux-vdso.so.1 (0x00007fb354763000)
             libc.so.6 => /usr/lib64/libc.so.6 (0x00007fb354549000)
             /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fb354765000)
          
          For context, I am a committer to GNU Coreutils. We have used OpenSSL by default for a few years now [1]. Previously it was disabled by default because the OpenSSL license is not compatible with the GPL license [2]. This was resolved when they switched to the Apache License, Version 2.0 in OpenSSL 3.0.0.

          If the Void people wanted to enable OpenSSL, they would probably just need to add openssl (or whatever they package it as) to the package dependencies.

          [1] https://github.com/coreutils/coreutils/commit/0d77e1b7ea2840... [2] https://www.gnu.org/licenses/license-list.html#OpenSSL

          • arp242 6 hours ago ago

            Cheers; I guess I should have checked the coreutils implementation; I kind of just assumed it has one implementation instead of being a compile option :embarrassed-emoji:

            I also have an Arch machine where it does link to libcrypto, and it seems roughly identical (or close enough that I don't care, this is a live server doing tons of $stuff so has big error bars):

              md5sum              1.58s user 0.31s system 98% cpu 1.908 total
              ~/verify -a md5     1.59s user 0.13s system 99% cpu 1.719 total
              
              sha256sum           0.71s user 0.12s system 99% cpu 0.840 total
              ~/verify -a sha256  0.74s user 0.12s system 99% cpu 0.862 total
            
            Still wish it could do multi-core though; one reason I looked in to this is because I wanted to check 400G of files and had 15 cores doing nothing (I know GNU parallel exists, but I find it hard to use and am never quite sure I'm using it correctly, so it's faster to write my own little Go program – especially for verifying files).
            • collinfunk 6 hours ago ago

              Interesting, there must be something wrong here. Here is a benchmark using the same commit and default options other than adjusting '--with-openssl=[yes|no]':

                $ dd if=/dev/random of=input bs=1000 count=$(($(echo 10G | numfmt --from=iec) / 1000))
                10737418+0 records in
                10737418+0 records out
                10737418000 bytes (11 GB, 10 GiB) copied, 86.3693 s, 124 MB/s
                $ time ./src/sha256sum-libcrypto input 
                b3e702bb55a109bc73d7ce03c6b4d260c8f2b7f404c8979480c68bc704b64255  input
              
                real 0m16.022s
                $ time ./src/sha256sum-nolibcrypto input 
                b3e702bb55a109bc73d7ce03c6b4d260c8f2b7f404c8979480c68bc704b64255  input
              
                real 0m39.339s
              
              Perhaps there is something wrong with the detection on your system? As in, you do not have this at the end of './configure':

                $ grep -F 'HAVE_OPENSSL_' lib/config.h
                #define HAVE_OPENSSL_MD5 1
                #define HAVE_OPENSSL_MD5_H 1
                #define HAVE_OPENSSL_SHA1 1
                #define HAVE_OPENSSL_SHA256 1
                #define HAVE_OPENSSL_SHA3 1
                #define HAVE_OPENSSL_SHA512 1
                #define HAVE_OPENSSL_SHA_H 1
              • arp242 3 hours ago ago

                Sorry, I meant "roughly identical [to my Go program]", not "roughly identical [to the version without OpenSSL]". The ~/verify binary is my little Go program that is ~4 times faster on my Void system, but is of roughly equal performance on the Arch system, to check that coreutils is not slower than Go (when using OpenSSL). Sorry, I probably didn't make that too clear.

  • 17 hours ago ago
    [deleted]
  • flykespice 16 hours ago ago

    [dead]

  • samdoesnothing 14 hours ago ago

    If they would have used a modern, safe programming language like rust those errors would be impossible to... oh wait...

    Never mind.

    • arp242 13 hours ago ago

      No one has ever claimed Rust prevents all bugs. This is such a tired strawman trope.

      • spacechild1 9 hours ago ago

        There a people who claim that "once it compiles, it works". I've seen this quite a few times here on HN. Just a random example: https://news.ycombinator.com/item?id=45046182

        • acdha 9 hours ago ago

          That’s a far more nuanced comment than you’re portraying it as, especially as it’s appearing for exactly this scenario: the new dd is working as designed, it’s not segfaulting or corrupting data, but its design isn’t identical to the GNU version and that logic error is the kind of thing Rust can’t prevent short of AGI.

        • arp242 8 hours ago ago

          > "once it compiles, it works"

          That is not a quote from that post. I am very much not pedantic about only using quotation marks for quotes as long as it reasonably accurately gets the gist right, but in this case it very much doesn't.

          You are leaving out the qualified language of "generally", which completely changes what was said. And worse, the post explicitly acknowledges that it doesn't solve all bugs in the next sentence.

          And even if you can dig deep and find someone using unqualified language somewhere, I'm willing to bet a lot of money that this is an oversight and when pressed they will immediately admit so (on account of this being an internet forum and not a scientific paper, and people are careless sometimes). "I like coffee" rarely means "I always like coffee, all the time, without exception".

      • tssva 11 hours ago ago

        "No one has ever" regarding human actions is quite a bold claim to make in relation to anything.

        • arp242 11 hours ago ago

          Prove me wrong. Find someone saying that Rust will prevent all bugs.

      • samdoesnothing 9 hours ago ago

        Oh come on, tons of rust evangelicals claim that if it compiles, it works.

        • acdha 9 hours ago ago

          Consider that “works without crashing” and “works the way I had in mind” are not the same thing. Rust makes it easier to avoid logic bugs but if you think bs= should do X and there should have been a spec saying to do Y, it’s not something a language can prevent.

        • arp242 7 hours ago ago

          Who? When?

          Writing bugs in Rust is trivial and happens all the time. "do_stuff(sysv[1], sysv[2])" is a bug if you reversed the sysv arguments by accident. You can easily create a more complex version of that with a few "if" conditionals and interaction with other flags.

          There are many such silly things people can – and do – trivially get wrong all the time. Most bugs I've written are of the "I'm a bloody idiot"-type. The only way to have a fool-proof compiler is to know intent.

          What people may say is something like "if it compiles, then it runs", which is generally true, but doesn't mean it does the right thing (i.e. is free of bugs).