Ubuntu 25.10&#x27;s Rust Coreutils Is Causing Major Breakage for Some Executables

2 months ago

1 reply

> Out of the box, script will pass bs= option to dd for it to be aware of how much to skip from the beginning of input data (and on later while loop iterations). This seem to have handled by dd either improperly or at least in a different way than it was in the past (with GNU core utils). However, once bs= is replaced with ibs=, all seems to go back to normal.

The bs/ibs/obs options don't "skip" anything, but determine how much data to buffer in memory at a time while transferring. Regardless, it's hard to fathom how something this simple got messed up, especially considering that the suite supposedly has good test coverage and has been getting close to a full green bar.

hulitu

2 months ago

1 reply

> However, once bs= is replaced with ibs=, all seems to go back to normal.

so it is a bug. bs is one thing, ibs is another.

2 months ago

1 reply

> bs is one thing, ibs is another.

  bs=BYTES
         read and write up to BYTES bytes at a time (default: 512); over‐
         rides ibs and obs

As described, the script should have worked as is, and the problem is in the handling of the dd options. (But I didn't verify the accuracy of the description.)

lesuorac

about 2 months ago

2 replies

Wonder if `\00` is handled different between them. Not sure how to run the rust version but my md5sum seems to care how many null bytes there are.

echo -e "\00" | md5sum 8f7cbbbe0e8f898a6fa93056b3de9c9c -

echo -e "\00\00" | md5sum a4dd23550d4586aee3b15d27b5cec433 -

about 2 months ago

> Wonder if `\00` is handled different between them.

`dd` is for copying all the bytes of a source (unless you explicitly set a limit with the `count` option), regardless of whether they're zero. It's fundamentally not for null-terminated strings but arbitrary binary I/O. In fact, "copying from" /dev/zero is a common use case. It seems frankly implausible that the `dd` implementation is just stopping at a null byte; that would break a lot of tests and demonstrate a complete, fundamental misunderstanding of what's supposed to be implemented.

> Not sure how to run the rust version but my md5sum seems to care how many null bytes there are.

Yes, the md5 algorithm also fundamentally operates on and produces binary data; `md5sum` just converts the bytes to a hex dump at the end. The result you get is expected (edit: modulo hiccuphippo's correct observation), because the correct md5 sum changes with every byte of input, even if that byte has a zero value.

hiccuphippo

about 2 months ago

Kind of off-topic, but those commands also add a newline character to the md5sums, giving unexpected results. I was trying it in a php interpreter and getting different values.

Add -n to echo to avoid the new line.

sionisrecur

2 months ago

1 reply

And dd is also part of coreutils. So this is still a rust-coreutils issue, or an issue in gnu-coreutils that scripts rely on.

kps

2 months ago

Which has inadequate tests and gets it wrong? (‘All of them’ is an option.)

b_e_n_t_o_n

2 months ago

3 replies

Hmm, this plus the performance regressions makes me wonder if it's too soon to move to the rust version of Coreutils. And makes me wonder if this is gonna cause more pushback regarding the rust in the kernel movement.

2 months ago

3 replies

> this is gonna cause more pushback regarding the rust in the kernel movement.

Only among those that don't understand that, if this is a problem, then it is Canonical problem, not a Rust problem.

To give another example, Canonical includes ZFS in Ubuntu too. And, for a while, Canonical shipped a broken snapshot mechanism called zsys with Ubuntu too. Canonical ultimately ripped zsys out because it didn't work very well. zsys would choke on more than 4000 snapshots, etc. zsys was developed in Go, while other snapshot systems developed in Perl and Python did a little less and worked better.

Now, is zsys a Go problem? Of course not. It wasn't ready because Canonical sometimes ships broken stuff.

2 months ago

3 replies

> Only among those that don't understand that, if this is a problem, then it is Canonical problem, not a Rust problem.

(This is hard to express in a careful way where I'm confident of not offending anyone. Please take me at my word that I'm not trying to take sides in this at all.)

The dominant narrative behind this pushback, as far as I can tell, is nothing to do with the Rust language itself (aside perhaps from a few fringe people who see the adoption of Rust as some kind of signal of non-programming-related politics, and who are counter-signaling). Rather, the opposition is to re-implementing "working" software (including in the sense that nobody seems to have noticed any memory-handling faults all this time) for the sake of seemingly nebulous benefits (like compiler-checked memory safety).

The Rust code will probably also be more maintainable by Rust developers than the C code currently is by C developers given the advantages of Rust's language design. (Unless it turns out that the C developers are just intrinsically better at programming and/or software engineering; I'm pretty skeptical of that.) But most long-time C users are probably not going to want to abandon their C expertise and learn Rust. And they seem to outnumber the new Rust developers by quite a bit, at least for now.

inejge

about 2 months ago

1 reply

> The dominant narrative behind this pushback, as far as I can tell, is nothing to do with the Rust language itself (aside perhaps from a few fringe people who see the adoption of Rust as some kind of signal of non-programming-related politics, and who are counter-signaling).

Difficult to say with certainty, because it's easy to dress "political" resistance in respectable preference for stability. (Scare quotes because it's an amalgam in which politics is just a part.) Besides, TFA is Phoronix, whose commentariat is not known for subtlety on this topic.

Replacing coreutils is risky because of the decades of refinement/stagnation (depending on your viewpoint) which will inevitably produce snags when component utilities interact in ways unforeseen by tests -- as has happened here. But without risk there's no reward. Of course, what's the reward here is subject to debate. IMO the self-evident advantage of a rewrite is that it's prima facie evidence of interest in using the language, which is significant if there's a dearth of maintainers for the originals. (The very vocal traditionalists are usually not in a hurry to contribute.)

subsistence234

about 2 months ago

Is there really a dearth of maintainers for the originals? They already work fine, no? To me it sounds a bit like: "Addition has become stagnant, so we need to re-implement it in higher category theory. Sure, 99% of even research mathematicians don't benefit from that re-implementation. But no risk no reward! If vocal traditionalists refuse to contribute to reinventing the wheel, maybe they're working on something that hasn't been refined/stagnated decades ago, but I won't take their perspective (that addition already works fine) seriously, unless they start re-implementing addition as well."

about 2 months ago

1 reply

> Rather, the opposition is to re-implementing "working" software

I understand the argument, and its sounds good as far as most things go, but it misses an important fact: In OSS, you can and should find your own bliss. If you want to learn Rust, as I did, you can do it by reimplementing uutils' sort and ls, and fixing bugs in cp and mv, as I did. That was my bliss. OSS doesn't need to be useful to anyone. OSS can be a learning exercise or it can be simply for love of the game.

The fact that Canonical wants to ship it, right now, simply makes them a little silly. It doesn't say a thing about me, or Rust, or Rust culture.

about 2 months ago

1 reply

> That was my bliss. OSS doesn't need to be useful to anyone.

If you can afford it, sure. Some would really prefer to at least be able to get some attention (and perhaps a paid job) this way.

about 2 months ago

> Some would really prefer to at least be able to get some attention (and perhaps a paid job) this way.

Not that I agree, but people seem to be giving uutils lots of attention right now? A. HN front page vs. B. obscure JS framework? I'll take door "A"?

I had someone contact me for a job simply because my Rust personal project had lots of stars on Github. You really don't know what people will find interesting.

knowitnone3

about 2 months ago

2 replies

so why create Wayland when we had X? why create another linux distro when there are so many already? why create C if we already had assembly? why create new model cars every year? why architect new homes every year? What you are proposing is we stop making changes or progress.

about 2 months ago

I don't propose this; I explain the apparent reasons why others do.

bsder

about 2 months ago

> so why create Wayland when we had X

Because X11 had a lot of issues that got papered over with half-baked extensions and weird interfaces to the kernel.

The problem is that Wayland didn't feel like doing the work to make fundamental things like screen sharing, IMEs, copy-paste, and pointer warping actually ... you know ... work.

The problem Wayland now has is that they're finally reaching something usable, but they took so long that the assumptions they made nearly 20 years ago are becoming as big a problem as the issues that were plaguing X11 when Wayland started. However, the sunk cost fallacy means that everybody going to keep pounding on Wayland rather than throwing it out and talking to graphics cards directly.

And client rendered decorations was always just a mind bogglingly stupid decision--but that's a Gnome problem rather than a Wayland issue.

pornel

about 2 months ago

4 replies

This is more nuanced in Rust's case.

Rust is trying to systemically improve safety and reliability of programs, so the degree to which it succeeds is Rust's problem.

OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs, and they take any bug in any Rust program as a proof by contradiction that Rust doesn't work.

shikon7

about 2 months ago

1 reply

It might be a bit of bad publicity for those who want to rewrite as much as possible in Rust. While Rust is not to blame, it shows that just rewriting something in Rust doesn't magically make it better (as some Rust hype might suggest). Maybe Ubuntu was a bit too eager in adopting the Rust Coreutils, caring more about that hype than about stability.

b_e_n_t_o_n

about 2 months ago

1 reply

> Rust is not to blame

Isn't that an unfalsifiable statement until the coreutils get written in another language and can be compared?

about 2 months ago

> Isn't that an unfalsifiable statement

Sounds pretty axiomatic: Rust is not to blame for someone else's choice to ship beta software?

about 2 months ago

2 replies

> Rust is trying to systemically improve safety and reliability of programs, so the degree to which it succeeds is Rust's problem.

GNU coreutils first shipped in what, the 1980s? It's so old that it would be very hard to find the first commit. Whereas uutils is still beta software which didn't ask to be representative of "Rust", at all. Moreover, GNU coreutils are still sometimes not compatible with their UNIX forebears. Even considering this first, more modest standard, it is ridiculous to hold this software to it, in particular.

pornel

about 2 months ago

In this case I agree. Small, short-running programs that don't need to change much are the easy case for C, and they had plenty of time to iron out bugs and handle edge cases. Any difficulties that C may have caused are a sunk cost. Rust's advantages on top of that get reduced to mostly nice-to-haves rather than fixing burning issues.

I don't mean to tell Rust uutils authors not to write a project they wanted, but I don't see why Canonical was so eager to switch, given that there are non-zero switching costs for others.

[1] https://www.mail-archive.com/coreutils@gnu.org/msg12529.html

about 2 months ago

You would not be able to find the first commit. The repositories for Fileutils, Shellutils, and Texutils do not exist, at least anywhere that I can find. They were merged as Coreutils in 2003 in a CVS repository. A few years later, it was migrated to git.

If anyone has original Fileutils, Shellutils, or Textutils archives (released before the ones currently on GNU's ftp server), I would be interested in looking at them. I looked into this recently for a commit [1].

hulitu

about 2 months ago

1 reply

> OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs

That is the narative that rust fanboys promote. AFAIK rust could be usefull for a particular kind of bugs (memory safety). Rust programs can also have coding errors or other bugs.

carlmr

about 2 months ago

>That is the narative that rust fanboys promote.

Strawmanning is not a good look.

carlmr

about 2 months ago

>OTOH we also have people interpreting it as if Rust was supposed to miraculously prevent all bugs, and they take any bug in any Rust program as a proof by contradiction that Rust doesn't work.

Yeah, that's such a tired take. If anything this shows how good Rust's guarantees are. We had a bunch of non-experts rewrite a sizable number of tools that had 40 years of bugfixes applied. And Canonical just pulled the rewritten versions in all at once and there are mostly a few performance regressions on edge cases.

I find this such a great confirmation of the Rust language design. I've seen a few rewrites in my career, and it rarely goes this smoothly.

b_e_n_t_o_n

about 2 months ago

It's not about rust specifically, it's about replacing working software with rewrites and going from a code base written in a single language to one written in multiple.

AllegedAlec

2 months ago

1 reply

People have forgotten the sacred motto "Don't Break Userspace"

steveklabnik

about 2 months ago

That slogan is about the kernel, not about user space programs.

SahAssar

about 2 months ago

1 reply

> And makes me wonder if this is gonna cause more pushback regarding the rust in the kernel movement.

Does this have anything at all to do with the kernel?

shikon7

about 2 months ago

No, and I expect kernel developers to be much more careful not to break anything with Rust rewrites.

sionisrecur

2 months ago

1 reply

As expected! I wonder how many tools depend on bugs or edgecases in GNU Coreutils.

kpcyrd

about 2 months ago

1 reply

Almost as if programming against losely defined command-line interfaces is maybe not a very robust way of programming~

yjftsjthsd-h

about 2 months ago

1 reply

No, the same thing can and does happen on any API. As an obvious example, there are an annoying number of programs that depend on GNU's libc in particular, and which therefore break when someone tries to compile against ex. musl.

kpcyrd

about 2 months ago

1 reply

This isn't true, libc is magnitudes better defined than binary names in shell scripting, but it's still yet another case of approximation-API with multiple competing implementations.

There are plenty of ecosystems where programs declare a specific library implementation they expect to call into (Rust, Python, Npm, Ruby, Perl, ...) often even constrained by versions. But also if you depend on libcurl you are only going to have to deal with multiple verions of the same implementation (that you can still constrain in e.g. pkg-config).

In shell scripting you have to deal with stuff like "in nc, EOF on stdin is undefined behavior and implementation specific".

yjftsjthsd-h

about 2 months ago

I don't see the difference? There's a POSIX spec for coreutils (and some other shell-usable executables), there's a POSIX spec for libc (and some other C libs), and both are a somewhat painful subset of what most programs/scripts actually use. And yes, in both cases often the solution is to explicitly depend on a particular implementation or even version thereof; systemd explicitly only supports glibc, shell scripts targeting macOS may require that you install a newer version of bash than the OS ships, and yes, if you need nc to behave a particular way then you need to depend on the right one (I've actually personally hit that; IIRC I installed netcat-openbsd but it didn't include a feature I needed). In all cases, there may be a portable standard, but it doesn't cover everything, so if you're not inside what's covered by the standard then you should specify your actual dependencies. It still doesn't matter whether the API you're touching is mediated by the link phase of a compiler chain or *sh calling particular binaries.

mongol

2 months ago

4 replies

Someone need to be the first to take the hit, and apparently Ubuntu volunteered

https://www.phoronix.com/forums/forum/phoronix/latest-phoron...

2 months ago

1 reply

Indeed. They knew there was risk associated with this, which is why they didn't just plop it into the LTS release. If it isn't working acceptably by the 26.04 release window, it'll just get reverted.

subsistence234

about 2 months ago

1 reply

non-LTS release means beta now?

dfc

about 2 months ago

No, but sort of yes. This is from Ubuntu:

"Interim releases will introduce new capabilities from Canonical and upstream open source projects, they serve as a proving ground for these new capabilities." https://ubuntu.com/about/release-cycle

packetlost

2 months ago

1 reply

They're doing it precisely so they can identify shortcomings and bugs. This is expected and good.

3836293648

about 2 months ago

1 reply

It's expected. Good is an entirely different issue. More permissively licenced core components is 100% a bad thing

Volundr

about 2 months ago

1 reply

> More permissively licenced core components is 100% a bad thing

I don't follow, can you explain why?

subsistence234

about 2 months ago

the result of this licensing: in the future there's gonna be a shitty free linux variant, and a SaaS premium variant. of course for complete distros this has always been the case, but now we'll get it for the core components.

ASalazarMX

2 months ago

2 replies

Seriously, topic like this are either commented as:

1. This is an inevitable problem that is being handled in a sensible manner by competent engineers.

2. X company is stupid and their engineers are stupid only someone as smart as I would be capable of doing it right

It tells a lot about the mental maturity of each participant. Not a single comment is "Maybe I don't know enough about this to voice an informed opinion", although that's probably a good indiucator.

majewsky

about 2 months ago

> Not a single comment is "Maybe I don't know enough about this to voice an informed opinion"

Survivorship bias.

Jonnax

about 2 months ago

Real good example is the comments on the article itself:

Where it seems like text based forums using upvotes/likes or reactions encourages those who are less inquisitive and/or humble to take up a lot of the atmosphere.

It got me thinking that the internet today has more people on it but fewer forums to engage with technical topics in depth.

hu3

about 2 months ago

That's why I always run x.04 LTS Ubuntu editions and not x.10 for critical stuff.

dmitrygr

2 months ago

2 replies

Hyrum's Law

   With a sufficient number of users of an API,
   it does not matter what you promise in the contract:
   all observable behaviors of your system
   will be depended on by somebody.

Avamander

about 2 months ago

1 reply

One more reason to actually reimplement coreutils. Figuring out these undocumented promises also makes it easier to ensure correctness in general.

steveklabnik

about 2 months ago

You’re downvoted but you’re correct. The uutils folks have been submitting bugs and test cases upstream. It actively helps both projects.

lioeters

about 2 months ago

Every change breaks someone's workflow. https://xkcd.com/1172/

samdoesnothing

about 2 months ago

1 reply

If they would have used a modern, safe programming language like rust those errors would be impossible to... oh wait...

Never mind.

about 2 months ago

3 replies

No one has ever claimed Rust prevents all bugs. This is such a tired strawman trope.

tssva

about 2 months ago

1 reply

"No one has ever" regarding human actions is quite a bold claim to make in relation to anything.

about 2 months ago

1 reply

Prove me wrong. Find someone saying that Rust will prevent all bugs.

tssva

about 2 months ago

I believe the way it generally works is the person that makes a claim has the burden to provide evidence it is true.

spacechild1

about 2 months ago

2 replies

There a people who claim that "once it compiles, it works". I've seen this quite a few times here on HN. Just a random example: https://news.ycombinator.com/item?id=45046182

about 2 months ago

1 reply

> "once it compiles, it works"

That is not a quote from that post. I am very much not pedantic about only using quotation marks for quotes as long as it reasonably accurately gets the gist right, but in this case it very much doesn't.

You are leaving out the qualified language of "generally", which completely changes what was said. And worse, the post explicitly acknowledges that it doesn't solve all bugs in the next sentence.

And even if you can dig deep and find someone using unqualified language somewhere, I'm willing to bet a lot of money that this is an oversight and when pressed they will immediately admit so (on account of this being an internet forum and not a scientific paper, and people are careless sometimes). "I like coffee" rarely means "I always like coffee, all the time, without exception".

spacechild1

about 2 months ago

Fair enough.

acdha

about 2 months ago

That’s a far more nuanced comment than you’re portraying it as, especially as it’s appearing for exactly this scenario: the new dd is working as designed, it’s not segfaulting or corrupting data, but its design isn’t identical to the GNU version and that logic error is the kind of thing Rust can’t prevent short of AGI.

samdoesnothing

about 2 months ago

2 replies

Oh come on, tons of rust evangelicals claim that if it compiles, it works.

acdha

about 2 months ago

Consider that “works without crashing” and “works the way I had in mind” are not the same thing. Rust makes it easier to avoid logic bugs but if you think bs= should do X and there should have been a spec saying to do Y, it’s not something a language can prevent.

about 2 months ago

Who? When?

Writing bugs in Rust is trivial and happens all the time. "do_stuff(sysv[1], sysv[2])" is a bug if you reversed the sysv arguments by accident. You can easily create a more complex version of that with a few "if" conditionals and interaction with other flags.

There are many such silly things people can – and do – trivially get wrong all the time. Most bugs I've written are of the "I'm a bloody idiot"-type. The only way to have a fool-proof compiler is to know intent.

What people may say is something like "if it compiles, then it runs", which is generally true, but doesn't mean it does the right thing (i.e. is free of bugs).

about 2 months ago

1 reply

As an aside on the GNU *sum tools, I found they're quite slow. A few months ago I wrote a simple replacement in Go for UX reasons and somewhat to my surprise, the Go implementation of most hash algorithms seems about 2 to 4 times as fast when using a simple naïve "obvious" single-threaded implementation. It can be sped up even more by using more than one core. Go has assembly implementations for most hash functions. I didn't really look at the coreutils implementation but I'm guessing it's "just" in C.

At any rate, small teething issues aside, long-term things should be better and faster.

about 2 months ago

1 reply

What distribution do you use?

GNU Coreutils uses the OpenSSL implementation of hashes by default, but some distributions have disabled it using './configure --with-openssl=no'. Debian used to do this, but now links to OpenSSL.

about 2 months ago

1 reply

This is on Void. It doesn't have --with-openssl configure args in the package, although the binary also doesn't link to lib{ssl,crypto}. It probably gets auto-detected to "no"(?)

[1] https://github.com/coreutils/coreutils/commit/0d77e1b7ea2840... [2] https://www.gnu.org/licenses/license-list.html#OpenSSL

about 2 months ago

1 reply

Yep, from a Void Linux container it appears that they do not link to libcrypto:

  $ ldd /usr/bin/cksum
   linux-vdso.so.1 (0x00007fb354763000)
   libc.so.6 => /usr/lib64/libc.so.6 (0x00007fb354549000)
   /lib64/ld-linux-x86-64.so.2 => /usr/lib64/ld-linux-x86-64.so.2 (0x00007fb354765000)

For context, I am a committer to GNU Coreutils. We have used OpenSSL by default for a few years now [1]. Previously it was disabled by default because the OpenSSL license is not compatible with the GPL license [2]. This was resolved when they switched to the Apache License, Version 2.0 in OpenSSL 3.0.0.

If the Void people wanted to enable OpenSSL, they would probably just need to add openssl (or whatever they package it as) to the package dependencies.

about 2 months ago

1 reply

Cheers; I guess I should have checked the coreutils implementation; I kind of just assumed it has one implementation instead of being a compile option :embarrassed-emoji:

I also have an Arch machine where it does link to libcrypto, and it seems roughly identical (or close enough that I don't care, this is a live server doing tons of $stuff so has big error bars):

  md5sum              1.58s user 0.31s system 98% cpu 1.908 total
  ~/verify -a md5     1.59s user 0.13s system 99% cpu 1.719 total
  
  sha256sum           0.71s user 0.12s system 99% cpu 0.840 total
  ~/verify -a sha256  0.74s user 0.12s system 99% cpu 0.862 total

Still wish it could do multi-core though; one reason I looked in to this is because I wanted to check 400G of files and had 15 cores doing nothing (I know GNU parallel exists, but I find it hard to use and am never quite sure I'm using it correctly, so it's faster to write my own little Go program – especially for verifying files).

about 2 months ago

1 reply

Interesting, there must be something wrong here. Here is a benchmark using the same commit and default options other than adjusting '--with-openssl=[yes|no]':

  $ dd if=/dev/random of=input bs=1000 count=$(($(echo 10G | numfmt --from=iec) / 1000))
  10737418+0 records in
  10737418+0 records out
  10737418000 bytes (11 GB, 10 GiB) copied, 86.3693 s, 124 MB/s
  $ time ./src/sha256sum-libcrypto input 
  b3e702bb55a109bc73d7ce03c6b4d260c8f2b7f404c8979480c68bc704b64255  input

  real 0m16.022s
  $ time ./src/sha256sum-nolibcrypto input 
  b3e702bb55a109bc73d7ce03c6b4d260c8f2b7f404c8979480c68bc704b64255  input

  real 0m39.339s

Perhaps there is something wrong with the detection on your system? As in, you do not have this at the end of './configure':

  $ grep -F 'HAVE_OPENSSL_' lib/config.h
  #define HAVE_OPENSSL_MD5 1
  #define HAVE_OPENSSL_MD5_H 1
  #define HAVE_OPENSSL_SHA1 1
  #define HAVE_OPENSSL_SHA256 1
  #define HAVE_OPENSSL_SHA3 1
  #define HAVE_OPENSSL_SHA512 1
  #define HAVE_OPENSSL_SHA_H 1

about 2 months ago

1 reply

Sorry, I meant "roughly identical [to my Go program]", not "roughly identical [to the version without OpenSSL]". The ~/verify binary is my little Go program that is ~4 times faster on my Void system, but is of roughly equal performance on the Arch system, to check that coreutils is not slower than Go (when using OpenSSL). Sorry, I probably didn't make that too clear.