Redoxfs Is the Default Filesystem of Redox Os, Inspired by Zfs
Posted3 months agoActive3 months ago
doc.redox-os.orgTechstoryHigh profile
calmmixed
Debate
60/100
Redox OSRedoxfsFile Systems
Key topics
Redox OS
Redoxfs
File Systems
The discussion revolves around RedoxFS, the default filesystem of Redox OS, which is inspired by ZFS, and the community's concerns and questions about its design choices and potential issues.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
31m
Peak period
141
Day 1
Avg / period
26.7
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Sep 25, 2025 at 5:25 PM EDT
3 months ago
Step 01 - 02First comment
Sep 25, 2025 at 5:56 PM EDT
31m after posting
Step 02 - 03Peak activity
141 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 8, 2025 at 10:13 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45379325Type: storyLast synced: 11/20/2025, 6:39:46 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Its much better to hope that OpenZFS decides to create a RedoxOS implementation themselves then to try and make a clean room ZFS implementation.
More importantly though, Linux or the Linux Foundation are unlikely to file a lawsuit without clear evidence of infringement, whereas Oracle by their nature will have filed lawsuits and a dozen motions if they catch even a whiff of possible infringement.
I wouldn't touch Oracle IP with a 50' fibreglass pole while wearing rubber boots.
Also, and again, the validity of a lawsuit isn't going to stop Oracle from filing it.
What I'm wondering is what about HAMMER2? It's under a copyfree license and it is developed for a microkernel operating system (DragonflyBSD). Seems like a natural fit.
[0] btrfs holds the distinction of being the only filesystem that has lost all of my data, and it managed to do it twice! Corrupt my drive once, shame on you. Corrupt my drive twice, can't corrupt my drive again.
[1] further explanation: The CDDL is basically "the GPL but it only applies to the files under the CDDL, rather than the whole project". So the code for ZFS would remain under the CDDL and it would have all the restrictions that come with that, but the rest of the code base can remain under MIT. This is why FreeBSD can have ZFS fully integrated whereas on Linux ZFS is an out-of-tree module.
Exact same drive? You might want to check that drive isn't silently corrupting data.
I still blame btrfs, something very similar happened to me.
I had a WD Green drive with a known flaw were it would just silently zero data on writes in some random situations. EXT4 worked fine on this drives for years (the filesystem was fine, my files had random zeroed sections). But btrfs just couldn't handle this situation and immediately got itself into an unrecoverable state, scrub and fsck just couldn't fix the issue.
In one way, I was better off. At least I now knew that drive had been silently corrupting data for years. But it destroyed my confidence in btrfs forever. Btrfs didn't actually lose any additional data for me, it was in RAID and the data was all still there, so it should have been able to recover itself.
But it simply couldn't. I had to manually use a hex editor to piece a few files back together (and restore many others from backup).
Even worse, when I talked to people on the #btrfs IRC channel, not only was nobody was surprised the btrfs had borked itself due to bad hardware, but everyone recommend that a btrfs filesystem that had been borked could never be trusted. Instead, the only way to get a trustworthy, clean, and canonical btrfs filesystem was to delete it and start from scratch (this time without the stupid faulty drive)
Basically, btrfs appears to be not fit for purpose. The entire point of such a filesystem is that it should be able to run in adverse environments (like faulty hardware) and be tolerant to errors. It should always be possible to repair such a filesystem back to a canonical state.
Pretty sure all file systems and their developers are unsurprised by file system corruption occurring on bad hardware.
There are also drives that report successful flush and fua, but the expected (meta)data is not yet on stable media. That results in out of order writes. There's no consequence unless there's a badly timed crash or power failure. In that case there's out of order writes and possibly dropped writes (what was left in the write cache).
File system developers have told me that their designs do not account for drives miscommunicating flush/fua succeeding when it hasn't. This is like operating under nobarrier some of the time.
Overwriting file systems' metadata have fixed locations, therefore quite a lot of assumptions can be made during repair about what should be there, inferring it from metadata in other locations.
Btrfs has no fixed locations for metadata. This leads to unique flexibility, and repair difficulty. Flexible: Being able to convert between different block group profiles (single, dup, and all the raids), and run on unequal sized drives, and conversion from any file system anybody wants to write the code for - because only the per device super blocks have fixed locations. Everything else can be written anywhere else. But the repair utility can't make many assumptions. And if the story told by the metadata that is present, isn't consistent, the repair necessarily must fail.
With Btrfs the first step is read-only rescue mount, which uses backup roots to find a valid root tree, and also the ability to ignore damaged trees. This read-only mount is often enough to extract important data that hasn't been (recently) backed up.
Since moving to Btrfs by default in Fedora almost 10 releases ago, we haven't seen more file system problems. One problem we do see more often is evidence of memory bitflips. This makes some sense because the file system metadata isn't nearly as big a target as data. And since both metadata and data are checksummed, Btrfs is more likely to detect such issues.
All I want is an fsck that I can trust.
I love that btrfs will actually alert me to bad hardware. But then I expect to be able to replace the hardware and run fsck (or scrub, or whatever) and get back to the best-case healthy state with minimal fuss. And by "healthy" I don't mean ready for me to extract data from, I mean ready for me to mount and continue using.
In my case, I had zero corrupted metadata, and a second copy of all data. fsck/scrub should have been able to fix everything with zero interaction.
If files/metadata are corrupted, fsck/scrub should provide tooling for how to deal with them. Delete them? Restore them anyway? Manual intervention? IMO, failure is not a valid option.
How does Fedora measure the rate of file system problems?
This isn't a problem integrating MIT code into a GPL project, because MIT's requirements are a subset of the GPL's requirements so the combined project being under the GPL is no problem. (Going the other way by integrating GPL code into an MIT project is technically also possible, but it would covert that project to a GPL project so most MIT projects would be resistant to this.)
This isn't a problem combining MIT and CDDL because both lack the GPL's virality. They can happily coexist in the same project, leaving each other alone.
(obligatory: I am not a lawyer)
The problem is, IIRC, that GPLv2 does not allow creating a combined work where part of it is covered by license that has stricter requirements than GPL.
This is the same reason why you can't submit GPLv3 code to Linux kernel, because GPLv3 falls under the same issue, and IIRC even on the same kind of clause (mandatory patent licensing)
That is part of the virality: the combined work is under the terms of the GPL and therefore cannot have additional restrictions placed on it. If the GPL wasn't viral then the GPL code and CDDL code would both be under their respective licenses and leave each other alone. The GPL decided to apply itself to the combined work which causes the problems.
This lack of required reciprocity and virtuous sounding "leave each other alone" is no virtue at all. It doesn't harm anyone else at least, which is great, but it's also shooting itself in the foot and a waste.
WinBtrfs [1], a reimplementation of btrfs from scratch for Windows systems, is licensed under the LGPL v3. Just because the reference implementation uses one license doesn't mean that others must use it too.
[1] https://github.com/maharmstone/btrfs
There certainly is a continuum. I've always wanted to build a microkernel-ish system on top of Linux that only has userspace options for block devices, file systems and tcp/ip. It would be dog-slow but theoretically work.
[1] https://www.redox-os.org/
Curious about the details behind those compatibility problems.
The whole ARC thing for example, sidestepping the general block cache, feels like a major hack resulting from how it was brutally extracted from Solaris at the time...
The way zfs just doesn't "fit" was why I had hope for btrfs... ZFS is still great for a file server, but wouldn't use it on a general purpose machine.
For instance using the `zfs` tool one wouldn't only configure file system properties, but also control NFS exports, which traditionally was done using /etc/exports.
ZFS and ZPOOL tools provide accesses to multiple different subsystems in ways that make more sense to end user, a lot like LVM and LUKS do on top of device mapper these days
ZFS sidestepping conventional device and mount handling with the way it "imports"/"exports" ZFS pools, usually auto-mounting all datasets straight to your system root by default - if you have a dataset named "tank", it automounts to "/tank".
ZFS operation itself being an inexact science of thousands of per-dataset flags and tunables (again not set through the common path with mount flags), and unless you run something like TrueNAS that sets them for you it's probably best to pretend it's not there.
Common configuration with decent performance commonly involving complexities like L2ARC.
It's far too invasive and resource consuming for a general purpose machine, and does not provide notable benefit there. A dedicated file server won't care about the invasiveness and will benefit from the extra controls.
btrfs fell short by still not having its striped RAID game under control, only being "production grade" on single disks and mirrors for the longest time - probably still?
One huge problem with ZFS is that there is no zero copy due to the ARC wart. Eg, if you're doing sendfile() from a ZFS filesystem, every byte you send is copied into a network buffer. But if you're doing sendfile from a UFS filesystem, the pages are just loaned to the network.
This means that on the Netflix Open Connect CDN, where we serve close to the hardware limits of the system, we simply cannot use ZFS for video data due to ZFS basically doubling the memory bandwidth requirements. Switching from UFS to ZFS would essentially cut the maximum performance of our servers in half.
Better to just have the filesystem get out of the way and just focus on being good at raw I/O scheduling.
I wonder if FreeBSD is going to get something io_uring-esque. That's one of the more interesting developments in kernel space...
FreeBSD has "fire and forget" behavior in the context of several common hot paths, so the need for io_uring is less urgent. Eg, sendfile is "fire and forget" from the application's perspective. If data is not resident, the network buffers are pre-allocated and staged on the socket buffer. When the io completes, the disk interrupt handler then flips the pre-staged buffers (whose pages now contain valid data) to "ready" and pokes the TCP state machine.
Similarly, FreeBSD has OpenBSD inspired splice which is fire-and-forget once 2 sockets are spliced.
But to be fair, we are approaching the point where spinning rust stops making sense for even the remaining use cases, and so designing new optimizations specifically for it might be a bit silly now.
0. https://genode.org/
Here is a talk about that porting effort:
https://m.youtube.com/watch?v=N624i4X1UDw
Note the binaries are not specific to the kernel, so anything built for Genode will work on Genode systems of compatible ISA irrespective of kernel.
I am surprised to hear 2008, I could swear they have been active far longer. Maybe I am conflating it with TUD:OS.
They are indeed quite active. Just see their backlog of release notes. They release 4 times a year, on the clock, and always document what they've done.
I agree, can’t say “dead” but it is a Google project so it’s like being born with a terminal condition.
Some new things that I can think of off the top of my head: * More complete support for linux emulation via starnix. * Support for system power management * Many internal platform improvements including a completely overhauled logging system that uses shared memory rather than sockets
Most project happenings are not that interesting to the average person because operating system improvements are generally boring, at least at the layers fuchsia primarily focuses on. If you've worked in the OS space, a lot of things fuchsia is doing is really cool though.
Look at their other "Open Source" projects like Android to understand why they would want to ensure they would avoid GPL code. It's all about control, and appearances of OS through gaslighting by source available.
I admire these projects & the teams for their tenacity.
Four bells! Damn the torpedoes.
MIT is for education not cooperation.
It can mean both of the options might be sane (reasonable) one is just more reasonable. It might also mean both of the options are insane (unreasonable) one is just less so.
That doesn't mean that I'd rather see some form of copyleft in place (like the MPLv2) or at least a licence with some kind of patent protection baked in (like the Apache 2.0), the X11/MIT licences are extremely weak against patent trolls
That we have Linux as we have it today is the result of
- being under GPL
- having a large enough and diverse enough group of contributors to make re-licensing practically impossible
- no CLA, no Copyright assignment
Unfortunely the license is seen as tainted by all businesses, and plenty of OSes are already seen as Linux alternative in some spaces.
In others Android is the only being used, where the only thing left from GPL is the Linux kernel itself, and only because Fuchsia kind of went nowhere, besides some kitchen devices.
Anyone have an idea what this actually means and what problems they were having in practice?
[0] https://vermaden.wordpress.com/2022/03/25/zfs-compatibility/
[1] https://github.com/openzfsonwindows/openzfs
ie ZFS isn't just a file system. It's a volume manager, raid and file system rolled into one holistic system vs for example LVM + MD + ext4.
And (again I'm only speculating) in their micro kernel design want to have individual components running separately to layer together a complete solution.
No, ZFS is not "monolithic".
It's just that on the outside you have a well integrated user interface that does not expose you to SPA (block layer), ZIO (IO layer, that one is a bit intersectional but still a component others call), DMU (object storage), and finally ZVOL (block storage emulated over DMU) and ZPL (POSIX-compatible filesystem on top of DMU) or Lustre-ZFS (Lustre metadata and object stores implented on top of DMU). There are also a few utility components that are effectively libraries (AVL trees, key-value data serialization library, etc)
I read some blog posts back in the day about why they did this and it sounded a lot like those layers were more historical accidents or something.
You can turn it around and say that ZFS is a full stack filesystem (or vertically integrated if you will) and it should be pretty obvious that a rethink on that level can have big advantages.
32bit inodes? why?
Other systems had to go through pains to migrate to 64bit. Why not skip that?
That said, while it's compatible with Linux via fuse, unless you're helping to build RedoxOS, I don't think there's any real expectation that you would try it.
Isn't writing a robust file system something that routinely takes on the order of decades? E.g. reiserfs, bcachefs, btrfs.
Not to rain on anyone's parade. The project looks cool. But if you're writing an OS, embarking on a custom ZFS-inspired file system seems like the ultimate yak shaving expedition.
In what sense?
The reason I ask is because I'm trying to tease out if you have architectural problems with the way the filesystem is designed, or if you simply think it's unreliable.
building fs with such large features set is just untrivial task, and btrfs one of very few who made it, so it is absolute success story.
an internet community project to write an entire operating system from scratch using some newfangled programming language is literally the final boss of yak shaving. there is no reason to do it other than "it's fun" and of course writing a filesystem for it would be fun.
For me, the things that would make it just perfect would be more ergonomic Cap'n Proto support (eliminate a ton of fiddly code for on disk data structures), and dependent types.
i suspect the linux stuff would be far more space and time efficient, but we won't know until projects like this mature more.
Now, the engineering effort required to rewrite or redo in Rust, that's a different story of course.
It's _so_ _much_ _better_. I can use async, maps, iterators, typesafe deserialization, and so on. All while not using any dynamic allocations.
With full support from Cargo for repeatable builds. It's night and day compared to the regular Arduino landscape of random libraries that are written in bad pseudo-object-oriented C++.
Calling it "optimized" is a stretch. A veeeery big one. The low-level code in some paths is highly optimized, but the overall kernel architecture still bears the scars of C.
The most popular data structure in the kernel land is linked list. AKA the most inefficient structure for the modern CPUs. It's so popular because it's the only data structure that is easy to use in C.
The most egregious example is the very core of Linux: the page struct. The kernel operates on the level of individual pages. And this is a problem in case you need _a_ _lot_ of pages.
For example, when you hibernate the machine, the hibernation code just has a loop that keeps grabbing swap-backed pages one by one and writing the memory to them. There is no easy way to ask: "give me a list of contiguous free page blocks". Mostly because these kinds of APIs are just awkward to express in C, so developers didn't bother.
There is a huge ongoing project to fix it (folios). It's been going for 5 years and counting.
Is this reasoning really true? A quick search reveals the availability of higher-level data structures like trees, flexible arrays, hashtables, and the like, so it's not as if the linux kernel is lacking in data structures.
Linked lists have a few other advantages - simplicity and reference stability come to mind, but they might have other properties that makes them useful for kernel development beyond how easy they are to create.
I don't understand that statement. Linked lists are no easier or harder to use than other data structures. In all cases you have to implement it once and use it anywhere you want?
Maybe you meant that linked lists are the only structure that can be implemented entirely in macros, as the kernel likes to do? But even that wouldn't be true.
We see this in the compiler frameworks that predated the C++ standard, and whose ideas lived on Java and .NET afterwards.
Too many C++ libraries are written for a C++ audience, when they could be just as ergonomic as in other high level languages, and being C++, there could be a two level approach, but unfortunately that isn't how it rolls.
There needs to be projects like that for any kind of innovation to happen.
Why leave at that? One day it can be production ready.
The filesystem is so deeply connected to the OS I bet there's a lot of horror around swapping those interfaces. On the contrary, I've never heard anything bad about DragonflyBSD's HAMMER. But it's basically assumed you're using DragonFlyBSD.
Would I keep a company's database on a new filesystem? No, nobody would know how to recover it from failed disk hardware.
This isn't really my area but a Rust OS using a ZFS-like filesystem seems like a lot of classic Linux maintainer triggers. What a funny little project this is. It's the first I've heard of Redox.
Edit: reminds me of The Tarpit chapter from the Mythical Man Month
> The fiercer the struggle, the more entangling the tar, and no beast is so strong or so skillful but that he ultimately sinks.
Filesystems, as complex as they are, aren't full of traps like encryption is. Still plenty of subtle traps, don't get me wrong: you have to be prepared for all kinds of edge cases like the power failing at exactly the wrong moment, hardware going flaky and yet you have to somehow retrieve the data since it's probably the only copy of someone's TPS report, that sort of thing. But at least you don't have millions of highly-motivated people deliberately trying to break your filesystem, the way you would if you rolled your own encryption.
The only connection is that writing custom encryption is a thing that smart people like to try their hand at, but its success is defined by the long tail of failure cases not by the cleverness of the happy path. I agree 100% with what rmunn said.
As I said I'm not a filesystem person, but my sense is that filesystem difficulty is also dominated by the long tail of failure cases and for similar reasons. Failure in encryption means you lose control of your data, failure in filesystems mean you lose your data (or maybe you lose liveness/performance) [0]
But really I just meant it in the sense that it's a journey people often go down underestimating just how long it takes. So it's a sort of trap from the project management perspective.
> I'm confused about what 'rolling your own encryption' means at an abstraction level
It cuts through many abstractions. You should definitely not define your own crypto primitives. You also shouldn't define your own login flow. You shouldn't design a custom JWT system, etc. You probably shouldn't write your own crypto library unless there's not one in your language, in which case you should probably be wrapping a trusted library in C or C++, etc. The higher you go in abstraction, the more it's okay to design an alternative. But any abstraction can introduce a weakness so the risk is always there.
[0] Ordinarily you still have backups, which makes file system failures potentially less final than encryption failures. But what if the filesystem holding your backup root keys fails. Then the encryption wasn't a failure but you've potentially crypto shredded your entire infrastructure.
What choice of paint you throw into the tarpit makes zero difference.
If you're going to have to write it from scratch (since rust), might as well make your own.
If your goal is to gain wide adoption fast, that is a bad idea.
Yeah, to clarify, that was how I took it. I'm not concerned about Redox OS failing to get adoption. And I love that they're exploring new things and following their interests.
I was thinking more as someone with ADHD and widespread interests. It feels a bit like saying "I'm learning the oboe. But first I want to learn how to make an oboe." Both are cool goals, but if you do the second you may never get to the first in a meaningful way.
That may be okay, I was just intending to express surprise at a detour that could last a significant fraction of the development teams' lives. Especially since as the default file system, delays or breakages in the filesystem can delay or break the OS project as a whole.
* inotify implementation is insufficiently atomic, so event subscribers can fail to receive notifications under certain conditions.
* License/patent encumbrance prevents modern operating systems from implementing/distributing a common next-gen FS.
* ZFS has native encryption in theory but it was bolted on later and has numerous buggy interactions the powerful zfs send/recv.
* ZFS native encryption is architecturally incapable of protecting metadata like filenames.
There is lots of room for innovation, and with modern tools it shouldn't take decades to build a production-ready driver for any one ISA (presumable x86_64.) If this project wants to pilot a superior option, I'm all for it!
* RedoxFS is MIT licensed
* RedoxFS supports all features with or without full disk encryption
* RedoxFS encryption includes all metadata
The people who just think OSes would be better with more Rust in them, but aren't looking to reinvent from first principles are in the Rust for Linux universe.
And you know what, that's fine. Linux started out as a hobby project with similar origins before it became a big serious OS.
Worst case this doesn't work. Best case, this works amazingly well. I think there's some valid reason for optimism here give other hard things that Rust has been used for in the past few years.
[1] https://en.m.wikipedia.org/wiki/HAMMER_(file_system)
Rust people love doing that, let them. Let them introduce logic bugs across the codebase. Sounds fun.
https://gitlab.redox-os.org/redox-os/redoxfs/-/commit/bc2d90...
at the end of the page should read
`fusermount3 -u ./redox-img`
[0] https://developer.apple.com/support/downloads/Apple-File-Sys...
Sorry for commenting this here, Redox is using a private gitlab instance I have no access to.
This would be a significant problem with my use case in the very near future. I already have double-digit-TB files, and that doesn't look like much margin on top of that.
20 more comments available on Hacker News