Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Kent does plenty of things right during the development process, but having a decade of experience contributing to OpenZFS/ZFSOnLinux, I can say that I will be very surprised if everything goes as well as people seem to think. There are many bugs in a complex code base and a simple 10x increase in userbase will undoubtably result in more reports of people hitting them.

That said, I think he is laying a good foundation (from what I have heard/read about his development process), but there are many times when the best of us have been confident in code that turned out to have problems and I doubt he is an exception to this.



> he is laying a good foundation (from what I have heard/read about his development process)

Could you please share some blog or resource to read a bit about his dev process myself? That stuff usually interests me, to improve my own process.


It is mostly from external observation. He seems to be the reason that bcachefs isn't being shipped in production today, as he is trying to work through all of his backlog rather than shipping it in a WIP state. This is very different than how developers of a certain filesystem in Linux's kernel source tree that I shall not name do things. If I recall correclty, a developer of that filesystem told me several years ago that if they did not ship the code for users to use it, they would not find bugs. That file system is not much less buggy today than it was back then. :/


btrfs


So the bcachefs is built on the basics of bcache which existed on its own as a block layer and worked in production for many years. This is layering that doesn't really exist in other filesystems. Not sure if that's what the parent meant.


That is not quite what I meant. Not shipping premature code that you know needs more work is something to be admired. It is something I wish more developers would do.

Anyway, it has a while since I read anything about bcachefs, but what I have read stuck me as being consistent with doing things well. For example, he is working on having an automated test suite in place before he ships it, which is a great thing to see:

https://lwn.net/Articles/934692/


> That said, I think he is laying a good foundation (from what I have heard/read about his development process), but there are many times when the best of us have been confident in code that turned out to have problems and I doubt he is an exception to this.

For mixed storages, bcachefs will be interesting.

For RAID1, another option is any filesystem over dm-integrity over mdadm: dm-integrity can protect against silent file corruption if used below the mdadm level: any filesystem reading inconsistent data + checksum at dm-integrity level would cause dm-integrity to give a EILSEQ to mdadm, which should recover data from the mirrors.

It's done with a cryptsetup step, and explained on https://gist.github.com/MawKKe/caa2bbf7edcc072129d73b61ae781...

Main advantages:

1. it's a "bring your own filesystem" (ex: XFS, EXT4): you add protection against bitrot to any filesystem

2. dm-integrity may be newer, but mdadm and XFS have a large user base, making them well tested.

3. you can test this approach by simulating bitflips on the underlying data device, reading from the /dev/mapper entry, reading files themselves, doing a scrub etc.

4. you can select other algorithms besides crc32: in the rare case the error couldn't be seen by crc32 (which is likely to be applied at the hardware level) you gain an extra layer of safety


The level of simplicity of ZFS has no equivalent in Linux; the whole lvm stack is a fuckup from a usability perspective. Half a dozen distinct commands working on different layers that - more often than not - are conceptual. ZFS has been around for almost 20 years now. It was production ready more than 10 years ago, when it was open-sourced. Having your systems 1 or 2 versions behind the mainline is good practice in every mission-critical piece of software.


> Having your systems 1 or 2 versions behind the mainline is good practice in every mission-critical piece of software.

I do that, and I've even personally experienced a few rare ZFS bugs that seem due to interactions between ZFS and Western Digital firmwares.

Still, I was caught unprepared: I have several backups not stored on ZFS, but all of them were made FROM a ZFS source, meaning they are now all suspicious since silent corruption has been possible since version 2.1.4, and maybe even longer than that.

ZFS is practical to use, but for now I think I'll keep a history of file checksums, like how it was done before bitrot protection.


Still, that could happen with any other FS. Silent corruption is actually quite more common than winning the lottery, so in that regard ZFS is actually a good step on reducing those odds, even if its not zero (like advertised in the package).

I'm also assuming those backups aren't actually ZFS streams (from zfs send|receive) which is a special case of "bugs biting you twice" :P


I just benefited from ZFS boot environments/snaphsots during the update of FreeBSD from 13.2 to 14.0-release. It's a great tool to have at your disposal


As someone who just installed OpenZFS on Linux, I agree with you totally.

ZFS just is, no need to worry about layering device mappers and LVM. The automount is also very nice.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: