Two quotes from the paper that I think will motivate people to read it: "Rigorou...

jeffbee · on June 7, 2021

Ehhhh, doesn't align with my experience. I think FDB is actually really poorly tested. When I was evaluating it for replacement of the metadata key-value store at a major, public web services company we found that injecting faults into virtual NVMe devices on individual replicas would cause corrupt results returned to clients. We also found that it would just crash-loop on Linux systems with huge pages, because although someone from the project had written a huge-page-aware C++ allocator "for performance", evidently nobody had ever actually tried to use it, including the author.

It's also really, really weird that their non-scalable architecture hits a brick wall at 25 machines. Ignoring the correctness flaws, it only works if you can either design around that limit by sharding, and never off cross-shard transactions, or if you can assure yourself that your use case will never outgrow half a rack of equipment.

fnordpiglet · on June 7, 2021

Can you fix a point in time? Software evolves and I think a point I saw is that it wasn’t well tested then they changed once production workloads told them it needs to change.

rbranson · on June 7, 2021

Were there other distributed databases that did pass the fault injection testing?

jeffbee · on June 7, 2021

There weren't any, which is why that particular shop elected to roll their own distributed system on top of rocks.

In general I think people who think they want to do FoundationDB owe themselves a serious contemplation of the cost/benefit of using Cloud Spanner instead. Obviously you cannot do your own fault injection testing of Spanner, but it does have end-to-end checksums.

sigstoat · on June 7, 2021

> There weren't any, which is why that particular shop elected to roll their own distributed system on top of rocks.

that's nuts. rocks could've been added as a storage engine to fdb far more easily.

jeffbee · on June 7, 2021

For the record, I said the same thing. But it's a management problem because on the one hand you have a known open project with demonstrable flaws, and on the other you have your own in-house developers and you will tend to discount the bugs they haven't written yet.

But, also for the same record, thinking you can implement a reliable, globally-replicated key-value store on top of FoundationDB that is cheaper and better than Cloud Spanner may be evidence of the same cognitive bias.

sigstoat · on June 7, 2021

> But, also for the same record, thinking you can implement a reliable, globally-replicated key-value store on top of FoundationDB that is cheaper and better than Cloud Spanner may be evidence of the same cognitive bias.

man, good thing nobody made any claim like that.

ryanworl · on June 7, 2021

This is currently in progress right now.

https://github.com/apple/foundationdb/blob/e7d7b39f12afa8ea2...

bpicolo · on June 7, 2021

What were the strong contenders?

_vvhw · on June 7, 2021

Thanks for the quotes, I've been wanting to read this paper for some time. Great to see they went through the consensus literature and made a decision to go with Active Disk Paxos, instead of stopping short and not fully understanding the consensus they're building on. The consensus and replication protocol is such a huge part of building a distributed database.

fizwhiz · on June 7, 2021

> de novo Paxos implementation written in Flow

That's... brave. Flow is a DSL built on top of C++?

alistairw · on June 7, 2021

Yeah it's their own language on top of c++ to help them with testing distributed systems with deterministic simulation.

Their talk from a while ago about it was something that really blew me away at the time [0]

[0] https://www.youtube.com/watch?v=4fFDFbi3toc

sandinmyjoints · on June 7, 2021

What is the Flow referred to here?

oconnor663 · on June 7, 2021

It's an async/await framework for C++. I'm not sure what the best source on this is, but here's a discussion: https://forums.foundationdb.org/t/why-was-flow-developed/171...

My understanding is that FDB relies heavily on deterministic simulations for testing, and that their async/await model is a big part of how they make sure they cover different possible interleavings in a deterministic way.