Tigerbeetle Is a Most Interesting Database

3 months ago

2 replies

hope you all like this post as much as I enjoyed writing it!

criddell

3 months ago

1 reply

Just to be a little pedantic for a second, isn't TigerBeetle a database management system (DBMS) whereas a TigerBeetle database would be actual data and metadata managed by TigerBeetle?

lioeters

3 months ago

> Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

3 months ago

Liked it almost as much as My Cousin Vinny! ;)

BrentOzar

3 months ago

2 replies

Because I'm sure other people will ask - no, it does not support SQL.

3 months ago

2 replies

Joran from TigerBeetle here!

Yes, this is by design. SQL is a great general purpose query language for read-heavy variable-length string workloads, but TigerBeetle optimizes for write-heavy transaction processing workloads (essentially debit/credit with fixed-size integers) and specifically with power law contention, which kills SQL row locks.

I spoke about this specific design decision in depth at Systems Distributed this year:

1000x - https://m.youtube.com/watch?v=yKgfk8lTQuE

3 months ago

1 reply

> which kills SQL row locks.

What's it like compared to MVCC?

3 months ago

2 replies

Depending on contention and RTT, as a specialized OLTP DBMS, TB can do roughly 1000-2000x more performance than a single node OLGP DBMS (cf. the live demo in the talk above)… but also with strict serializability. You don’t need to sacrifice correctness or real-time resolution, and that’s important. For example, if you need to do real-time balance checks.

3 months ago

1 reply

Hmmm, I guess it sounds weird to me to be talking "RTT" (round trip time) when the example is "a single node".

I'll watch your talk properly at some point and see if it makes sense to me after that. :)

cma256

3 months ago

1 reply

Node count doesn't matter. You could use an embedded database and encounter the same problem. There is some time T between when you acquire a lock and release it. Depending on the amount of contention in your system, this will have some affect on total throughput (i.e. Amdahl's law).

https://www.postgresql.org/docs/current/mvcc-intro.html

3 months ago

1 reply

How familiar are you with MVCC?

Asking because needing a lock for changing a row isn't the only approach that can be taken.

senderista

3 months ago

Almost all commercial MVCC implementations (including Postgres) use row locks. Very few use OCC even though it's arguably a more natural fit for MVCC. Pessimistic locking is simply more robust over varied workloads.

moffers

3 months ago

1 reply

TB seems really awesome, but is there non-DebitCredit use cases where it can be applied effectively? I like trying to find off-label uses for cool technology

3 months ago

Thanks! Check out https://tigerfans.io

big_hacker

3 months ago

1 reply

What's it like compared to Redis or even KeyDB?

3 months ago

Compared to Redis, TigerBeetle has strong durability, and an LSM storage engine to page incremental updates to disk, so it can support 10+ TiB data sets without running into snapshot stalls or OOM.

karmakaze

3 months ago

It helps to know what kind of data TigerBeetle handles. The data committed by its transactions is an immutable Transfer of id:128-bit, debit_account_id:128-bit, credit_account_id:128-bit, amount:128-bit, ledger:128-bit, code:16-bit, flags:bitfield, timestamp:64-bit, user_data_128, user_data_64, user_data_32.

Transactions atomically process one or more Transfers, keeping Account balances correct. Accounts are also records, their core fields (debits_posted, credits_posted, etc).

This gives a good idea of what TigerBeetle might be good for and what it might not be. For anything where latency/throughput and accuracy really, really matters, it could be worth the effort to make your problem fit.

pbowyer

3 months ago

2 replies

> And yet some of the most popular OLTP databases in the world today are still highly dependent on a single node architecture.

Which databases? SQLite is the one I can think of, but it's designed for that use-case. Others start as single node but will replicate to other nodes, either as master-slave or master-master.

redwood

3 months ago

1 reply

Postgres, MySQL

pbowyer

3 months ago

1 reply

They both have replication, which allows them to be multi-node.

ViewTrick1002

3 months ago

2 replies

With the bounds capped by a single writer. Unless you can shard the data and create a distributed database with manual sharding.

But yes. Postgres remains an amazing choice, especially with modern hardware, until you also have the money available to tackle said write throughput issue.

sgt

3 months ago

1 reply

I think the point is that sharding won't really help that much since transactions will happen across all or most shards, and then you have certain accounts that will be more active than others.

samlambert

3 months ago

1 reply

if your sharding schema is designed properly you will avoid cross shard transactions.

sgt

3 months ago

How so? Any account may transact with any other account - regardless which shard it resides in.

big_hacker

3 months ago

2 replies

TigerBeetle doesn't shard either.

3 months ago

1 reply

Yes, sharding would kill performance under contention, which characterizes many OLTP workloads (e.g. top ten bestseller list on Black Friday, super stocks like NVIDIA, the big 4 banks on a switch, PhonePe/Google Pay on UPI etc)

menaerus

3 months ago

1 reply

Sharding is if anything a way to reduce the contention.

3 months ago

Transactions typically need to hit the same set of house accounts. For example, fee accounts, omnibus accounts etc.

Or a central bank switch may only have 4 major banks.

Or a system like UPI may have 85% of transactions flowing through 2 hubs. Say 50% through PhonePe. And say 35% through Google Pay.

redwood

3 months ago

Good catch.

3 months ago

2 replies

I am quite sure Oracle and MS SQL server do just fine in multi-node cluster based architectures, but maybe that isn't their target audience.

lomase

3 months ago

1 reply

The HN crowd does not talk much about those, but every single place I have worked used them or Postgres.

https://learn.microsoft.com/en-us/azure/azure-sql/database/h...

3 months ago

Same here, I happily ignored all the NoSQL movement, and little by little, they got all relevant features out of it.

bob1029

3 months ago

MSSQL has a full-blown Death Star deployment option now:

SchwKatze

3 months ago

2 replies

Pretty much agree, I honestly think that TB team should invest some more effort on evangelizing the double-entry model for non-financial scenarios, like managing stocks, booking a show ticket etc. The 1000x API improvement has been done, now people must know how to use it

deadbabe

3 months ago

3 replies

You could probably do something similar in Postgres.

3 months ago

1 reply

You could, but TB brings a whole lot of debit/credit primitives out of the box (making it dead simple for e.g. 2PC money movement across third party systems, or transactions of thousands of transactions, to express complex financial contracts)… that are Jepsen-tested, which run multi-AZ, and which would probably take months if not years to replicate on PG. Even then, the internal concurrency control of PG’s storage engine would just get hammered by the contention, and would struggle to approach TB’s replicated performance, even if PG were single node.

deadbabe

3 months ago

Nothing takes “months or years” anymore, if you really wanted this it could be done fairly quick by someone who knows what they’re doing and you don’t layer it in bureaucracy.

graemep

3 months ago

It would be nice. I can think of things for which I want to process a financial transaction and something else in one atomic transaction (e.g. selling a ticket - something I am working on right now) and have both content (e.g. product descriptions) and finance in a single store.

SchwKatze

3 months ago

Similar, yes. But not near the same performance, TB is entirely built around this concept so they became pretty good at it. It's like comparing a lambo and a beetle (no pun intended).

But honestly, if double-entry really become a thing I foresee traditional DBMS agglutinating it just like they did with vector and object databases, getting the long tail of the market.

CGMthrowaway

3 months ago

1 reply

It sounds awesome. I'm an a analytics guy using a lot of SQL - not a coder. Though I understand the OP writeup and the purported performance advantages. Can someone explain:

a) what Tigerbeetle data looks like in practice? Assuming it doesn't look like a regular table

b) how you use it, if you can't write sql queries?

c) separately, curious what double-entry would look like for stocks, tickets etc. E.g. I'm a venue and I have 1000 tickets in inventory & deferred revenue.. each time I sell a ticket I turn that inventory to cash and the deferred into a performance liability? Or no entries at all until a ticket is sold? Something else?

3 months ago

As one example, Rene Schallner made a pretty cool ticketing demo called TigerFans: https://tigerfans.io

3 months ago

6 replies

> In less than a decade, the world has become at least three orders of magnitude more transactional. And yet the SQL databases we still use to power this are 20-30 years old. Can they hold up?

Errr yes. Without much sweat really.

Just because something started ~30 years ago doesn't mean it hasn't updated with the times, and doesn't mean it was built on bad foundations.

dakiol

3 months ago

4 replies

I disagree. If we are talking about distributed systems in which we have N different databases, then distributed transactions are left as an exercise to the reader (that’s why we have things like Sagas).

Within a single machine, yeah, relational dbs still work like a charm.

Normal_gaussian

3 months ago

2 replies

The other interesting thing to consider is that machines are a lot more powerful in all dimensions than they were 30 years ago - so many tasks that would have required distributed systems no longer do.

jandrewrogers

3 months ago

The caveat is that the hardware constraints have fundamentally changed, so database architectures from 30 years ago are effectively incapable of properly using modern hardware environments.

Databases have not been bottlenecked on storage bandwidth in a long time but most databases are designed as if this was still the case. Optimizing for memory bandwidth, the current bottleneck, leads to substantially different architectures than are commonly deployed.

3 months ago

DBMS design has also changed. The bottleneck for a DBMS has moved from spinning disk to CPU and per core memory bandwidth. That's also key to TigerBeetle's design. For example, with static allocation and zero-copy techniques such as zero-deserialization.

If you were to design an OLGP DBMS like Postgres today, it would look radically different. Same is true for OLTP.

big_hacker

3 months ago

1 reply

If it's gonna fit on one machine I'm picking Postgres, Mongo or Redis before TigerBeetle and let's be honest it's not the difference between the 4 that will make it not fit on one machine.

3 months ago

And to be fair, those are all great general purpose solutions. For example, you could also use Postgres or Redis as queue, instead of say Redpanda or Kafka.

But our customers need separation of concerns in their architecture. While they could put the cash in the general purpose filing cabinet, they actually want the separation of concerns between OLGP system of reference (string database, i.e. PG) in the control plane, and OLTP system of record (integer/counting database, i.e. TB) in the data plane.

[0]: https://changelog.com/posts/monoliths-are-the-future

3 months ago

1 reply

I was part of a project that successfully deployed them into production using Oracle, sometimes those license costs are actually worth it.

eitland

3 months ago

1 reply

Agree.

But for anyone tempted by Oracle, do remember that the upfront, agreed licence costs are only a fraction of the true price:

You’ll need someone who actually knows Oracle - either already in place or willing to invest a serious amount of time learning it. Without that, you’re almost certainly better off choosing a simpler database that people are comfortable with.

There’s always a non-zero risk of an Oracle rep trying to “upsell” you. And by upsell, I mean threatening legal action unless you cough up for additional bits a new salesperson has suddenly decided you’re using. (A company I worked with sold Oracle licences and had a long, happy relationship with them - until one day they went after us over some development databases. Someone higher up at Oracle smoothed it over, but the whole experience was unnerving enough.)

Incidental and accidental complexity: I’ve worked with Windows from 3.1 through to Server 2008, with Linux from early Red Hat in 2001 through to the latest production distros, plus a fair share of weird and wonderful applications, everything from 1980s/90s radar acquisition running on 2010 operating systems through a wide range of in house, commercial and open source software and up to modern microservices — and none of it comes close to Oracle’s level of pain.

Edit: Installating Delphi 6 with 14 packages came close, I used 3 days when I had to find every package scattered on disks in shelves and drawers and across ancient web paces + posted as abandonware on source forge but I guess I could learn to do that in a day if I had to do it twice a month. Oracle consistently took me 3 days - if I did everything correct on first try and didn't have to start from scratch.

MobiusHorizons

3 months ago

1 reply

Wait, are you saying that oracle lets you use features you don’t have a license for, but then threatens to sue you for using them? I guess I shouldn’t be surprised given oracle’s business model, but I am surprised they let you use the feature in the first place.

eitland

3 months ago

1 reply

That was not what I was thinking of when I wrote it, but that absolutely also was a thing:

I especially remember one particular feature that was really useful and really easy to enable in Enterprise Manager, but that would cost you at least $10000 at next license review (probably more if you had licensed it for more cores etc).

What I wrote about above wasn't us changing something or using a new feature but some sales guy at their side re-interpreting what our existing agreement meant. (I was not in the discussions, I just happened to work with the guys who dealt with it and it is a long time ago so I cannot be more specific.)

MobiusHorizons

3 months ago

Got it, that is much more in line with my expectations of oracle. I am constantly amazed they have any business left given these kinds of stunts.

sgarland

3 months ago

Tbf, this is a natural consequence of companies doing premature optimization / cargo culting. There are many, many reasons to NOT go into microservices. Kelsey Hightower had some thoughts on this [0] back in 2020.

You can run a hell of a lot off of a small fleet of beefy servers fronted with a load balancer, all pointing to one DB cluster.

3 months ago

2 replies

Joran from TigerBeetle!

Without much sweat for general purpose workloads.

But transaction processing tends to have power law contention that kills SQL row locks (cf. Amdahl’s Law).

We put a contention calculator on our homepage to show the theoretical best case limits and they’re lower than one might think: https://tigerbeetle.com/#general-purpose-databases-have-an-o...

tux3

3 months ago

3 replies

>Traditional SQL databases hold locks across the network; under Amdahl's Law, even modest contention caps write throughput at ≈100–1,000 TPS

In fact large real world systems are not limited to 100-1000 TPS, or even 10 kTPS as the calculator tries to suggest. That's not because Amdahl's law is wrong, the numbers you're plugging in are just wildly off, so the conclusions are equally nonsensical.

There might be some specific workloads where you saw those numbers, and your DB might be a good fit for this particular niche, but you shouldn't misrepresent general purpose workloads to try to prop up your DB. Claiming that SQL databases are limited to "100-1000 TPS" is unserious, it is not conductive to your cause.

cma256

3 months ago

2 replies

Its a financial database built for use cases where this invariant holds and built for enabling new use cases where this invariant prevented businesses from expanding into new industries. The creator says as much:

> Without much sweat for general purpose workloads.

tux3

3 months ago

2 replies

Having a niche and expanding into new industries are all fine, there is no problem with having a DB filling a particular sub-segment of the market.

But writing that traditional SQL databases cannot go above these "100-1000 TPS" numbers due to Amdahl's law is going to raise some eyebrows.

cma256

3 months ago

> But writing that traditional SQL databases cannot go above these "100-1000 TPS" numbers due to Amdahl's law is going to raise some eyebrows.

I don't think that's controversial. Amdahl's law applies to all software. Its not a peculiar feature of SQL databases. The comment is well-contextualized, in my view, but reasonable minds may disagree.

milesskorpen

3 months ago

He clearly says this is in the context of "transaction processing" in the comment you're responding to.

WhitneyLand

3 months ago

1 reply

You’re changing the subject to the company’s mission when the concern was about a specific claim made.

mbrock

3 months ago

A specific claim about OLTP processing under contention. Or how would you characterize the specific claim, specifically?

oefrha

3 months ago

They’re talking about 100-1000 TPS for transactions all locking a single row, which is not wrong, just not reflective of most workloads. They’re not talking about TPS of the entire database with simultaneous operations on many independent rows. This should be reasonably clear in context, but of course when you publish grandiose claims (when viewed in isolation) and very vague graphs backing said claims people won’t be happy.

TFA contextualizes this better:

> This gets even worse when you consider the problem of hot rows, where many transactions often need to touch the same set of “house accounts”.

tomck

3 months ago

This whole page, and their response in this thread, is about tigerbeetle as a transaction processing database - e.g. financial transaction processing

I think this is very clear, I don't know why you're saying that tigerbeetle is trying to make a generic claim about general workloads

The comment you're replying to explicitly states that this isn't true for general workloads

pluto_modadic

3 months ago

So, the pitch for TigerBeetle is... "you can do database schemas wrong and we're performant enough"? (and also we don't have auth)

Xelbair

3 months ago

2 replies

DNS still runs strong and it was ~~designed~~ released* in the November 1983.

It still holds up basically whole of internet.

in most cases SQL is good enough for 90% of workloads.

seemaze

3 months ago

7 replies

What would DNS look like if it were released in 2025?

Something1234

3 months ago

3 replies

It would be completely centralized with a micropayment rent seeking solution to update records.

3 months ago

1 reply

Nono, it would run decentralized in blockchain with a mircropayment rent seeking solution to update records.

vvern

3 months ago

1 reply

You’re right that it would run on a block chain, but that fact would primarily exist to power some marketing. Everybody would end up interacting with it through a single centralized web site and API because it’s the only usable way to get it to work.

https://youtu.be/Jlqzy02k6B8

3 months ago

{txid:"7abde7838e8db8ba98bf8b74be77a9e787be7b8bfb7b893", gas: 0.00015, query: "A", domain: "blabla.com", actual-dns-that-will-resolve-the-query: "8.8.8.8"}

ochronus

3 months ago

I laughed at first, then I cried, because sadly, it's true.

oblio

3 months ago

I think it would actually be (pseudo-)decentralized and you'd have to mine a blockchain node containing each domain, where its creators will have reserved at least 10% of its node pool to make sure they became billionaires if it took off.

Plus micropayments, of course :-)

kopadudl

3 months ago

1 reply

Certainly some AI features

eitland

3 months ago

And it would make it really painful.

But we couldn't get rid of it because it papered over something important.

Xelbair

3 months ago

2 replies

Owned by GoogleFlare with public spec, but most of it would be run as closed source application that uses extended spec akin to EEE - making it impossible to use without relying on GoogleFlare.

config defined in YAML.

Groxx

3 months ago

>config defined in YAML.

when I woke up today, I didn't really expect to be convinced that we live in a relatively good timeline, but...

Bolwin

3 months ago

Is there a standard config format for dns currently?

6510

3 months ago

1 reply

...and what should it look like?

6510

3 months ago

Ill try answer it myself with more questions. If ranking search results is such a hard problem perhaps dividing domain names should be a tad more complex than giving all the words to the first person to ask as long as they have the money. Perhaps A web directory of some kind but that still doesn't answer the riddle of ordering.

wpm

3 months ago

Obfuscated behind QUIC.

looperhacks

3 months ago

Certainly a JSON-based REST API that is not really RESTful just so that people can complain about it

slightwinder

3 months ago

JSON-Messages over WebSockets. And proper path-format.

slightwinder

3 months ago

2 replies

Isn't DNS a catastrophic clusterfuck which also had to be patched up multiple times to not allow people burning down the whole world? I mean, yes, it's doing its job, but mainly because of people doing their job.

drakythe

3 months ago

People who are malicious or see an opportunity for gain will always abuse a system. Just because the DNS system has had to be patched to handle the abuse doesn't mean its badly designed. If it were as badly designed as your comment seems to imply it wouldn't be able to operate even half as well as it does.

The combustion engine's fundamental design is pretty damn good and that it had to be updated to handle unleaded gasoline isn't a knock (pun... intended?) against that design.

Xelbair

3 months ago

Quite the opposite: as it designed in a way that it could be patched and extended, while also being performant.

yomismoaqui

3 months ago

2 replies

Are you telling me that new shiny things are not always better that established, time tested and boring tech?

Sometimes I feel like we software engineers have the worst memory of any engineers.

sgarland

3 months ago

1 reply

I don’t even think it’s memory, it’s lack of experience. A lot of the battle-tested infrastructure is now abstracted away to the point that a lot of devs are hard-pressed to explain the difference between AWS’ ALB and NLB, much less name OSS versions of them.

Actually, load balancers are a great example. The number of times I’ve seen a team re-implementing HAProxy, but poorly, is entirely too high. The reasoning is always something along the lines of “we need this extremely specific feature.” OK, and have you even read the manual for HAProxy? Or if that feature truly doesn’t exist, did you consider implementing that separately?

yomismoaqui

3 months ago

True, the only right answer for a question like "Should we use X here?" is "It depends". Maybe NIH is the right answer in some specific case but not on another that seems quite similar to the first.

Some months ago I was re-enlightened when Anders Hejlsberg (creator of C# and TypeScript) explained why they chose Go for reimplementing the TypeScript compiler, instead of using any of those languages or something like Rust.

The way they defined the problem, the tests they did and how they justify their pick is how these kind of decisions should be made if we want to call ourselves engineers.

causenad

3 months ago

Kelsey Hightower gave an interesting talk on this - why understanding fundamentals will always matter more than the recent shiny new trend:

cmrdporcupine

3 months ago

1 reply

Yeah, SQL is not the issue. At least not mostly.

The relational model has shown itself to be exactly the flexible and powerful model that Codd said it was, even in its relatively-debased from in SQL.

In fact the potential of it as a universal and flexible data model that abstracts away storage still remains to be fully unlocked.

Now as for existing SQL databases, yes, many of them were built on foundational assumptions about the nature of memory and secondary storage that no longer hold true.

Many of us in this industry still have our heads stuck in the past world of spinny spinny magnetic disks on platters with heads and cylinders and grinding noises. Real world hardware has moved on. We have a couple orders of magnitude higher IOPS generally available than we did just 10-15 years ago.

So I'm excited about products like CedarDB and others (Umbra before it, etc) that are being built on a foundation from day one of managing with hybrid in-memory/disk.

Throwing out SQL is not really the recipe for "performance" and the lessons of separating the storage system from the data model is still key, since the 1960s.

I am willing to grant that a specialized transaction/ledger system like TigerBeetle might have its place as an optimization strategy in a specific industry/domain, but we should not make the mistake of characterizing this as a general problem with the relational model and data storage broadly.

9rx

3 months ago

1 reply

> Throwing out SQL is not really the recipe for "performance"

We've discovered hacks to work around the limitations of SQL, so you can maintain performance with sufficient hackiness, but you have to give up "purity" to get there. Worse is better applies, I suppose, but if we were starting over today the design would be very different.

Throwing out SQL isn't strictly required, but like Rust isn't strictly required when you already have C, sometimes it is nice to take a step back and look at how you can improve upon things. Unfortunately, NoSQL turning into a story about "document databases" instead of "A better SQL" killed any momentum on that front.

cmrdporcupine

3 months ago

1 reply

No, I'm arguing in fact the opposite and you'll have to prove the point wrong instead of drive-by insisting on this. The point of Codd's original paper was to separate concerns about data storage from its representation and to propose an elegant representational & transformation & query model based on first order logic and I believe he succeeded.

SQL was the first serious attempt to translate this into a working system, and it's full of warts, but the underlying model it is reaching towards retains its elegance.

But most importantly the principle of not tying the structure of data to its storage model is absolutely key. Not least because we can iteratively and flexibly improve the storage model over time, and not be stuck with decisions that tie us to one particular structure which is the problem that 1960s "network" and "hierarchical" databases (and modern day "key value" or "NoSQL" databases) cause.

9rx

3 months ago

1 reply

The opposite of what? There is no contention with Codd's original paper. I'd suggest that Codd's Alpha paper was lacking in the same way SQL struggles, but that's a bit orthogonal to what you seem to be trying to say. I know its a bit confusing as SQL tries to name many different, independent things with the same name, but when confused it is best to become unconfused, not go off to la-la land.

cmrdporcupine

3 months ago

1 reply

Notably, you edited your comment after my reply :-)

9rx

3 months ago

Strangly off-topic, but if that's where you want to go, what's notable about that? Given the confusion, there should have been no rush to reply before some minor adjustments were made. That is how you ended up here in the first place. Understand first, the consider replying.

pkphilip

3 months ago

True. The older databases run very well even on hardware which was far less powerful than what is available now.

websiteapi

3 months ago

6 replies

All of these apply to FoundationDB as well.

- Slow code writing.

- DST

- No dependencies

- Distributed by default in prod

- Clock fault tolerance with optimistic locking

- Jepsen claimed that FDB has more rigorous testing than they could do.

- New programming language, Flow, for testing.

You probably could solve the same problems with FDB, but TigerBeetle I imagine is more optimized for its use case (I would hope...).

AFAIK - the only reason FDB isn't massively popular is because no one has bothered to write good layers on top. I do know of a few folks writing a SQS, DynamoDB and SQLite layers.

ksec

3 months ago

1 reply

I thought they got bought by Apple and what happened to it after?

andrewl-hn

3 months ago

Apple bought them, took it down from the web, then quietly open-sourced it a few years later. They tried to make it popular, ran a conference for it, but the adoption was too minor for Apple to care afterwards.

It's still maintained by a sizable team at Apple, GH stats show that the activity is much lower now than it was 3 years ago, but there're about 10 people that contribute on a steady regular basis, which is honestly better than 99% of open source projects out there.

andrewl-hn

3 months ago

1 reply

> the only reason FDB isn't massively popular

The only reason is Apple. They liked the product that was released in 2013 so much they bought the whole company, and all other FoundationDB users were abandoned and were forced to drop it.

Who would trust a database that can be yanked out of you at any moment? Though a lot of products have license terms like this only a handful were ever discontinued so abruptly. It's under Apache license now but the trust is not coming back.

senderista

3 months ago

Well, all users except Snowflake.

3 months ago

I am currently working on a post about DST with their team too haha!

oblio

3 months ago

And a more serious comment, to separate it from the silly one below:

Interesting that they didn't release it with an SQL client, is there no way to make it compatible? Even with extensions to SQL, I imagine it would be great for a lot of use cases.

Edit: ah, it's more of a key-value store.

oblio

3 months ago

> AFAIK - the only reason FDB isn't massively popular is because no one has bothered to write good layers on top. I do know of a few folks writing a SQS, DynamoDB and SQLite layers.

I started writing this comment:

> It seems interesting, but considering what it's for, why aren't the hyperscalers using it?

And while writing it I started searching for FoundationDB and found this:

> https://github.com/<<apple>>/foundationdb

Ah, all right :-p

dangoodmanUT

3 months ago

the one true database

3 months ago

4 replies

We were considering TigerBeetle, but found blockers:

* We use Cloudflare Workers. TigerBeetle client app is not supported. It might work using Cloudflare Containers, but then the reason we use Cloudflare is for the Workers. --> https://github.com/tigerbeetle/tigerbeetle/issues/3177

* TigerBeetle doesn't support any auth. It means the containing server (e.g. a VPS) must restrict by IP. Problem is, serverless doesn't have fixed IP. --> https://github.com/tigerbeetle/tigerbeetle/issues/3073

vjerancrnjak

3 months ago

2 replies

But Cloudflare Workers or AWS Lambda setup would not work anyway with any db?

* spawning 1000 workers all opening a connection to a db,

* solved by service/proxy in front of db,

* proxy knows how to reach db anyway, let's do private network and not care about auth

3 months ago

1 reply

Cloudflare has its own DB (D1, Sqlite-derived), but you can also connect with PostgreSQL using their adapter (Hyperdrive). I have used both, they're okay.

threatofrain

3 months ago

I wouldn't recommend D1 for now due to its harsh storage limitations (10 GB).

dangoodmanUT

3 months ago

DBs like DynamoDB work great with these kinds of runtimes, they don't need a connection pool in front

caust1c

3 months ago

3 replies

Wait, what? A database in 2025 doesn't support any kind of auth? A financial database? WTF?

C'mon folks, the least you can do is put a guide for adding an auth proxy or auth layer on your site.

Particularly since you don't use HTTP (cant easily tell from the docs, I'm assuming), then folks are going to be left wondering: "well how the hell do I add an auth proxy without HTTP" and just put it on the open internet...

3 months ago

1 reply

Joran from TigerBeetle here!

TigerBeetle is our open source contribution. We want to make a technical contribution to the world. And we have priorities on the list of things we want to support, and support properly, with high quality, in time.

At the same time, it's important I think that we encourage ourselves and each other here, you and I, to show respect to projects that care about craftsmanship and doing things properly, so that we don't become entitled and take open source projects and maintainers in general for granted.

Let's keep it positive!

nextaccountic

3 months ago

1 reply

> TigerBeetle is our open source contribution.

Wait, is it open source?? Since when? I always thought it was proprietary

3 months ago

1 reply

Apache 2.0 since day 0, since it came out of a non-profit central bank switch by the Gates Foundation, which is also Apache 2.0.

Our view is that this kind of infrastructure is simply too valuable, too critical, not to be open source.

3 months ago

For more about our open source and business philosophies, which we see as orthogonal, see this interview we did with Jerod Santo: https://www.youtube.com/watch?v=Z0kzYlCTUuc

3 months ago

Exactly my thoughts, and yes, it's not HTTP. Not a big deal, I guess, if they explain how to work around it...

senderista

3 months ago

You are not entitled to anyone else's free labor.

3 months ago

1 reply

You should chat to our solutions team, who'd be able to help you solve auth etc. at the logical level with end-to-end encryption, something we typically do for our customers.

3 months ago

1 reply

I might not be able to afford paid support yet since it's just PoC, but if it's okay to be public, could they respond to the GitHub issue instead so it can benefit others? https://github.com/tigerbeetle/tigerbeetle/issues/3073

3 months ago

Appreciate your transparency that it's still a PoC, we'll chime in there. (But see also Wireguard and stunnel, depending on what your policy needs are, we try to keep policy out of TB as much as possible, and just give you great primitives)

cakealert

3 months ago

1 reply

Wireguard would be an answer here, IP's are authenticated with ECC keys.

3 months ago

Yes, this is what we also recommend, amongst other solutions. The building blocks are orthogonal.

bell-cot

3 months ago

1 reply

Consider re-titling it "A very cool & interesting article about a very boring & reliable database"?

Since "interesting" is the very last thing that anyone sane wants in their accounting/financial/critical-stuff database.

3 months ago

looks like the poster agrees

3 months ago

4 replies

> They keep assertions enabled in production.

Never understood why we turn those off. An assert failing in prod is an assert that I desperately want to know about.

(That "never understood" was rhetorical).

mcherm

3 months ago

3 replies

I like to write assertions that aren't always easy to check. Like asserting that a list is sorted.

3 months ago

3 replies

That sounds easy to check. Can you expand on this, because I don't understand.

maverwa

3 months ago

1 reply

its easy as in "simple to implement and execute" but not cheap, because it may require scanning large amounts of memory. You have to visit every list entry.

Whats trivial for a very small list, may be a no-go for gigabyte-sized lists.

3 months ago

Ah I see. That's the bit of the conversation I was trying to head off with "rhetorical" :)

3 months ago

Basically this could be something like this, D based example:

    void process_list(List aList)
    in(aList.isSorted, "List must be sorted!")
    do
    {
    // do something
    }

However the O() cost of calling isSorted() has an impact on overall cost for process_list() on every call, hence why the way contracts are executed is usually configurable in languages that have them available.

hxtk

3 months ago

Meaning computationally. It would cost a lot of cycles to keep that enabled in production.

It’s only O(n), but if I check that assertion in my binary search function then it might as well have been linear search.

3 months ago

1 reply

Then you want Design By Contract, dependent types, or formal verification.

However, several factors have to be taken into account regarding performance impact when they get cleverly written, thus in manys cases they can only be fully turned on during debug builds.

3 months ago

1 reply

Out of curiosity, have you looked at Cue before?

It seems to be used for stuff like this, though I'm yet to really look into it properly.

https://cuelang.org

3 months ago

No, yet another language to discover, thanks.

matklad

3 months ago

Yeah, there's actually trickiness here. Another curveball is that, with simulation testing, you generally want more assertions to catch more bugs, but slow assertions can _reduce_ testing efficiency by requiring more CPU time per iteration! But then, if you know, at comptime, that the list is going to be short, you might as well do O(N^2) verification!

We captured these consideration in the internal docs here: https://github.com/tigerbeetle/tigerbeetle/blob/0.16.60/src/...

3 months ago

3 replies

Because the very raison d'etre of asserts is to be a compile time/test time check. If you wanted them to run in production, you wouldn't use an assert, you would just run an if. Otherwise what's an assert to you? A mildly convenient sugar syntax alternative to if statements?

swiftcoder

3 months ago

1 reply

That a semantic we've all just sort of inherited from C - I don't know we're required to be bound by it forever. The word itself doesn't imply that assertions would be contextually disabled.

3 months ago

semantics are as if not more important than the technical execution. Bits are all abstractions anyways, what's important is the meaning we assign to them.

kevinqi

3 months ago

right, it is just syntactic sugar, but if that wasn't helpful then why have it in dev either? I find it more confusing to have asserts be stripped, which creates an implicit dev/prod discrepancy

3 months ago

> A mildly convenient sugar syntax alternative to if statements?

Yup.

Cthulhu_

3 months ago

On paper it's because assertions are only checks for developers to make sure they don't misuse an API. They still have values if it's user input, but at that stage, it would be better to change it to business logic like a proper error message returned to the end-user.

kstenerud

3 months ago

It's historical: turning off assertions made the code run faster.

But nowadays an extra comparison is no biggie, especially if the compiler can hint at which way it's likely to go.

igtztorrero

3 months ago

1 reply

I really liked this article. I met the new kid on the block, the DBMS neighborhood. I also didn't know that Zig programming language existed. So many new things. Congratulations to TigerBeetle! I'm going to tell my team about it and try it out on an interesting project.

3 months ago

thank you!

self_awareness

3 months ago

1 reply

The most interesting database is the most boring one.

3 months ago

The problem with "boring", "popular" and "old" is that they are too often synonyms (and logical fallacies, i.e. as an appeal to popularity/authority).

Rather, I would say that the most interesting database is the fastest and safest.

Or as Jim Gray always put it: “correct and fast”.

ducklord

3 months ago

1 reply

What does TigerBeetle stand on the CAP theorem? Sounds like it's AP?

https://en.wikipedia.org/wiki/Optical_proximity_correction

3 months ago

CAP defines availability physically: “requests to a non-failing node receive a response”. TigerBeetle ensures logical availability: as long as clients can reach a majority, requests complete with strict serializability, preserving safety and liveness under partitions.

cf. Durability and the Art of Consensus: https://www.youtube.com/watch?v=tRgvaqpQPwE

IshKebab

3 months ago

1 reply

> Deterministic Simulation Testing (DST) feels like the most transformational technology that the fewest developers know about. It’s a novel testing technique made popular by the FoundationDB team

It's not novel. That's how hardware (ASIC) testing has been done forever. The novelty is applying it to software.

> TigerBeetle’s VOPR is the single largest DST cluster on the planet. It runs on 1,000 CPU cores

Only if you exclude hardware, otherwise basically every chip design company has a cluster bigger than this.

bob1029

3 months ago

Leading edge photo mask synthesis is easily in the top 5 most computationally demanding things out there. You need an actual supercomputer to solve these problems on meaningful (competitive) timescales.

xnzm1001

3 months ago

1 reply

I work at a bank with old old monstrous sql queries. I thought I can make a use of tigerbeetle to simplify the system. But sadly, I just couldn't figure out how to make it work. Transaction requires lot of business logics but I couldn't convert that to make it work with combination of RDBMS + tigerbeetle. I wish there were some realworld example that I can get insight using tigerbeetle.