Event Sourcing, Cqrs and Micro Services: Real Fintech Example
Mood
heated
Sentiment
mixed
Category
other
Key topics
The article discusses a FinTech project's architecture using Event Sourcing, CQRS, and Microservices, sparking debate among commenters about the necessity and complexity of these design choices.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
31m
Peak period
64
Day 1
Avg / period
22.7
Based on 68 loaded comments
Key moments
- 01Story posted
Oct 18, 2025 at 11:56 AM EDT
about 1 month ago
Step 01 - 02First comment
Oct 18, 2025 at 12:27 PM EDT
31m after posting
Step 02 - 03Peak activity
64 comments in Day 1
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 21, 2025 at 5:07 PM EDT
about 1 month ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
There it is. My automatic response to any questions about event sourcing is “if you have to ask, you don’t need it.” This is one of those situations where the explosion in complexity somewhat makes sense: when you need legally enforced auditability.
Event sourcing is a really cool architecture that makes theoretical sense but the yak shaving needed to implement it is at least an order of magnitude more than any other design.
If you peer underneath the covers of a lot of financial stuff, and it's effectively double entry accounting. Which is a giant ledger (or ledgers) of events
But you don't need to decide to use it. The people describing the requirements will tell you, insist on it, and threaten you if you don't do it.
“Event replay: if we want to adjust a past event, for example because it was incorrect, we can just do that and rebuild the app state.”
Consider the update_order_item_quantity event in a classic event sourced systems. It's not possible to guarantee that two waiters dispatching two such events at same time when current quantity is 1 would not cause the quantity to become negative/invalid.
If the data store allowed for mutability and produced an event log it's easy:
Instead of dispatching the update_order_item_quantity you would update the order document specifying the current version. In the previous example second request would fail since it specified a stale version_id. And you can get the auditability benefits of classic event sourcing system as well because you have versions and an event log.
This kind of architecture is trivial to implement with CouchDB and easier to maintain than kafka. Pity it's impossible to find managed hosting for CouchDB outside of IBM.
When you construct your own event system you are constructing a DB with your own primitives (deposit, withdraw, transfer, apply monthly interest...).
You have to figure out your transaction semantics. For example, how to reject invalid events.
Agreed, I just wish apart from WAL they also had versioning as first class and their update api required clients to pass the version they have "last seen" to prevent inconsistencies.
And DBs are not really CQRS because the events are artificial and don't have business data that people are interested in keeping.
Without preemptive defensive coding in your aggregates (whatever you call them) this can quickly blow up in your face.
There are two kinds of adjustments: an adjustment transaction (pontual), or re-interpreting what happened (systemic). The event sourcing pattern is useful on both situations.
Sometimes you need to replay events to have a correct report because your interpretation at the time was incorrect or it needs to change for whatever reason (external).
Auditing isn't about not changing anything, but being able to trace back and explain how you arrived at the result. You can have as many "versions" as you want of the final state, though.
One - bake whatever happens into your system permanently, like 99% of all apps, and disallow corrections.
Two - keep the events around so that you can check and re-check your corrections before you check in new code or data.
Like buggy data that crashes the system.
If you have the old events there, you can "measure twice, cut once", in the sense that you can keep re-running your old events and compare them to the new events under unit-test conditions, and be absolutely sure that your history re-writing won't break anything else.
It's not for just doing a refund or something.
How would it work if they had to support intra system transfers? So one user balance should be withdrawn and another should get a deposit? That's not possible to do atomically with event sourcing right?
For inter-system consistency, you’d probably need a reconciliation mechanism or some kind of 2 phase commit
But if I'm downstream consumer consuming the event log and computing the state from that, if for some reasons I receives only first event the state computed would be invalid and not represent the real state of accounts?
As for the inter-system transfers, It's not possible in the general case. You might not have a network cable between the two systems. And if you do, you run into CAP, two-generals, etc.
ACID is out of the question because the two parties don't share a DB. And if they did, some tech lead would turn down the acidity level for performance reasons.
The best you can do is intelligently interpret the information that has been given to you. That's all ledgers are: a list of things that some system knows. 'UPDATE CustomerBalance...' (CRUD) is not a fact, but 'Customer paid...' is a fact, and that's all event-sourcing is.
In this case it’s XTransactionStarted, XTransactionDepositConfirmed, and XTransactionCreditConfirmed or something along those lines. External interactions tend to follow that kind of pattern where it tracks success/failure in the domain events.
The command side of CQRS tends to be the services that guarantee ordered events either via the backing database or with master-slave topology.
Event sourcing is a terrible idea that may be useful for some incredibly niche scenario.
> I am a Software Architect, Ex-Founder & AI enthusiast with over 8 years in the IT.
A double-entry ledger is a combination of a process and a view that was mistaken for a data model centuries ago, and that mistake became embedded.
Fundamentally, you’re dealing with a sequence of events. The double-entry ledger is a particular result of processing those events - a view. There are many other useful views.
This is well understood in academic accounting. See e.g. https://en.wikipedia.org/wiki/Resources,_Events,_Agents for an alternative system that doesn’t make the same mistake.
* Even if you have ACID, it's not sufficient for distributed systems. Its guarantees will keep one node consistent with itself. No transactionality between the customers app and your db.
Maybe you're one bank with all the customers, but as soon as you want to talk to other banks, are you really going to share the one ACID instance? Who's the DBA?
Currently the state of fintech is 90% of devs being in denial that they're in a distributed system.
> proven double-entry ledger approach.
Yes. In that language, the ledger is the list of events. If today's devs were around 300 years ago they'd be calling for 'balances' instead of 'ledgers' because they're simpler.
What specific audit requirements existed beyond point-in-time balance queries? The author dismisses alternatives as "less business-focused" but doesn't justify why temporal tables or structured audit logs couldn't satisfy the actual compliance need.
The performance issues were predictable: 2-5 seconds for balance calculations, requiring complex snapshot strategies to get down to 50-200ms. This entire complexity could have been avoided with a traditional audit trail approach.
The business context analogy to accounting ledgers is telling - but accounting systems don't replay every transaction to calculate current balances. They use running totals with audit trails, which is exactly what temporal tables provide.
Event Sourcing is elegant from a technical perspective, but here it's solving a problem that simpler, proven approaches handle just fine. The regulatory requirement was for historical balance visibility, not event replay capabilities.
If the requirement is, "Show the balance _as it was_ at that point in time", this system doesn't fulfil it. They even say so in the article: if something is wrong, throw away the state and re-run the events. That's necessarily different behaviour. To do this requirement, you actually have to audit every enquiry and say what you thought the result was, including the various errors/miscalculations.
If the requirement is, "Show the balance as it should have been at that point in time", then it's fine.
In the author's case, they separate writes and reads into different DBs. The read-optimized DB has aggregated balances stored, not events. This is not materially different, and the trade-offs regarding staleness of data will be mostly the same.
Related: https://vvvvalvalval.github.io/posts/2018-11-12-datomic-even...
Having built systems that process billions of events and displayed results, triggered notifications, etc in real time (not RTOS level, I'm talking 1 or 2 seconds of latency) you absolutely need to separate reads and writes. And if you can trust db replication to be fast and reliable, you can indeed skip distributed locks and stay on the right side of the CAP theorem.
Event sourcing is how every write ahead log works. Which powers basically every db.
Is the concern on this thread that they preoptimized? I thought they walked through their decision making process pretty clearly.
I think your point about write-ahead logging etc is a good one. If you need a decent transactional system, you're probably using a system with some kind of WAL. If you're event sourcing and putting events into something which already implements a WAL, you need to give your head a wobble - why is the same thing being implemented twice? There can be great reasons, but I've seen (a few times) people using a perfectly fine transactional DB of some kind to implement an event store, effectively throwing away all the guarantees of the system underneath.
1) "Kafka is resume-driven-development" is a meme.
2) Devs are in denial about being in a distributed system, and think that single-threaded thinking (in proximity to a DB that calls itself ACID) leads to correct results in a distributed setting.
As in, fixing things during a scaleup phase when business has been working for a while and the original improvised systems are breaking, but you can’t stop business to repair.
Currently undergoing a similar project and would really appreciate any resource thrown my way, both purely technical and/or for interfacing with accounting people with no hybrid roles to bridge the domain gap.
Mapping the two domains is the main issue, and how much the new system should reflect accounting movement of money vs the current engineering model or a completely different in between
The only mutable thing here would be the end date of said subscription, at which point the company no longer requires amount M from the customer, and the customer no longer receives Y.
Then on the accounting side, every time subscription Y renews, said customer in account 750xx needs to have its balance lowered by amount M, only to get increased again when they pay.
The only way to bridge this gap is to have the engineers know what accounting needs, and let them build the right infrastructure. In this [2018] [video] https://www.youtube.com/watch?v=KH0l8QqhzYk I recently watched, the speaker Rahul Pilani explains how Netflix organised their billing systems, and how all parts fit together. I'm not saying you should copy their infrastructure, but it doesn't hurt to look at a higher level how the business operates and what their accounting requirements are.
Think for example about orders where a costumer bought three items and later cancelled one, the order value mutates as it is updated and at most we have a copy of the previous order state before price was updated (for some cases not even that).
If you think that’s not a good modeling for financial processes, well so do I, it’s the legacy we’re supposed to manage; moving out from that type of non ideal system to something more solid is what I’m researching.
Here's a scenario: you've partnered with a credit card provider. They charge some money each month per card, and you pass that onto your customers who use the cards.
One day the partner sends you a 'card-cancelled' message. Have you built your system to accept that message unconditionally? Or did your engineers put in defensive code ("fail fast", assertions, status checks, db constraints) so that your system can reject that message?
Because that's how we've built our system at work. Our engineers are proud of something called "data integrity" that our almost-ACID (READ_COMMITTED) DB supposedly has. We won't move to events (listening to what actually happened) because we'd be giving up on pretending that our DB guarantees somehow correspond to what's going on in the real world.
1) Learned the basic concepts of double-entry bookkeeping. 2) Told ChatGPT about my business domain and requested an example Chart of Accounts (CoA) tailored to it.
Feel free to reach out to me, I’d love to exchange ideas.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.