Do the Simplest Thing That Could Possibly Work
Original: Do the simplest thing that could possibly work
Key topics
The quest for simplicity in tech is a hot topic, with many developers lamenting the tendency to overcomplicate solutions. As one commenter quipped, "Everything should be made as simple as possible, but not simpler," highlighting the delicate balance between simplicity and functionality. The discussion reveals a common pain point: managers or higher-ups getting enamored with new, shiny technologies, leading to "resume-driven development" and unnecessary complexity. Commenters shared war stories of being tasked with implementing unnecessarily complex solutions, with some fortunate enough to have guidelines or autonomy to resist the trend.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
31m
Peak period
121
0-12h
Avg / period
26.7
Based on 160 loaded comments
Key moments
- 01Story posted
Aug 29, 2025 at 3:05 PM EDT
4 months ago
Step 01 - 02First comment
Aug 29, 2025 at 3:36 PM EDT
31m after posting
Step 02 - 03Peak activity
121 comments in 0-12h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 5, 2025 at 9:40 AM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
As someone who has strived for this from early on, the problem the article overlooks is not knowing some of these various technologies everyone is talking about out, because I never felt I needed them. Am I missing something I need, but just ignorant, or is that just needless complexity that a lot of people fall for?
I don’t want to test these things out to learn them in actual projects, as I’d be adding needless complexity to systems for my own selfish ends of learning these things. I worked with someone who did this and it was a nightmare. However, without a real project, I find it’s hard to really learn something well and find the sharp edges.
Yeah, let me shoehorn that fishing trip into my schedule without a charge number, along with the one from last week...
That is what my boss asks us to do =p
Though there was a time when he wanted me to onboard my simple little internal website to a big complicated CICD system, just so we could see how it worked and if it would be useful for other stuff. It wouldn’t have been useful for anything else, and I already had a script that would deploy updates to my site that was simple, fast, and reliable. I simply ignored every request to look into that.
Other times I could tell him his idea wouldn’t work, and he would say “ok” and walk away. That was that. This accounted for about 30% of what he came to me with.
Ignorance plays a big role. If you don't perceive, e.g. a race condition happening, then it's much simpler to avoid complicated things like locking and synchronisation.
If you have the belief that your code will never be modified after you commit it, then it's much simpler to not write modifiable code.
If you believe there's no chance of failure, then it's simpler to not catch or think about exceptions.
The simplest thing is global variables, single-letter variable names, string-handling without consideration for escaping, etc.
It’s more about adding additional tools to the stack. I will fight hard not to add in additional layers of complexity that require more infrastructure and maintenance to manage. I want to eliminate as many points of failure as possible, and don’t want the stack to be so complex that other people can’t understand how it all fits together. If I win the lotto, or simply go on vacation, I want whoever has to take it over to be able to understand and support it.
The thing I’ll be working on next week has a lot of potential race conditions, and I need to find a simple solution that avoid them, without creating a support and maintainability burden on myself and the future team. Building a database would probably be the easy solution, but that’s one more dependency and thing to maintain, and also means I need to build a front end for people to access it. If I can do it without a database, that would be ideal.
Eventually you might start adding more things to it because of needs you haven't anticipated, do it.
If you find yourself building the tool that does "the whole thing" but worse, then now you know that you could actually use the tool that does "the whole thing".
Did you waste time not using the tool right from the start? That's almost a filosofical question, now you know what you need, you had the chance to avoid it if it turned out you didn't, and maybe 9 times out of 10 you will be right.
“Just because it works doesn’t mean it isn’t broken.” Is an aphorism that seems to click for people who are also handy in the physical world but many software developers think doesn’t sound right. Every handyman has at some time used a busted tool to make a repair. They know they should get a new one, and many will make an excuse to do so at the next opportunity (hardware store trip, or sale). Maybe 8 out of ten.
In software it’s probably more like 1 out of ten who will do the equivalent effort.
On a recent project I fixed our deployment and our hotfix process and it fundamentally changed the scope of epics the team would tackle. Up to that point we were violating the first principle of Continuous: if it’s painful, do it until it isn’t. So we would barely deploy more often than we were contractually (both in the legal and internal cultural sense) obligated to do, and that meant people were very conservative about refactoring code that could lead to regressions, because the turnaround time on a failing feature toggle was a fixed tempo. You could turn a toggle on to analyze the impact but then you had to wait until the next deployment to test your fixes. Excruciating with a high deviation for estimates.
With a hotfix process that actually worked worked, people would make two or three times as many iterations, to the point we had to start coordinating to keep people from tripping over each other. And as a consequence old nasty tech debt was being fixed in every epic instead of once a year. It was a profound change.
And as is often the case, as the author I saw more benefit than most. I scooped a two year two man effort to improve response time by myself in three months, making a raft of small changes instead of a giant architectural shift. About twenty percent of the things I tried got backed out because they didn’t improve speed and didn’t make the code cleaner either. I could do that because the tooling wasn’t broken.
> It's not enough for a program to work – it has to work for the right reasons
I guess that’s basically the same statement, from a different angle.
Until recently I would say such programs are extremely rare, but now AI makes this pretty easy. Want to do some complicated project-wide edit? I sometimes get AI to write me a one-off script to do it. I don't even need to read the script, just check the output and throw it away.
But I'm nitpicking, I do agree with it 99% of the time.
By the time you’ve done something five times, it’s probably part of your actual process, and you should start treating it as normal instead of exceptional. Even if admitting so feels like a failure.
So I staple something together that works for the exact situation, then start removing the footguns I’m likely to hit, then I start shopping it to other people I see eye to eye with, fix the footguns they run into. Then we start trying to make it into an actual project, and end game is for it to be a mandatory part of our process once the late adopters start to get onboard.
If they want to use those resources to prioritize quality, I'll prioritize quality. If they don't, and they just want me to hit some metric and tick a box, I'm happy to do that too.
You get what you measure. I'm happy to give my opinion on what they should measure, but I am not the one making that call.
My second lead role, the CTO and the engineering manager thought I could walk on water and so I had considerable leeway to change things I thought needed changing.
So one of the first things I did was collectively save the team about 40 hours of code-build-test time per week. Which is really underselling it because what I actually did was both build a CI pipeline at a time nobody knew what “CI” meant, and increase the number of cycles you could reliably get through without staying late from 4 to 5 cycles per day. A >20% improvement in iterations per day and a net reduction in errors. That was the job where I learned the dangers of pushing code after 3:30pm. Everyone rationalizes that the error they saw was a glitch or someone else’s bug, and they push and then come in to find the early birds are mad at them. So better to finish what we now call deep work early and do lighter stuff once you’re tired.
Edit: those changes also facilitated us scaling the team to over twice the size of any project I’d worked on before or for some time after, though the EM deserves equal credit for that feat.
Then they fired the EM and Peter Principled by far the worst manager I’ve ever worked for (fuck you Mike, everyone hated your guts), and all he wanted to know was why I was getting fewer features implemented. Because I’m making everyone else faster. Speaking of broken, the biggest performance bottleneck in the entire app was his fault. He didn’t follow the advice I gave him back when he was working in our query system. Discovering it took hiring an Oracle DB contractor (those are always exorbitant). Fixing it after it shipped was a giant pain (as to why I didn’t catch his corner cutting, I was tagged in by another lead who was triple booked, and when I tagged back out he unfortunately didn’t follow up sufficiently on the things I prescribed).
Then the executives would be stunned that it was done so quickly. The prototype team would pass it off to another team and then move on to the next prototype.
The team that took over would open the project and discover that it was really a proof of concept, not a working site. They wouldn't include basic things like security, validation, error messages, or any of the hundred things that a real working product requires before you can put it online.
So the team that now owned it would often have to restart entirely, building it within the structures used by the rest of our products. The executives would be angry because they saw it "work" with their own eyes and thought the deployment team was just complicating things.
Having the house fall on Buster was <chef's kiss>: https://youtube.com/watch?v=FN2SKWSOdGM
Those are the worst because you don’t have done criteria you can reasonably write down. It’s whenever QA stops finding fakes in the code, plus a couple months for stragglers you might have missed.
Meanwhile all the people writing agentic LLM systems: “Hold my beer”
You aren't gonna need it
Alas, you do not have infinite money. But you can earn money by becoming this person for other people.
The catch 22 is most people aren't going to hire the guy who bills himself as the guy who does the simplest thing that could possibly work. It turns out the complexities actually are often there for good reason. It's much more valuable to pay someone who has the ability to trade simplicity off for other desirable things.
"It turns out the complexities actually are often there for good reason" - if they're necessary, then it gets folded into the "could possibly work" part.
The vast majority of complexities I've seen in my career did not have to be there. But then you run into Chesterton's Fence - if you're going to remove something you think is unnecessary complexity, you better be damn sure you're right.
The real question is how AI tooling is going to change this. Will the AI be smart enough to realize the unnecessary bits, or are you just going to layer increasingly more levels of crap on top? My bet is it's mostly the latter, for quite a long time.
Dev cycles will feel no different to anyone working on a legacy product, in that case.
Time and time again amazingly complex machines and they just fail to perform better than a rubber-band and bubble gum.
This stuff just can not be reimplemented that simple and be expected to work.
The music was also quite good imo.
Same, or reliability-tiered separately. But in both aspects I more frequently see the resulting system to be more expensive and less reliable.
The situation is extremely frustrating, because I have to be careful not to insult anyone or create endless arguments, while trying to somehow salvage the project into something workable, or convince a team of junior/mid-level engineers to start over (the code is technically not salvageable, at all). Trying to convince people who don't know what they're doing that the same end result could be reproduced in 45 days and then the next 18 months of effort could be condensed into an additional 45 days is like trying to convince an octopus that there are satellites in orbit around the Earth.
Don't add passwords, just "password" is fine. Password policies add complexity.
For services that require passwords just create a shared spreadsheet for everyone.
/s
Yesterday I had a problem with my XLSX importer (which I wrote myself--don't ask why). It turned out that I had neglected to handle XML namespaces properly because Excel always exported files with a default namespace.
Then I got a file that added a namespace to all elements and my importer instantly broke.
For example, Excel always outputs <cell ...> whereas this file has <x:cell ...>.
The "simplest thing that could possibly work" was to remove the namespace prefix and just assume that we don't have conflicting names.
But I didn't feel right about doing that. Yes, it probably would have worked fine, but I worried that I was leaving a landmine for future me.
So instead I spent 4 hours re-writing all the parsing code to handle namespaces correctly.
Whether or not you agree with my choice here, my point is that doing "the simplest thing that could possible work" is not that easy. But it does get easier the more experience you have. Of course, by then, you probably don't need this advice.
This avoids the endless whack-a-mole that you get with a partial solution such as "assume namespaces are superflous", which you almost certainly will eventually discover weren't optional.
Or some other hapless person using your terrible code will discover at 2am at night sitting alone in the office building while desperately trying to do something mission critical such as using a "simple" XML export tool to cut over ten thousand users from one Novel system to another so that the citizens of the state have a functioning government in the morning.
Ask me how I know that kind of "probably won't happen" thing will, actually, happen.
I think the author kind of mentions this: "Figuring out the simplest solution requires considering many different approaches. In other words, it requires doing engineering."
But the irony, in my opinion, is that experienced engineers don't need this advice (they are already "doing engineering"), but junior engineers can't use this advice because they don't have the experience to know what the "simplest thing" is.
Still, the advice is useful as a mantra: to remind us of things we already know but, in the heat of the moment, sometimes forget.
The simplest thing can be very difficult to do. It require thought and understanding the system, which is what he says at the very beginning. But I think most people read the headline and just started spewing personal grievances.
But an experienced engineer already knows this!
I just think it's ironic that this advice is useless to junior engineers but unneeded by senior engineers.
That's a good way of putting it. The advice essentially boils down to "do the right thing, don't do the wrong thing". Which is good (if common sense) advice, but doesn't practically really help with making decisions.
The best solution is the simplest.
The quickest? No the simplest; sometimes thats longer.
So definitely not a complex solution? No, sometimes complexity is required, its the simplest solution possible given your constraints.
Soo… basically, the advice is “pick the right solution”.
Sometimes that will be quick. Sometimes slow. Sometimes complex. Sometimes config, Sometimes distributed.
It depends.
But the correct solution will be the simplest one.
Its just: “solve your problems using good solutions not bad ones”
…and that indeed both good, and totally useless advice.
We both read the article; you know as well as I do that the advice in it is to build simple reliable system that focus on actual problems not imagined ones.
…but does not say how to do that; and offers no meaningful value for someone trying to pick the “right” thing in the entire solution space that is both sufficiently complex and scalable to solve the requirements, but not too scalable, or too complex.
There’s just some vague hand waving about over engineering things at Big Corp, where, ironically, scale is an issue that mandates a certain degree of complexity in many cases.
Here’s some thing that works better than meaningless generic advice: specific detailed examples.
You will note the total lack of them in this article, and others like it.
Real articles with real advice are a mix of practical examples that illustrate the generic advice they’re giving.
You know why?
…because you can argue with a specific example. Generic advice with no examples is not falsifiable.
You can agree with the examples, or disagree with them; you can argue that examples support or do not support the generic advice. People can take the specific examples and adapt them as appropriate.
…but, generic advice on its own is just an opinion.
I can arbitrarily assert “100% code coverage is meaningless; there are hot paths that need heavy testing and irrelevant paths that do not require code coverage. 100% code coverage is a fools game that masks a lack of a deeper understanding of what you should be testing”; it may sound reasonable, it may not. That’s your opinion vs mine.
…but with some specific examples of where it is true, and perhaps, not true, you could specifically respond to it, and challenge it with counter examples.
(And indeed, you’ll see that specific examples turn up here in this comment thread as arguments against it; notably not picked up to be addressed by the OP in their hacker news feedback section)
For example, at work, the simplest solution across the whole organization was to adopt the most complex PostgreSQL deployment structure and backup solutions.
This sounds counter-intuitive at first. But this way, the company can invest ~3 full time employees on having an HA, PITR capable PostgreSQL clutser with properly archived backups around ~25 other development teams can rely on. This stack solves so many B2B problems of business continuity, security, backups, availability.
And on the other hand, for the dev-teams, the PostgreSQL is suddenly very simple. Inject ~8 variables into a container and you can claim all of these good things for your application without ever thinking about those.
If you had just used a compliant XML parser as intended, you might not even have noticed that different encodings of namespaces was even occurring in the files! It just "doesn't register" when you let the parser handle this for you in the same sense that if you parse HTML (or XML) properly, then you won't notice all of the & and < encodings either. Or CDATA. Or Unicode escapes. Or anything else for that matter that you may not even be aware of.
You may be a few more steps away from making an XLSX importer work robustly. Did you read the spec? The container format supports splitting single documents into multiple (internal) files to support incremental saves of huge files. That can trip developers in the worst way, because you test with tiny files, but XLSX-handling custom code tends to be used to bulk import large files, which will occasionally use this splitting. You'll lose huge blocks of data in production, silently! That's not fun (or simple) to troubleshoot.
The fast, happy path is to start with something like System.IO.Packaging [2] which is the built-in .NET libary for the Open Packaging Conventions (OPC) container format, which is the underlying container format of all Office Open XML (OOXML) formats. Use the built-in XML parser, which handles namespaces very well. Then the only annoyance is that OOXML formats have two groups of namespaces that they can use, the Microsoft ones and the Open "standardised" ones.
[1] Famously! https://stackoverflow.com/questions/8577060/why-is-it-such-a...
[2] https://learn.microsoft.com/en-us/dotnet/api/system.io.packa...
Namespaces add a wrinkle, but it wasn't that hard to add. And I was able to add namespace aliasing in my API to handle the two separate "standard" namespaces that you're talking about.
But you're right about OPC/OOXML--those are massive specs and even the tiny slice that I'm handling has been error-prone. I haven't dealt with multiple internal files, so that's a future bug waiting for me. The good news is I'm building a nice library of test files for my regression tests!
It really isn't, and rolling your own parser is the diametric opposite of the "do the simplest thing" philosophy.
The XML v1.1 spec is 126 KB of text, and that doesn't even include XML Namespaces, which is a separate spec with 25 KB of text.
XML is only "simple" in the sense of being well-defined, which makes interoperability simple, in some sense. Contrast this with ill-defined or implementation-defined text formats, where it's decidedly not simple to write an interoperable parser.
As an end-user of XML, the simplest thing is to use an off-the-shelf XML parser, one that's had the bugs beaten out of it by millions of users.
There are very few programming languages out that don't have a convenient, full-featured XML parser library ready to use.
Wouldn't it have been less effort and simpler to replace the custom code with an existing XML parser? It appears that in your case the simplest thing would have been easy, though the aphorism doesn't promise "easy".
If using a library wasn't possible for you due to NIH-related business requirements and given the wide proliferation of XML libraries under a multitude of licenses, then your pain appears to have been organizationally self-inflicted. That's going to be hard to generalize to others in different organizations.
I totally agree with you that most people should not implement their own XML parser, much less an Excel importer. But I'm grateful to have the luxury of being allowed/able to do both.
The specific choice I made doesn't matter. What matters is the process of deciding trade-offs between one approach and another.
My point is that the OP advice of "do the simplest thing that could possibly work" doesn't help a junior engineer (who doesn't have the experience to evaluate the trade-off) but it's superfluous for a senior engineer (who already has well-developed instincts).
Still, your experience with those holding "senior" job titles involves greater median expertise than I have found in my experience.
Ignoring the namespace creates ongoing complexity that you have to be aware of. Your solution now just works and users can use namespaces if they want.
The author deals with this in the hacks section.
Of course plenty of times there'll be some abstractions that make the code easier to follow, even at the expense of logic locality. And other times where extra infrastructure is really necessary to improve reliability, or when your in-memory counter hack gets more requirements and replacing it with a dedicated rate limiter lets you delete all that complexity. And in those cases, by all means, add the abstractions or infrastructural pieces as needed.
But in all such cases, I try to ask myself, if I need to hand off this project afterward, which approach is going to make things easiest to explain?
Note that my perception of this has changed over time. Long ago, I was very much in the camp of "simple" meaning: make everything as terse as possible, put everything in its own service, never write code when a piece of infrastructure could do it, decouple everything to the maximum extent, make everything config-based. I ironically remember imagining how delighted the new owners would be to receive such a well-factored thing that was almost no code at all; just abstraction upon abstraction upon event upon abstraction that fit together perfectly via some config file. Of course, transition was a complete fail, as they didn't care enough to grok how the all pieces were designed to fit together, and within a month, they'd broken just about every abstraction I'd built into it, and it was a pain for anybody to work with.
Since then, I've kept things simpler, only using abstractions and extra infra where it'd be weird not to, and always thinking what's going to be the easiest thing to transition. And even though I'm not necessarily transitioning a ton of stuff, it's generally easier to ramp up teams or onboard new hires or debug problems when the code just does what it says. And it's nice because when a need for a new abstraction becomes apparent, you don't have to go back and undo the old one first.
But there's utility in talking about it. If you teach people that good engineers prepare for Google scale, they will lean towards that. If you teach that unnecessary complexity is painful and slows you down, they will lean towards that.
Maybe we need a Rosetta stone of different simple and complex ways to do common engineering stuff!
They were cognizant of the limitations that are touched on in this article. The example they gave was of coming to a closed door. The simplest thing might be to turn the handle. But if the door is locked, then the simplest thing might be to find the key. But if you know the key is lost, the simplest thing might be to break down the door, and so on. Finding the simplest thing is not always simple, as the article states
IIRC, they were aware that this approach would leave a patchwork of technical debt (a term coined by Cunningham), but the priority on getting code working overrode that concern at least in the short term. This article would have done well to at least touch on the technical debt aspect, IMHO.
http://www.extremeprogramming.org/rules/simple.html
It's interesting you gave that example. Before my first use of a wiki I was on a team that used Lotus Notes and did project organization in a team folder. I loved that Notes would highlight which documents had been updated since the last time I read them.
In the next project, that team used a wiki. It's simpler. But, the fact it didn't tell me which documents had been updated effectively made it useless. People typed new project designs into the wiki but no one saw them since they couldn't, at a glance, know which of the hundreds of pages had been updated since they last read them.
It was too simple
Here's the page for my local makerspace's wiki, which runs on mediawiki:
https://bloominglabs.org/Special:RecentChanges?hidebots=1&li...
> It was too simple
That happens when you ignore requirements, either because they were outright discarded or because they were never recognized. "The simplest thing possible" is understood to include: "to meet all requirements."
> Everything should be made as simple as possible, but not simpler.
And I found a similar quote from Aquinas
> If a thing can be done adequately by means of one, it is superfluous to do it by means of several; for we observe that nature does not employ two instruments where one suffices
(Aquinas, [BW], p. 129).
[0] https://blogs.oracle.com/javamagazine/post/interview-with-ke...
Sometimes better to not assume the worst of people by default. Very easy to not know where something comes from, misremember or come up with something in parallel.
At no time was anything like the "worst of people" anywhere on the radar.
It is possible the OP came to this conclusion without knowing about Ward Cunningham?
Now the problem with the headline and repeating it is, when "just do a simple thing" becomes mandated from management (technical or not), there comes a certain stress about trying to keep it simple and if you try running with it for a complex problems you easily end up with those hacks that become innate knowledge that's hard to transfer instead of a good design (that seemed complex upfront).
Conversly, I think a lot of "needless complexity" comes from badly planned projects where people being bitten by having to continuously add hacks to handle wild requirements easily end up overdesigning something to catch them, only to end up with no more complexity in that area and then playing catchup with the next area needing ugly hacks (to then try to design that area that stabilized and the cycle repeats).
This is why as developers we do need to inject ourselves into meetings (however boring they are) where things that do land up on our desks are decided.
Unfortunately, simplicity is complicated. The median engineer in industry is not a reliable judge of which of two designs is less complex.
Further, "simplicity" as an argument has become something people can parrot. So now it's a knee-jerk fallback when a coworker challenges them about the approach they are taking. They quickly say "This is simpler" in response to a much longer, more sincere, and more correct argument. Ideally the team leader would help suss out what's going on, but increasingly the team lead is a less than competent manager, and simplicity is too complicated a topic for them to give a reliable signal. They prefer not to ruffle feathers and let whoever is doing the work make the call; the team bears the complexity.
What you really learn over time and it’s more useful, is to think along these lines: don’t try to solve problems that don’t exist yet.
This is a mantraic, cool headline but useless. The article doesn't develop it properly either in my opinion.
It is best to prepare for problems which don't exist yet. You don't need to solve them, but design with the expectation they may arise. Failure to do so leads to tech debt.
― Dijkstra
And then, there's people who do "resume-driven development" and push for more complexity in their workplace so that they can list real-life work experience for the next door to open. I know someone who made a tool that just installs Java JDK + IDE + DBearer using Rust, so that he can claim that he used Rust in the previous company he worked for.
I generally think we're more obsessed with being perceived as engineers than actually do engineering.
"real mastery often involves learning when to do less, not more. The fight between an ambitious novice and an old master is a well-worn cliche in martial arts movies: the novice is a blur of motion, flipping and spinning. The master is mostly still. But somehow the novice’s attacks never seem to quite connect, and the master’s eventual attack is decisive".
I see people adding unnecessary complexity to things all the time and advocate for keeping things simple on a daily basis probably. Otherwise designers and product managers and customers and architects will let their mind naturally add complexity to solutions which is unnecessary.
But also keep in mind the audience: the kinds of people who are tempted to use J2EE (at the time) with event sourcing and Semantic Web, etc.
This is really a counterbalance to that: let's not add sophistication and complexity by default. We really are better off when we bias towards the simpler solutions vs one that's overly complex. It's like what Dan McKinley was talking about with "Choose Boring Technology". And of course that's true (by and large), but many in our industry act like the opposite is the case - that you get rewarded for flexing how novel you can make something.
I've spent much of my career unwinding the bad ideas of overly clever devs. Sometimes that clever dev was me!
So yes ... it's an overly general statement that shouldn't need to be said, and yet it's still useful given the tendency of many to over-engineer and use unnecessarily sophisticated approaches when simpler ones would suffice.
Some generalizations are necessary to formalize the experience we have accumulated in the industry and teach newcomers.
The obvious problem is that, for some strange reason, lots of concepts and patterns that may be useful when applied carefully become a cult (think clean architecture and clean code), which eventually only makes the industry worse.
For example, clean architecture/ports and adapters/hexagonal/whatever, as I see it, is a very sane and pragmatic idea in general. But somehow, all battles are around how to name folders.
First of all, simplicity is the hardest thing there is. You have to first make something complex, and then strip away everything that isn't necessary. You won't even know how to do that properly until you've designed the thing multiple times and found all the flaws and things you actually need.
Second, you will often have wildly different contexts.
- Is this thing controlling nuclear reactors? Okay, so safety is paramount. That means it can be complex, even inefficient, as long as it's safe. It doesn't need to be simple. It would be great if it was, but it's not really necessary.
- Is the thing just a script to loop over some input and send an alert for a non-production thing? Then it doesn't really matter how you do it, just get it done and move on to the next thing.
- Is this a product for customers intended to solve a problem for them, and there's multiple competitors in the space, and they're all kind of bad? Okay, so simplicity might actually be a competitive advantage.
Third, "the simplest thing that could possibly work" leaves a lot of money on the table. Want to make a TV show that is "the simplest thing that could possibly work"? Get an iPhone and record 3 people in an empty room saying lines. Publish a new episode every week. That is technically a TV show - but it would probably not get many views. Critics saying that you have "the simplest show" is probably not gonna put money in your pocket.
You want a grand design principle that always applies? Here's one: "Design for what you need in the near future, get it done on time and under budget, and also if you have the time, try to make it work well."
I don't follow. I've made simple things many times without having to make a complex thing first.
You just described Podcast. It did work for many (obviously it failed for many as well). That's an excellent example of why one should start with the simplest thing that could possibly work. Probably better than the OP's examples.
The beauty of this approach is that you don't design anything you don't need. The requirements will change, and the design will change. If you didn't write much in the first place, it's easy.
An example is databases. People design their database schemas in incredibly simplistic ways, and then regret it later when the predictable stuff most people need doesn't work with the old schema, and you can't even just add columns, but you have to modify existing ones. Avoid the nightmare by making it reasonably extensible from the start. It may not be "the simplest thing that could possibly work", but it is often useful and doesn't cost you anything extra.
Just as much as people say "don't prematurely optimize", they should also say "don't prematurely make it total crap".
Anyone proclaiming simplicity just hasnt worked at scale. Even rewrites that have a decade old code base to be inspired from, often fail due to the sheer amount of things to consider.
A classic, Chesterton's Fence:
"There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”"
What is far more likely is the proverbial "JS framework problem:" gah, this technology that I read about (or encounter) is too complex, I just want 1/10th that I understand from casually reading about it, so we should replace it with this simple thing. Oh, right, plus this one other thing that solves a problem. Oh, plus this other thing that solves this other problem. Gah, this thing is too complex!
I'm not saying that these jobs are bullshit in the same way that a VP of box-ticking is, just that it's not a conspiracy that a cathedral based on 'design-doc culture' might produce incentives that result in people who focus on maximising their performance on these fiscally rewarding dot points, rather than actualising their innate belief in performant and maintainable systems.
I work at a start-up so if my code doesn't run we don't get paid. This motivates me to write it well.
https://inthesetimes.com/article/capitalism-job-bullshit-dav...
It’s not the same as introducing complexity to keep yourself employed, but the result is the same and so is the cause - incentive structures aren’t aligned at most companies to solve problems simply and move on.
The complexity comes from the fact that at scale, the state space of any problem domain is thoroughly (maybe totally) explored very rapidly.
That’s a way bigger problem than system complexity and pretty much any system complexity is usually the result of edge cases that need to be solved, rather than bad architecture, infrastructure or organisational issues - these problems are only significant at smaller, inexperienced companies, by the time you are at post scale (if the company survives that long) then state space exploration in implementation (features, security, non-stop operations) is where the complexity is.
At the scale you are mentioning, even "simple" solutions must be very sophisticated and nuanced. How does this transformation happen naturally from an engineer at a startup where any mainstream language + Postgres covers all your needs, to someone who can build something at Google scale?
Let's disregard the grokking of system design interview books and assume that system design interviews do look at real skills instead of learning common buzzwords.
I built a hobby system for anonymously monitoring BitTorrent by scraping the DHT, in doing this, I learned how to build a little cluster, how to handle 30,000 writes a second (which I used Cassandra for - this was new to me at the time) then build simple analytics on it to measure demand for different media.
Then my interview was just talking about this system, how the data flowed, where it can be improved, how is redundancy handled, the system consisted of about 10 different microservices so I pulled the code up for each one and I showed them.
Interested in astronomy? Build a system to track every star/comet. Interested in weather? Do SOTA predictions, interested in geography? Process the open source global gravity maps, interested in trading? Build a data aggregator for a niche.
It doesn’t really matter that whatever you build “is the best in the world or not” - the fact that you build something, practiced scaling it with whatever limited resources you have, were disciplined to take it to completion, and didn’t get stuck down some rabbit hole endlessly re-architecting stuff that doesn’t matter, this is what they’re looking for - good judgement, discipline, experience.
Also attitude is important, like really, really important - some cynical ranter is not going to get hired over the “that’s cool I can do that!” person, even if the cynical ranter has greater engineering skills, genuine enthusiasm and genuine curiosity is infectious.
There are steps that most take. Start with caching. Then you learn about caching strategies because the cache gets slow. Then you shard the database and start managing multiple database connections and readers and writers. Then you run into memory, cpu, or i/o pressure. Maybe you start horizontally scaling. Connections and file descriptors have limits you learn about. Proxies might enter your lexicon. Monitoring, alerting, and testing all need improvement. And recently teams are getting harder to manage and projects are getting slower. Maybe deploying takes forever. So now we break up into different domains. Core backend, control panel, compliance, event processing, etc.
As the org grows and continues to change, more and more stakeholders appear. Security, API design, different cost modeling, product and design, and this web of stakeholders all have competing needs.
Go back to my opening stanza. Rinse and repeat.
Doing this exposes patterns and erroneous solutions. You work to find the least complex solution necessary to solve the known constraints. Simple is not easy (great talk, look it up). The learnings from these battle scars is what makes a staff level engineer methinks. You gain stories and tools for delivering solutions that solve increasingly larger systems and organizations. I recently was the technical lead for a 40 team software project. I gained some more scars and learnings.
An expert is someone who has made and learned from many mistakes in a narrow field. Those learnings and lessons get passed down in good system design interview books, like Designing Data Intensive Applications.
Those are things that matter and can't be brushed away though.
What Conway's law describes is also optimization of the software to match the shape it can be developped and maintained with fewer frictions.
Same for infra, complexity induced by it shouldn't be simplified unless you also simplify/abatract the infra first.
If the software base is full of gotchas and unintended side-effects then the source of the problem is in unclean separation of concerns and tight coupling. Of course, at some point refactoring just becomes an almost insurmountable task, and if the culture of the company does not change more crap will be added before even one of your refactorings land.
Believe me, it's possible to solve complex problems by clean separation of concerns and composability of simple components. It's very hard to do well, though, so lots of programmers don't even try. That's where you need strict ownership of seniors (who must also subscribe to this point of view).
Sometimes the problem is in the edges—the way the separate concerns interact—not in the nodes. This may arise, for example, where the need for an operation/interaction between components doesn't need to be idempotent because the need for it to be never came up.
Again, wrong design. Like I said, it's very difficult to do well. Consider alternate architecture: one component adds the bulk data to request, the second component modifies it and adds other data, then the data is sent to transaction manager that commits or fails the operation, notifying both components of the result.
Now, if the first component is one k8s container already writing to the database and second is then trying to modify the database, rearchitecting that could be a major pain. So, I understand that it's difficult to do after the fact. Yet, if it's not done that way, the problem will just become bigger and bigger. In the long run, it would make more sense to rearchitect as soon as you see such a situation.
Do you know how you get such a system? When you start with a simple system and instead of redesigning it to reflect the complexity you just keep the simple system working while extending it to shoehorn the features it needs to meet the requirements.
We get this all the time, specially when junior developers join a team. Inexperienced developers are the first ones complaining about how things are too complex for what they do. More often than not that just reflects opinionated approached to problem domains they are yet to understand. Because all problems are simple once you ignore all constraints and requirements.
Shoehorning things into working systems is something I have seen juniors do. I have also seen "seniors" do this, but in my view, they are still juniors with more years working on the same code base.
I have once heard it described as "n-years of 1 year experiences". In other words, such a person never learns that program design space must continuously be explored and that recurrence of bugs in the same part of code usually means that a different design is required. They never learn that cause of the bug was not that particular change that caused the unintended side effect but that the fact that there is a side effect is a design bug onto its own.
I do agree, though, that TFA may be proposing sticking with simpler design for longer than advisable.
This isn't to say you should never try to refactor or improve things, but make sure that it's going to work for 100% of your use cases, that you're budgeted to finish what you start, and that it can be done iteratively with the result of each step being an improvement on the previous.
No one can predict how efficacious that attempt will be from the get-go. Eventually, often people find out that their assumptions were too naive or they don’t have enough budget to push it to completion.
Successful refactoring attempts start small and don’t try to change the universe in a single pass.
In most of these cases, a few days up front exploring edge cases would have identified the problems and likely would have red lighted the project before it started. It can make you feel like a party pooper when everyone is excited about the new approach, but I think it's important that a few people on the team are tasked with identifying these edge cases before greenlighting the project. Also, maybe productionize your easiest case first, just to get things going, but then do your hardest case second, to really see if the benefits are there, and designate a go/rollback decision point in your schedule.
Of course, such problems can come up in any project, but from what I've seen they tend to be more catastrophic in refactoring/rearchitecting projects. If nothing else, because while unforeseen difficulties can be hacked around for new feature launches, hacking around problems completely defeats the purpose of a refactoring project.
And that’s usually because the person or small group that began the refactor weren’t given the time and resources to do the refactor, and uninterested or unknowledgable people hijacked and over complicated the process, and others blocked it from happening, so what would have taken a few weeks for the initial team to have completed the refactor successfully, with a little help and cooperation from others, and had they not been pulled in 10 different ways to fight other fires — instead after months and months and expending tons of time and money on people mucking it up instead of fixing it, the refactor got abandoned, a million dollars was wasted, and the system as a whole was worse than it was before.
Simple stuff had tons of long term advantages and benefits - its easy to ramp up new folks on it compared to some over-abstracted hypercomplex system because some lead dev wanted to try new shiny stuff for their cvs or out of boredom. Its easy to debug, migrate, evolve and just generally maintain, something pure devs often don't care much for unless they become more senior.
Complex optimizations are for sure required for extreme performance or massive public web but that's not the bulk of global IT work done out there.
I always felt software is like physics: Given a problem domain, you should use the simplest model of your domain that meets your requirements.
As in physics, your model will be wrong, but it should be useful. The smaller it is (in terms of information), the easier it is to expand if and when you need it.
227 more comments available on Hacker News