Testing Is Better Than Data Structures and Algorithms

Posted4 months agoActive4 months ago

rsyring

188 points

176 comments

nedbatchelder.comTechstoryHigh profile

heatedmixed

Debate

85/100

Software TestingData Structures and AlgorithmsSoftware Development

Key topics

Software Testing

Data Structures and Algorithms

Software Development

The article argues that testing is more important than data structures and algorithms for in-the-trenches software engineers, sparking a heated debate among commenters about the importance of each.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

Peak period

6-12h

Avg / period

13.3

Comment distribution160 data points

Loading chart...

Based on 160 loaded comments

Key moments

01Story posted
Sep 22, 2025 at 12:21 PM EDT
4 months ago
Step 01
02First comment
Sep 22, 2025 at 2:53 PM EDT
3h after posting
Step 02
03Peak activity
58 comments in 6-12h
Hottest window of the conversation
Step 03
04Latest activity
Sep 25, 2025 at 4:34 PM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (176 comments)

Showing 160 comments of 176

wjrb

4 months ago

3 replies

Are there any resources out there that anyone can recommend for learning testing in the way the author describes?

In-the-trenches experience (especially "good" or "doing it right" experience) can be hard to come by; and why not stand on the shoulders of giants when learning it the first time?

Jtsummers

4 months ago

1 reply

Working Effectively with Legacy Code by Michael Feathers. It spends a lot of time on how to introduce testability into existing software systems that were not designed for testing.

Property-Based Testing with PropEr, Erlang, and Elixir by Fred Hebert. While a book about a particular tool (PropEr) and pair of languages (Erlang and Elixir), it's a solid introduction to property-based testing. The techniques described transfer well to other PBT systems and other languages.

Test-Driven Development by Kent Beck.

https://www.fuzzingbook.org/ by Zeller et al. and https://www.debuggingbook.org/ by Andreas Zeller. The latter is technically about debugging, but it has some specific techniques that you can incorporate into how you test software. Like Delta Debugging, also described in a paper by Zeller et al. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=988....

I'm not sure of other books I can recommend, the rest I know is from learning on the job or studying specific tooling and techniques.

GrumpyYoungMan

4 months ago

1 reply

TDD is a development methodology, not a testing methodology. The main thing it does is check whether the developer implemented what they thought they should be implementing, which is not necessarily what the spec actually says to implement or what the end user expects.

Jtsummers

4 months ago

It's still a useful technique and way to apply testing to development. But yes, it's not the best resource in telling you what tests to write, more about how they can be applied effectively. Which is a skill that seems absent in many professionals.

fuzztester

4 months ago

1 reply

The Art of Software Testing. New York: Wiley, 1979

The Art of Software Testing, Second Edition. with Tom Badgett and Todd M. Thomas, New York: Wiley, 2004.

It is by Glenford Myers (and others).

https://en.m.wikipedia.org/wiki/Glenford_Myers

From the top of that page:

[ Glenford Myers (born December 12, 1946) is an American computer scientist, entrepreneur, and author. He founded two successful high-tech companies (RadiSys and IP Fabrics), authored eight textbooks in the computer sciences, and made important contributions in microprocessor architecture. He holds a number of patents, including the original patent on "register scoreboarding" in microprocessor chips.[1] He has a BS in electrical engineering from Clarkson University, an MS in computer science from Syracuse University, and a PhD in computer science from the Polytechnic Institute of New York University. ]

I got to read it early in my career, and applied it some, in commercial software projects I was a part of, or led, when I could.

Very good book, IMO.

There is a nice small testing-related question at the start of the book that many people don't answer well or fully.

pfdietz

4 months ago

1 reply

As I recall this was a book that included the orthodoxy at the time that random testing was the worst kind of testing, to be avoided if possible.

That turned out to be bullshit. Today, with computers many orders of magnitude faster, using randomly generated tests is a very cost effective away of testing, compared to carefully handcrafted tests. Use extremely cheap machine cycles to save increasingly expensive human time.

fuzztester

4 months ago

2 replies

Interesting. Don't remember that from the book, but then, I read it long ago.

I agree that random testing can be useful. For example, one kind of fuzzing is using tons of randomly generated test data against a program to try to find unexpected bugs.

But I think both kinds have their place.

Also, I think the author might have mean that random testing is bad when used with a small amount of test data, in which case I'd agree with him, because in that case, an equally small amount of carefully crafted test data would be the better option, e.g. using some test data in each equivalence class of the input.

pfdietz

4 months ago

Here is the quote (from the 3rd ed., page 41):

"In general, the least effective methodology of all is random-input testing—the process of testing a program by selecting, at random, some subset of all possible input values. In terms of the likelihood of detecting the most errors, a randomly selected collection of test cases has little chance of being an optimal, or even close to optimal, subset. Therefore, in this chapter, we want to develop a set of thought processes that enable you to select test data more intelligently."

You can immediately see the problem here. It's optimizing for number of tests run, not for the overall cost of creating and running the tests. It's an attitude suited to when running a program was an expensive thing using precious resources. It was very wrong in 2012 when this edition came out and even more wrong today.

pfdietz

4 months ago

I'd say in any sufficiently complex program, random testing is not only useful, it's essential, in that it will quickly find bugs no other approach would.

Even better, it subsumes many other testing paradigms. For example, there was all sorts of talk about things like "pairwise testing": be sure to test all pairwise combinations of features. Well, randomly generated tests will do that automatically.

I view random testing as another example of the Bitter Lesson, that raw compute dominates manually curated knowledge.

cogman10

4 months ago

Resources, none that I'm aware of. I generally think this is an OK way to look at testing [1], though I think it goes too far if you completely adopt their framework.

The boil down the tests I like to see. Structure them with "Given/when/then" statements. You don't need a framework for this, just make method calls with whatever unit test framework you are using. Keep the methods small, don't do a whole lot of "then"s, split that into multiple tests. Structure your code so that you aren't testing too deep. Ideally, you don't need to stand up your entire environment to run a test. But do write some of those tests, they are important for catching issues that can hide between unit tests.

[1] https://cucumber.io/docs/bdd/

marcosdumay

4 months ago

1 reply

When testing job candidates, sure, no doubt about that.

For for learning, no, it's not. You should not spend as much time learning testing as you spend leaning data structures.

Jtsummers

4 months ago

1 reply

I feel like this mischaracterizes the blog. You seem to be taking this:

> People should spend less time learning DSA, more time learning testing.

And reading it as "More total time should be spent on learning testing than the total time spent learning DSA". That's one reading, another is that people are studying DSA too much, and testing too little. The ratio of total time can still be in favor of studying DSA more, but maybe instead of 10:1 it should be more like 8:1 or 5:1.

marcosdumay

4 months ago

That's a fair point. But then the author makes a blatant and unrealistic generalization about how much time people spend on each of those. Between CS undergrads and introduction to programming bootcamps, the variance on that number is extreme.

pshirshov

4 months ago

2 replies

Pure bullshit and incompetence.

> esoteric things like Bloom filters, so you can find them later in the unlikely case you need them.

They are not esoteric, they are trivial and extremely useful in many cases.

> Less DSA, more testing.

Testing can't cover all the cases by definition, why not property testing? Why not formal proofs?

Plus, in our days, it's easy to delegate testcase writing to LLMs, while they literally cannot invent new useful AnDS.

cogman10

4 months ago

3 replies

> extremely useful in many cases.

I've not ran into a case where I can apply a bloom filter. I keep looking because it always seems like it'd be useful. The problem I have is bloom filter has practically reverse characteristics from what I want. It gives false positives and true negatives. I most often want true positives and false negatives.

guiand

4 months ago

> true positives and false negatives

That would be a simple cache in most instances.

burch45

4 months ago

Its entire purpose is an optimization. You have an expensive operation. A bloom filter can tell you that you definitely don’t need to do that operation. So rather than wasting a lot of time unnecessarily doing that operation, you get the cheap Bloom filter Che most of the time and only occasionally have the false positive where you do the expensive thing when it turns out you didn’t need to. That as far as I am aware of is the only use case for a bloom filter. That said, I have used it for that purpose effectively several times in my career.

pshirshov

4 months ago

Assume that you need to build a large-scale search or analytics tool for example. All the sketch data structures (like cuckoo filters and especially hypermihashes) are extremely useful in these scenarios.

burnt-resistor

4 months ago

It's a strawman besmirching niche knowledge for methodology. The two aren't mutex and shouldn't be competitors. Bloom filters are really trivial to implement and are great examples of time/space tradeoffs, and are useful mostly for checking if a key isn't a member of an otherwise expensive lookup operation and so can be avoided early.

What's more concerning is "engineers" incurious about how lower levels of the stack work, or aren't interested in learning breadth, depth, or new things.

hvb2

4 months ago

3 replies

This feels backwards. When you have a good understanding of data structures you have the luxury of testing.

If you focus on testing over data structures, you might end up testing something that you didn't need to test because you used the wrong data structures.

IMHO too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

cogman10

4 months ago

1 reply

That wasn't the thrust of the article.

The article is saying that it's more important to write tests than it is to learn how to write data structures. It specifically says you should learn which data structures you should use, but don't focus on knowing how to implement all them.

It calls out, specifically, that you should know that `sort` exists but you really don't need to know how to implement quicksort vs selection sort.

hvb2

4 months ago

1 reply

No, it says learn data structures first, then focus on testing.

You don't have to go super deep on all the sort algorithms, sure. That's like saying that learning testing implies writing a mocking library

MrJohz

4 months ago

I think the issue is that most DSA curriculums do go super deep on things like sort algorithms or linked lists or whatever else. Whereas testing is usually barely taught in universities or colleges, and when it is taught, it's usually very lightweight.

matheusmoreira

4 months ago

1 reply

> too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

The reverse also happens frustratingly often. One could spend a lot of time obsessing over theoretical complexity only for it to amount to nothing. One might carefully choose a data structure and algorithm based on these theoretical properties and discover that in practice they get smoked by dumb contiguous arrays just because they fit in caches.

The sad fact is the sheer brute force of modern processors is often enough in the vast majority of cases so long as people avoid accidentally making things quadratic.

Sometimes people don't even do that and we get things such as the GTA5 dumpster fire.

https://news.ycombinator.com/item?id=26296339

janalsncm

4 months ago

Slightly related, in ML I write a lot of code which will be executed exactly once. Data analysis, creating a visualization, one-off ETL tasks.

There are a lot of times where I could spend mental energy writing “correct” code which trades off space for time etc. Sometimes, it’s worth it, sometimes not. But it’s better to spend an extra 30 seconds of CPU time running the code than an extra 10 minutes carefully crafting a function no one will see later, or that someone will see but is harder to understand. Simpler is better sometimes.

What Big O gives you is an ability to assess the tradeoffs. Computers are fast so a lot of times quadratic time doesn’t matter for small N. And you can always optimize later.

jancsika

4 months ago

> IMHO too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

Not if the user can, say, farm 1,000,000 different rows 100 times over an hour and a half while gossiping with their office mates. I over Excel as Exhibit A.

nice_byte

4 months ago

2 replies

ignore this advice.

spend plenty of time studying data structures and algorithms as well as computer architecture. these are actually difficult things that take a long time to understand and will have a positive impact on your career.

study the underlying disciplines of your preferred domain.

in general, focus on more fundamental things and limit the amount of time you spend on stupid shit like frameworks, build systems, quirks of an editor or a programming language. all these things will find a way to steal your time _anyway_, and your time is extremely precious.

"testing" is not fundamental. there is no real skill to be learned there, it's just one of those things that will find a way to steal your time anyway so there is no point in focusing actively on it.

put it that way: you will NEVER get the extra time to study fundamental theory. you will ALWAYS be forced to spend time to write tests.

if you somehow find the time, spend it on things that are worth it.

KevinMS

4 months ago

1 reply

> "testing" is not fundamental. there is no real skill to be learned there, it's just one of those things that will find a way to steal your time anyway so there is no point in focusing actively on it.

that's an edgy take and a red flag

nice_byte

4 months ago

1 reply

it is not edgy whatsoever. it reflects the actual reality on the ground.

nobody goes to school to learn how to use git or how to write unit tests. it's not something that needs to be actively "learned", you'll just absorb it eventually because you can't escape it.

The more interesting and important things you will never "just absorb", you actually have to make a conscious effort to engage with them.

KevinMS

4 months ago

1 reply

I'm replying to statements like this

> "testing" is not fundamental.

and

> there is no real skill to be learned there

one of the biggest problems that has plagued software is failed projects. There have been a lot of them, and its probably costs hundreds of billions of dollars.

I can guarantee not one of those projects failed because somebody had to take the time to look up the best data structure. But I'll bet a lot of them failed because they didn't follow smart testing practices and collapsed under their own weight of complexity, untestability and inflexibility.

nice_byte

4 months ago

1 reply

Citation needed.

I've seen projects fail for a multitude of reasons, by far the most common are boring political ones, like the leadership not understanding what it is that they want to build.

Hiring people who think bloom filters are "exotic" to work on a distributed system could certainly doom that project to failure regardless of how diligently tested it is.

I assure you that if you have enough competence to actually go through with designing and building a thing, you certainly have more than enough competence to test it. It is not a fundamental discipline that needs to be studied, much less at the expense of fundamental knowledge.

Edit: to reframe it a bit differently: you can always add more tests. you can't fix the problems you don't even know you have due to lack of thorough understanding of the problem domain.

KevinMS

4 months ago

1 reply

> Citation needed.

[7 Software Failures Due To Lack Of Testing That Rocked The World](https://www.appsierra.com/blog/software-failures-due-to-lack...)

> Hiring people who think bloom filters are "exotic" to work on a distributed system could certainly doom that project to failure regardless of how diligently tested it is.

Citation needed.

> Edit: to reframe it a bit differently: you can always add more tests. you can't fix the problems you don't even know you have due to lack of thorough understanding of the problem domain.

The problem domain is never the data structure or algorithm.

nice_byte

4 months ago

1 reply

> [7 Software Failures Due To Lack Of Testing That Rocked The World](https://www.appsierra.com/blog/software-failures-due-to-lack...)

Do you really think you can prove your point by showing me some sloplist of mildly high profile bugs? All these systems had extensive test suites and yet these problems happened anyway.

Bugs happen in extensively tested systems literally all the time, but by your own logic, any bug is "due to lack of testing". That's an unproductive line of reasoning because it is not possible or practical to test for every possible eventuality. This is why fields like formal verification exist.

>> Hiring people who think bloom filters are "exotic" to work on a distributed system could certainly doom that project to failure regardless of how diligently tested it is. > Citation needed.

ever tried to build a distributed cache??

> The problem domain is never the data structure or algorithm.

The problem domain is literally always that. The way your data is organized and the way you work with it is directly affected by the exact problem you are solving.

KevinMS

4 months ago

1 reply

> Do you really think you can prove your point by showing me some sloplist of mildly high profile bugs?

Whatever I link to you are just going to say its AI, or inconclusive. There's a section on testing in the Mythical Man-month, but I can't link it here. But I don't see anything on getting the "fundamental theory" wrong.

> All these systems had extensive test suites and yet these problems happened anyway.

They were obviously missing some important tests.

> ever tried to build a distributed cache??

Why would I if I could avoid it? And building that, rather than finding it somewhere looks like nudge towards a project failure.

> The problem domain is literally always that. The way your data is organized and the way you work with it is directly affected by the exact problem you are solving.

That's just basic programming in the type system of your chosen language, not "fundamental theory" as you call it.

nice_byte

4 months ago

> Whatever I link to you are just going to say its AI, or inconclusive.

There is no point in linking anything further because your line of reasoning is flawed to begin with, due to two reasons.

Reason one, your argument amounts to: software has bugs, and every bug is there because there was no test that would prevent that specific kind of bug (and it would if they were doing testing "correctly"). This is a completely vacuous argument because all software has bugs and therefore no one is doing testing "correctly" to your satisfaction anyway, which makes the whole discussion moot.

Reason two, you seem to be assuming that I am somehow advocating for not doing software quality assurance or not writing any tests. I am not. I am arguing that it is not worth investing extra time into learning that discipline, because a) not fundamental; b) you will be forced to learn it anyway. Therefore, spend your precious extra time on more interesting and useful things.

> But I don't see anything on getting the "fundamental theory" wrong.

Typically projects that get the basics wrong don't live long enough to find themselves in an AI training corpus used to generate listicles.

> Why would I if I could avoid it?

a simple "no" would have sufficed to establish that your opinion on usefulness of bloom filters in distributed systems probably shouldn't be weighed very high.

> That's just basic programming in the type system of your chosen language, not "fundamental theory" as you call it.

The fundamental theory bit helps to choose the appropriate data organization for your use case and either implement it yourself or modify a pre-existing implementation, or convince yourself that a pre-existing implementation is sufficient.

blind_tomato

4 months ago

1 reply

I'll ignore your advice. It's one-sided and misleading, lacking the nuances the OP had.

nice_byte

4 months ago

feel free to. it's your career and your own precious time.

ChrisMarshallNY

4 months ago

1 reply

I agree with the article, but I'll bet a lot of others, don't. Discussions on Code Quality, don't fare well, here. Wouldn't surprise me, if the article already has flags.

Of course, "testing," is in the eye of the beholder.

Some folks are completely into TDD, and insist that you need to have 100% code coverage tests, before writing one line of application code, and some folks think that 100% code coverage unit tests, means that the system is fully tested.

I've learned that it's a bit more nuanced than this[0].

[0] https://littlegreenviper.com/testing-harness-vs-unit/

general1465

4 months ago

2 replies

Testing, especially vstest.console.exe in Visual Studio has carried my business really far. I have accumulated thousands of tests on my codebase usually based on customer requirements or on past bugs which I have been trying to replicate.

I think that a lot of people dislike testing because a lot of tests can run for hours. In my case it is almost 6 hours from start to finish. However as a software developer I have accumulated a lot of computers which are kind of good and I don't want to throw them out yet but they are not really usable for current development - i.e. 8GB of RAM, 256GB SSD, i5 CPU from 2014 - That would be a punishment to use it with Visual Studio today. But it is a perfect machine for compiling in console i.e. dotnet build or msbuild and running tests via vstest glued together with PowerShell script. So this dedicated testing machine is running on changes over night and I will see if it passed or not and if not fix tests which did not passed.

This setup may feel clunky, but it allows me to make sweeping changes in a codebase and be confident enough, that if the tests pass, it will very likely work for the customer too. The most obvious example where tests were carrying me around has been moving to .NET8 from .NET Framework 4.8. I have went from 90% failure rate on tests to all tests clear in like 3-4 iterations.

ChrisMarshallNY

4 months ago

1 reply

I have not done it, myself, but I think that Xcode, for Apple stuff, can parallelize tests, across multiple machines (maybe VMs?).

I would assume that Microsoft systems could do the same.

general1465

4 months ago

1 reply

A lot of tests are sharing one resource (USB device) which can't be accessed in parallel. So that's my constraint which I need to live with and the main reason why I can't parallelize or offload testing to cloud.

Otherwise yes, you can run tests in parallel in vstest. That's completely possible.

ChrisMarshallNY

4 months ago

Oh yeah. I did a lot of hardware stuff.

Quite familiar with the drill. Carry on...

Izikiel43

4 months ago

You could have a pipeline in some cloud provider to run the tests, and distribute the load across machines if tests are independent to reduce the time, if that's more important. If it's ok to just run the overnight, keep it on.

glitchc

4 months ago

2 replies

The article fails to demonstrate how code-tests result in objectively better code. Many comp sci programs have courses on testing that cover TDD, unit testing and fuzzing, among other topics.

Yet much of the safety critical code we rely on for critical infrastructure (nuclear reactors, aircraft, drones, etc) is not tested in-situ. It is tested via simulation, but there's minimal testing in the operating environment which can be quite complex. Instead the code follows carefully chosen design patterns, data structures and algorithms, to ensure that the code is hazard-free, fault-tolerant and capable of graceful degradation.

So, testing has its place, but testing is really no better than simulation. And in simulation, the outputs are only as good as the inputs. It cannot guarantee code safety and is not a substitute for good software design (read: structures and algorithms).

Having said that, fuzzing is a great way to find bugs in your code, and highly recommended for any software that exposes an API to other systems.

azeirah

4 months ago

2 replies

I don't understand what the difference between a simulation and a test is?

qayxc

4 months ago

Mostly just semantics.

glitchc

4 months ago

There is none, and that's my point. Simulations themselves are contrived scenarios that are not representative of production environments.

MoreQARespect

4 months ago

>fails to demonstrate how code-tests result in objectively better code.

Tests give the freedom to refactor which results in better code.

>So, testing has its place, but testing is really no better than simulation

Testing IS simulation and simulation IS testing.

>And in simulation, the outputs are only as good as the inputs. It cannot guarantee code safety

Only juniors think that you can get guarantees of code safety. Seniors look for ways to de-risk code, knowing that you're always trending towards a minima.

One of the key skills in testing is defining good, realistic inputs.

burnt-resistor

4 months ago

1 reply

Sigh. Monochromatic myopia denying the need for holistic quality and mastery in multiple arenas and methodologies. Belts and suspenders, not just elastic waistbands.

CyberDildonics

4 months ago

Oh if the holistic myopia is multiples of monochromatic does it really need elastic mastery? Sigh.

danielmarkbruce

4 months ago

8 replies

This will annoy a lot of folks, but:

1 - If you work on large scale software systems, especially infrastructure software of most types then you need to know and understand DSA and feel it in your bones.

2 - Most people work on crud apps or similar and don't really need to know this stuff. Many people in this camp don't realize that people working on 1 really do need to know this stuff.

What someone says on this topic says more about what things they have worked on in their life than anything else.

uncivilized

4 months ago

1 reply

I already know the answer to this, but did you read the article? Ned addresses your concerns.

danielmarkbruce

4 months ago

No, he doesn't. He doesn't discuss the gigantic dividing line between the two different types of systems I categorize above. He also doesn't cover the "feel it in your bones" required in the type 1 systems. Spend a minute reading or listening to Jeff Dean talk, and you'll see what is required to build those types of systems. Spend some time somewhere working on those systems and you'll come across some folks who just have this ready to go and can apply it and the drop of a hat.

ecshafer

4 months ago

1 reply

> What someone says on this topic says more about what things they have worked on in their life than anything else.

This is the crux of the debate. If you work on CRUD apps, you basically need to know hash maps, and lists, but getting better at SQL and writing clean code is good. But there are many areas where writing the right code vs the wrong code really matters. I was writing something the other day where one small in loop operation was the difference betweeen a method running in miliseconds and minutes. Or choose the right data structure can simplify a feature into 1/10th the code and makes it run 100x better than the wrong one.

MoreQARespect

4 months ago

This happens to me too it just happens roughly 100x less than me needing to know how to test properly.

It's never the other day it's 10x a day, every day.

So, OP is still correct.

mlinhares

4 months ago

1 reply

That's going to be true in all fields, people think their experiences are the only valid experiences and everyone else must think and work on what they think is important, otherwise they're wrong.

fuzztester

4 months ago

Right.

It is a very basic flaw in their logical thinking ability.

I never cease to be amazed by the number of HN people who display this flaw via their comments.

Or rather, I have ceased to be amazed, because I have seen it so many times by now here, and got resigned to the fact that it's gonna continue.

hatthew

4 months ago

1 reply

My work involves petabyte scale data, and the algorithms are very straightforward:

- What you want to do is probably trivially O(kn).

- There isn't a <O(kn) algorithm, so try to reduce overhead in k.

- Cache results when they are O(1), don't when they are O(n).

- If you want to do something >O(kn), don't.

- If you really need to so something >O(kn), do it in SQL and then go do something else while it's running.

None of that requires any DSA knowledge beyond what you learn in the first weeks of CS101. Instead, what's useful is knowing how to profile to optimize k, knowing how SQL works, and being able to write high quality maintainable code. Any smart algorithms that have a large time complexity improvement will probably be practically difficult to create and test even if you are very comfortable with the underlying theoretical algorithm. And the storage required for an <O(n) algorithm is probably at least as expensive as the compute required for the naive O(n) algorithm.

My general impression is that for small-scale problems, a trustworthy and easy algorithm is fine, even if it's inefficient ($100 of compute < $1000 of labor). For large-scale problems, domain knowledge and data engineering trumps clever DSA skills. The space between small- and large-scale problems is generally either nonexistent or already has premade solutions. The only people who make those "premade solutions" obviously need to feel it in their bones the way you describe, but they're a very very small portion of the total software population, and are not the target audience of this article.

fuzztester

4 months ago

1 reply

>My work involves

As the GP said:

>>What someone says on this topic says more about what things they have worked on in their life than anything else.

hatthew

4 months ago

1 reply

TFA isn't saying "DSA is useless", it's saying "intermediate/advanced DSA is not useful for most people". It's obvious that it's useful for some people, but I think even most people working on "large scale systems" should probably value general software engineering skills over DSA skills. The very few people who actually need DSA skills already know that the advice "you don't need DSA" doesn't apply to them.

danielmarkbruce

4 months ago

> The very few people who actually need DSA skills already know that the advice "you don't need DSA" doesn't apply to them.

This is right. And most of those people know a lot of their job is very far removed from many other software engineers. But the prevalence of the idea "you don't really use DSA in practice" does suggest many people building applications where DSA isn't as applicable seem to misunderstand the situation. It matters in some sense - it explains why interviews at google are the way they are, why universities teach what they teach, what one should do if they really like such things.

mamcx

4 months ago

1 reply

> Most people work on crud apps or similar

CRUD apps are the ones that become more complex, not less. The idea that a "CRUD app" is the poster child of simplicity is mega-misleading.

Building a ERP or similar will eat you alive in forms that making a total OS from scratch with all the features and more of linux not. (Probably the only part that is hard as "crud apps" is the drivers, and that is because you see what kind of madness is interface with others code)

danielmarkbruce

4 months ago

I didn't mention the word complex nor imply it. Complexity of an application and scale aren't the same thing.

evrydayhustling

4 months ago

This is so true. When you get DSA wrong, you end up needing insanely complex system designs to compensate -- and being great at Testing just can't keep up with the curse of dimensionality from having more moving parts.

4ndrewl

4 months ago

I'm not sure the article disagrees on that point. As you say, for most people, testing is better than dsa.

(Alternatively you could just argue it's a false dichotomy)

arvinsim

4 months ago

In the end it doesn't really matter.

In software development hiring, everyone tests for DSA whether it is useful or not in the actual job description.

jerf

4 months ago

2 replies

This is one of the things I'd tune in the current curriculum.

When I went to college in the late 1990s, we were right on the verge of a major transition to DSAs being something every programmer would implement themselves to something that you just pick up out of your libraries. So it makes sense that we would have some pretty heavy-duty labs on implementing very basic data structures.

That said, I escaped into the dynamic programming world for the next 15 years or so, so I almost never actually did anything of significance with this. And now even in the static world, I almost never do anything with this stuff directly because it's all libraries for them now too. Even a lot of modern data structures work is just using associative maps and arrays together properly.

So I would agree that we could A: spend somewhat less time on this in the curriculum and B: tune it to more about how to use arrays and maps and less about how to bit bang efficient hash tables.

People always get frosty about trying to remove or even "tune down" the amount of time spent in a curriculum, but consider the number of things you want to add and consider that curricula are essentially zero-sum games; you can't add to them without removing something. If we phrase this in terms of "what else could we be teaching other than a fifth week on pointer-based data structures" I imagine it'll sound less horrifying to tweak this.

Not that it'll be tweaked, of course. But it'd be nice to imagine that I could live in a world where we could have reasonable discussions about what should be in them.

jkhdigital

4 months ago

1 reply

About 20 years ago I failed out of the undergrad CS program at UIUC because I thought I was smart enough to skip most lectures. I did manage to get an A in the C++ Data Structures course because the lectures were recorded and I just binged them all the night before each test.

Anyways, now I’m a full-time lecturer teaching undergraduate CS courses (long story) and I’m actually shaping curriculum. As soon as I read this article I thought “I need to tell my data structures students to read this” because it echoes a lot of what I’ve been saying in class.

Case in point: right after two lectures covering the ArrayList versus LinkedList implementations of the Java List interface, I spent an entire lecture on JUnit and live-coded a performance test suite that produced actual data to back up our discussions of big-O complexity. The best part of all? They learned about JIT compilation in the JVM firsthand because it completely blew apart the expected test results.

MrJohz

4 months ago

> Anyways, now I’m a full-time lecturer teaching undergraduate CS courses (long story)

I would love to hear that story if you're willing to tell it.

It sounds like you're a great lecturer, though, giving the students exactly the sort of stuff they need. I remember a university lecturer explaining to us that "JIT" just meant that Java loaded the class files when it needed them, rather than loading them all at the start, so your lesson sounds like a far cry from those days!

philwelch

4 months ago

I don’t think the primary value in learning data structures and algorithms is the ability to implement them yourself. It’s more of a way to get repetitions in on basic programming skills while learning about the tools that are available to you. Later in a CS curriculum you might learn how to write an operating system or a compiler, not because you’re necessarily going to ever actually do it again but because it’s a way of learning how those systems work as well as getting repetitions building larger projects.

fastaguy88

4 months ago

1 reply

It really depends. Working on genome analysis, I once encountered/interrupted (by rebooting after a software update) a student who had been running an analysis for more than a week, because they had not pre-sorted the data. With pre-sorted data, it took a few minutes.

Not everyone works on web sites using well-optimized libraries; some people need to know about N and Nlog(N) vs N^2.

matheusmoreira

4 months ago

2 replies

> some people need to know about N and Nlog(N) vs N^2.

Every programmer should know enough to at least avoid accidentally making things quadratic.

https://news.ycombinator.com/item?id=26296339

pfdietz

4 months ago

Or, to recognize when they or someone else has done so and recover.

It's often a case of "N won't be large here" and then later N does sometimes turn out to be large.

hetman

4 months ago

Indeed. As an anecdote, I've come across a self professed frontend UI guru writing quadratic code that worked fine in testing because it only had to display a few tens of items there, but at a complete loss why it was unusable in production.

matheusmoreira

4 months ago

1 reply

> Of course some engineers need to implement hash tables, or sorting algorithms or whatever.

> We love those engineers: they write libraries we can use off the shelf so we don’t have to implement them ourselves.

The world needs to love "infrastructure developers" more. To me it seems only the killer app writing crowd is valued. Nobody really thinks about the work that goes into programming languages, libraries and tools. It's invisible work, taken for granted, often open source, not rarely unpaid.

> It wasn’t opening a textbook to find the famous algorithm that would solve my problem.

I had that exact experience. I'm working on my own programming language. After weeks of trying to figure something out by myself, someone told me to read Structure and Interpretation of Computer Programs. It literally had the exact algorithm I wanted.

chamomeal

4 months ago

2 replies

Dang I got this book a few weeks ago and still haven’t cracked it open. Maybe today is the day

matheusmoreira

4 months ago

1 reply

The algorithm I'm talking about is at the very end of the book. If you start reading it from start to finish you might stop before you reach it. Certainly happened to me. Someone had to point it out for me to realize SICP had the answer all along.

https://eng.libretexts.org/Bookshelves/Computer_Science/Prog...

The explicit control evaluator. It's a register and stack machine which evaluates lisp expressions without transforming them into bytecode.

hinkley

4 months ago

1 reply

A common story with JIT languages is to go back and forth between having a bytecode interpreter and not.

The paradox is that when the interpreter is fast enough then you delay JIT because it takes longer for the amortized cost to be justified. But that also means the reasons for that high amortization cost don’t get prioritized because they don’t really show up as a priority.

Eventually the evidence piles so high nobody can ignore it and the code gets rearranged to create a clearer task list. And when that list runs out they rearrange again because now that other part is 2x too slow.

Personally I’d love to see a JIT that was less just in time. Queuing functions for optimization that only get worked on when there are idle processors. So there’s a threshold where the JIT pre-empts, and one where it only offers best effort.

matheusmoreira

4 months ago

1 reply

I wanted to preserve the "code is just lists" property of lisps. Compiling the lists away means that property is lost: the code becomes bytecode or native code instead, sacrificing lisp's soul in exchange for performance.

I want to implement a partial evaluator one day. That should go a long way to improving performance by precomputing and inlining things as much as possible.

hinkley

4 months ago

Sometimes people avoid that by making a stupidly cheap code generator that goes straight from the input file format to unoptimized machine code. Because you only have to reach a fraction of what the optimized code would achieve for throughput.

alabhyajindal

4 months ago

This might be useful: https://docs.racket-lang.org/sicp-manual/index.html

JackSlateur

4 months ago

3 replies

I think the author is mislead

Let's grab a simple use case: some basic CRUD http API. Easy, you say, no need to know fancy stuff ! Just test it and that's all.

You do your test, all good, you can roll in production !

But sadly, in production, you have multiple users (what an idea ..). Suddenly, your CRUD api has become a concurrent system. Suddenly, you have data corruption, because you never thought about anything about that, and "your tests were green".

Algorithms are the backbone tools of programming. Knowing them help us, ignoring them burdens us.

wubrr

4 months ago

1 reply

I mean, in your example you just have an incomplete test suite. (Though writing a complete one is often unrealistic)

While understanding algorithms and data structures is important, the only way you really know how well it works, and how well it's implemented is by thoroughly testing it. There are an infinite amount of clever algorithms out there with terrible implementations.

You need both.

JackSlateur

4 months ago

6 replies

Testing concurrency is extremely hard

For instance, get sql queries; You ran them, and you have no issue; Is your code sane ? Or is it because one query ran 10ms earlier and, thus, you avoided the issue ?

I truly wonder if there is real world tests around this; I bet there is only algorithm and fuzzing;

wubrr

4 months ago

3 replies

> Testing concurrency is extremely hard

Writing a non-trivial concurrent system based on your understanding of the 'algorithm' , without relying on testing is much harder.

> I truly wonder if there is real world tests around this

Of course there are. There are many tools, methods, and test suites out there for concurrency testing, for almost any major language out there. Of course, understanding your algorithm, and the systems involved is required to write a proper test suite.

> For instance, get sql queries; You ran them, and you have no issue; Is your code sane ?

Take those queries and run them 1000x+ times concurrently in a loop. That will catch most common issues. If you want to go a step further you can build a setup to execute your queries in any desired order.

sarchertech

4 months ago

2 replies

I’ve never worked somewhere (in 20 years from big tech companies to small startups) that was generally and reliably testing for concurrency bugs.

And I’ve seen dozens of bugs caused by people assuming that transactions (with the default isolation level) protect against race conditions.

wubrr

4 months ago

1 reply

Every place I worked at, that had any kind of reliable, high-throughput concurrent system had an extensive suite of concurrent tests.

https://github.com/postgres/postgres/tree/master/src/test/is...

https://muratbuffalo.blogspot.com/2023/08/distributed-transa...

https://learn.microsoft.com/en-us/archive/msdn-magazine/2008...

https://go.dev/blog/synctest

https://learntla.com/core/concurrency.html

sarchertech

4 months ago

1 reply

> Every place I worked at, that had any kind of reliable, high-throughput concurrent system

Pretty much anyone with high throughput is running a high throughput concurrent system, and very few companies have an extensive suite of concurrency tests unless you just mean load tests (that aren’t setup to catch race conditions).

The “reliable” part of that statement might be doing a lot of heavy lifting depending on what exactly you mean by that.

wubrr

4 months ago

1 reply

I gave you several concrete examples. Your claims of 'very few companies have...' aren't very convincing, and the apparent popularity of concurrency testing isn't really a strong argument for or against it's effectiveness or do-ability.

sarchertech

4 months ago

2 replies

Did you Google “concurrency testing” and send me the top 5 results?

fn-mote

4 months ago

Kind of looks like it … the supporting evidence includes work from Microsoft: learning how to write concurrent programs. Surely not evidence that Microsoft is testing for concurrency bugs (of course they are).

wubrr

4 months ago

Maybe you should have googled 'concurrency testing' before telling me a story about how you worked at every tech company for 76000 years and never saw any concurrency testing lmao.

nijave

4 months ago

Not sure about other languages but I believe the stock test tooling for `go` generates a random sorting seed and has a flag to run tests concurrently. You can also manually pass the seed to simulate a certain ordering.

While not perfect, our e2e tests caught a couple bugs running them with concurrency.

JackSlateur

4 months ago

1 reply

Running 1000x queries in a loop is called luck.

wubrr

4 months ago

2 replies

No, it's called testing many concurrent operations.

Implementing a complex concurrent algorithm based on your understanding of it, without proper testing is called luck, and often called delusion.

JackSlateur

4 months ago

2 replies

What algorithm ? The whole idea is that algorithms are useless, and you should just write a bunch of tests and go with it

Yes, if I write stuff with locks, I shall ensure that my code acquires and releases locks correctly

This is completely off-topic with the original post;

Also, you cannot prove something by tests; Just because you found 100000 cases where your code works does not mean there is not a case where is does not (just as you cannot prove that unicorn does not exist) :)

sarchertech

4 months ago

2 replies

> Also, you cannot prove something by tests; Just because you found 100000 cases where your code works does not mean there is not a case where is does not (just as you cannot prove that unicorn does not exist) :)

That’s exactly it. For any non trivial program, there exists an infinite number of ways your program can be wrong and still pass all your tests.

Unless you can literally test every possible input and every bit of state this holds true.

wubrr

4 months ago

1 reply

For any 'non trivial' program there exists an infinite number of ways your program can be wrong but you still believe it's right.

Testing is not a perfect solution to catch all bugs. It's a relatively easy, efficient and reliable way to catch many common bugs though.

sarchertech

4 months ago

Sure but it's not a reliable replacement for understanding what the program you wrote is doing either.

pfdietz

4 months ago

1 reply

And yet, (1) testing finds bugs in any nontrivial program that hasn't been tested, and (2) test long enough and with enough variety and you can make programs significantly more reliable.

Perfect is the enemy of good, and absent academic fantasies of verified software testing is essential (even then, it's still essential, since you are unlikely to have verified every component of your system.)

sarchertech

4 months ago

Sure testing is useful. It's not so useful that you don't need to understand what you're doing though.

wubrr

4 months ago

1 reply

It's not about making sure your system is 100% perfect. You cannot do that on any real sufficiently complex system. It's about testing the core functionality in a relatively straightforward and reliable way (including concurrency testing), to catch many common bugs.

JackSlateur

4 months ago

My shit is the backbone of a multibillions compagny

Common bugs are not enough, uncommon bugs are just too expensive

teraflop

4 months ago

4 replies

You can't easily, automatically test concurrent code for correctness without testing all possible interleavings of instructions, and that state space is usually galactically huge.

It is very easy to write multithreaded code that is incorrect (buggy), but where the window of time for the incorrectness to manifest is only a few CPU instructions at a time, sprinkled occasionally throughout the flow of execution.

Such a bug is unlikely to be found by test cases in a short period of time, even if you have 1000 concurrent threads running. And yet it'll show up in production eventually if you keep running the code long enough. And of course, when it does show up, you won't be able to reproduce it.

That is, I think, what the parent commenter means by "luck".

This is similar to the problem you'll run into when testing code that explicitly uses randomness. If you have a program that calls rand(), and it works perfectly almost all the time but fails when rand() returns the specific number 12345678, and you don't know ahead of time to test that value, then your automated test suite is unlikely to ever catch the problem. And testing all possible return values of rand() is usually impractical.

nick__m

4 months ago

There is a cost benefit ratio and context that matters.

The repeat the concurent operations 1000times technique is adequate for a CRUD API but it's whofully inadequate for a database engine or garbage collector.

wubrr

4 months ago

It will obviously not catch all bugs. Nothing will. But it is a relatively easy and reliable way to catch many of them. It works.

afiori

4 months ago

and if you have too many threads running you could have slowdowns in the system that prevent the race condition from happening

nijave

4 months ago

There's still value in eliminating ways your program can wrong even if you can't eliminate all of them.

Using your logic, why bother testing at all.

pixl97

4 months ago

I mean if you're talking SQL on a large real database vs a small test db you can get some pretty big differences in performance and behavior. Of course query planning is something that should be monitored as an app is deployed and used, but testing never does seem to catch the edge cases.

hxtk

4 months ago

1 reply

I’ve long wished for an SQL error model: given a schema, query, and transaction isolation mode, what errors are theoretically possible?

I have a hard time answering this for Postgres, which disappoints me because I don’t see any reason it sounds very easy to answer, like there could be an extension to EXPLAIN that would dry run the query and list all the error states reachable.

jiggawatts

4 months ago

Computer Science is incredibly immature. Many of its "founding fathers" are still alive!

Something akin to this that blew my mind recently was an IDE for a functional language that used typed holes, the programming equivalent of a semiconductor electron "vacancy", a quasi-particle with real properties that is actually just the lack of a particle. The concept was that if you delete (or haven't yet typed) some small segment out of an otherwise well-typed program, the compiler can figure out the type that the missing part must have. This can be used for rapid development because many such "holes" can only have one possible type.

This kind of mechanistic development with tool assistance is woefully under-developed.

terpimost

4 months ago

https://antithesis.com/ was made to deal with this. You can think of its as a fuzzing but it has overall determinism for the whole system, so there is a time travel and interactive debugging.

toolslive

4 months ago

There are "lightweight formal methods". Most problems can be produced via small models. Tools like alloy are built around this idea. (IIRC alloy was used to show that a famous DHT had issues with the churn protocol)

https://en.wikipedia.org/wiki/Alloy_(specification_language)

eevmanu

4 months ago

deterministic simulation testing[1] (DST)

[1] https://notes.eatonphil.com/2024-08-20-deterministic-simulat...

lucianbr

4 months ago

https://jepsen.io/

hinkley

4 months ago

1 reply

Mediocre testing can also lead to a situation where there is friction for improvement because the tests are brittle and coupled (with each other and the misfeature you’re interested in fixing).

I like a more uniform distribution in my testing efforts. Start earlier, end later than most, and it’s experiences like this that inform that preference. And also production bugs in code with supposed 100% test coverage.

sarchertech

4 months ago

1 reply

> Mediocre testing can also lead to a situation where there is friction for improvement because the tests are brittle and coupled

This is very very common among inexperienced devs and in immature organizations that think that more tests necessarily means better.

hinkley

4 months ago

And devs who think they are experienced by disappear when the testing gets tough.

themafia

4 months ago

1 reply

Macro Pierre White says "perfection is lots of little things done well."

Which is something I've always agreed with, so, I never understand articles that seek to eschew an important part of releasing software because they believe their approach elsewhere is enough to overcome these intentionally suboptimal choices.

yakshaving_jgt

4 months ago

1 reply

Almost all of professional software should be intentionally suboptimal.

This is what we mean when we say that premature optimisation is the root of all evil.

FridgeSeal

4 months ago

1 reply

We ought to ban people saying that quote, due to the way it has been abused to avoid considering performance _at all_.

“Intentionally suboptimal” is also a strange way of phrasing it, as it makes it sound a bit like you’re intentionally building something bad, as opposed to “only as good as it needs to be”.

yakshaving_jgt

4 months ago

In general I avoid considering performance at all. Instead, I focus on adding testing and instrumentation. When the telemetry tells me I have a performance problem, then I can solve that part while being confident that the performance improvement doesn't change the functionality because I first invested in tests.

IvyMike

4 months ago

8 replies

Always gonna have to side with Peter Norvig on this one: https://pindancing.blogspot.com/2009/09/sudoku-in-coders-at-...

> They said, “Look at the contrast—here’s Norvig’s Sudoku thing and then there’s this other guy, whose name I’ve forgotten, one of these test-driven design gurus. He starts off and he says, “Well, I’m going to do Sudoku and I’m going to have this class and first thing I’m going to do is write a bunch of tests.” But then he never got anywhere. He had five different blog posts and in each one he wrote a little bit more and wrote lots of tests but he never got anything working because he didn’t know how to solve the problem. I actually knew—from AI—that, well, there’s this field of constraint propagation—I know how that works. There’s this field of recursive search—I know how that works. And I could see, right from the start, you put these two together, and you could solve this Sudoku thing. He didn’t know that so he was sort of blundering in the dark even though all his code “worked” because he had all these test cases.

quotemstr

4 months ago

1 reply

> “Well, I’m going to do Sudoku and I’m going to have this class and first thing I’m going to do is write a bunch of tests.” But then he never got anywhere

There's a blog post I read once and that I've since been unable to locate anywhere, even with AI deep research. It was a blow-by-blow record of an attempt to build a simple game --- checkers, maybe? I can't recall --- using pure and dogmatic test driven development. No changes at all without tests first. It was a disaster, and a hilarious one at that.

Ring a bell for anyone?

moron4hire

4 months ago

1 reply

Norvig mentions it in the article linked in the post to which you are replying. The game was Sudoku. The person was Ron Jeffries. https://ronjeffries.com/articles/-z022/01121/sudoku-again/

quotemstr

4 months ago

1 reply

Thanks!

fifilura

4 months ago

1 reply

To be fair, I'd like to be forgiven for anything I did in 2006.

It is a story that reads like a fairy tale, but it is time to give the guy a break.

moron4hire

4 months ago

1 reply

That's why I linked to Jeffries' post where he gave more context.

Though, in this particular case, he then went on to go back down the TDD Sudoku rabbit hole and, though he does seem to eventually write a program that works, the path to get there involved reading existing solutions and seems rather drain circly, which makes his post I linked seem a bit like making excuses. IDK. I don't really care beyond mild bemusement.

fifilura

4 months ago

Haha, he really continued digging himself into the hole, i see it now.

"I’ve found some Python code for Sudoku techniques. I do not like it. But it’ll be useful, I reckon, even though we aren’t likely to copy it."

pncnmnp

4 months ago

I love what Norvig said. I can relate to it. As far as data structures are concerned, I think it's worth playing smart with your tests - focus on the "invariants" and ensure their integrity.

A classic example of invariant I can think of is the min-heap - node N is less than or equal to the value of its children - the heap property.

Five years from now, you might forget the operations and the nuanced design principles, but the invariants might stay well in your memory.

EVa5I7bHFq9mnYK

4 months ago

If you write all the tests, I'm sure the LLM can figure out the implementation.

MrJohz

4 months ago

I think the point Norvig is making there broadly agrees with this post though. In the Sudoku affair, Norvig had the DSA knowledge there, sure, but his point is more that you need to be willing to look up other people's answers, rather than assuming you have enough knowledge or that you can slowly iterate towards a correct answer. You can't expect to solve every problem yourself with the right application of DSA/TDD/whatever.

That's the same as the blog post: you need to know enough DSA to be able to understand how to look for the right solution if presented with a problem. But Batchelder's point is that, beyond that knowledge, learning testing as a skill will be more valuable to you than learning a whole bunch of individual DSA tricks.

casey2

4 months ago

To be fair to the other guy a Sudoku solver easier to bang out than a tiny distributed operating system environment that happens to solve sudoku, even if your language does help you.

runeblaze

4 months ago

That story reads like what happens when the average senior engineer tries to do a hardish usaco problem; turns out algorithm engineering is different from your average enterprise engineering; turns out there are people in both camps

roxolotl

4 months ago

This totally misses the point of the article. The article agrees that knowing when a problem is a data structure and algo problem is a key strength. The article also isn’t saying that all development should be done TDD.

The point of the article is that knowing how to test well is more useful than memorizing solutions to algo problems. You can always look those up.

IvyMike

4 months ago

More context, from an earlier HN comment: https://news.ycombinator.com/item?id=3033446

sakesun

4 months ago

1 reply

Just realise that I have been reading his blog for two decades already.

rossant

4 months ago

Same. Love his blog.

JoeAltmaier

4 months ago

2 replies

We wrote a conferencing app and server (years before Zoom). Tested the server by having automated headless apps run in gangs, a hundred at a time, hopping from conversation to conversation, turning mic and camera on and off, logging out and logging back in. Used it for years, the Bot Army we called it. Responsible for our rock-solid quality reputation. Not API design or test classes or constraints or anything. Just, trying the damn thing, in large cases, for a long time.

When it ran an hour, we celebrated. When it ran overnight, we celebrated. When it ran a week we celebrated, and called that good enough.

rossant

4 months ago

1 reply

How much work was it to go from 1 hour to 1 week? How many issues have you discovered, what were they? Genuinely interested.

JoeAltmaier

4 months ago

1 reply

It took a big fraction of our energy to get to each new stage. Always it was something new and unexpected. It was quite a while ago, but lets see what I remember.

Some good fraction were mismatches between the app (bot) state and the server state. A bot would be expecting a message and stall. The server thought it had said enough.

The app side used a lot of libraries, which it turns out are never as robust as advertised. They leak, race, are very particular about call order. Have no sense of humor if they're still connecting and a disconnect call is made, for instance.

The open source server components were fragile. In one instance, the database consistency library had an update where, for performance, a success message was returned before the operation upstream was complete. Which broke, utterly, the consistency promise that was the entire point of using that product.

A popular message library created a timer on each instantiation. Cancelled it, but in typical Java fashion didn't unlink it. So, leak. Tiny, but you do it enough times, even the biggest server instance runs out of memory.

We ran bots on Windows, Linux, even a Mac. Their network libraries had wildly different socket support. We'd run out of sockets! They got garbage collected after a time, but the timer could be enormous (minutes).

Our server used a message-distribution component to 'shard' messages. It had a hard limit on message dispatching per second. I had to aggregate the client app messages (we used UDP and a proprietary signaling protocol) to drop the message rate (ethernet packet rate) by an order of magnitude. Added a millisecond of latency, which was actually important and another problem.

Add the usual Java null pointers, order-dependent service termination rules (never documented), object lifetime surprises. It went on and on.

Each doubling of survival-time the issues got more arcane and more interesting. Sometimes took a new tool or technique to ferret out the problem.

To be honest, I was in hog heaven. Kept my brain plastic for a long time.

rossant

4 months ago

Wow, really interesting write-up, thank you! It really proves the immense value of this kind of automated, realistic stress test.

yakshaving_jgt

4 months ago

1 reply

As effective as that sounds, having that integrated test suite didn’t preclude you from also having more granular isolated tests.

JoeAltmaier

4 months ago

Sure, in a generous world with lots of resources. Given the startup environment and the overworked team, it's a choice how to spend limited time and energy.

rr808

4 months ago

2 replies

Do the people who love testing run JS or Python or something that compiles really quick? I've worked on some big projects and just to make a change, compile and run a unit test often takes 5-15 minutes. TDD works great if you have a trivial library in an interpreted language.

yakshaving_jgt

4 months ago

Understanding how to write tests in an economical fashion is a skill in itself. With care, you can write fast (and effective) tests in Haskell (and, I’m sure, every language ever).

markmark

4 months ago

Testing doesn't imply TDD.

quotemstr

4 months ago

1 reply

> I see new learners asking about “DSA” a lot.

I've noticed this "DSA" acronym appearing overnight. I can't recall people using it this much (at all actually) even six months ago. Where did it come from? Why do we suddenly need a term to talk about the concept?

burch45

4 months ago

1 reply

That is the standard acronym for the course in American universities and has been for many decades.

quotemstr

4 months ago

Maybe so, but that doesn't explain why have people suddenly started using it more.

cogman10

4 months ago

I agree.

The main benefit of being familiar with how data structures and algorithms work is that you become familiar with their runtime characteristics and thus can know when to reach for them in a real problem.

The author is correct here. You'll almost never need to implement a B-Tree. What's important is knowing that B-Trees have log n insertion times with good memory locality making them faster than simple binary trees. Knowing how the B-Tree works could help you in tuning it correctly, but otherwise just knowing the insertion/lookup efficiencies is enough.

29athrowaway

4 months ago

Learning both is not mutually exclusive.

karmakaze

4 months ago

The context what you should spend time to learn starting out. TL;DR

> Here is what I think in-the-trenches software engineers should know about data structures and algorithms: [...]

> If you want to prepare yourself for a career, and also stand out in job interviews, learn how to write tests: [...]

I feel like I keep writing these little context comments to fix the problem of clickbait titles or those lacking context. It helps to frame the rest of the comments which might be coming at it from different angles.

varjag

4 months ago

Sudokugate flashbacks.

OfflineSergio

4 months ago

I think they enough unnecessary content in universities that there be room for both DSA and testing.

ngcc_hk

4 months ago

Can a program function without data, algorithm or testing or users or programmers. Now which one is better … sorry what rubbish question is that. All are important. And limited time you still have to consider all these … noth8ng is better.

zeroCalories

4 months ago

The reason you study ds&a is because it's a difficult skill. Testing is incredibly straightforward and easy to learn. The reason tests suck is because the infra suck, the eng is lazy, or both.

atmavatar

4 months ago

The title is unfortunately more than a little irresponsible, considering it's the norm for many (most?) to read only the title.

There is no dichotomy here: you need to know testing as well as data structures and algorithms.

However, the thrust of the article itself I largely agree with -- that it's less important to have such in-depth knowledge about data structures and algorithms that you can implement them from scratch and from memory. Nearly any modern language you'll program in includes a standard library robust enough that you'll almost never have to implement many of the most well-known data structures and algorithms yourself. The caveat: you still need to know enough about how they work to be capable of selecting which to use.

In the off-chance you do have to implement something yourself, there's no shortage of reference material available.

Izikiel43

4 months ago

Testing is the case I've found AI actually useful. Write one good test, maybe happy path, and then tell your AI to test for scenario XYZ and how it should fail, etc.

The generated code is in general 90% there.

This allows me to write many more tests than before to try to catch all scenarios.

jsd1982

4 months ago

Are we assuming that "testing" is limited to only exercising the single-threaded behavior of a function? I'm curious how others approach effective testing of multi-threaded behavior.

16 more comments available on Hacker News

View full discussion on Hacker News

ID: 45335635Type: storyLast synced: 11/20/2025, 8:47:02 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN