Back to Home11/14/2025, 7:10:10 AM

RegreSQL: Regression Testing for PostgreSQL Queries

150 points
37 comments

Mood

supportive

Sentiment

positive

Category

tech

Key topics

PostgreSQL

regression testing

database management

software testing

Debate intensity20/100

RegreSQL is a tool for regression testing PostgreSQL queries, helping ensure database queries work as expected.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

1h

Peak period

33

Day 1

Avg / period

18.5

Comment distribution37 data points

Based on 37 loaded comments

Key moments

  1. 01Story posted

    11/14/2025, 7:10:10 AM

    5d ago

    Step 01
  2. 02First comment

    11/14/2025, 8:15:52 AM

    1h after posting

    Step 02
  3. 03Peak activity

    33 comments in Day 1

    Hottest window of the conversation

    Step 03
  4. 04Latest activity

    11/15/2025, 2:57:39 PM

    3d ago

    Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (37 comments)
Showing 37 comments
WilcoKruijer
5d ago
5 replies
It's pretty terrible how poorly developers test their database queries. This looks like a great step in the right direction. I think how the ORM story in RegreSQL develops is crucial. The SQLAlchemy integration looks interesting, but at the same time super specific. There are a million ways to generate SQL statements and ORMs are just one of them. A question that comes to mind is how will you handle interactive transactions? I'd say most complexity in queries comes from the back-and-forth between database and server. Is that out of scope?

Would also be fun if you could support PGLite [0], that's what I've been using to write "unit" tests connected to a "real" database.

[0] https://pglite.dev/

jillesvangurp
4d ago
1 reply
It's an area where people get conflicted between unit testing in isolation and integration testing. The compromise between those two approaches tend to be slow, flaky, and merely provide the illusion of code coverage which you shoot for with unit testing while not being quite realistic enough to push it towards the side of being a proper integration test. I've never liked tests like that and mostly consider them a typically gigantic waste of time. Code coverage is meaningless for integration tests.

I deal with a lot of complex querying logic with mostly Elasticsearch. My appproach is to either unit test or integration test and just skip everything in between. With queries what I care about is that they work under realistic scenarios against the same version of Elasticsearch that we use in production. Creating test fixtures is expensive. So tests for read only querying shares the same test fixtures. That speeds things up. I don't care about separately testing simple crud operations because most of my scenarios trigger plenty of those. A unit test for that has no value to me. Unit testing whether my API can send queries to some fake Elasticsearch, empty Elasticsearch, etc. has limited value to me. There's some limited value in knowing the queries I'm sending are syntactically correct. But you get that for free with a proper integration test not failing. The unit test is redundant if you have a proper integration test. And a unit test without a proper integration test provided little practical value.

What I actually do care about is all the complicated dashboard and other queries that make lot of assumptions about how data is structured, what fields are there, how they are indexed, whether they can be null, blank, or have invalid values, etc. work as intended. That kind of calls for an integration test. Anything trivial enough that a unit test would be good enough doesn't tend to need a lot of testing. Any scenario that touches enough of that stuff, kind of covers most of that.

I put a lot of effort in ensuring that my integration tests can run quickly, concurrently, and don't interact with each other (data randomization). That allows me to get away with not deleting a lot of data between tests and gets me a lot of realism for free because real users don't have an empty system completely to themselves. So having a lot of tests running against a busy system is integration testing gold. I have close to 300 full API integration tests running in around 30 seconds on my laptop. Close enough to unit testing performance that I run them many times per day.

The same approach applies to database testing. Probably more so because all the interesting bugs usually relate to constraints, transactionality, database locks, etc. If you have flaky tests because of that, it might actually be because your database layer has some issues with the notion of users not being polite enough to queue up one by one.

This is not for everyone, I realize. Do what works for you. I've butted heads with people over this more than a few times. But in my company (of which I'm the CTO), we unit test functions, small classes, regular expressions, parsing logic, etc. We integration test systems and APIs. Testing individual queries without the rest of the system is hard and pointless. Test the API that triggers the query. That 30 second performance for test runs is something I spent a lot of time on getting. It means we can do major changes without fear. If tests pass, our users should be fine.

rhodeon
3d ago
I agree with all of this and it's what I tend to follow too.

One question, since you seem to have more experience with this approach.

> Test the API that triggers the query.

When you test the API, do you also check the state of the database?

Let's say there's an endpoint which creates an entity, and that operation involves setting a bunch of columns that aren't returned in the API.

When testing, do you check only the expected API response, or also the database state to be sure the operation was fully correct?

I currently do the latter, but this feels a bit iffy as the database internals are being leaked in an otherwise "blackbox" test.

If you learn towards the first option, how then do you verify all the unreturned database fields are correctly set?

sgarland
4d ago
> It's pretty terrible how poorly developers test their database queries.

Yes. This becomes especially obvious when you rewrite ORM garbage for something complicated, and are told that they can’t accept it, because they’re not sure how to test it.

mrasong
5d ago
Just found out about pglite, this library is insanely cool. You can even run Postgres right in the browser.
jci
5d ago
My goto for this lately has been ephemeralpg [0] and pgTAP [1]. It’s been pretty great

[0] https://github.com/eradman/ephemeralpg [1] https://github.com/theory/pgtap

radimm
5d ago
OP here - I do agree some of the problems that come with SQL/ORM queries are pretty horrendous and that's exactly where I would like RegreSQL going. For now I can't promise the particular direction, but comments like this are the reason why I pushed myself to release it and keep it beyond my own playground. Thank you!
h1fra
5d ago
3 replies
Interesting. Perf regression can happen locally but they mostly happen in prod when data change in volume or in shape, can this run safely on a prod db?
null_deref
5d ago
Yeah my question exactly, another one from me would be will the best practice be to run it periodically?
radimm
5d ago
The primary direction is to make RegreSQL part of CI/CD pipelines. In theory in can be run against production DB, but I believe it needs much more work to provide real value there. Thank you for the comment!
jelder
4d ago
Underrated point.

Looking for performance issues on a machine with different baseline IO and CPU load, buffer state, query plans, cardinality, etc. is just theater and will lead to a false sense of security. RegreSQL is approaching a stateful problem as if it were stateless and deterministic. A linter like https://squawkhq.com is a good partial solution but only addresses DDL problems.

RegreSQL would be better served by focusing only on the aspects of correctness that tools like SQLx and sqlc fundamentally cannot address. This is a real need that too few tools try to address.

jumski
5d ago
2 replies
Looks really well thought out and I will be testing it for sure!

I'm wondering how I would be able to regression-test functions in my project (pgflow [0]) - it tracks a graph of state machines modeled in few tables. State is mutated only by calling few exposed SQL functions (task queue worker does it).

Given I can't enforce everything I need with check constraints and I try to avoid triggers if possible, I opted for only using exposed SQL API [1] for setting up state in my pgTAP tests.

It is imperative and harder to maintain, like scripts you described in the article, but really my only option, as I want to have maximum confidence level.

Does RegreSQL support some kind of init scripts or I would need to wire it myself and just run RegreSQL after the proper state is set? Would lose the "run once and get report on everything" benefit then :-(

[0] https://pgflow.dev/ [1] https://github.com/pgflow-dev/pgflow/blob/main/pkgs/core/sup...

mickeyp
4d ago
1 reply
IMO, you should not avoid triggers if it helps prevent invariants in your database. That is what they are especially good at preventing.

You can instruct postgres to raise exceptions using the same error code that constraints use: that way your clients do not need to know the difference.

jumski
4d ago
Good point! For the SQL functions I mentioned, I'm comfortable without triggers - all mutations go through functions (no direct table access), and only start_flow is user-fac ing.

That said, there ARE other places that would benefit from triggers (aggregate counts). I've avoided them because they're hot paths and I was worried about perf impact - relyi ng on pgTAP coverage instead.

Your defense-in-depth argument is solid though. I should revisit this and benchmark whether the safety is worth the perf cost. Something like RegreSQL would come in handy

radimm
4d ago
1 reply
At this point it supports initialization through the fixtures systems (like inline SQL or SQL files). At the moment they have fixed order, which might lead to some limitations, but I'm already thinking about some pre/post test setup hooks and full schema handling as well (for full schema reloads).

Plus I have whole set of other requirements where RegreSQL suddenly seems to be a good solution.

And without sounding cliche - Thank you for the comment! This is exactly why I forced myself to go public and get this level of feedback.

jumski
4d ago
1 reply
No cliche at all - I'm in the same boat, showing my stuff online was way out of my comfort zone!

I was postponing proper, dedicated performance testing for some time and would really love to up my game in that regard.

I'm very happy with pgTAP approach of running stuff in transaction and rolling them back after the test - how this works in RegreSQL?

Would love to provide feedback and test the hooks when you will be working on them. I'm mostly interested in performance testing and my use case would be to run them on CI and compare to previous metrics stored somewhere in order to fail CI when performance regressions are introduced.

Happy to connect, got contact info in my profile.

radimm
4d ago
1 reply
For now only fixtures support transaction as cleanup options, but that's a good point that tested queries might also modify the queries.

I will definitely reach out, just give me bit of time to mentally recover from the exposure and got some meet ups where I promised to deliver some presentations and they will consume a lot of my spare free time.

jumski
4d ago
no rush, take your time and enjoy the fame :-)
tinodb
4d ago
2 replies
Nice! However I would actually advocate for fixtures in application code. I’ve seen too much drift otherwise. And creating “scale” is also easy, just add a for loop :). No programming in yaml needed. As an added benefit you can use the same fixtures for your end to end tests!

So it would be nice if RegreSQL would support fixture hooks for those who like this route.

radimm
4d ago
It's not unreasonable view - noted, will add to my list. Thank you!
jitl
4d ago
To get prod scale out of a for loop I’m gonna need a few hours of iterations :-(
andy_ppp
4d ago
2 replies
This looks great, any plans for MySQL support (or a similar project), the legacy system I'm working on could really do with this!
radimm
4d ago
1 reply
I'm obviously biased. Adding MySQL support is not that difficult but maintenance is (and ultimately PostgreSQL is better way forward (half joking :))

With current feature set it's something I have already considered but still undecided.

esafak
4d ago
Are there interfaces that we can use to implement support for other databases?
evanelias
4d ago
Not quite the same workflow, but in Percona Toolkit there's a tool called pt-upgrade which can provide similar information. See the use-case doc section on "reference results to host" comparisons: https://docs.percona.com/percona-toolkit/pt-upgrade.html
ForHackernews
4d ago
1 reply
PgTAP is only mentioned offhandedly at the end of this article, but it's an excellent mature tool for unit-testing your database: https://pgtap.org/
radimm
4d ago
OP here - I'm going to follow up with the separate article on pgTap. But the goals of both tools is slightly different in my mind.
aranw
4d ago
1 reply
How well will this work with something like sqlc [0]? sqlc has some custom syntax around the sql files specific to the library

[0] https://sqlc.dev/

radimm
4d ago
For now the syntax is not fully compatible - but my goal is to add https://github.com/boringSQL/queries (library behind SQL files parsing) to better align on it. It's definitely on my radar
jitl
4d ago
2 replies
I don’t understand how you can test Postgres performance in CI or on a developer laptop. Until your tables are large and varied enough, Postgres can ignore indexes and prefer full table scans because it’s faster. Plans depend on statistics so your test data generator better output rows with the same distributions as you get in prod. Unless you have a large and representative load of concurrent queries, Postgres and filesystem caching can optimize around a single query shape, masking issues that will shit the bed in prod.

Example: I wrote a recursive query that provided a huge reduction in queries issued from our app for a traversal. Worked beautifully in local, and halved p95 in our dogfood environment. Yippee! In prod? The query always timed out after 60 seconds, even though it had the same query plan as dogfood env and staging. Sad trombone noises.

For query semantics regression testing, we just write tests in our normal test framework and run them in CI like any other test. Test data setup works like any other test, and we can call some of our millions of lines of app code to help out with repetitive tasks. To prevent cross-test leakage we wrap each test in a BEGIN..ROLLBACK, and transform inner use of transactions to save points in our db client layer. I’d like to add libeatmydata to speed things up further but haven’t done so yet.

radimm
4d ago
1 reply
Let me be clear - this is just baseline testing that's only valid for similar data patterns, same cardinality, and so on. Plus it's currently using EXPLAIN, so we're working with estimates, not actual execution data.

But from my experience, this is low-hanging fruit for catching regressions that could break your application without you noticing. The performance work you're describing is the real nitty-gritty database management - the deep optimization work that truly makes or breaks an app.

For that there's no silver bullet.

jitl
3d ago
1 reply
For regression testing I struggle to see how a new SQL only test runner improves over a project’s existing testing systems. I think any project using a database should have integration tests that combine testing application logic with their database access queries. Your system does some explain checking but to me it feels tenuous if it’s worth the complexity required to dump all the queries my app may perform to text / YAML and adding yet another test runner in CI.

I think one interesting angle would be to integrate your assertions directly into Postgres as a plugin. That way, you can spy on every query issued in any test framework, dump the text automatically, and prove additional assertions about the query plan index use perf etc, regardless of the user’s primary language. That would also get you out of needing to build 1000 different integrations for all the programming language / ORM combos in the universe. If you don’t want to do a Postgres plugin you could also do this as a Postgres wire protocol proxy.

radimm
3d ago
Hey, thank you for follow-up. Really appreciated. This is why I decided to go out and show RegreSQL.

You're absolutely right that in an ideal world, every project would have solid integration tests covering their database access queries. That's the goal everyone should aim for. Unfortunately there again and again cases where developers are afraid of SQL and databases and it's not uncommon to see very simple things sending applications to crawl.

As for the second point, that's a direction I would love to get. Whatever it can be directly, as part of open source or service is something to be decided - but tracking the trend data there would provide a lot of value indeed.

isaacvando
4d ago
1 reply
At work we have anonymized dumps of our production data that we use for local development. This is extremely helpful as it allows testing things locally that would be completely unrealistic to setup otherwise. Your whole dataset might not be able to fit on a dev machine, but I'd guess in most cases you could create a subset of it that would be large enough to be useful for local performance work.
jitl
4d ago
Do you use an off the shelf tool for anonymization or did you write something yourself?
mbvisti
5d ago
I have nothing to add but this looks cool! Will definitely check it out :)
StarlaAtNight
4d ago
Wonder if the YAML fixtures drew inspiration from dbt’s unit tests: https://docs.getdbt.com/docs/build/unit-tests#unit-testing-a...
ID: 45924619Type: storyLast synced: 11/16/2025, 9:42:57 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.