Broker-Side SQL Filtering with Rabbitmq Streams

Posted3 months agoActive3 months ago

ansd

15 points

3 comments

rabbitmq.comTechstory

calmpositive

Debate

20/100

RabbitmqSQL FilteringMessage Queuing

Key topics

Rabbitmq

SQL Filtering

Message Queuing

The RabbitMQ team has introduced broker-side SQL filtering with RabbitMQ Streams, improving developer experience and reducing network capacity costs, but also raising concerns about potential performance impacts.

Snapshot generated from the HN discussion

Discussion Activity

Light discussion

First comment

Peak period

96-102h

Avg / period

1.5

Key moments

01Story posted
Sep 24, 2025 at 5:25 PM EDT
3 months ago
Step 01
02First comment
Sep 28, 2025 at 2:00 PM EDT
4d after posting
Step 02
03Peak activity
2 comments in 96-102h
Hottest window of the conversation
Step 03
04Latest activity
Sep 28, 2025 at 7:46 PM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (3 comments)

Showing 3 comments

4ndrewl

3 months ago

I guess there's some effect on the broker side wrt resources or efficiency, but I couldn't immediately see anything about this.

zbentley

3 months ago

One possible drawback of this kind of system is performance (or broker CPU) getting dragged down by crazy/bad filtering queries.

Normally, those issues are solved the usual way (monitor, identify, fix). It’s rarer to see systems that proactively detect/reject costly arbitrary queries when they’re issued, though.

Proactively detecting potentially bad SQL queries in RDBMSes relies on table statistics (can’t be known for streams) or query text/plan analysis heuristics (hairy, subjective/error prone).

But it just occurred to me: could RabbitMQ’s choice of Erlang enable the easy rejection of query plans above a certain cost?

Could the BEAM be easily made to reject a query plan (assuming the plan—or a worst-case version of it at least—can be compiled into a loopless/unrolled chunk of BEAM bytecode ahead of time) with a reduction count more than a user specified threshold?

That might be interesting, if possible. Most runtimes don’t have user-surfaced equivalents of reduction counts, so there might be some mechanical sympathy in RabbitMQ’s case.

zbentley

3 months ago

What an incredibly useful feature. Besides the obvious developer experience benefits, it’s huge for network-bound use cases: really heavily optimized uses of RabbitMQ (or less-optimized uses with really big message payloads) end up bottlenecked or paying lots of money for broker network capacity, since a message’s bytes must cross the wire 2 or more times (publish, consume, maybe replication) for it to be processed. Moving filtering logic to the consumer side helps a lot with that—but workloads should still use separate queues/topics/streams instead whenever they can, of course (I’m sure there will be some one-topic-for-everything abuses enabled by the combination of poor architectural foresight + SQL filtering, but such is life).

I am confused, though: why does the bloom filter … er, filter still need to be manually specified by the consumer (filterValues in the example Java)?

As far as the broker filtering query evaluation logic is concerned, bloom-filter enabled fields are just indexes; why can’t the SQL-filter query planner automatically make use of them?

I’m probably missing something, but it seems like a very light query plan optimization pass would not be hard to implement here; there’s only one kind of index, and it can only be used with equality comparisons, so it doesn’t seem like the the implementation complexity would be too bad versus needing a fully general SQL optimizing plannner.

View full discussion on Hacker News

ID: 45366149Type: storyLast synced: 11/20/2025, 6:27:41 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN