Scaling Request Logging with Clickhouse, Kafka, and Vector
Posted3 months agoActive2 months ago
geocod.ioTechstory
supportivepositive
Debate
40/100
ClickhouseData EngineeringScalabilityLogging
Key topics
Clickhouse
Data Engineering
Scalability
Logging
The article discusses how Geocodio scaled their request logging using ClickHouse, Kafka, and Vector, and the discussion revolves around the technical details and alternatives to their implementation.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
5d
Peak period
22
Days 5-6
Avg / period
9.3
Comment distribution28 data points
Loading chart...
Based on 28 loaded comments
Key moments
- 01Story posted
Oct 8, 2025 at 5:56 AM EDT
3 months ago
Step 01 - 02First comment
Oct 13, 2025 at 2:45 PM EDT
5d after posting
Step 02 - 03Peak activity
22 comments in Days 5-6
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 27, 2025 at 9:21 AM EDT
2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45514213Type: storyLast synced: 11/20/2025, 5:42:25 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I had a similar project back in August when I realised my DB's performance (Postgres) was blocking me from implementing features users commonly ask for (querying out to 30 days of historical uptime data).
I was already blown away at the performance (200ms to query what Postgres was doing in 500-600ms), but then I realized I hadn't put an index on the Clickhouse table. Now the query returns in 50-70ms, and that includes network time.
Alternatively, you could've used async insert functionality built into ClickHouse: https://clickhouse.com/docs/optimize/asynchronous-inserts . All of these solutions are operationally simpler than Kafka + Vector, although obviously it's all tradeoffs.
But I imagine the writeup eschews myriad future concerns and does not entirely illustrate the pressure and stress of trying to solve such a high-scale problem.
Ultimately, going with a somewhat more complex solution that involves additional architecture but has been tried and tested by a 3rd party that you trust can sometimes be the more fitting end result. Assurance often weighs more than simplicity, I think.
Kafka and Redis is a "pick your poison" IMO, scaling and operating those have their own headaches.
https://www.onehouse.ai/blog/apache-spark-vs-clickhouse-vs-p...
Druid is real-time analytics, similar to Clickhouse. StarRocks is best at Joins - Clickhouse is not good for joins.
This is less and less true as time goes on tbh. 25.9 introduced Join Reordering as well - https://clickhouse.com/blog/clickhouse-release-25-09
1a) If you’re still having too many files/parts, then fix your partition by, and mergetree primary key.
2) why are you writing to kafka when vector dev does buffering / batching?
3) if you insist on kafka, https://clickhouse.com/docs/engines/table-engines/integratio... consumes directly from kafka (or since you’re on CHC, use clickhouse pipes) — what’s the point of vector here?
Your current solution is unnecessarily complex. I’m guessing the core problem is your merge tree primary key is wrong.
From experience the Kafka tables in ClickHouse are not stable at a high volumes, and harder to debug when things go sideways. It is also easier to mutate your data before ingestion using Vector's VRL scripting language vs. ClickHouse table views (SQL) when dealing with complex data that needs to be denormalized into a flat table.
The one they're going to shut down as soon as this works? Yeah, great reason to make a permanent tech choice for a temporary need. Versus just keeping the MariaDB stuff exactly the same on the PHP side and writing to 2 destinations until cutover is achieved. Kafka is wholly unnecessary here. Vector is great tech but likely not needed. Kafka + Vector is absolutely the incorrect solution.
Their core problem is the destination table schema (which they did not provide) and a very poorly chosen primary key + partition.
I setup some Vector to buffer ElasticSearch writes years ago, also for logs, it ran so well without any problems that I almost fogot about it.
For anyone if curious.
Tbh this terrifies me! We don’t just have to log the requests but also store the full emails for a few days, and they can be up to 50 mib in total size.
But it will be exciting when we get there!
We recently added a MySQL/MariaDB CDC connector in ClickPipes on ClickHouse Cloud. This would have simplified your migration from MariaDB.
https://clickhouse.com/docs/integrations/clickpipes/mysql https://clickhouse.com/docs/integrations/clickpipes/mysql/so...
Happy to exchange notes about our journey too.
Cheers