Clickhouse Acquires Librechat, Open-Source AI Chat Platform
Postedabout 2 months agoActiveabout 2 months ago
clickhouse.comTechstory
skepticalmixed
Debate
70/100
ClickhouseLibrechatOpen-SourceAcquisitionArtificial Intelligence
Key topics
Clickhouse
Librechat
Open-Source
Acquisition
Artificial Intelligence
ClickHouse acquired LibreChat, an open-source AI chat platform, raising concerns among users about the project's future and potential changes to its open-source nature.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
20m
Peak period
20
0-3h
Avg / period
8
Comment distribution40 data points
Loading chart...
Based on 40 loaded comments
Key moments
- 01Story posted
Nov 10, 2025 at 11:44 AM EST
about 2 months ago
Step 01 - 02First comment
Nov 10, 2025 at 12:04 PM EST
20m after posting
Step 02 - 03Peak activity
20 comments in 0-3h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 12, 2025 at 2:31 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45877770Type: storyLast synced: 11/20/2025, 3:50:08 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
[0] https://libera.chat/
- LibreChat remains 100% open-source under its existing MIT license
- Community-first development continues with the same transparency and openness
of course, no binding commitment to any of that in the long term.
Within the next few years, it will introduce an additional enterprise edition, a SaaS offering, or a change in licensing terms.
But, we are also a database provider.
Ryadh mentions some examples below where we have joined forces, incorporated code into ClickHouse Cloud (our commercial offering), and OSS has grown.
Time will tell (I can't predict future)...but I'm excited about the future of OSS LibreChat.
(disclaimer: I work at ClickHouse)
See Hashicorp and Elasticsearch for the same old story.
Luckily these kinds of products are a dime a dozen, ie zero technical complexity and there are so many similar projects already out there. Hell you can even vibe code this kind of project.
So, why this move ?
Basically, we noticed that the existing "agentic" open-source ecosystem is primarily focused on developer tools and SDKs, as developers are the early adopters who build the foundation for emerging technologies. Current projects provide frameworks, orchestration, and integrations The idea behind the Agentic Data Stack is a higher-level integration to provide a composable software stack for agentic analytics that users can setup quicky, with room for customization.
My favourite use-case: our sales and support folks systematically ask DWAINE (our dwh agent) to produce a report before important meetings with customers, something along the lines of: "I'm meeting with <customer_name> for a QBR, what do I need to know?". This will pull usage data, support interactions, billing, and many other dimensions, and you can guess that the quality of the conversation is greatly improved.
My colleague Dmitry wrote about it when we first deployed it: https://www.linkedin.com/pulse/bi-dead-change-my-mind-dmitry...
We have a similar experience where it's shocking how much users prefer the chat interface.
> The idea behind the Agentic Data Stack is a higher-level integration to provide a composable software stack for agentic analytics that users can setup quicky, with room for customization.
I agree with this. For those who have been programming with LLM, the difference between something working and not working can be a simple "sentence" conveying the required context. I strongly believe data enrichment will be one of the main ways we can make agents more effective and efficient. Data enrichment is the foundation for my personal assistant feature https://github.com/gitsense/chat/blob/main/packages/chat/wid...
Basically instead of having agents blindly grep for things, you would provide them with analyzers that they can use to search with. By making it dead simple for domain experts to extract 'business logic' from their codebase/data, we can solve a lot of problems, much more efficiently. Since data is the key, I can see why ClickHouse will make this move since they probably want to become the storage for all business logic.
Note: I will be dropping a massive update to how my tool generates and analyzes metadata this week, so don't read too much into the demo or if you decide to play with it. I haven't really been promoting it because the flow hasn't been right, but it should be this week.
open source means you have a license to freely use and commercialize it. not that it has no owners.
An OSS project has contributing members and they exert control over the project by working on the codebase and approving/rejecting contributions. If they act as an entity, you can buy their control over the project. The entity that owns the project can now change the license, “sponsor” the contributors by giving them salary, “fire” the contributors, etc.
From the post
“ LibreChat remains 100% open-source under its existing MIT license Community-first development continues with the same transparency and openness Expanded roadmap to bring an even more enterprise-ready analytics experience. This proven playbook is the same one that we applied when joining forces with PeerDB to provide our ClickPipes CDC capabilities, and HyperDX, which became the UX of our observability product, ClickStack.”
Also, I work at ClickHouse, my email is super easy to figure out. Would love to alleviate concerns where I can.
At its simplest the team who was building that rad thing called LibreChat, now works at ClickHouse and build that rad thing called LibreChat.
Even simpler, the LibreChat team works at ClickHouse and are now my colleagues.
More complex, acquisitions can take a variety of "forms"...most importantly in these scenarios (and now I speak without knowledge of the deal structure) is making sure the team is paid, that copyright/trademark stuff is worked out, that OSS plans are discussed, and that everyone is excited to work together.
I too have LibreChat deployed for my personal use and now the only question is how long until it will inevitably be enshittified/monetized.
Must feel even worse for volunteers who worked on the project but don't get any benefit from the aquisition.
It's a fair concern, and I understand where you are coming from. What I can say is that it's not our first rodeo incorporating another OSS product in our family. I tried to summarize it in the post:
> "This proven playbook is the same one that we applied when joining forces with PeerDB to provide our ClickPipes CDC capabilities, and HyperDX, which became the UX of our observability product, ClickStack."
If you research both instances above, the result is that these projects got more traction and adoption overall.
I hope this helps! and thank you for using LibreChat
Rather, people have to ask questions of it, and interact with the data. Increasingly, that is via AI tooling.
We've had a long-standing demo at llm.clickhouse.com (librechat, bedrock, anthropic).
(disclaimer: work at ClickHouse)
That is, given all my own experiences on that front, terrifying if "increasingly" people are interacting with their data via AI tooling. In all the testing I've done, it can seem like magic "Look, it just told us XXX piece of data and we just asked a simple question!" but LLMs, even with copious amounts of context, are not good at understanding your business rules for understanding your data. And that goes for just about any company with more than "Pet Store"-level complexity (especially after years or decades of the data growing/changing).
Perhaps this has improved/changed but I used LLMs daily and nothing indicates to me that it's improved enough to make this worthwhile. Any AI-only interface to data I would assume is either dealing with a laughably simple dataset/schema (or super new) or lying to you constantly.
Our own experience running internal agents taught us that the best remediation comes from providing the LLMs with the maximum and most accurate context possible. Robust evaluations are also critical to measure accuracy, detect regressions, and improve. But there is no silver bullet.
SOTA LLMs are increasingly better at generating SQL and notoriously bad with math and numbers in general. Combining them with powerful querying capabilities bridges that gap and makes the overall experience an useful one.
IMO, we'll always have to deal with the stochastic nature of these models and hallucinations, which calls for caution and requires raising awareness within the user base. What I found watching our users internally is that, while it's not magical, it allows users to request data more often, and compounds in data-driven decision-making, assuming the users are trained to interpret the interactions
- Wouldn't it be cool to let my users chat with their data? ("How many new users signed up today/this event/this month/etc?" or "How much did we make yesterday?")
- An internal tool to use as a starting point for analytics dashboards
I still use LLMs to help write queries if it's something I know can be done but can't remember the syntax but I scrapped the project to try and accomplish both the above goals due to too many mistakes. Maybe my data is just too "dirty" (but honestly, I've never _not_ seen dirty data) and/or I should have cleaned up deprecated columns in my tables that confused the models (even with strict instructions to ignore them, I should have filtered them completely) but I spent way too much time repeating myself, talking in all caps, and generally fighting with the SOTA models to try to get them to understand my data so that they could generate queries that actually worked (worked as in returned valid data, not just valid SQL). I wasn't doing any training/fine-tuning (which may be the magic needed) but I felt like it was a dead end (given current models). I'll also stress that I haven't re-tested those theories on newer models and my results are at least a year out of date (a lifetime in LLM/AIs) but the fundamental issues I ran into didn't seem to be "on the cusp" of being solved or anything like that.
I wish you all the best of luck in improving on this kind of thing.
We published a public demo of the Agentic Data Stack, I'd love to hear your feedback https://clickhouse.com/blog/agenthouse-demo-clickhouse-llm-m...
Keep in mind that it's not fully "fair", since these public dataset are often documented in the internet so already present in pre-training of the models underneath (Claude Sonnet 4.5 in this case)
I hope the librechat dev got a nice payout, I've been selfhosting for about a year.