Building an AI Agent Inside a 7-Year-Old Rails Monolith
Key topics
Diving into the world of AI integration, a developer's experiment with building an AI agent within a 7-year-old Rails monolith has sparked a lively discussion. At the heart of the debate is the concern about giving Large Language Models (LLMs) access to sensitive data, with some commenters suggesting that limiting the LLM to running function calls without exposing the return data could be a more secure approach. While some saw the implementation as a Rube Goldberg-esque overengineering, others appreciated the practicality and efficiency of the solution, such as building a RAG in just 50 lines of Ruby. The conversation highlights the trade-offs and creative problem-solving involved in integrating AI into legacy systems.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
2h
Peak period
12
6-9h
Avg / period
6.2
Based on 56 loaded comments
Key moments
- 01Story posted
Dec 26, 2025 at 2:35 AM EST
8 days ago
Step 01 - 02First comment
Dec 26, 2025 at 4:07 AM EST
2h after posting
Step 02 - 03Peak activity
12 comments in 6-9h
Hottest window of the conversation
Step 03 - 04Latest activity
Dec 27, 2025 at 2:37 PM EST
6d ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
This reads to me like they think that the response from the tool doesn’t go back to the LLM.
I’ve not worked with tools but my understanding is that they’re a way to allow the LLM to request additional data from the client. Once the client executes the requested function, that response data then goes to the LLM to be further processed into a final response.
They're saying that a public LLM won't know the email address of Jon Snow, but they still want to be able to answer questions about their private SaaS data which DOES know that.
Then they describe building a typical tool-based LLM system where the model can run searches against private data and round-trip the results through the model to generate chat responses.
They're relying on the AI labs to keep their promises about not training in data from paying API customers. I think that's a safe bet, personally.
It’s also funny how these tools push people into patterns by accident. You’d never consider sending a customer’s details to a 3rd party for them just to send them back, right? And there’s nothing stopping someone from just working more directly with the tool call response themselves but the libraries are setup so you lean into the LLM more than is required (I know you more than anyone appreciate that the value they add here is parsing the fuzzy instruction into a tool call - not the call itself).
I use hosted database providers and APIs like S3 all the time.
Sending customer details to a third party is fine if you trust them and have a financial relationship with them backed by legal agreements.
- Made a RAG in ~50 lines of ruby (practical and efficient)
- Perform authorization on chunks in 2 lines of code (!!)
- Offload retrieval to Algolia. Since a RAG is essentially LLM + retriever, the retriever typically ends up being most of the work. So using an existing search tool (rather than setting up a dedicated vector db) could save a lot of time/hassle when building a RAG.
Not all cool code is in new greenfield projects.
I probably wouldn't write a training system in Ruby (not because it's not doable, just because it's not a good use of time to rewrite stuff that is already available in python ecosystem)... but hooking up a Ruby system up to LLM's for interaction is eminently doable with very little effort.
I am assuming your situation had some specific constraints that made it harder, but it would be nice to understand what they were - right now your comment describes a more complicated solution and I am curious why you needed it.
Surely a fuzzy search by name or some other field is a much better UI for this.
We build front ends for the API to make our applications easier to use. This is just another type of front end.
It’s not a good thing.
See also "instagram is spying on you through your microphone". It's not, but I've seen people argue that it's OK for people to believe that because it supports their general (accurate) sentiment that targeted ads are creepy.
(I have a ton more arguments if that's not convinced enough for you, I collect them here: https://simonwillison.net/tags/microphone-ads-conspiracy/ )
Seriously: the entire idea there is that there was a vast global conspiracy to secretly spy on people to target ads which was blown wide open by THIS deck: https://www.documentcloud.org/documents/25051283-cmg-pitch-d...
How do you prove they are not?
How can I trust a corporation which is tied to Facebook who experimented with Cambridge Analytica?
The water figures are very overestimated, but the principle is true: using a super computer to do simple things uses more electricity, compute and therefore water than doing it in a traditional way.
It's a tool. The main question should be: is it useful? In the case of AI, sometimes yes, sometimes no.
It's a tool, and using the wrong tool for the wrong job is just wasteful. And, usually, overly complicated and frail. So it's only losses.
Does anyone have a comparison of the two, or any other libraries?
RubyLLM gives you a clean API for LLM calls and tool definitions — you're still writing prompts and managing conversations directly.
DSPy.rb treats prompts as functions with typed signatures. You define inputs/outputs and the framework handles prompt construction, JSON parsing, and structured extraction. Two articles that might help:
1. "Building Your First ReAct Agent" — shows how to build tool-using agents with type-safe tool definitions: https://oss.vicente.services/dspy.rb/blog/articles/react-age...
2. "Building Chat Agents with Ephemeral Memory" — demonstrates context engineering patterns (what the LLM sees vs. what you store), cost-based routing between models, and memory management: https://oss.vicente.services/dspy.rb/blog/articles/ephemeral...
The article's approach (RubyLLM + single tool) works great for simple cases. DSPy.rb shines when you need to decompose into multiple specialized modules with different concerns — e.g., separate signatures for classification vs. response generation, each optimized independently.
Would love to learn how dspy.rb is working for you!
Note that RubyLLM and DSPy.rb aren't mutually exclusive (`gem 'dspy-ruby_llm'`) adapter gives us access to a TON of providers.
It may be the current "Zeitgeist", but I find the addiction to AI annoying. I am not denying that there are use cases to be had that can be net-positive, but there are also numerous bad examples of AI use. And these, IMO, are more prevalent than the positive ones overall.
If a problem is this widespread, a conference is arguably the best place to address it.
> but there are also numerous bad examples of AI use
which should be discussed publicly. I think we all have a lot to learn from each others' successes and failures, which is where coming together at a conference can really help.
I liked how well designed the monolith application seems to be from the brief description in the article.
Coincidentally I installed Ruby, first time in years, last week and spent a half hour experimenting the same nicely designed RubyLLM gem used in the article. While slop code can be written in any language, it seems like in general many Ruby devs have excellent style. Clojure is another language where I have noticed a preponderance for great style.
As long as I am rambling, one more thing, a plug for monolith applications: I used to get a lot of pleasure from working as a single dev on monoliths in Java and Ruby, eschewing micro-services, really great to share data and code in one huge usually multithreaded process.
Your single-tool approach is a solid starting point. As it grows, you might hit context window limits and find the prompt getting unwieldy. Things like why is this prompt choking on 1.5MB of JSON from this other API?
When you look at systems like Codex CLI, they run at least four separate LLM subsystems: (1) the main agent prompt, (2) a summarizer model that watches the reasoning trace and produces user-facing updates like "Searching for test files...", (3) compaction and (4) a reviewer agent. Each one only sees the context it needs. Like a function with their inputs and outputs. Total tokens stay similar, but signal density per prompt goes up.
DSPy.rb[0] enables this pattern in Ruby: define typed Signatures for each concern, compose them as Modules/Prompting Techniques (simple predictor, CoT, ReAct, CodeAct, your own, ...), and let each maintain its own memory scope. Three articles that show this:
- "Ephemeral Memory Chat"[1] — the Two-Struct pattern (rich storage vs. lean prompt context) plus cost-based routing between cheap and expensive models.
- "Evaluator Loops"[2] — decompose generation from evaluation: a cheap model drafts, a smarter model critiques, each with its own focused signature.
- "Workflow Router"[3] — route requests to the right model based on complexity, only escalate to expensive LLMs when needed.
And since you're already using RubyLLM, the dspy-ruby_llm adapter lets you keep your provider setup while gaining the decomposition benefits.
Thanks for coming to my TED talk. Let me know if you need someone to bounce ideas off.
[0] https://github.com/vicentereig/dspy.rb
[1] https://oss.vicente.services/dspy.rb/blog/articles/ephemeral...
[2] https://oss.vicente.services/dspy.rb/blog/articles/evaluator...
[3] https://oss.vicente.services/dspy.rb/blog/articles/workflow-...
(edit: minor formatting)