VAC Memory System – SOTA RAG (80.1% LoCoMo) Built by a Handymen Using Claude CLI
Mood
informative
Sentiment
positive
Category
research
Key topics
RAG
Claude CLI
Memory System
AI
Open-Source
Discussion Activity
Light discussionFirst comment
N/A
Peak period
2
Hour 2
Avg / period
1.5
Based on 3 loaded comments
Key moments
- 01Story posted
Nov 24, 2025 at 11:43 AM EST
9h ago
Step 01 - 02First comment
Nov 24, 2025 at 11:43 AM EST
0s after posting
Step 02 - 03Peak activity
2 comments in Hour 2
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 24, 2025 at 1:04 PM EST
8h ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
The Problem: Standard vector search (RAG) fails in complex, multi-hop conversations (semantic drift/hallucinations). The Solution: I built the VAC Memory System. It achieves deterministic accuracy by using a proprietary MCA-first Gate instead of relying solely on semantics. This "Gravity" logic protects core entities (like dates and names) from being missed, which is essential for multi-user, conversational agents.
Key Metrics (Tested on 100 runs):
SOTA Accuracy: 80.1% Mean Accuracy on LoCoMo.
Cost: Ultra-low, <$0.10 per million tokens (using gpt-4o-mini).
Speed: Fast, 2.5 seconds per query.
This was built in 4.5 months, using Claude in the terminal, under extreme personal pressure. This repository is proof that focus and grit can still deliver breakthrough results.
The Ask (Read by VCs): I am looking for $100k in Pre-Seed funding to formalize the business, secure the IP, and hire the first engineer to build the API layer. I am looking for partners who who back founders with deep product conviction.
Check the reproducible code and full results: https://github.com/vac-architector/VAC-Memory-System
Feedback is welcome!
I see two main concerns emerging, and I want to be completely transparent:
1.so Files and IP Protection The core MCA algorithm is compiled to protect the IP while I seek $100k Pre-Seed funding. This is the "lottery ticket" I need to cash to scale. I did not retrain any model. The system's SOTA performance comes entirely from the proprietary MCA-first Gate logic. Reproducibility is guaranteed: you can run the exact binary that produced the 80.1% SOTA results and verify all logs.
2. Overfit vs. Architectural Logic All LLM and embedding components are off-the-shelf. The success is purely due to the VAC architecture. MCA is a general solution designed to combat semantic drift in multi-hop, conversational memory. If I was overfitting by tuning boundaries, I would have 95%+ accuracy, not 80.1%. The 16% failures are real limitations.
Call to Action: Next Benchmarks I need your recommendations: I am looking for the toughest long-term conversation benchmarks you know. What else should I test the MCA system on to truly prove its generalizability?
GitHub: https://github.com/vac-architector/VAC-Memory-System
I appreciate the honesty of the HN community and your help in validating my work.
The MCA algorithm does not perform the majority of document retrieval; it is merely a mechanism that complements FAISS and BM25 for greater coverage. My system uses a truly hybrid retriever:
FAISS (with BGE-large-en-v1.5, 1024D) handles the primary retrieval load, pulling 60%+ documents.
MCA acts as a specialized gate. The logic is: if two retrievers miss the ground truth (GT), the third one catches it. They complement each other—what FAISS misses, MCA finds.
Pipeline Magic: Despite this aggressive union coverage (which often exceeds 85% documents), the reranker plays an equally critical role. The ground truth (GT) doesn't always reach the top 15, and the final LLM doesn't always grasp the context even when it's present. All the magic is in the deterministic pipeline orchestration.
LLM Agnosticism: The LLM (gpt-4o-mini) is only involved in the final answer generation, which makes my system highly robust and LLM-agnostic. You can switch to a weaker or stronger generative model; the overall accuracy will not change by more than ±10.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.