Not Hacker News Logo

Not

Hacker

News!

Home
Hiring
Products
Companies
Discussion
Q&A
Users
Not Hacker News Logo

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Hiring
  • Products
  • Companies
  • Discussion
  • Q&A

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.

Not Hacker News Logo

Not

Hacker

News!

Home
Hiring
Products
Companies
Discussion
Q&A
Users
  1. Home
  2. /Discussion
  3. /I built an open-weights memory system that reaches 80.1% on the LoCoMo benchmark
  1. Home
  2. /Discussion
  3. /I built an open-weights memory system that reaches 80.1% on the LoCoMo benchmark
12h agoPosted Nov 26, 2025 at 8:38 AM EST

I Built an Open-Weights Memory System That Reaches 80.1% on the Locomo Benchmark

ViktorKuz
1 points
0 comments

Mood

informative

Sentiment

positive

Category

startup_launch

Key topics

Long-Term Memory Architectures
Retrieval Pipelines
Natural Language Processing
Information Retrieval
I’ve been experimenting with long-term memory architectures for agent systems and wanted to share some technical results that might be useful to others working on retrieval pipelines.

Benchmark: LoCoMo (10 runs × 10 conversation sets) Average accuracy: 80.1% Setup: full isolation across all 10 conv groups (no cross-contamination, no shared memory between runs)

Architecture (all open weights except answer generation)

1. Dense retrieval

BGE-large-en-v1.5 (1024d)

FAISS IndexFlatIP

Standard BGE instruction prompt: “Represent this sentence for searching relevant passages.”

2. Sparse retrieval

BM25 via classic inverted index

Helps with low-embedding-recall queries and keyword-heavy prompts

3. MCA (Multi-Component Aggregation) ranking A simple gravitational-style score combining:

keyword coverage

token importance

local frequency signal MCA acts as a first-pass filter to catch exact-match questions. Threshold: coverage ≥ 0.1 → keep top-30

4. Union strategy Instead of aggressively reducing the union, the system feeds 112–135 documents directly to a re-ranker. In practice this improved stability and prevented loss of rare but crucial documents.

5. Cross-Encoder reranking

bge-reranker-v2-m3

Processes the full union (rare for RAG pipelines, but worked best here)

Produces a final top-k used for answer generation

6. Answer generation

GPT-4o-mini, used only for the final synthesis step

No agent chain, no tool calls, no memory-dependent LLM logic

Performance

<3 seconds per query on a single RTX 4090

Deterministic output between runs

Reproducible test harness (10×10 protocol)

Why this worked

Three things seemed to matter most:

MCA-first filter to stabilize early recall

Not discarding the union before re-ranking

Proper dense embedding instruction, which massively affects BGE performance

Notes

LoCoMo remains one of the hardest public memory benchmarks: 5,880 multi-hop, temporal, negation-rich QA pairs derived from human–agent conversations. Would be interested to compare with others working on long-term retrieval, especially multi-stage ranking or cross-encoder heavy pipelines.

Github: https://github.com/vac-architector/VAC-Memory-System

Memory System: I built an open-weights memory system that reaches 80.1% on the LoCoMo benchmark

Snapshot generated from the HN discussion

Discussion Activity

No activity data yet

We're still syncing comments from Hacker News.

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (0 comments)

Discussion hasn't started yet.

ID: 46057283Type: storyLast synced: 11/26/2025, 1:40:08 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

View on HN
Not Hacker News Logo

Not

Hacker

News!

AI-observed conversations & context

Daily AI-observed summaries, trends, and audience signals pulled from Hacker News so you can see the conversation before it hits your feed.

LiveBeta

Explore

  • Home
  • Hiring
  • Products
  • Companies
  • Discussion
  • Q&A

Resources

  • Visit Hacker News
  • HN API
  • Modal cronjobs
  • Meta Llama

Briefings

Inbox recaps on the loudest debates & under-the-radar launches.

Connect

© 2025 Not Hacker News! — independent Hacker News companion.

Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.