Evaluate Your Own Rag, Why Best Practices Failed Us
Posted2 months ago
huggingface.coTechstory
calmpositive
Debate
0/100
Rag SystemsInformation RetrievalAI Applications
Key topics
Rag Systems
Information Retrieval
AI Applications
The post discusses the development and evaluation of RAG (Retrieval-Augmented Generation) systems, with a commenter sharing their own experience building a similar system for searching scientific PDFs.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
N/A
Peak period
1
Start
Avg / period
1
Key moments
- 01Story posted
Nov 5, 2025 at 8:47 AM EST
2 months ago
Step 01 - 02First comment
Nov 5, 2025 at 8:47 AM EST
0s after posting
Step 02 - 03Peak activity
1 comments in Start
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 5, 2025 at 8:47 AM EST
2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45822765Type: storyLast synced: 11/17/2025, 7:53:27 AM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Manual search wasn't cutting it. So we built a RAG system to give our team instant access to critical technical knowledge.
What worked: - AWS Titan V2 crushed it (69.2% hit rate vs. 57.7% for Qwen, 39.1% for Mistral) - Chunk size? Barely mattered (2K to 40K—no significant difference) - Qdrant: Easy to use, solid performance, great for self-hosting - Mistral OCR: Unmatched, the only tool that parsed our equations correctly - Naive chunking beat context-aware (70.5% vs 63.8%) - Dense-only search outperformed hybrid search (69.2% vs 63.5%)
Hard lessons: - OpenSearch from AWS is ridiculously expensive for no reason and presented as the default option by AWS - Mistral Embed works well in English but not in French