Ask HN: Scaling local FAISS and LLM RAG system (356k chunks)architectural advice
Mood
informative
Sentiment
neutral
Category
ask_hn
Key topics
Faiss
Rag
Local-Ai
Scaling
Architecture
Problems I’m running into:
Metadata pickle file loads entirely into RAM
No incremental indexing — have to rebuild the FAISS index from scratch
Query performance degrades with concurrent use
Want to scale to 1M+ chunks but not sure FAISS + pickle is the right long-term architecture
My questions for those who’ve scaled local or offline RAG systems:
How do you store metadata efficiently at this scale?
Is there a practical pattern for incremental FAISS updates?
Would a vector DB (Qdrant, Weaviate, Milvus) be a better fit for offline use?
Any lessons learned from running large FAISS indexes on consumer hardware?
Not looking for product feedback — just architectural guidance from people who’ve built similar systems.
Discussion Activity
Light discussionFirst comment
1h
Peak period
1
Hour 2
Avg / period
1
Based on 2 loaded comments
Key moments
- 01Story posted
Nov 25, 2025 at 6:42 AM EST
4h ago
Step 01 - 02First comment
Nov 25, 2025 at 7:56 AM EST
1h after posting
Step 02 - 03Peak activity
1 comments in Hour 2
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 25, 2025 at 10:45 AM EST
28m ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Discussion hasn't started yet.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.