Valori – a Python-Native Vector Database I Built From Scratch
Key topics
The idea came from my frustration with existing vector DBs that were either too heavy for experimentation or too opaque to modify. I wanted something simple, modular, and extensible — so I built it.
What it does:
Lets you store, index, and search high-dimensional vectors
Supports multiple indices (Flat, HNSW, IVF, LSH, Annoy)
Has memory, disk, and hybrid storage backends
Includes a full document processing pipeline (parsing, cleaning, chunking, embedding)
Offers quantization, persistence, and plugin-based extensibility
All written in Python, integrated with NumPy, and production-tested with logging and monitoring built in.
Install:
pip install valori
GitHub: https://github.com/varshith-Git/valori
PyPI: https://pypi.org/project/valori
I’d love to hear your thoughts —
What’s missing for you in current vector DBs?
If you’ve built LLM or RAG systems, what do you wish a lightweight, pure Python DB like this handled better?
Would you prefer tighter integrations (LangChain, Haystack, etc.) or a more “build-it-yourself” style?
Feedback, criticism, or collaboration ideas are all welcome. — Varshith (varshith.gudur17@gmail.com )
The author presents Valori, a Python-native vector database built from scratch, and receives mixed feedback from the community regarding its design, performance, and potential applications.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
1m
Peak period
4
18-21h
Avg / period
1.8
Based on 11 loaded comments
Key moments
- 01Story posted
Nov 9, 2025 at 6:52 AM EST
about 2 months ago
Step 01 - 02First comment
Nov 9, 2025 at 6:53 AM EST
1m after posting
Step 02 - 03Peak activity
4 comments in 18-21h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 11, 2025 at 7:24 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
nothing is better than sqlite as a library and don't use high perforamnce as your value for a python product
where did you get the original mental model to begin building it?
The “vibe” part came from trying to make it feel like a system that could run in production, not just a toy. So yeah, it’s a little heavy, but it earned the vibe honestly.
Since you're asking for feedback:
- perhaps some of the document type specific dependencies by optional?
- could there be LESS config surface?
- I noticed GitHub CI action has a cross.
It's good to add how to use with Astral "uv" these days, especially anything that might pull in PyTorch dependency hell, which they have mostly solved if used correctly!
Nice work!
Github: https://github.com/varshith-Git/valori
https://valori-python-vector-db.lovable.app/