The Theoretical Limitations of Embedding-Based Retrieval
Posted4 months agoActive4 months ago
alphaxiv.orgTechstory
calmneutral
Debate
20/100
Information RetrievalEmbedding-Based RetrievalNlp
Key topics
Information Retrieval
Embedding-Based Retrieval
Nlp
A research paper discusses the theoretical limitations of embedding-based retrieval, sparking discussion on the implications for information retrieval and potential alternatives like BM25.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
31m
Peak period
2
0-1h
Avg / period
1.5
Key moments
- 01Story posted
Sep 3, 2025 at 3:09 PM EDT
4 months ago
Step 01 - 02First comment
Sep 3, 2025 at 3:40 PM EDT
31m after posting
Step 02 - 03Peak activity
2 comments in 0-1h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 3, 2025 at 4:49 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45119397Type: storyLast synced: 11/20/2025, 8:00:11 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Multi-vector models
Multi-vector models are more expressive through the use of multiple vectors per sequence combined with the MaxSim operator [Khattab and Zaharia, 2020]. These models show promise on the LIMIT dataset, with scores greatly above the single-vector models despite using a smaller backbone (ModernBERT, Warner et al. [2024]). However, these models are not generally used for instruction-following or reasoning-based tasks, leaving it an open question to how well multi-vector techniques will transfer to these more advanced tasks.
Sparse models
Sparse models (both lexical and neural versions) can be thought of as single vector models but with very high dimensionality. This dimensionality helps BM25 avoid the problems of the neural embedding models as seen in Figure 3. Since the of their vectors is high, they can scale to many more combinations than their dense vector counterparts. However, it is less clear how to apply sparse models to instruction-following and reasoning-based tasks where there is no lexical or even paraphrase-like overlap. We leave this direction to future work.
In other words, it says that both multi-vector (i.e. late interaction) and sparse models hold promise.