Biscuit
github.comKey Features
Tech Stack
Key Features
Tech Stack
Biscuit 277.09 MB
Trigram 86 MB
B-Tree 43 MB
Pretty much you exchange space for speedOne suggestion is to index the end-of-string as a character as well; then you don't need negative offsets. But that turns the suffix search into a wildcard type of thing where you have to try all offsets, which is what the '%pat%' searches do already, so maybe it's OK.
The interesting question in prod is always the other side of that trade: write amplification and index bloat. The docs are pretty up-front that write performance and concurrency haven’t been deeply characterized yet, and they even have a section on when you should stick with pg_trgm or plain B-trees instead. If they can show that Biscuit stays sane under a steady stream of updates on moderately long text fields, it’ll be a really compelling option for the common “poor man’s search” use case where you don’t want to drag in an external search engine but ILIKE '%foo%' is killing your box.
But if you really need to optimize LIKE instead of providing plain text search, sure.
Example: LIKE '%abc%def'
...
Step 2: Match first part as prefix
-- "abc" must start at position 0
Candidates = pos[a@0] ∩ pos[b@1] ∩ pos[c@2]
Is this a mistake, or is there some position magic that makes the position == 0 even after arbitrary prefix?Usually you're quickly steered towards fulltext search (tsvector) in Postgres if you want to do something like that. But depending on what kind of search you actually need, trigram indexes can be a better option. If you don't search so much for natural language, but more for specific keywords the stemming in fulltext search can get in the way.
One information that would be nice here is a comparison of the index size on disk for both index types.
"Foobario 451" With the string "Foo 4" Is this too much complexity for trigrams? Would biscuit work for this?
Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.