PGVector's Missing Features — Trieve

The Nugget

  • PGVector's infrastructure simplicity is countered by critical missing features that harm search effectiveness, particularly in handling negated terms, explainability of results, performant filtering, and support for various search modes.

Make it stick

  • 🔍 Negated words can be managed in queries, but Postgres lacks efficient indexing for this.
  • 🛠️ Explainability is essential; search results need highlights for users to self-correct ineffective terms easily.
  • Filtering and ordering in PGVector are slow, making dedicated solutions like Trieve more efficient.
  • 📊 Semantic versus full-text search: Users expect keyword functionality, and missing this could lead to frustration.

Key insights

Key Features Required in Search Solutions

  1. Support for Required and Negated Words: Postgres struggles to manage terms accurately in dense vector searches. Effective query formulation can mitigate this but requires extensive user knowledge.
  2. Explainability with Highlights: Users benefit from highlighted keyword snippets, enabling them to identify and rectify poor queries. This is a crucial user experience aspect that PGVector lacks.
  3. Performance on Filters and Ordering: PGVector poses performance issues that can lead to delays in query responses, while dedicated search solutions like Trieve can handle these efficiently.
  4. Support for Sparse Vectors and Various Search Modes: PGVector limits users to semantic search only, creating a mismatch for users accustomed to keyword searches. Introducing full-text options can optimize adoption and user satisfaction.

Evaluating PGVector

  • Simplicity vs. Features: While PGVector is appreciated for its straightforward infrastructure, the lack of key functionality can hinder performance in practical applications.
  • Dedicated Solutions Recommended: For users needing advanced search capabilities, opting for solutions like Trieve, which address these missing features, is advised.

Key quotes

  • "Dense vector search is not perfect and will fail on queries like emails where Rob praises the team and does not mention productivity."
  • "Postgres pgvector does not come up with a way to highlight the keyword snippets being matched on."
  • "Its maintainers are working on this as you can see in this currently 83 comment long issue on Github."
  • "Defaulting to fulltext and adding semantic as an option gives your users the time to figure out how best to use the new dense vector semantic search workflow."
  • "Our goal with this post is to inform readers about the potential downside and pitfalls of PGVector."
This summary contains AI-generated information and may have important inaccuracies or omissions.