Architecture
Beyond the Vector DB
Elena Ro
//
Aug 30, 2025
Vector databases changed the industry, but they lack semantic nuance. We built a custom retrieval layer that combines sparse and dense vectors with graph-based re-ranking to solve the "lost in the middle" phenomenon.
The Problem with Cosine Similarity
"Apple" the fruit and "Apple" the company have high cosine similarity in generic embeddings. In enterprise search, this ambiguity is fatal. Relying solely on vectors leads to "vibes-based retrieval," where results feel right but are factually wrong. Standard embeddings struggle to capture exact keyword matches or specific entity relationships defined in corporate taxonomies.
Hybrid Search Architecture
We implemented a multi-stage retrieval pipeline. Stage 1 uses BM25 for keyword precision and embeddings for semantic reach, casting a wide net. Stage 2 is the secret sauce: a cross-encoder model that re-ranks results based on true relevance, not just vector proximity. This ensures that the context window is filled with high-density, highly relevant information.
We also introduced a graph layer that maps entities before vectorization, allowing the system to understand relationships (CEO of, Located in, Subsidiary of) that vectors often miss entirely.
Conclusion
Retrieval is the bottleneck of intelligence. By moving beyond simple vector similarity to a hybrid, graph-aware approach, we ensure that the model reasons over the right data, every time.


