The Ontological Index

With ongoing advancements in Large Language Models (LLMs) such as ChatGPT, vector-based search mechanisms are rapidly transitioning from being auxiliary features to core functionalities in many platforms. Vector search is now found not only in specialised stores like Pinecone and Weaviate but also in search platforms such as Elasticsearch and databases like MongoDB. Notably, both these platforms have employed an algorithm called Hierarchical Navigable Small Worlds (HNSW) to deliver efficient vector search. HNSW is a graph-based algorithm, its power lies in the ability to transform continuous embedding vectors into a discrete, layered graph.

🔵 Discrete and Continuous Semantics
Traditionally, fuzzy matching strategies are often implemented in conjunction with discrete filters. In search, this is referred to as 'faceting' (think of searching for 'shiny black shoes' on eBay and then selecting a specific brand from a dropdown menu). This hybrid approach has proven effective and is being widely adopted for vector search as well. For example, one might restrict documents based on geographical origin or timeframe and then use vector-based search to gauge sentiment only within that subset.

🔵 A Graph-Based Revolution
Traditional filtering is typically based on tabular (rows in a database) or tree-like (JSON documents) data formats. The landscape changes significantly when the data itself is structured as a graph. When employing HNSW in a graph-based setup, both continuous vectors and discrete facets become vertices in the same graph. This allows for more nuanced relationships and more efficient alignment. Furthermore, the upper layers within HNSW represent a form of compression. With your data in a graph, you can move beyond the classic HNSW node-degree compression algorithms to consider more semantic forms of compression, which take domain-specific ontologies into account. This could prove to be very powerful.

🔵 Key Takeaways for Organisations
I posit that transitioning to graph-based data structures is the next logical step in the evolution of search and knowledge representation. Therefore, my advice to organisations looking to stay ahead in the data management and analytics game is to transition as much of their core data into a graph structure as quickly as possible.

⭕ Semantic Compression: https://www.knowledge-graph-guys.com/blog/semantic-compression

⭕ The Semantic Router: https://www.knowledge-graph-guys.com/blog/the-semantic-router

Previous
Previous

The AI Iceberg

Next
Next

Semantic Compression