An alternate approach to calculating SKG/relatedness over high-cardinality domains and fields (e.g., for sorting by “relatedness”) yields as much as 60x latency reduction for common high-cardinality queries. Incorporation of a facet count cache yields as much as 450x latency reduction. These modifications should facilitate production deployment of sort-by-relatedness faceting in high-cardinality contexts.
The ability to execute complete graph queries over arbitrarily complex graphs of indexed tokens opens a number of new possibilities
Overview of a candidate implementation providing complete, performant, configurable graph query support over indexed token graphs in Lucene.
SpanNearQuery
is arguably the essential component of graph queries in Lucene. Over time, enhancements to various Lucene components have increasingly invalidated some fundamental assumptions in the SpanNearQuery
graph query implementation, leading to buggy and/or unpredictable query behavior.