-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjusts Seeded knn searches to clean up user and internal interfaces #14170
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
this.delegate = knnQuery; | ||
this.seedWeight = seedWeight; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor/subjective: init order to match member declaration order
this.delegate = knnQuery; | |
this.seedWeight = seedWeight; | |
this.seedWeight = seedWeight; | |
this.delegate = knnQuery; |
/** | ||
* Iterator of valid entry points for the kNN search | ||
* | ||
* @return DocIdSetIterator of entry points, default is empty iterator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @return DocIdSetIterator of entry points, default is empty iterator | |
* @return DocIdSetIterator of entry points |
/** | ||
* Number of valid entry points for the kNN search | ||
* | ||
* @return number of entry points, default is 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @return number of entry points, default is 0 | |
* @return number of entry points |
private final int[] seedOrds; | ||
|
||
static SeededHnswGraphSearcher fromEntryPoints( | ||
AbstractHnswGraphSearcher delegate, int numEps, DocIdSetIterator eps, HnswGraph graph) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe
AbstractHnswGraphSearcher delegate, int numEps, DocIdSetIterator eps, HnswGraph graph) | |
AbstractHnswGraphSearcher delegate, int numEps, DocIdSetIterator eps, int graphSize) |
* @param parentBitSet The leaf parent bitset | ||
* @param searchStrategy The search strategy to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* @param parentBitSet The leaf parent bitset | |
* @param searchStrategy The search strategy to use | |
* @param searchStrategy The search strategy to use | |
* @param parentBitSet The leaf parent bitset |
…14170) This is a bugfix and refactor for seeded knn searches. First, Since we are using collectors, we don't actually need unique queries for every input type. Consequently, I have collapsed the two individual seeded queries into a single query that delegates to a provided kNN query. Then the collector manager is simply wrapped, so that the entry points can be provided. Second, the interactions in the hnsw graph were not clear. Consequently, I did a minor refactor of HNSW searcher to have a "SeededSearcher", where instead of searching the graph for the entry points, it provides them directly. Third, instead of continually overloading collectors, I opted to add a new "searchstrategy" value to KnnCollector. This way various strategies can be executed with different options. I think Seeded could eventually be replaced with something.
This is a bugfix and refactor for seeded knn searches.
First, Since we are using collectors, we don't actually need unique queries for every input type. Consequently, I have collapsed the two individual seeded queries into a single query that delegates to a provided kNN query. Then the collector manager is simply wrapped, so that the entry points can be provided.
Second, the interactions in the hnsw graph were not clear. Consequently, I did a minor refactor of HNSW searcher to have a "SeededSearcher", where instead of searching the graph for the entry points, it provides them directly.
Third, instead of continually overloading collectors, I opted to add a new "searchstrategy" value to KnnCollector. This way various strategies can be executed with different options. I think
Seeded
could eventually be replaced with something.