Skip to content

Conversation

miojizzy
Copy link

Modify

  1. Modify function getNeighborsByHeuristic2
    In this function, first of all, pop all items from top_candidates one by one will generate an ordered sequence, if queue_closest is only aim to get nearest from candidates, traversal this sequence reverse order also can do it. Its no need to build another priority queue with negative distance(because of results will put in top_candidates to return, pop all items is necessary). And also delete dist in return_list (it's not use).
  2. Add a simple benchmark test in /benchmark/cpp/, it will build two binary, one is use local hnswlib, another is use origin master branch hnswlib, run them seperently can get performance compair results.

Benifit

My performance test shows it will take 5%-10% time benifit during index building.
Because most of time spent on distance calculate, more data, less time benifit

dataset avg time ns git commit percent
500 30833859 b125ce8 88.2
500 34959886 origin develop 100
5000 912679244 b125ce8 94.4
5000 966823741 origin develop 100
50000 16780000000 b125ce8 95.7
50000 17518000000 origin develop 100

@jianshu93
Copy link
Contributor

any data on the improvement?

@miojizzy
Copy link
Author

any data on the improvement?

no actual data yet, it is still under testing and will be released after a while

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants