Use k-d tree to optimize performance in visual function #19

TimoSci · 2019-12-15T23:03:14Z

The visual function, which finds the winner neuron for an arbitrary "specimen" input, was using a brute force search to find the nearest neighbor among the neuron codes.

A k-d tree search is usually faster than a brute force search, especially when doing a large number of queries over a large number of neurons.

The disadvantage of a k-d tree is that the initial construction of the tree is costly, but once constructed, search queries are fast. That's why we can't use the k-d tree in the training algorithm - we would have to reconstruct after every step.

But for functions on an already-trained SOM, like mapToSOM, a k-d tree can be advantageous, in most practical use cases.

Example of a benchmark I have done

I have included an optimized version of the visual function. The old function is renamed visualGeneric because it uses multiple dispatch in findWinner to decide which search algorithm to choose.

A test is also included, to test whether visual and visualGeneric return the same result.

PS. Caching the k-d tree in the SOM struct might also be a good idea.

…cted at every step

andreasdominik · 2019-12-16T08:29:44Z

Indeed - its a good idea to optimise the NN search ( I ignored it because even the brute force search is not time limiting for SOMs with some 100s or 1000s neurons).
However, I'll do some tests and include it in the next release.
(nearestNeighbour.jl is actively maintained; the addtl. dependency should be no disadvantage).

TimoSci added 14 commits December 11, 2019 16:45

add benchmarking using pseudorandom data frames

9a2a5ac

add kd-tree nearest neighbor search for faster performance in some cases

2dab459

kd-tree not well suited for training because it needs to be reconstru…

94d0bf6

…cted at every step

performance optimization for winner neuron finder function

50c4dd2

remove distance monitoring used for testing

46adf90

clean up

08e012c

refactor using pipe

3de2aee

bring style in line

32a3250

remove file used for testing changes

0237dd5

dataset generator for benchmarking

014c600

refactor and dry up using multiple dispatch

8ab7b4d

remove benchmarking from contributions branch

ca4291b

more descriptive name for visual functions

767e19d

add test to check if visual returns same result as visualGeneric

6eb4115

TimoSci mentioned this pull request Dec 15, 2019

Use k-d tree to optimize performance in visual function #18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use k-d tree to optimize performance in visual function #19

Use k-d tree to optimize performance in visual function #19

Uh oh!

TimoSci commented Dec 15, 2019

Uh oh!

andreasdominik commented Dec 16, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Use k-d tree to optimize performance in visual function #19

Are you sure you want to change the base?

Use k-d tree to optimize performance in visual function #19

Uh oh!

Conversation

TimoSci commented Dec 15, 2019

Uh oh!

andreasdominik commented Dec 16, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants