-
Notifications
You must be signed in to change notification settings - Fork 2
giannis_2_28_2017
-
k-dimensional trees create a binary search tree of hyperplanes.It partitions the k-dimensional space by selecting a different axis at every node. Axis selection is a modulo operation between the tree's depth and number of dimensions. The median value from the elected axis is used to perform a split.
- Construction: O(knlogn) -> nlogn to presort as every dimension and for every dimension.
- Insert: O(logn)
- Remove: O(logn)
- Query: (m + n^(1-1/k))
-
A ball tree is a variant off-d trees. Their difference lies in the way data points is partitioned. Instead of hypercubes, ball trees use hypersphere.
In both cases, the tree can be queried to report the neighbors of a points based on a radius. When all points are queried an adjacency list is reported, which can be used to create a graph.
-
Until now I have not been able to find a way to use 3D meshes to find the leaflets.
-
I/O: Most papers that I found try to exploit as much as possible the local hard drives of the nodes. Klein et al. [1] at a letter in Oxford Bioinformatics, propose a new file format, called SFile. This file is used through HDFS to increase the read rate. Based on the letter, using this filetype they were able to sustain an 6 ~ 8GB/s when 128 HDDs were used.
[1]Max Klein, Rati Sharma, Chris H. Bohrer, Cameron M. Avelis, Elijah Roberts; Biospark: scalable analysis of large numerical datasets from biological simulations and experiments using Hadoop and Spark. Bioinformatics 2017; 33 (2): 303-305. doi: 10.1093/bioinformatics/btw614