Skip to content

Improve convex hull check#139

Open
mhungen wants to merge 4 commits intosenbox-org:masterfrom
mhungen:improve_convex_hull_check
Open

Improve convex hull check#139
mhungen wants to merge 4 commits intosenbox-org:masterfrom
mhungen:improve_convex_hull_check

Conversation

@mhungen
Copy link
Copy Markdown

@mhungen mhungen commented Nov 20, 2024

The current implementation of the convex hull check is inefficient due to the incorrect use of HashMap.
Instead of checking if there is a key O(1) in the set, it checks if the map contains a value O(n).
I think this is an implementation error as previous commits mention a HashSet, which makes sense to me.

/*
 * Second check : be sure input is within the approximated convex hull (see ATBD)
 */

if (bandMinMax != null && definitionGridMap != null) {
    int [] gridProjection = new int[definitionGridSize];
    String gridProjString="";
    for (int i = 0; i < gridProjection.length; i++) {
        double bandMin = bandMinMax[0][i];
        double bandMax = bandMinMax[1][i];
        gridProjection[i] = (int)Math.floor(10 * (input[i] - bandMin) / (bandMax - bandMin) + 1);
        gridProjString+=String.valueOf(gridProjection[i]);
    }
    boolean insideDefinitionDomain = false;
    if(definitionGridMap.containsValue(gridProjString)){  // << here is the problem
        insideDefinitionDomain = true;
    }
    if (!insideDefinitionDomain) {
        setInputOutOfRange(result);
        return;
    }

}

I'm also not a fan of converting each domain definition point to a string representation, which will be hashed again to get an int representation. i suggest combining the int array into a single int and using that as a hash instead.

I also added a comment describing how the check works, because it took me a while to figure out that DefinitionDomain_Grid contains all valid points and not the points of the hull boundary (even though ‘Domain’ should have given me the hint).

Previously, a HashMap.containsValue was used to check whether a point lies within the domain. this is very inefficient O(n), a HashSet is better suited for this task (as originally intended in c766f1c)
Previously, the numerical values of the domain definition were converted into a string representation, which is converted back into an int hashCode in the HashSet. which is inefficient.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant