Skip to content

Conversation

@aaronquantexa
Copy link

CL:

  • Add check that topTreeSize parameter is greater than the input data size. Throws an exceptions stating this.

Motivation behind the change was on this raised issue: #21 about difficulty debug.

Before this change, the error would be:

requirement failed: Sampling fraction (1.002267573696145) must be on interval [0, 1]
java.lang.IllegalArgumentException: requirement failed: Sampling fraction (1.002267573696145) must be on interval [0, 1]

whereas now the happens earlier before doing any data transformations, and says:

org.apache.spark.SparkException: Invalid top tree size relative to size of data. Data to fit of size 441 was less than topTreeSize 442

In a previous commit, included a test for this which matched the string in the error message, but didn't see it as necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant