-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inbloom.error: internal initialization failed -- Parameter bounds? #4
Comments
Thanks for the report, I'll take a look in the upcoming week. |
Following up, I noticed in the libbloom docs that it defaults to a 32-bit build. I then poked at the parameters and found that with an error rate of 10**-4, the largest number of elements it will take is 112022460. With an error rage of 10**-5, the limit is 89617968. Going to 10**-6 reduces the limit by exactly 5/6 to 74681640. That log-linear pattern continues through at least 10**-7. Ideally, reducing the error rate by a factor of ten should require 4.8 bits/element. Using that estimate, these maximal filters would be taking 112022460 * (4*4.8) bits, which is just a smidge above 2**31. I have no idea where to get an extra factor of two, but this looks like the 32-bit build may be at the root of it. |
Hi, it's definitely an issue with libbloom, it doesn't support filters which are larger than 2 ** 31, this can be fixed by changing libbloom to use size_t instead of int for the entries, bits and bytes fields of the bloom struct. I have a patch which fixes that and will submit it upstream. Does this solution seem reasonable to you? |
Here's the commit for libbloom EverythingMe/libbloom@87c929a (I started a fork since the library changed a lot in the past month and will need some modifications to inbloom). |
Trying to initialize large filters results in memory errors, see below. Most likely explanation is the filter is not intended to be used with these parameters. If so, what limits should be used in practice?
This output was produced using OSX with Python 2.7.9 and inbloom 0.2.2. I got identical results on Ubuntu 14 with Python 2.7.10, except that the error messages were uniformly "inbloom.error: internal initialization failed".
The text was updated successfully, but these errors were encountered: