-
Notifications
You must be signed in to change notification settings - Fork 44
Description
Hello,
First, I've been studying the source code to better understand the implementation of the sieving algorithms, and I have a quick question about a specific design choice.
I noticed that there are two slightly different constants used when calculating and checking for saturation.
In db_stats (within siever.pyx), the theoretical saturation metric appears to be calculated using the factor 2.0:
return [(2 * tmp_histo[i] / (1+i*(1./self._core.size_of_histo))**(self.n/2.)) for i in range(self._core.size_of_histo)]However, in the termination condition within Siever::gauss_sieve (in sieveing.cpp), a slightly smaller constant, 1.99, is used for the check:
for (unsigned int i=0 ; i < size_of_histo; ++i)
{
cumul += histo[i];
if (i>=imin && 1.99 * cumul > GBL_saturation_histo_bound[i])
{
return;
}
}My hypothesis is that this is an intentional design choice to separate the theoretical calculation from the practical decision-making:
The 2.0 is used for pure statistical reporting, providing a result that is true to the underlying theoretical model (Gaussian Heuristic).
The 1.99 is used in the control-flow logic to provide a small margin of error, making the termination condition more robust against floating-point inaccuracies and ensuring the algorithm doesn't get stuck trying to reach a perfect, idealized value.
Could you confirm if this understanding is correct? Or is there a more subtle reason for this specific choice of 1.99?