Skip to content

Question about saturation check constants (1.99 vs 2.0) #140

@blueWatermelonFri

Description

@blueWatermelonFri

Hello,

First, I've been studying the source code to better understand the implementation of the sieving algorithms, and I have a quick question about a specific design choice.

I noticed that there are two slightly different constants used when calculating and checking for saturation.

In db_stats (within siever.pyx), the theoretical saturation metric appears to be calculated using the factor 2.0:

return [(2 * tmp_histo[i] / (1+i*(1./self._core.size_of_histo))**(self.n/2.)) for i in range(self._core.size_of_histo)]

However, in the termination condition within Siever::gauss_sieve (in sieveing.cpp), a slightly smaller constant, 1.99, is used for the check:

        for (unsigned int i=0 ; i < size_of_histo; ++i)
        {
            cumul += histo[i];
            if (i>=imin && 1.99 * cumul > GBL_saturation_histo_bound[i])
            {
                return;
            }
        }

My hypothesis is that this is an intentional design choice to separate the theoretical calculation from the practical decision-making:

The 2.0 is used for pure statistical reporting, providing a result that is true to the underlying theoretical model (Gaussian Heuristic).

The 1.99 is used in the control-flow logic to provide a small margin of error, making the termination condition more robust against floating-point inaccuracies and ensuring the algorithm doesn't get stuck trying to reach a perfect, idealized value.

Could you confirm if this understanding is correct? Or is there a more subtle reason for this specific choice of 1.99?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions