-
Notifications
You must be signed in to change notification settings - Fork 157
Description
Currently Advz
construction enforces that recovery_threshold
and multiplicity
must be powers of two:
Lines 176 to 185 in 7cd4f76
// TODO TEMPORARY: enforce power-of-2 chunk size | |
// Remove this restriction after we get KZG in eval form | |
// https://github.com/EspressoSystems/jellyfish/issues/339 | |
if chunk_size as usize != eval_domain.size() { | |
return Err(VidError::Argument(format!( | |
"recovery_threshold {} currently unsupported, round to {} instead", | |
chunk_size, | |
eval_domain.size() | |
))); | |
} |
Code comments link to #339 in several places but do not explain why this issue blocks us from allowing power-of-two args in ADVZ.
Why the restriction?
In ADVZ we encode payload data into polynomials in evaluation form so as to facilitate payload proofs (aka "namespace proofs") in payload_prover.rs
. We do not yet have KZG in eval form (#339) so instead we use FFT to compute polynomial coefficients from payload data, then use KZG in coefficient form for payload proofs. Ten months ago in internal discussion I said (paraphrasing):
A limitation of this approach is that payload recovery is possible only if
recovery_threshold * multiplicity
is a power of two. Why? Because FFT is applied to all polynomials, which effectively "rounds up" the degreed
tonext_power_of_2(d)
. Interpolating such a polynomial requiresnext_power_of_2(d)
points.
This is wrong. A degree-d
polynomial is padded with zero until its degree becomes next_power_of_2(d)
. Yes, you need next_power_of_2(d)
points to interpolate such a polynomial, but we already know that next_power_of_2(d) - d
of those points are zero, so we only need d
additional points to interpolate. Thus, the payload is recoverable under this approach for any desired degree d
.
10 months ago I also said "this limitation can be removed after we have a proper implementation of KZG in eval form." But of course that's incorrect---we do not need KZG in eval form in order to remove the power-of-two restriction.
Efficiency concerns
The FFT operates only on data sets whose length is a power of two. Thus, if we use it on a non-power-of-two data set the implementation must waste computation: we pad a size-d
data set with zeros until its size is next_power_of_2(d)
, do FFTs, then discard excess data back down to size d
. This is wasteful, but it's not a security issue.
Also, ADVZ dispersal uses FK23 algorithm to efficiently compute many KZG proofs. FK23 is a FFT-like algorithm that operates only on data sets whose length is a power of two, so we introduce a similar inefficiency in FK23 if we operate on non-power-of-two data sets. But again, this inefficiency does not introduce a security concern: it's safe to use non-power-of-two data sets.
TODO
- short term: explain in docs. future code should link to this issue in comments
- long term: remove the power-of-two restriction in such a way that payload recovery is possible for all payload data sizes.