Replies: 2 comments 2 replies
-
Very good observations! And you are completely right -- we compute the overflow such that the check in the And your remark really makes me think that we actually should compute two overflows in the By the way, empirically actually it is never worth to amortise multiplications before modular reductions. As every multiplication increases the number of limbs by two, then the final modular reduction becomes really expensive. Lets say we have 4-limb elements, then single mul 8 limbs, second mul (two 8 limb elements) 16 limbs. Then when modreducing we have to binary decompose 16 very saturated (close to the bitwidth of the scalar field) elements. We can get it better with table lookups, but it is still expensive. Due to that, I'm actually thinking about losing plain mul and replacing it with optimized modmul implementation. This would make the overflow computation irrelevant. But I haven't figured the optimized modmul out yet and it is waiting behind some other urgent work :). |
Beta Was this translation helpful? Give feedback.
-
Thanks for the quick response! I was not expecting one within a few hours. It does not seem like the constraint check in For There Let As the coefficient matrix on the left is a Vandermonde matrix with distinct rows, it has an inverse. So the above system has a unique solution for the We know one solution of the above linear system. It is given by for The maximum bitwidth of If we can ensure that this maximum bitwidth does not exceed So we can assume equality without the Let me know if I got something wrong in this argument. I had a (albeit) quick look at the arkworks-rs/nonnative code. The pre_mul_reduce function there does not seem to account for the bitwidth of the constraint check. For instance, the bitwidth of |
Beta Was this translation helpful? Give feedback.
-
I am unable to understand the calculation of
nextOverflow
in themulPreCond
method ofstd/math/emulated/field_ops.go
. Here is a link to the code.The
mulPreCond
function is as follows:The inputs
a
andb
are slices containing the limbs of the multiplication operands. The functionnbMultiplicationResLimbs(len(a.Limbs), len(b.Limbs))
returnslen(a.Limbs) + len(b.Limbs)-1
. This corresponds to the number of limbs in the product ofa
andb
. For example, ifa
andb
have 4 limbs each, the productf
will have 7 limbs as shown below.The number of terms in a product limb
f_i
can be at most the minimum of the number of limbs ina
andb
. So the maximum bitwidth of a product limb would bef.Params.BitsPerLimb()*2 + a.overflow + b.overflow + ceil(Log2(Min(len(a.limbs), len(b.limbs))
. The last term accounts for the number of carry bits.To get the overflow value, we can remove one of the
f.Params.BitsPerLimb()
. So the value fornextOverflow
would bef.Params.BitsPerLimb() + a.overflow + b.overflow + ceil(Log2(Min(len(a.limbs), len(b.limbs))
.I could not figure out the reasoning behind the
uint(math.Log2(float64(2*nbResLimbs-1)))
term in the currentnextOverflow
calculation. Is it something to do with the constraint check in themul
function?Beta Was this translation helpful? Give feedback.
All reactions