Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slowdown after switch to crypto-bigint #490

Open
fjarri opened this issue Mar 5, 2025 · 5 comments
Open

Slowdown after switch to crypto-bigint #490

fjarri opened this issue Mar 5, 2025 · 5 comments

Comments

@fjarri
Copy link
Contributor

fjarri commented Mar 5, 2025

Benchmarks slowed down several times after #394. What could be the reason?

Performance improvements for boxed uints (RustCrypto/crypto-bigint#777) and the corresponding changes in crypto-primes (entropyxyz/crypto-primes#78) don't really change the results much according to my tests.

sign test time is split ~50/50 between Montgomery exponentiation (with all the time spent in almost_montgomery_mul(), the lowest level function) and BoxedUint::inv_mod(); decrypt is almost exclusively exponentiation. Both of these are the calls in RSA itself; crypto-primes calls take negligible time.

So it seems that either these two functions are somehow much slower than num-bigint (possible, yet unlikely, and straightforward to test), or somehow #394 changed the algorithm to apply them more times than necessary.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 5, 2025

So one difference that I see is that modpow() (and Montgomery multiplication in general) in num-bigint-dig was not, in fact, constant-time. But I would not expect just that account for such a huge difference.

@tarcieri
Copy link
Member

tarcieri commented Mar 5, 2025

Yeah, that's one difference: num_bigint::BigUint has a normalize function which automatically strips leading zeros which is called all over the place.

However, that could also be a clue to the slowdown, namely there may be places where fixed-but-dynamic precision BoxedUints are using a larger size/precision than necessary for a given key size, causing the number of iterations looping through limbs to be much larger than num-bigint (i.e. because they're larger than necessary).

That's my best guess anyway. I'm not sure to sleuth out exactly where that might be happening other than tediously going through line-by-line and comparing the implementations or otherwise looking for unnecessary leading zeros (via e.g. dbg! inspection)

@dignifiedquire
Copy link
Member

The difference comes from the fact that we now do a lot of operation over numbers twice as large as they need to be, at least that is what I remember when I last checked.

if each prime has x bits, we end up doing operations on numbers of the size of 2x bits in a bunch of places, when often times something much smaller would suffice.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 6, 2025

Yep, I just reached that conclusion as well. There are several instances of exponentiation modulo p or q in rsa_decrypt() with the moduli kept in a BoxedUint the size of p * q. That still leaves the inversion, but it's probably the same problem there.

@fjarri
Copy link
Contributor Author

fjarri commented Mar 6, 2025

Btw if anyone wants to deal with this issue, feel free, I don't think I will have enough free time in the coming weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants