Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrected comment in add sv_expf_inline.h #77

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ClaudioMartino
Copy link

@ClaudioMartino ClaudioMartino commented Jan 3, 2025

In sv_expf_inline.h, lines 55 to 63:

/* y = exp(r) - 1 ~= r + C0 r^2 + C1 r^3 + C2 r^4 + C3 r^5 + C4 r^6.  */
svfloat32_t p12 = svmla_lane (sv_f32 (d->c1), r, lane_consts, 2);
svfloat32_t p34 = svmla_lane (sv_f32 (d->c3), r, lane_consts, 3);
svfloat32_t r2 = svmul_x (svptrue_b32 (), r, r);
svfloat32_t p14 = svmla_x (pg, p12, p34, r2);
svfloat32_t p0 = svmul_lane (r, lane_consts, 1);
svfloat32_t poly = svmla_x (pg, p0, r2, p14);
return svmla_x (pg, scale, scale, poly);

lane_consts contains ln2_lo, c0, c2, c4, (lane_consts = svld1rq (svptrue_b32 (), &d->ln2_lo)), hence:

  • p12 = svmla_lane (sv_f32 (d->c1), r, lane_consts, 2) is computing c1 + r c2
  • p34 = svmla_lane (sv_f32 (d->c3), r, lane_consts, 3) is computing c3 + r c4
  • p0 = svmul_lane (r, lane_consts, 1) is computing r c0

In p14 there is p12 + r^2 p34 = c1 + r c2 + r^2 (c3 + r c4) = c1 + r c2 + r^2 c3 + r^3 c4.
poly will contain the values for p0 + r^2 p14 = r c0 + r^2 (c1 + r c2 + r^2 c3 + r^3 c4) = r c0 + r^2 c1 + r^3 c2 + r^4 c3 + r^5 c4
Finally the returned value is scale + scale * poly = scale (1 + poly) = scale * exp(r).

Hence poly is a polynomial of degree 5, not 6 as written in the comment, and the coefficients have to be changed accordingly (c0 with r, c1 with r^2, etc.).
Since on line 42 the final result has been defined as exp(x) = 2^n (1 + poly(r)) and y is not used elsewhere, I would also change the y on line 55 with poly(r).

The error may come from exp.c where c0 is the coefficient of r^2:

/* y = exp(r) - 1 ~= r + C0 r^2 + C1 r^3 + C2 r^4 + C3 r^5.  */
svfloat64_t r2 = svmul_x (pg, r, r);
svfloat64_t p01 = svmla_x (pg, C (0), C (1), r);
svfloat64_t p23 = svmla_x (pg, C (2), C (3), r);
svfloat64_t p04 = svmla_x (pg, p01, p23, r2);
svfloat64_t y = svmla_x (pg, r, p04, r2);
svfloat64_t s = svexpa (u);
return svmla_x (pg, s, s, y)

The sve f32 exp function is using a polynomial of
degree 5, not 6 as written in the comment. I modified
the comment accordingly. Moreover, the term y is not
used anywhere in the code, while the result is defined
as a function of poly(r), hence I changed it as well.
@blapie
Copy link
Contributor

blapie commented Jan 3, 2025

Hello, Thanks for the contribution, as agreed we will follow the Arm-internal review process.

@ClaudioMartino
Copy link
Author

Ok, thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants