Skip to content

formater: PEP8 operator recommendations #18868

Closed as not planned
Closed as not planned
@DerWeh

Description

@DerWeh

Question

(Sorry, if this issue already popped up, but I didn't find anything.)
Thank you very much for this incredible formatter, which helps to end pointless discussions on stylistic choices.

I just have one problem, that is the spacing operators, which in fact reduces readability. And we all know:

Readability counts.

ruff consistently places a single space around binary operators. PEP8, on the other hand, suggests:

If operators with different priorities are used, consider adding whitespace around the operators with the lowest priority(ies). Use your own judgment; however, never use more than one space, and always have the same amount of whitespace on both sides of a binary operator:

# Correct:
i = i + 1
submitted += 1
x = x*2 - 1
hypot2 = x*x + y*y
c = (a+b) * (a-b)
# Wrong:
i=i+1
submitted +=1
x = x * 2 - 1
hypot2 = x * x + y * y
c = (a + b) * (a - b)

The use your own judgment is, of course, not really suitable for an automatic formatter, but I am confident, that we can do better than the current default. My suggestion: don't put spaces around the highest priority operator if there are operators with different priority in the same parenthesis or line if there are no parentheses.1
In my opinion, enforcing this with an automatic formatter would reduce misunderstandings, as the reader is less likely to make wrong assumptions on the operator priority.

Other alternatives are to always use parentheses or to always split the equation on multiple lines, but both options reduce readability due to visual clutter.

Currently, this issue is a showstopper for me to adopt ruff for numerical code. Of course, I can turn off formatting selectively for equations, but this is a rather error-prone solution that counters the purpose of automatic formatters.

🔽 Note: ruff puts spaces around `**` if there are parentheses or brackets involved, else not
3**2
(2 + 1) ** 2
3 ** (1 + 1)
sqrt(9) ** 2
3 ** sqrt(4)
np.finfo(np.float64).eps ** 0.25
tt[small] ** 2

The behavior seems a little odd to me. Is this intentional?


In the following, I provide some more code examples from my own code base
# mine
W = (0.25*U*U + 4*t*t)**0.5
E_0 = 0.5*U - W
gf_z = (0.5 + s*t/W) / (z - (E_0 + s*t))
gf_z += (0.5 - s*t/W) / (z - (U + s*t - E_0))
# ruff
W = (0.25 * U * U + 4 * t * t) ** 0.5
E_0 = 0.5 * U - W
gf_z = (0.5 + s * t / W) / (z - (E_0 + s * t))
gf_z += (0.5 - s * t / W) / (z - (U + s * t - E_0))

The formatted equation is very difficult to read... We have to introduce further parenthesis and reorder some expressions:

W = ((0.25 * U * U) + (4 * t * t)) ** 0.5
E_0 = (0.5 * U) - W
gf_z = (0.5 + (s * t / W)) / (z - (s * t + E_0))
gf_z += (0.5 - (s * t / W)) / (z - (s * t + U - E_0))

But it still seems too hard to read, we probably need to introduce some sub-expressions and split equations, but this in turn reduces readability...

# mine
denominator = 0.5 * np.pi / wn[-1]**2
# ruff
denominator = 0.5 * np.pi / wn[-1] ** 2

This example is not all too bad, we are used that ** has high priority and / is read like a fraction.

# mine
self_cpa = self_cpa_re - mu + 1j*self_cpa_im  # add contribution of mu
# ruff
self_cpa = self_cpa_re - mu + 1j * self_cpa_im  # add contribution of mu

We are trained to read complex numbers as re + 1j*im. re + 1j * im is harder when we read left-to-right, as we see the tokens in the order rere + 1jre + 1j * im. Inverting the order 1j * im + re is not a good option as we are conditioned to first get the real, then the imaginary part. re + (1j * im) is probably the best option, but adds visual clutter, which might be problematic for more complicated equations. Here it would be fine

self_cpa = (self_cpa_re - mu) + (1j * self_cpa_im)  # add contribution of mu
# mine
gf_z = delta_tt * (weight1*g_dft + weight2*dg_dft)
# ruff
gf_z = delta_tt * (weight1 * g_dft + weight2 * dg_dft)
# parenthesis
gf_z = delta_tt * ((weight1 * g_dft) + (weight2 * dg_dft))
# mine
return 0.25 * (np.pi**2 * z) * _u_ellipk(m)**2
# ruff
return 0.25 * (np.pi**2 * z) * _u_ellipk(m) ** 2
# mine
m = 0.5 - 0.5j*np.sqrt(eps_rel[finite]**-2 - 1)
Ksqr = _u_ellipk(m)**2
# ruff
m = 0.5 - 0.5j * np.sqrt(eps_rel[finite] ** -2 - 1)
Ksqr = _u_ellipk(m) ** 2
# mine
gf_d1 = (2. / half_bandwidth**2) * (1 - 1/sqrt)
# ruff
gf_d1 = (2.0 / half_bandwidth**2) * (1 - 1 / sqrt)
# mine 
gf_d2 = (2.0 / half_bandwidth**3) * z_rel * sqrt / (1 - z_rel**2)**2
# ruff
gf_d2 = (2.0 / half_bandwidth**3) * z_rel * sqrt / (1 - z_rel**2) ** 2
# mine
xi = mp.sqrt(1 - mp.sqrt(1 - z_sqr)) / mp.sqrt(1 + mp.sqrt(1 - 9*z_sqr))
k2 = 16 * xi**3 / ((1 - xi)**3 * (1 + 3*xi))
green = (1 - 9*xi**4) * (2 * mp.ellipk(k2) / mp.pi)**2 / ((1 - xi)**3 * (1 + 3*xi)) / z
# ruff
xi = mp.sqrt(1 - mp.sqrt(1 - z_sqr)) / mp.sqrt(1 + mp.sqrt(1 - 9 * z_sqr))
k2 = 16 * xi**3 / ((1 - xi) ** 3 * (1 + 3 * xi))
green = (1 - 9 * xi**4) * (2 * mp.ellipk(k2) / mp.pi) ** 2 / ((1 - xi) ** 3 * (1 + 3 * xi)) / z

Version

0.12.0

Footnotes

  1. One could think of augmenting this rule such that from a certain priority there are no spaces by default unless there is another higher priority operator, such that something like x**y has no spaces by default.

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionAsking for support or clarification

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions