Skip to content

SPEC 12: Formatting mathematical expressions #326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
ec37a94
SPEC 13: Recommended targets and naming conventions
tupui Jun 5, 2024
ff7d1ab
SPEC 12: Formatting mathematical expressions
tupui Jun 6, 2024
f46a93e
Attempt to make SPEC 12 complete and unambiguous
mdhaber Jul 14, 2024
5f4b20a
Apply suggestions from code review
tupui Jul 14, 2024
34aa825
Improvements per self-review
mdhaber Jul 15, 2024
63d45e6
Apply suggestions from code review
mdhaber Aug 11, 2024
f2a96a6
Apply suggestions from code review
mdhaber Aug 11, 2024
efa2ea8
Remove old rule 9
mdhaber Aug 11, 2024
8065ec6
Merge pull request #1 from mdhaber/spec_12
tupui Aug 11, 2024
a3d70f8
Update index.md
tupui Sep 7, 2024
6c1174e
Update index.md
tupui Sep 7, 2024
39d2fc2
Update index.md
tupui Sep 7, 2024
07e93f4
Run linter
stefanv Sep 13, 2024
92093c5
Fix spelling error
stefanv Sep 13, 2024
5e7a285
Update spec-0012/index.md
mdhaber Sep 29, 2024
503ce73
MAINT: adjustments per review
mdhaber Oct 12, 2024
e4f3c6f
[pre-commit.ci 🤖] Apply code format tools to PR
pre-commit-ci[bot] Oct 12, 2024
101fde4
Merge remote-tracking branch 'origin/main' into spec_12
stefanv Oct 12, 2024
246f73c
Merge branch 'main' into spec_12
mdhaber Nov 5, 2024
5b738a7
Apply suggestions from code review
mdhaber Nov 5, 2024
716808b
Merge branch 'main' into spec_12
mdhaber Nov 20, 2024
bed0e02
Apply suggestions from code review
mdhaber Nov 20, 2024
8b5b767
Update spec-0012/index.md
mdhaber Nov 20, 2024
1d01101
Update spec-0012/index.md
mdhaber Apr 25, 2025
083a300
Merge branch 'main' into spec_12
bsipocz Apr 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
201 changes: 201 additions & 0 deletions spec-0012/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
---
title: "SPEC 12 — Formatting mathematical expressions"
number: 12
date: 2024-06-06
author:
- "Pamphile Roy <[email protected]>"
- "Matt Haberland <[email protected]>"
discussion: https://discuss.scientific-python.org/t/spec-12-formatting-mathematical-expressions
endorsed-by:
---

## Description

[PEP 8](https://peps.python.org/pep-0008)
and other established styling documents either

- lack comprehensive guidelines about mathematical expressions, or
- provide simple rules that ignore the relationship between formatting and readability.

In practice, this leads to varying, even conflicting, mathematical expression
styles across the ecosystem. We seek to standardize the representation of
mathematical code for the same reason we standardize formatting of other code:
it brings consistency to the ecosystem and allows collaborators to focus on
more important aspects of their work.

## Implementation

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should any of these rules differ based on the expression type? E.g., all of the examples below use Name nodes (like x, y, etc.). What if the expression uses calls or subscripts or similar? Like f() ** 2?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add examples, but no - to keep things simple, I didn't consider changing the rules based on that.


These rules are intended to respect and
complement the [PEP 8 standards](https://peps.python.org/pep-0008), such as using
[implied line continuation](https://peps.python.org/pep-0008/#maximum-line-length) and
and [breaking lines before binary operators](https://peps.python.org/pep-0008/#should-a-line-break-before-or-after-a-binary-operator)[^1].

0. Unless otherwise specified, rely on the implicit order of operations;
i.e., do not add extraneous parentheses. For example, prefer `u**v + y**z`
over `(u**v) + (y**z)`, and prefer `x + y + z` over `(x + y) + z`. A full
list of implicit operator priority levels is given by
[Operator Precedence](https://docs.python.org/3/reference/expressions.html#operator-precedence).
1. Always use the `**` operator and unary `+`, `-`, and `~` operators _without_
surrounding whitespace. For example, prefer `y = -x**4` over `y = - (x ** 4)`.
2. Always surround non-PEMDAS[^2] operators with whitespace, and always make the priority of
non-PEMDAS operators explicit. For example, prefer `(x == y) or (w == t)` over
`x==y or w==t`.[^3]
3. Always surround AS[^2] operators with whitespace.
4. Typically, surround MD[^2] operators with whitespace, except in the following situations.
- When there are lower-priority operators (namely AS) within the same compound
expression[^4]. For example, prefer `z = -x * y**t` over `z = -x*y**t`, but
prefer `z = w - x*y**t` over `z = w - x * y**t` due to the presence of the
lower-priority subtraction operator.
- When the division operation would be written mathematically as a fraction with a
horizontal bar. For example, prefer `z = t/v * x/y` over `z = t / v * x / y`
if this would be written mathematically as the product of two fractions,
e.g. $\frac{t}{v} \cdot \frac{x}{y}$.
5. Considering the previous rules, only `**`, `*`, `/`, and the unary `+`, `-`, and `~`
operators can appear in implicit subexpressions[^4] without spaces. In such expressions,

- Use at most one unary operator, and if used, ensure that it is the leftmost operator.
- Use at most one `**` operator, and if used, ensure that it is the rightmost operator.
- Use at most one `/` operator, and if used, ensure that it is the rightmost operator except for `**`.

To achieve these goals, simplification or the addition of parentheses may be required.
For example:

- The expressions `--x` and `-~x` would be implicit subexpressions without spaces
containing more than one unary operator. The former can be simplified to `+x` or
simply `x`, and the latter requires explicit parentheses, i.e. `-(~x)`.
- The expression `x**y**z` would be an implicit subexpression without spaces
containing more than one `**` operator. This code would be executed as `x**(y**z)`
following the implicit order, but the explicit parentheses should be included for
clarity.
- In the expression `t**v*x**y + z`, no spaces are used around the multiplication
operator due to the presence of the lower-priority addition operator. However,
this would lead to `t**v*x**y` being an implicit subexpression without spaces
containing more than one `**` operator. This code would be executed as
`(t**v)*(x**y) + z`, but the explicit parentheses should be included for clarity.
- In the expression `z + x**y/w`, no spaces are used around the division operator
due to the presence of the lower-priority addition operator. However, this would
lead to `x**y/w` being an implicit subexpression without spaces containing `**`
to the left of another operator. This code would be executed as `z + (x**y)/w`,
but the explicit parentheses should be included for clarity.

6. Simplify combinations of unary and binary `+` and `-` operators when possible.
For example,
- prefer `x + y` over `x + +y`,
- prefer `x + y` over `x - -y`,
- prefer `x - y` over `x - +y`, and
- prefer `x - y` over `x + -y`.
7. If required to satisfy other style requirements, include line breaks before
the outermost explicit subexpression possible. For example, if
`t + (w + (x + (y + z))))` must be broken, prefer
```python3
(t
+ (w + (x + (y + z)))))
```
over
```python3
(t + (w + (x + (y
+ z)))))
```
If there are multiple candidates, include the break at the first opportunity.
8. If line breaks must occur within a compound subexpression, the break should
be placed before the operator with lowest priority. For example, if
(x + y*z) must be broken, prefer
```python3
(x
+ y*z)
```
over
```python3
(x + y
* z)
```
If there are multiple candidates, include the break at the first opportunity.
9. Any of the preceding rules may be broken if there is a clear reason to do so.
- _Conflict with other style rules_. For example, there is not supposed to be
whitespace surrounding the `**` operator, but one can imagine a chain of `**`
operations that exhausts the character limit of a line.
- _Domain knowledge_. For instance, in the expression
`t = (x + y) - z`, it may be important to emphasize that the addition should be
performed first for numerical reasons or because `(x + y)` is a conceptually
important quantity. In such cases, consider adding a comment, e.g.
```python3
t = (x + y) - z # perform `x + y` first for precision
```
or breaking the expressions into separate logical lines, e.g.
```python3
w = x + y
t = w - z
```

## Terminology

An "explicit" expression is a code expression enclosed within parentheses or
otherwise syntactically separated from other expressions (i.e. by code other
than operators, whitespace, literals, or variables). For example, in the list
comprehension:

```python3
[j for j in range(1, i + 1)]
```

The output expression `j` is one explicit expression and the input sequence
`range(1, i + 1)` is another.

A "subexpression" is subset of an expression that is either explicit or could
be made explicit (i.e. with parentheses) without affecting the order of
operations. In the example above, `j` and `range(1, i + 1)` can also be
referred to as explicit subexpressions of the whole expression, and `1` and
`i + 1` are explicit subexpressions of the expression `range(1, i + 1)`. `i` and
`1` are "implicit" subexpressions of `i + 1`: they could be written as explicit
subexpressions `(i)` and `(1)` without affecting the order of operations, but they
are not explicit as written.

As another example, in `x + y*z`, `y*z` is a subexpression because it could be made
explicit as in `x + (y*z)` without changing the order of operations. However, `x + y`
would not be a subexpression because `(x + y)*z` would change the order of operations.
Note that `x + y*z` as a whole may also be referred to as a "subexpression" rather than
an "expression" even though `(x + y*z)` is not a proper subset of the whole.

A "simple" expression is an expression involving only one operator priority level
without considering the operators within explicit subexpressions.
A "compound" expression is an expression involving more than one operator
priority level without considering the contents of explicit subexpressions.
For example,

- `x + y - z` is a simple expression because `+` and `-` have the
same priority level. There are no explicit subexpressions to be ignored.
- `x * (y + z)` is also a simple expression because there is only one operator
between `x` and the explicit subexpression `(y + z)`; we ignore the contents - and
especially the operator - within the explicit subexpression; conceptually, it may
regarded as `(...)`.
- `x*y + z` is a compound expression; there are two operators and no explicit
subexpressions that can be ignored.

[^1]:
Although examples do not show the use of hanging indent, any of the indentation styles
allowed by [PEP 8 Indentation](https://peps.python.org/pep-0008/#indentation) are permitted
by this SPEC.

[^2]:
The acronym PEMDAS commonly refers to "parentheses", "exponentiation", "multiplication",
"division", "addition", and "subtraction". Herein, we will consider these operators
to be "PEMDAS operators", and we will also include the unary `+`, `-`, and `~` in
this category for convenience. The order of operations of PEMDAS operators is typically
taught in primary school and reinforced throughout a programmer's training and
experience, so it is assumed that most programmers are comfortable relying on the
implicit order of operations of expressions involving a few PEMDAS operations. Implicit
order of operations becomes less obvious as the number of distinct operator priority
levels increases and when multiple non-PEMDAS operators are involved. Portions of this
acronym, namely MD and AS, will be used to refer to the corresponding operators.

[^3]:
There is a case for simply eliminating spaces to reinforce the implicit order
of operations, as in `x==y or w==t`. However, if this were the rule, following
the rule would require users to remember the full order of operations hierarchy
and apply it without mistakes. Use of explicit parentheses with non-PEMDAS
operators leads to simpler rules, is more explicit, and is not uncommon in
existing code.

[^4]:
For definitions of "explicit"/"implicit" and "simple"/"compound"
"expressions"/"subexpressions", see Terminology.