Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on a new display format for numpydoc lint and pre-commit hook findings #606

Open
stefmolin opened this issue Feb 15, 2025 · 4 comments

Comments

@stefmolin
Copy link
Contributor

The table output generated with tabulate can look like this when there is long content:

Image

The need to compile everything for the table to have consistent sizing also requires pre-commit to run serially. If we switch to a different format, we could have it run in parallel.

I was playing around with an alternate format just now and something like this would be a good start:

numpydoc/docscrape.py:302
in docscrape.NumpyDocString._parse_see_also.parse_item_name
  ES01 - No extended summary found
  PR01 - Parameters {'text'} not documented
  RT01 - No Returns section found
  SA01 - See Also section not found
  EX01 - No examples section found

numpydoc/docscrape.py:345
in docscrape.NumpyDocString._parse_index
  GL02 - Closing quotes should be placed in the line after the last
         text in the docstring (do not close the quotes in the same
         line as the text, or leave a blank line between the last
         text and the quotes)
  GL03 - Double line break found; please use only one blank line to
         separate sections or paragraphs, and do not leave blank
         lines at the end of docstrings
  SS01 - No summary found (a short summary in a single line should be
         present at the beginning of the docstring)
  ES01 - No extended summary found
  PR01 - Parameters {'content', 'section'} not documented
  RT01 - No Returns section found
  SA01 - See Also section not found
  EX01 - No examples section found

What do you think?

@stefanv
Copy link
Contributor

stefanv commented Feb 15, 2025

Thanks for this suggestion @stefmolin!

Can you explain a bit more the requirements for pre-commit to run in parallel?

The new format looks good to me, but I'm a bit surprised that a tool wouldn't require filename:line rule description uniformly for each line.

@stefmolin
Copy link
Contributor Author

The only reason not to run in parallel currently is that it leads to calling tabulate multiple times, which creates multiple tables, and since tabulate determines column widths based on the data, these tables may not line up either. Here's an example:

Image

Running serially allows us to call tabulate once with all of the findings. By default pre-commit will run a hook in parallel, but I had to configure this as serial due to the alignment issue:

require_serial: true


If we remove the need for alignment (e.g., switch to a non-tabular format), we can run in parallel because we don't need to collect and format everything together at the end.

I'm a bit surprised that a tool wouldn't require filename:line rule description uniformly for each line.

I'm not quite sure what you mean here. I'm open to suggestions on the format, but the descriptions are sometimes extremely long and there is no way it wouldn't wrap (multiple) lines.

@stefanv
Copy link
Contributor

stefanv commented Feb 16, 2025

I see now: parallelization means one run per file (of bunch of files). This format works just fine for that. The filename:line rule description format would also work, and while being a bit more verbose, would allow for automated parsing by other tools.

@stefmolin
Copy link
Contributor Author

So you would prefer this format?

numpydoc/numpydoc.py:388: GL08 The object does not have a docstring

numpydoc/numpydoc.py:399: GL01 Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)

numpydoc/numpydoc.py:399: GL02 Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)

numpydoc/numpydoc.py:399: GL03 Double line break found; please use only one blank line to separate sections or paragraphs, and do not leave blank lines at the end of docstrings

numpydoc/numpydoc.py:399: PR01 Parameters {'lines', 'content_old'} not documented

numpydoc/numpydoc.py:399: RT01 No Returns section found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants