Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stringing together everything in the command-line interface #39

Merged
merged 16 commits into from
Jan 12, 2023

Conversation

jherland
Copy link
Member

@jherland jherland commented Jan 5, 2023

This PR finally combines extract_imports, extract_dependencies and check into a functional command-line interface.

There are still a fair amount of sharp corners and TODOs, but this gets us in the ballpark of something that is useable and useful for end users.

  • main: Rephrase how we import the extract_imports module
  • Move handling of --code argument into extract_imports
  • main: Teach command-line interface to list dependencies
  • test_cmdline: Prepare for changing default action
  • main: Finally teach command-line interface to check imports vs deps
  • main: Flip default action from --list-imports to --check
  • test_cmdline: Rephrase test names

@jherland jherland mentioned this pull request Jan 5, 2023
2 tasks
@mknorps mknorps linked an issue Jan 5, 2023 that may be closed by this pull request
2 tasks
@jherland
Copy link
Member Author

jherland commented Jan 9, 2023

I kept this PR as a draft because I still had some unresolved thoughts around various aspects of it. Feel free to discuss whether these are things we need to resolve before merging this PR, or if they can wait until after the first prototype is shipped:

  • Reporting; For now the undeclared + unused deps are simply printed to stdout with a simple text heading. Do we want to use log messages (e.g. logger.error() for undeclared deps and logger.warning() for unused deps)? I think the answer is "no", because this is kind of our "main" output, and that does belong on stdout (log messages go to stderr).
  • Reporting: Currently no way to ask for only undeclared or only unused deps. The mechanism is all there in the code (the Action mechanism), but not sure how to best expose the command-line interface for filtering?
  • Reporting: More formats. For now there is only a boring human-readable list/text. Need e.g. JSON output controlled by more command-line options. Probably not in the first prototype?
  • Exit code. Currently, we return exit code 0 (i.e. success) even when there are undeclared/unused deps found. This should be fixed.
  • Configuration? At some point we probably want FawltyDeps to be configured via directives in pyproject.toml, rather than command-line flags. (This is certainly not needed in the first prototype.)
  • Import style. As we have started discussing elsewhere: Do we want:
    • from fawltydeps import extract_imports then below: extract_imports.parse_foo()
    • from fawltydeps.extract_imports import parse_foo, then below: parse_foo()
    • from fawltydeps.extract_imports import extract_imports, then below: extract_imports()`
    • something else?
    • My preference is - I think - for the first one, but I'm open to arguments.
  • There are some unresolved TODOs in the test cases. Might require fixes outside the command-line interface itself.
  • More test cases needed. For now, we test mostly happy-paths, and not all kinds of weird corner cases that violate our assumptions.
  • Update README.md with instructions on how to use the command-line interface. Will do this as part of this PR.

@jherland jherland marked this pull request as ready for review January 9, 2023 18:47
@jherland jherland requested review from mknorps and Nour-Mws January 9, 2023 18:47
@mknorps
Copy link
Collaborator

mknorps commented Jan 10, 2023

I will put my comment to your summary first, then will add a particular PR remarks.

  • Reporting; For now the undeclared + unused deps are simply printed to stdout with a simple text heading. Do we want to use log messages (e.g. logger.error() for undeclared deps and logger.warning() for unused deps)? I think the answer is "no", because this is kind of our "main" output, and that does belong on stdout (log messages go to stderr).

I agree.

  • Reporting: Currently no way to ask for only undeclared or only unused deps. The mechanism is all there in the code (the Action mechanism), but not sure how to best expose the command-line interface for filtering?

Is this obsolete? Now you wrote: Action.REPORT_UNDECLARED and Action.REPORT_UNUSED

  • Reporting: More formats. For now there is only a boring human-readable list/text. Need e.g. JSON output controlled by more command-line options. Probably not in the first prototype?

Not the first prototype but definitely a good idea! Worth a separate issue :)

  • Exit code. Currently, we return exit code 0 (i.e. success) even when there are undeclared/unused deps found. This should be fixed.

Yes, and I think this even in the first version. But not necessarily in this PR. Worth an issue good-first-issue.

  • Configuration? At some point we probably want FawltyDeps to be configured via directives in pyproject.toml, rather than command-line flags. (This is certainly not needed in the first prototype.)

You mean like default setting of command-line flags? Or something different?

  • Import style. As we have started discussing elsewhere: Do we want:

    • from fawltydeps import extract_imports then below: extract_imports.parse_foo()
    • from fawltydeps.extract_imports import parse_foo, then below: parse_foo()
    • from fawltydeps.extract_imports import extract_imports, then below: extract_imports()
    • something else?
    • My preference is - I think - for the first one, but I'm open to arguments.

All of them have some issues:

  • from fawltydeps import extract_imports then below: extract_imports.parse_foo() makes function call verbose, but you can have the same interface within each extract_*, but again it looks to me more like a call of an object method,
  • from fawltydeps.extract_imports import parse_foo, then below: parse_foo() makes you invent more names and you cannot have extract_* export the function of the same name, but the function call is straightforward
  • from fawltydeps.extract_imports import extract_imports, then below: extract_imports() does not make you invent more names, though it might be annoying to have the module and function of the same name (pprint...), the function call is straightforward

Another idea would be to keep the names of the main modules super simple, indicating only subject they are dealing with, like:

from fawltydeps.imports import extract_imports, then run it with: extract_imports()

Generally, I would like option 4 the most, then 3 ex equo with 1 (common interface argument).

  • There are some unresolved TODOs in the test cases. Might require fixes outside the command-line interface itself.

Do we transfer todos to issues?

  • More test cases needed. For now, we test mostly happy-paths, and not all kinds of weird corner cases that violate our assumptions.

We can make an issue for it. A lot of beginner-friendly issues may come from this PR :)

  • Update README.md with instructions on how to use the command-line interface. Will do this as part of this PR.

👍

@jherland
Copy link
Member Author

  • Reporting: Currently no way to ask for only undeclared or only unused deps. The mechanism is all there in the code (the Action mechanism), but not sure how to best expose the command-line interface for filtering?

Is this obsolete? Now you wrote: Action.REPORT_UNDECLARED and Action.REPORT_UNUSED

Still relevant. Even if I seperated things into Action.REPORT_UNDECLARED and Action.REPORT_UNUSED there no way yet to enable only one of those two from the command line.

  • Reporting: More formats. For now there is only a boring human-readable list/text. Need e.g. JSON output controlled by more command-line options. Probably not in the first prototype?

Not the first prototype but definitely a good idea! Worth a separate issue :)

Created: #52

  • Exit code. Currently, we return exit code 0 (i.e. success) even when there are undeclared/unused deps found. This should be fixed.

Yes, and I think this even in the first version. But not necessarily in this PR. Worth an issue good-first-issue.

Created: #53

  • Configuration? At some point we probably want FawltyDeps to be configured via directives in pyproject.toml, rather than command-line flags. (This is certainly not needed in the first prototype.)

You mean like default setting of command-line flags? Or something different?

Yes. Or at least - if we do it well - there should be a natural mapping between command-line flags and corresponding config directives in pyproject.toml. I haven't looked into this yet, but IMHO it would makes sense to create a [tool.fawltydeps] section in a project's pyproject.toml just like for e.g. [tool.pylint] or [tool.isort]... Nox is one example of how to integrate configuration via both config file and command-line arguments.

  • There are some unresolved TODOs in the test cases. Might require fixes outside the command-line interface itself.

Do we transfer todos to issues?

I'll make issues of any TODOs that I don't get around to solving in this PR.

@jherland
Copy link
Member Author

  • Configuration? At some point we probably want FawltyDeps to be configured via directives in pyproject.toml, rather than command-line flags. (This is certainly not needed in the first prototype.)

You mean like default setting of command-line flags? Or something different?

Yes. Or at least - if we do it well - there should be a natural mapping between command-line flags and corresponding config directives in pyproject.toml. I haven't looked into this yet, but IMHO it would makes sense to create a [tool.fawltydeps] section in a project's pyproject.toml just like for e.g. [tool.pylint] or [tool.isort]... Nox is one example of how to integrate configuration via both config file and command-line arguments.

Ah, what the heck: #54 😆

@Nour-Mws Nour-Mws added this to the First prototype milestone Jan 10, 2023
@Nour-Mws
Copy link
Collaborator

Since most of the questions raised above now have their own issues and are independent of this PR, I'll only address the remaining points:

  • Import style. As we have started discussing elsewhere: Do we want:

    • from fawltydeps import extract_imports then below: extract_imports.parse_foo()
    • from fawltydeps.extract_imports import parse_foo, then below: parse_foo()
    • from fawltydeps.extract_imports import extract_imports, then below: extract_imports()
    • something else?
    • My preference is - I think - for the first one, but I'm open to arguments.

My preference is definitely for the first one. I don't like importing functions (option 2). I also find duplicating the module name in a function name (option 3) does not appeal to my aesthetic taste.

For what it's worth, Google's Python style guide states the following:

Use import statements for packages and modules only, not for individual classes or functions.
...
2.2.2 Pros
The namespace management convention is simple. The source of each identifier is indicated in a consistent way; x.Obj says that object Obj is defined in module x.

2.2.3 Cons
Module names can still collide. Some module names are inconveniently long.
...

2.2.4 Decision
..

  • Use from x import y where x is the package prefix and y is the module name with no prefix.
  • Use from x import y as z if two modules named y are to be imported, if y conflicts with a top-level name defined in the current module, or if y is an inconveniently long name.

..

Copy link
Collaborator

@Nour-Mws Nour-Mws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I enjoyed reading this PR a lot and picked up a bunch of new things :)
The commit history is also so clean and the commit messages helpful.
This was a pleasant experience. 10 out of 10, would recommend!
Will review again for approval once the todo items and the remaining discussion points have been done.

@Nour-Mws
Copy link
Collaborator

Reporting: Currently no way to ask for only undeclared or only unused deps. The mechanism is all there in the code (the Action mechanism), but not sure how to best expose the command-line interface for filtering?

As you said, the mechanism is all there in the action code, so this is more of a design issue. What options are you thinking of for exposing this on the CLI?
Also, this can totally go into its own issue and its own PR. This PR is self-contained and achieves a more than reasonable CLI for the first prototype. I don't see the reason we should block it on figuring out the best way to expose reporting options in the CLI.

@Nour-Mws
Copy link
Collaborator

Reporting; For now the undeclared + unused deps are simply printed to stdout with a simple text heading. Do we want to use log messages (e.g. logger.error() for undeclared deps and logger.warning() for unused deps)? I think the answer is "no", because this is kind of our "main" output, and that does belong on stdout (log messages go to stderr).

Agree. Our main output should go on stdout.

@jherland
Copy link
Member Author

Reporting: Currently no way to ask for only undeclared or only unused deps. The mechanism is all there in the code (the Action mechanism), but not sure how to best expose the command-line interface for filtering?

As you said, the mechanism is all there in the action code, so this is more of a design issue. What options are you thinking of for exposing this on the CLI? Also, this can totally go into its own issue and its own PR. This PR is self-contained and achieves a more than reasonable CLI for the first prototype. I don't see the reason we should block it on figuring out the best way to expose reporting options in the CLI.

Thinking maybe of adding two more action options: --check-undeclared and --check-unused. These are like --check, but they only enable one of the reports. They would be part of the mutually exclusive group, so you could only use one at a time. --check would remain the default action.

Copy link
Collaborator

@mknorps mknorps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good interface - simple but to the point. I like it and the simplicity of the code that goes behind :)

I put few comments in the code, and apart from that, I think we need to:

  • update README
  • consider moving example of projects to conftest for reusability

Side note - running fawltydeps on fawltydeps is excrutiatingly slow. I propose, we add .fawltydepsignore and do not let default search look into directories listed there (like .git, .venv, etc)

@Nour-Mws
Copy link
Collaborator

Nour-Mws commented Jan 11, 2023

Reporting: Currently no way to ask for only undeclared or only unused deps. The mechanism is all there in the code (the Action mechanism), but not sure how to best expose the command-line interface for filtering?

As you said, the mechanism is all there in the action code, so this is more of a design issue. What options are you thinking of for exposing this on the CLI? Also, this can totally go into its own issue and its own PR. This PR is self-contained and achieves a more than reasonable CLI for the first prototype. I don't see the reason we should block it on figuring out the best way to expose reporting options in the CLI.

Thinking maybe of adding two more action options: --check-undeclared and --check-unused. These are like --check, but they only enable one of the reports. They would be part of the mutually exclusive group, so you could only use one at a time. --check would remain the default action.

Sounds reasonable to me.

@Nour-Mws
Copy link
Collaborator

@jherland what is your plan for this PR? Are there any CLI changes or tests that you are planning on adding?

@mknorps mknorps mentioned this pull request Jan 11, 2023
@mknorps
Copy link
Collaborator

mknorps commented Jan 11, 2023

  • consider moving example of projects to conftest for reusability

This one I can do in #11.

@jherland jherland force-pushed the jherland/cmdline-with-deps branch from d19f511 to c3491fc Compare January 11, 2023 19:41
@jherland
Copy link
Member Author

Thinking maybe of adding two more action options: --check-undeclared and --check-unused. These are like --check, but they only enable one of the reports. They would be part of the mutually exclusive group, so you could only use one at a time. --check would remain the default action.

Sounds reasonable to me.

Added these in dccf5d3

@jherland
Copy link
Member Author

From @Nour-Mws:

@jherland what is your plan for this PR? Are there any CLI changes or tests that you are planning on adding?

I think I have added all the stuff I planned for now. Further work/TODOs can be solved in later PRs.

From @mknorps:

  • update README

First draft added in b55d7c5.

  • consider moving example of projects to conftest for reusability

I'll leave further reorganizing of test code for a later PR. There are some things I'd like to discuss about what belongs in conftest.py vs. the importance of keeping test data as close to the test case/code as possible.

Side note - running fawltydeps on fawltydeps is excrutiatingly slow. I propose, we add .fawltydepsignore and do not let default search look into directories listed there (like .git, .venv, etc)

Yes. We need some way to ignore files/dirs, and probably even a set of defautl ignores. I don't think we need a .fawltydepsignore file, though, I'd rather solve it via a fawltydeps.ignore directive in pyproject.toml (see #54).

Copy link
Collaborator

@mknorps mknorps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Readme instructions are clear and informative.
We can run it later by a person who did not touch this repo, to have user feedback.

Apart from some editorial suggestions, I think this PR is good to be 🚀 :)

Side note - running fawltydeps on fawltydeps is excrutiatingly slow. I propose, we add .fawltydepsignore and do not let default search look into directories listed there (like .git, .venv, etc)

Yes. We need some way to ignore files/dirs, and probably even a set of defautl ignores. I don't think we need a .fawltydepsignore file, though, I'd rather solve it via a fawltydeps.ignore directive in pyproject.toml (see #54).

You are right. pyproject.toml sounds more suited for the job and can be one source of such information.

@jherland jherland self-assigned this Jan 12, 2023
jherland and others added 16 commits January 12, 2023 21:11
We're about to import more parts of fawltydeps, in which the function
names parse_code(), parse_file(), and parse_dir() are no longer
necessarily unambiguous. Instead we want to qualify them with
"extract_imports.*" anywhere we use them.
It might seem strange to move what is arguably a command-line interface
detail into the extract_imports module, but this ends up accomplishing
two things:

  - Simplify the logic inside the main module. This module is about to
    get considerably more complicated with the addition of calls to the
    extract_dependencies and check modules.

  - Provide a convenience function in extract_imports that smartly
    handles whether we pass a file or a directory (or indeed "-" to
    signal reading from stdin). A similar function is about to be added
    to extract_dependencies, and these will mesh well with how we are
    designing the --code (and future --deps) argument in the
    command-line interface.
This adds the --deps option (similar to --code) to tell fawltydeps where
to look for declared dependencies.

It also introduces two new options to choose which action fawltydeps
should perform:

 --list-imports: List the extracted imports and exit
 --list-deps: List the extracted dependencies and exit

When none of these is given, we currently default to --list-imports.

The available actions will change soon, as will the default action.
We will soon introduce a new default action to the command-line
interface. Change the tests that currently depend on --list-imports
being enabled by default to explicitly pass --list-import instead.

Also update the test names to reflect this.
This adds the missing piece of the command-line puzzle: a --check option
that enables the extracted imports to be compared against the extracted
dependencies, and a report to be generated over undeclared and unused
dependencies.
When none of --check/--list-imports/--list-deps were given, we used to
default to --list-imports. We now instead default to --check, which was
always meant to be the main objective for fawltydeps!

Add a couple of tests to verify that we do the Right Thing when fewer
options are explicitly passed.
When all test cases are named "test_main__*" the "main" signifier is no
longer meaningful. Promote the relevant action option ("list_imports",
"list_deps", or "check") into this prime position instead.
perform_actions() has a lot of considitionals that depend on whether an
action is enabled (i.e. present in the given `actions` set) or not.

These tests were done using a fairly terse syntax:

    if actions & {<one or more actions>}:

which is not very readable unless one is already familiar with the
shorthand for set intersections in Python, plus the fact and
non-empty/empty containers evaluate to true/false in a boolean context.

Aid readability by adding an `is_enabled()` inner helper function, which
does exactly the same, but hopefully in a more readable way.
These perform each half of --check. They are mutually exclusive with
--check, --list-imports, and --list-deps.
The run_fawltydeps() helper runs our command-line application inside a
subprocess. When it fails, we typically want to look at its cpatured
stdout/stderr while debugging the test failure. This include the
captured stdout/stderr in the test output when run_fawltydeps() fails.
This cuts boilerplate when a test case created a sample project with:
 - A single Python script that consists of a few import statements
 - A single requirements.txt that lists a few declared dependencies

Instead of manually constructing the files with their needed contents
inside each test case, this helper takes a list of imports and a list of
declared deps, and then manufactures a code.py with the given imports
and a requirements.txt with the given declared deps.
There are a couple of more test cases in test_cmdline.py that manually
constructed their own Python projects. Use the newly-merged
write_tmp_files() fixture to cut a few lines of boilerplate from these
test cases.
@jherland jherland force-pushed the jherland/cmdline-with-deps branch from 39d333e to 24bc421 Compare January 12, 2023 20:13
@jherland jherland merged commit 3370c44 into main Jan 12, 2023
@jherland jherland deleted the jherland/cmdline-with-deps branch February 8, 2023 11:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Connect modules together
3 participants