Skip to content

Commit

Permalink
adding first pass
Browse files Browse the repository at this point in the history
  • Loading branch information
jimboid committed Aug 23, 2024
1 parent fd787a6 commit 8115141
Show file tree
Hide file tree
Showing 9 changed files with 598 additions and 0 deletions.
34 changes: 34 additions & 0 deletions code_quality/1_code_quality.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Code Quality 101

This course is an introduction to using pylint to increase the quality of software that you may write in the future.


### Prerequisites

* Access to a computer that has Python installed
* Familiar with any programming language - beneficial if you have basic Python
* Basic familiarity with how to start command-line based tools


### Summary

As software engineers or scientists that write software scripts, whether this is for proper distributable software pacakages and tools or simply to share simulation setup protocols with your colleagues. It is best practice from the outset to write code with quality, sharing and maintainability in mind, even if at the start you do not wish to develop for the purpose of sharing.

In this course we will focus on tools for code quality in the Python and git toolchains, but you can find similar tools for other toolchains out in the wild.


## Code Quality - Why do we care?

Imagine you are a new enthusiastic researcher, you have just landed a new research post (PhD or PDRA) and on your first day you are introduced to your project. The project up until now has been described as working on an exciting piece of science to extend previous works in a new cutting edge direction.

On your first day you are given a compressed file archive and told "this is the software we have for doing this research, it has been developed by several folks over the years". You are probably thinking "ok - awesome, can't wait to get started". You go away and start looking through the code and you start to find it is very poorly structured, difficult to read and has been coded in several different styles all smashed together - and here is the kicker, there's also no version control!

To stop this endless cycle of drama with software development, software engineers have for many years put significant effort into standardising the way that we write our software so that such issues become a thing of the past. This has required a significant community effort to standardise the ways in which we write software so that we can make this happen.

One such standard that belongs to the Python community is the PEP 8 standard (or style guide). [PEPs](https://peps.python.org/) are Python community parlance for ratified and agreed ways of doing things that all developers of tools in the language should adhere to. There are many useful PEPs but in this particular workshop, we will focus on [PEP 8](https://peps.python.org/pep-0008/).

Activity: Have a look over the two links above just to appreciate what information they contain.

The reason we use standards, is that they lead to cleaner code that is more easy to share, if we all write and learn to read code in a particular format, then contributing to other projects becomes very easy. Which means our code is a lot easier to maintain as it grows, both in its size and in community.

This course is designed to give you an introduction to some of the types of tools that are available to us as developers and how to get started using them. Often the difficult bit is simply getting going rather than learning about the advanced features!
95 changes: 95 additions & 0 deletions code_quality/2_pylint.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Introducing Pylint

You can find a much more exhaustive look at Pylint in the [Documentation](https://pylint.readthedocs.io/en/stable/)

Now PEP 8 is a rather large document and we are not going to get very far in improving our ways of working if we require all developers to memorise a standard like that!! So this is where pylint comes in, we use pylint to test our compliance against a standard and by default this will be PEP 8. So every time you are working on python, you can use Pylint to check your code for compliance before you version control it, and this means you don't have to hold the whole of PEP 8 in your head!


### Installing Pylint

Installing pylint is fairly trivial, we can simply use Python pip.

```bash
pip install pylint
```

If you are working in a conda environment then you can use pip to install it into your conda environment. You must have activated the environment before running "pip install pylint" otherwise it will not be installed into the correct path. This is particularly useful if you have many projects, and some might have different python package version and/or dependency requirements.


### Getting started with Pylint

If the above installation went smoothly, and it really should have since it is just a pip installed package.

You should then be able to check it is installed by running:

```bash
pylint --version
```

This is a good way to also see what specific version is installed. We can get some help on the command-line with regards to basic functionality by running:

```bash
pylint --help
```

It is often more helpful to run the longer help output to get a quick idea of what you can do with pylint on the command-line without having to go into the documentation:


```bash
pylint --long-help
```

A useful part of the help output is the towards the end of the help message. It shows the following output, and this gives you an idea of what the messages that pylint will return is picking on.

```bash
Output:
Using the default text output, the message format is :
MESSAGE_TYPE: LINE_NUM:[OBJECT:] MESSAGE
There are 5 kind of message types :
* (C) convention, for programming standard violation
* (R) refactor, for bad code smell
* (W) warning, for python specific problems
* (E) error, for probable bugs in the code
* (F) fatal, if an error occurred which prevented pylint from doing
further processing.
```
For example if pylint had this as one of its outputs "C0114: Missing module docstring (missing-module-docstring)" the "C" is telling you it is a convention related issue so is likely to be a violation of the PEP 8 standard, if you don't yet know how to fix it you can go to the PEP 8 document linked above and actually look at the correct way. The other letters will typically indicate syntactic or errors in the code that should be fixed.
You should however consult the documentation linked above, for very specific information about Pylint, and of course for advanced information not covered in this introduction.
### Using Pylint on an example
Enough with the intro, lets get going on taking Pylint for a spin. We have provided a code example randomly discovered on the internet, since they are often badly written and this one will not disappoint!
In your terminal, change into the examples directory for the code quality part of the workshop, for example if you are already in the repository path in your terminal then:
```bash
cd code_quality/examples
```
It is worth to have a look at the example programme to see what the code looks like, but you don't need to figure out exactly what it does for this tutorial. The cat command will dump the contents of the file to your terminal:
```bash
cat 1_random_code.py
```
Now run Pylint on this file:
```bash
pylint 1_random_code.py
```
What do you see and what does it mean?
You will notice that there are listed a bunch of bad indentation warnings and a few other warnings, and you will also it grades the code out of 10, this one is pretty shocking! If you are not sure what these mean and how to fix them, then consult the [PEP 8](https://peps.python.org/pep-0008/) standard where it will tell you specifically what the code should have looked like.
Activity: Use the PEP 8 standard and Pylint together to fix this code. Can you get this to zero warnings and a code score of 10?
Activity: Can you find a random program on the internet or even one you have written yourself and apply Pylint to it?
That is all there is to using Pylint, you can imagine, this was a small code. Imagine you had just inherited thousands of lines or tens of thousands or even a million lines of code. You aren't going to fix it all by hand any time fast, and this effort is what we call technical debt. Technical debt is a concept in programming where we measure the effort required versus the benefit of doing it, for massive code bases it would be a big effort to standardise a badly written code base.
This is where the next tool comes in!
102 changes: 102 additions & 0 deletions code_quality/3_black.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Introducing Black

You can find a much more exhaustive look at Black in the [Documentation](https://black.readthedocs.io/en/stable/)

Wouldn't it be great if we could automate some of the changes that we need to make in order to simplify making our codes of better quality?

Black is a really useful software utility that enables us to make some of these changes automatically. It will mainly sort out code formatting issues that would crop up as convention issues within Pylint.


### Installing Black

Again, like with Pylint, we can very quickly and easily install black with Python pip.

```bash
pip install black
```

Also, like with Pylint, you can pip install this into any conda environment that you may be using should you have many projects with different dependencies.


### Getting started with Black

As with Pylint, you can check the package is installed correctly and find out what version you have by running:

```bash
black --version
```

You can also get some pointers on the command-line for how to use the tool by running:

```bash
black --help
```

Black is generally run in one of two ways.

The first way is you can have it run through your code and only do checks to report what it found, like this:

```bash
black --check --target-version=py35 <file-or-directory-to-check-goes-here>
```

or the second way is you let it reformat your code so that you have automated code quality improvements. Like this:

```bash
black --target-version=py35 <file-or-directory-to-check-goes-here>
```

There is an important caution that needs to be made here. Black is not always right!!! It can sometimes make changes that although might meet the language spec of something like PEP 8, it would look really bad to a human working on the source. You tend to see these kinds of artifacts when processing certain types of formatted lists or strings where formatting effort has been designed to make them readable but not neccessarily standard, but this is still rare.

It is important to still run a linter on the changes that Black makes, so in the case of Python, you would run Pylint after running Black.

### Example of using Black

Make sure your terminal is still in the examples directory, if you are in the repository main directory then:

```bash
cd code_quality/examples
```

Once here lets run Pylint again on the second file example. This is actually the same program you had in the Pylint example, but with the original mistakes.

```bash
pylint 2_random_code.py
```

You will recognise the errors and warnings from the previous example (hopefully). Ok so this time we are not going to fix it by hand. We are going to use Black to fix as much of it as possible as a first pass.

```bash
black --check --target-version=py35 2_random_code.py
```

Hopefully you will see that Black has its usual "Oh no!" message when it has found issues with your code and said it would reformat it. Now this is not very useful if you wanted to see what it would do beforehand. If you want to see what it will change, then use the --diff flag on the command-line:

```bash
black --check --diff --target-version=py35 2_random_code.py
```

You will see a typical diff that Linux diff utils usually present with lines beginning with a "-" denoting lines that have been removed and lines beginning with a "+" denoting lines added, in this case it is dealing mostly with bad indentation and white space issues as per the PEP 8 standard so you will see the lines changing in those ways.

Now if you are happy with what you see, you can run the same command again without the --check flag and it will change the file.

```bash
black --target-version=py35 2_random_code.py
```

We don't really always need to check and look at the diff, we are doing this for your benefit so you can see what the tool is doing to the code. In reality this is fully automatable like Pylint is, and we only check deeper when issues arise, since it is mostly rare.

Now lets check that file with Pylint again, since we should always check Pylint after tools have modified code:

```bash
pylint 2_random_code.py
```

What do you see now, compared to the very first time?

You are probably noticing it has fixed a whole bunch of things but a few minor things remain. Usually things that remain will be things like the docstrings etc, since Black cannot interpret what your code is supposed to do and add documentation (though this might change in the post-chatGPT era!!!).

Activity: Can you find a random program on the internet or even one you have written yourself and apply Black to it?

The upshot is that you would have a lot less work to do on minor issues than if you only used Pylint and manual fixing, so the technical debt of maintaining high quality code is even lower.

101 changes: 101 additions & 0 deletions code_quality/4_isort.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# Introduction to isort

You can find a much more exhaustive look at isort in the [Documentation](https://pycqa.github.io/isort/)

Another area of code that is often overlooked but is extremely important in automating the process of making your code better and more readable.

isort is a tool for dealing with the python module imports part of a script, it will sort the imports alphabetically and also separate different types of imports into groups of like types such that it makes it very simple to keep track of them.

### Installing isort

Like with the other tools we can simply install isort using the python pip installer like this:

```bash
pip install isort
```

You can also install it with pip into a conda environment as we have discussed with the other tools.


### Getting started with isort

As with Pylint and black, you can check the package is installed correctly and find out what version you have by running:

```bash
isort --version
```

You can also get some pointers on the command-line for how to use the tool by running:

```bash
isort --help
```

It is possible to run isort in a number of ways and the usual basic syntax goes something like this:


```bash
isort <file/s or directory here>
```

Or for all files in the current directory you could just do:

```bash
isort .
```

Like with black it is possible to run isort to show a diff of what it will change before letting it do it, and you do this again with the --diff flag.

```bash
isort --diff <file/s or directory here>
```

When running isort automatically without checking the outputs (which is what you want), you can instruct it to only make changes if there are no syntax errors introduced, and we do this by setting the --atomic flag.

```bash
isort --atomic <file/s or directory here>
```

### Example with isort

isort is a pretty simple utility that only acts on the imports in a Python application. This is an example from the isort documentation, but it illustrates how isort works perfectly!

Again, lets make sure you are in the correct directory path, if you are in the repository root then:

```bash
cd code_quality/examples
```

Firstly lets have a look at the contents of the file we are about to sort out:

```bash
cat 3_random_import.py
```

You will see that there is not much going on in this program, and if you tried to run it then it probably wouldn't actually run anything, it is just for the purpose of this sort of example. As you will see this file is all over the place, imports that import functions from the same package are on different lines and there is no sorting or aggregation of similar imports. This is bad practice because if you were maintaining this package it would take you a lot longer to identify imports to change if your code changes if they are not organised properly. This gets a lot, lot worse for bigger projects.

So lets see what isort will actually change:

```bash
isort --diff 3_random_import.py
```

As you can see, again like with Black, the diff will show which lines are removed (minus) and which ones are added (plus). You can see when comparing the minus lines with the plus lines that they are both better organised and shorter in general than the original file.

To have isort run and change the file:

```bash
isort 3_random_import.py
```

Then if you have a look in the file again with cat:

```bash
cat 3_random_import.py
```

You will see it is much, much more organised and easier to read.

Activity: Can you find a random program on the internet or even one you have written yourself and apply isort to it?


Loading

0 comments on commit 8115141

Please sign in to comment.