Skip to content

sdsc-innovation/cookiecutter-python

Repository files navigation

Cookiecutter template for Python

This repository contains a Cookiecutter template for Python projects. While it is designed to fit a data science context, it is designed to cover most common uses of Python.

Getting started

First, make sure Cookiecutter is installed:

pip install -U cookiecutter

Then, create a project based on this template:

cookiecutter https://github.com/sdsc-innovation/cookiecutter-python.git

This will prompt you for:

  • a project name, used in README.md;
  • a project slug (in kebab-case), used for the top-level directory name;
  • a module name (in snake_case);
  • whether you would like to use Ruff to format and analyze your code;
  • whether you would like to use pytest to write unit tests.

The module name may also be a short generic identifier, such as lib or helper, if you do not plan to use it externally.

The generated folder structure is straightforward:

  • code is stored as an installable module, in src;
  • unit tests have a dedicated folder in tests;
  • notebooks have their own folder;
  • and, by default, a data folder is added, as a suggestion.

You can now create an empty repository on GitHub or GitLab (i.e. without an initial README file), and initialize it locally:

cd your-project
git init --initial-branch=main
git remote add origin https://github.com/you/your-project.git
git add .
git commit -m "Initial commit"
git push --set-upstream origin main

Apart from adding code, both in .py files and notebooks, it is recommended to make sure that dependencies are properly configured. More information about version specifiers can be found in PEP 440.

  • requirements.txt should contain the exact (a.k.a. pinned) versions of the required dependencies during development. Note that pipreqs can be used to infer automatically which packages are used in your code (whereas pip freeze would list all installed packages). Also note that -e . is used to install your newly created module in editable mode.
  • environment.yml, by default, delegates to requirements.txt.
  • In pyproject.toml, under section tool.poetry.dependencies, are listed the install dependencies of your module. It should represent the minimal versions of the required dependencies during regular usage. Therefore, this typically excludes any development tool, such as ruff or pytest.

If unit tests are enabled during template instantiation, CI/CD configuration files are provided both for GitHub and GitLab. Keep only the one that applies to your scenario.

Please refer to the generated README.md for more details, in particular to install dependencies and register pre-commit hooks.

Design choices and tools

  • Python 3.10 is chosen as a minimum, as 3.9 will reach end-of-life in 2025. We recommend 3.11, as some packages may not fully support 3.12 yet.
  • A pyproject.toml file is the recommended way to store configuration.
  • PyPA's Setuptools is used as build backend, as this is historically the most common solution. However, other options are discussed in Python Packaging User Guide, such as Hatch or Poetry.
  • No setup.py is provided, as typical projects do not need to access low-level Setuptools configuration. Note that setup.py and Setuptools are not deprecated as a build backend; building C extensions will require this file.
  • requirements.txt is used to define development dependencies. An environment.yml is also provided for Conda users. Installation dependencies should be defined in pyproject.toml.
  • The src layout is used, to enforce proper use of editable installation during development.
  • Ruff is used for code formatting and linting, using mostly the default configuration.
  • pytest is used for unit testing, as the standard unittest module tends to be more verbose.
  • mypy is suggested for static type checking.
  • .gitlab-ci.yml is pre-configured to run unit tests on GitLab CI/CD.
  • .github/workflows/pytest.yml is provided to run unit tests on GitHub Actions.
  • pre-commit hooks are configured to enforce Ruff, and also some quality-of-life built-in tools.
  • No default license file is provided, as this template does not necessarily targets open source projects. By default, copyright applies; choose a license if you would like to open your project!

References

About

Cookiecutter template for a data science project in Python

Topics

Resources

License

Stars

Watchers

Forks

Languages