This is a template repository for building python packages for data pipelines.
Use this template as a starting point for new projects. You only need to
rename mip_start with the name of your project in a few places:
- References in this README file.
- The name of the root, package, and
testing directories (in Pycharm, do a right-click and
then
Refactor > Rename...or justSHIFT + F6). - The name of unit testing script. Be careful to not name it the same as its parent folder!
- References in pyproject.toml.
- Exceptions in .gitignore.
Make sure to keep the word "test_" when renaming the testing directory and the unit testing scripts.
- docs: Hosts documentation (in addition to readme files and docstrings) of the project.
- mip_start: Contains the Python package that solves the problem. It contains scripts that define the input and the output data schemas, the solution engine, data manipulations, and other auxiliary modules.
- test_mip_start: Hosts testing suits and testing data sets used for testing the solution throughout the development process.
pyproject.tomlis used to build the distribution files of the package (more information here).
- Clone the repository on your machine and navigate to its folder (same level of this readme)
- Create a python virtual environment, e.g.
pythonX.Y -m venv <venv_name>- Make sure to use a python version
X.Ycompatible with pyproject.toml'srequires-python - Replace
<venv_name>by a name for your virtual environment (e.g..venv)
- Make sure to use a python version
- Activate the virtual environment
- Linux/macOS:
source <venv_name>/bin/activate - Windows (cmd):
<venv_name>\Scripts\activateor.\<venv_name>\Scripts\activate - If necessary, call
deactivate(Linux/macOS/Windows) to deactivate an already activated virtual environment
- Linux/macOS:
- Install dependencies:
pip install -r requirements.txt
For the scripts on this project to run properly, the python interpreter must be able to locate modules/packages in the root folder of this repository (the parent folder of this readme file). This ensures that all import statements across the project work well.
If you're using Pycharm, you can disregard this section because it will handle that for you under the hood.
Otherwise, you need to manually add the project's root folder to the interpreter's path. With your virtual environment activated, and from the project's root folder, run:
- Linux/macOS:
pwd > "$(python -c 'import site; print(site.getsitepackages()[0])')/path_to_root.pth" - Windows (cmd):
python -c "import site; print(site.getsitepackages())". This will output some paths on the console; copy the first path that containssite-packages(without quotes);cd > "<path>\path_to_root.pth"(make sure to use double quotes). Replace<path>by what you copied from step 1.
This will locate the appropriate site-packages directory under your virtual environment, and create a file path_to_root.pth with one line that points to the root folder. This ensures that any module/package on this repository can be found (i.e. imported) starting from the project's root folder.
To make sure it worked, you can run python -c "import sys; print(sys.path)" and confirm that the path to the root folder of your project shows up among the output.