The data dictionary consists at least of the following columns:
Data Set
: Used when mapping in combination withField
to rename to the column toName
.Field
: Column name of the data frame to map toName
.Name
: Column name that is unique throughout the data dictionary.Description
: Description of the column name. This can be used to provide additional information when displaying the data frame.Type
: Type the column should be cast to.Format
: Format to use when values need to be converted to a string representation. The format string has to be a Python format string such as{:.0f}%
The data dictionary can either be loaded from a CSV file (example data dictionary) or from a data frame.
You can install using the pip package manager by running:
pip install pandas-datadict
Alternatively, you could install directly from Github:
pip install https://github.com/177arc/pandas-datadict/archive/master.zip
Download the source code by cloning the repository or by pressing Download ZIP on this page. Install by navigating to the proper directory and running
python setup.py install
For usage guidance and testing the package interactively, hit the Usage Jupyter Notebook.
For the code documentation, please visit the documentation Github Pages.
- Fork the repository on GitHub.
- Run the tests with
python -m pytest tests/
to confirm they all pass on your system. If the tests fail, then try and find out why this is happening. If you aren't able to do this yourself, then don't hesitate to either create an issue on GitHub, contact me on Discord or send an email to [email protected]. - Either create your feature and then write tests for it, or do this the other way around.
- Run all tests again with with
python -m pytest tests/
to confirm that everything still passes, including your newly added test(s). - Create a pull request for the main repository's
master
branch.