A simple collection of well documented methods for performing data science on a wide range of data.
the toolbox contains methods for different tasks within different domains with the naming convention datasciencetoolbox.<domain>.<task>.<method>
the most important domain name is 'general' containing generic implementations of methods for fundamental tasks(i.e. implementation of a nearest-neighbor classifier).
"Domain name examples":
- General
- Time Series
- ...
Domains may use methods from other domains depending on how specialized they are. Most domains uses something from the general domain
"Task name examples":
- preprocessing
- modeling
- ...
Following these guidelines when contributing to the data science tool box
Tests are located in the test directory and run by either
pytest
or to test the installation of the package with tox
tox
...