Skip to content

Repo cleanup#22

Open
yann-mabsilico wants to merge 3 commits intooxpig:masterfrom
mabsilico:repo_cleanup
Open

Repo cleanup#22
yann-mabsilico wants to merge 3 commits intooxpig:masterfrom
mabsilico:repo_cleanup

Conversation

@yann-mabsilico
Copy link

Context

I want to maintain a modified version of anarci while keeping integration of future anarci changes as smooth as possible. This is rendered difficult by the extraneous files currently present in the repository, as well as the less than ideal installation process.

What this PR does

Removes all non-essential files to reduce the repository to its bare minimum. Streamlines the installation process.

Details

muscle

The packaged muscle executable does not work on all systems. I'm running debians, both stable and testing and in both cases, the packaged muscle executable immediately exits with a segmentation fault. Worse than this, in some context, the broken muscle installed by anarci takes precedence over the non-broken binary installed on the system, meaning any call to muscle will now fail.
What the PR does: removes the packaged binaries of muscle, remove its licence, removes the lines pertaining to their installation and adds muscle as a requirement (like hmmer is).

The build/ directory

The directory and its files are created by distutils when the setup.py is called. It serves no purpose.
What the PR does: removed the directory.

The build_pipeline/IMGT_sequence/, build_pipeline/curated_alignments/, build_pipeline/muscle_alignments/ and build_pipeline/HMMs/ directories

These are created by the installation process. Since the installation actually overwrites them, they don't need to be in the repository.
What the PR does: removed all of them.

The lib/python/anarci/germlines.py file and lib/python/anarci/dat/ directory

lib/python/anarci/germlines.py is only used if the build part of the setup.py fails and lib/python/anarci/dat/ is never used.
Explanation: the installation part of setup.py copies the .py files (but the dat/ directory is ignored because the package_data option is commented) to the appropriate location.
After that the build part of setup.py overwrites both germlines.py and the dat directory.
What the PR does: removed both.
Note. With the modified installation (see further), there is an actual value in keeping them, as the python module could then be installed without going through the building process.

.gitignore

I can only assume that the reason for the extraneous file is the use of git add .. The PR adds a .gitignore file to avoid these files being re-added by such command (which should preferably be avoided entirely). It also keeps them out of commands such as git status. The PR also adds an equivalent .hgignore for mercurial users.

Installation process

The PR isolates the building part of the installation into the (newly created) build.sh script. If executed without argument, germlines.py and the dat/ directory will be built and copied to lib/python/anarci/. If executed with a clean argument, it will remove all build-related files.
As a result, setup.py has been reduced and simply installs the ANARCI executable and the anarci python module (including the dat/ directory.
In addition, distutils has been replaced by setuptools to allow installation, and perhaps more importantly, uninstallation, via pip.
The INSTALL file has been modified to reflect these changes.

yann-mabsilico added 3 commits November 24, 2021 16:06
…nstallation and added it to the list of requirements.
… building (now in `build.sh`) and installation. Made it `pip` installable. Updated `INSTALL`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant