Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: open2c/pairtools
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v1.0.1
Choose a base ref
...
head repository: open2c/pairtools
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: master
Choose a head ref

Commits on Oct 4, 2022

  1. Copy the full SHA
    16ffd69 View commit details

Commits on Oct 7, 2022

  1. Important fixes of splitting schema; dedup comment removed; pairs lin…

    …es are always split after rstrip newline (#148)
    agalitsyna authored Oct 7, 2022
    Copy the full SHA
    c265e11 View commit details
  2. version number update

    agalitsyna committed Oct 7, 2022
    Copy the full SHA
    ef696d4 View commit details

Commits on Oct 12, 2022

  1. Copy the full SHA
    3ba2423 View commit details

Commits on Oct 25, 2022

  1. select regex upd

    Select based on regex instead of string substitutions. Robust if the column name is a substring of an existing one.
    agalitsyna authored Oct 25, 2022
    Copy the full SHA
    1e8d518 View commit details

Commits on Nov 8, 2022

  1. Copy the full SHA
    7c3d18e View commit details
  2. Modified scaling.bins_pairs_by_distance to fix error thrown by narrow…

    … distance ranges. (#152)
    
    * Fix IndexError when dist_range maximum is less than the largest distance present in pairs
    * upper limit set to np.iinfo(np.int64).max (not np.inf)
    itsameerkat authored Nov 8, 2022
    Copy the full SHA
    63db477 View commit details
  3. changelog update

    agalitsyna committed Nov 8, 2022
    Copy the full SHA
    a99b7d8 View commit details
  4. Python version for deploy set to 3.10, because we don't support pytho…

    …n 3.11 (pysam does not support it).
    agalitsyna committed Nov 8, 2022
    Copy the full SHA
    caf38db View commit details

Commits on Nov 16, 2022

  1. Copy the full SHA
    e0d0223 View commit details
  2. Add new forgotten test file

    Phlya committed Nov 16, 2022
    Copy the full SHA
    3bfe8e3 View commit details

Commits on Nov 29, 2022

  1. speed up indexing

    Phlya committed Nov 29, 2022
    Copy the full SHA
    ce81556 View commit details

Commits on Nov 30, 2022

  1. Copy the full SHA
    41f2137 View commit details

Commits on Dec 1, 2022

  1. Copy the full SHA
    d6bf4af View commit details

Commits on Dec 2, 2022

  1. connected components in groups

    Phlya committed Dec 2, 2022
    Copy the full SHA
    9803e49 View commit details
  2. Copy the full SHA
    c73bc34 View commit details
  3. Copy the full SHA
    9e51c9f View commit details
  4. fix flake8

    Phlya committed Dec 2, 2022
    Copy the full SHA
    8b21cc1 View commit details
  5. small fixes

    Phlya committed Dec 2, 2022
    Copy the full SHA
    45eefd1 View commit details
  6. Fix!

    Phlya committed Dec 2, 2022
    Copy the full SHA
    ea98bce View commit details
  7. Don't lose some pairs!

    Phlya committed Dec 2, 2022
    Copy the full SHA
    3d4d338 View commit details

Commits on Dec 6, 2022

  1. Copy the full SHA
    c8f7eed View commit details
  2. sort by arbitrary columns

    Phlya committed Dec 6, 2022
    Copy the full SHA
    7a4d982 View commit details
  3. Copy the full SHA
    314e0b2 View commit details
  4. Copy the full SHA
    fa5d165 View commit details
  5. Fix and simplify

    Phlya committed Dec 6, 2022
    Copy the full SHA
    af709b0 View commit details
  6. remove unnecessary print

    Phlya committed Dec 6, 2022
    Copy the full SHA
    16669ff View commit details

Commits on Dec 8, 2022

  1. Add dtypes to pairsam format, use them in sort

    Fliamer committed Dec 8, 2022
    Copy the full SHA
    810b49d View commit details
  2. Copy the full SHA
    ce888c1 View commit details

Commits on Dec 13, 2022

  1. Copy the full SHA
    7a313e8 View commit details
  2. Copy the full SHA
    806c9ba View commit details

Commits on Dec 19, 2022

  1. Copy the full SHA
    c37a6df View commit details

Commits on Dec 22, 2022

  1. Copy the full SHA
    4a6a76e View commit details

Commits on Feb 4, 2023

  1. Stats docs (#171)

    * stats docs
    
    * np.int problem fix
    
    ---------
    
    Co-authored-by: Phlya <flyamer@gmail.com>
    agalitsyna and Phlya authored Feb 4, 2023
    Copy the full SHA
    0c94f7b View commit details

Commits on Feb 16, 2023

  1. Copy the full SHA
    3679207 View commit details

Commits on Oct 19, 2023

  1. Update readthedocs.yml

    Fix docs
    Phlya authored Oct 19, 2023
    Copy the full SHA
    ee2ab25 View commit details
  2. Update readthedocs.yml

    Phlya authored Oct 19, 2023
    Copy the full SHA
    f2191a8 View commit details
  3. Update readthedocs.yml

    Phlya authored Oct 19, 2023
    Copy the full SHA
    f5a4fbb View commit details

Commits on Nov 20, 2023

  1. Copy the full SHA
    c5a1561 View commit details
  2. Update CHANGES.md

    Phlya authored Nov 20, 2023
    Copy the full SHA
    c381d7f View commit details
  3. Bump version

    Phlya authored Nov 20, 2023
    Copy the full SHA
    2ada87c View commit details
  4. Copy the full SHA
    b6ea94c View commit details
  5. Copy the full SHA
    59c90f7 View commit details
  6. Copy the full SHA
    162eaf7 View commit details
  7. set docs language to 'en'

    Phlya authored Nov 20, 2023
    Copy the full SHA
    ffc07ea View commit details
  8. Copy the full SHA
    96bafff View commit details
  9. Copy the full SHA
    1735da4 View commit details
  10. Copy the full SHA
    462f57c View commit details
  11. Copy the full SHA
    38e8632 View commit details
  12. Copy the full SHA
    3ffdade View commit details
Showing with 4,895 additions and 1,471 deletions.
  1. +20 −10 .flake8
  2. +88 −0 .github/workflows/python-build-wheels.yml
  3. +44 −25 .github/workflows/python-publish-test.yml
  4. +42 −25 .github/workflows/python-publish.yml
  5. +15 −9 .github/workflows/{python-package.yml → python-test.yml}
  6. +80 −0 CHANGES.md
  7. +5 −3 MANIFEST.in
  8. +20 −6 README.md
  9. +2 −3 doc/conf.py
  10. +2 −2 doc/{technotes.rst → designnotes.rst}
  11. +86 −22 doc/examples/benchmark/Snakefile
  12. +741 −276 doc/examples/benchmark/benchmark.ipynb
  13. +180 −105 doc/examples/benchmark/benchmarking_1mln.csv
  14. +603 −0 doc/examples/duplicate_distance.ipynb
  15. +694 −115 doc/examples/pairtools_phase_walkthrough.ipynb
  16. +5 −5 doc/examples/pairtools_restrict_walkthrough.ipynb
  17. +85 −41 doc/examples/scalings_example.ipynb
  18. +4 −2 doc/index.rst
  19. +22 −2 doc/installation.rst
  20. +14 −0 doc/parsing.rst
  21. +126 −0 doc/protocols_pipelines.rst
  22. +129 −0 doc/stats.rst
  23. +2 −2 pairtools/__init__.py
  24. +55 −51 pairtools/cli/dedup.py
  25. +1 −1 pairtools/cli/flip.py
  26. +1 −1 pairtools/cli/markasdup.py
  27. +1 −1 pairtools/cli/parse.py
  28. +2 −2 pairtools/cli/parse2.py
  29. +1 −1 pairtools/cli/phase.py
  30. +1 −1 pairtools/cli/restrict.py
  31. +10 −10 pairtools/cli/scaling.py
  32. +1 −1 pairtools/cli/select.py
  33. +107 −27 pairtools/cli/sort.py
  34. +2 −2 pairtools/cli/split.py
  35. +13 −3 pairtools/cli/stats.py
  36. +1 −1 pairtools/lib/__init__.py
  37. +280 −82 pairtools/lib/dedup.py
  38. +29 −1 pairtools/lib/fileio.py
  39. +3 −16 pairtools/lib/headerops.py
  40. +46 −0 pairtools/lib/pairsam_format.py
  41. +49 −0 pairtools/lib/pairsio.py
  42. +264 −243 pairtools/lib/parse.py
  43. +107 −60 pairtools/lib/scaling.py
  44. +7 −4 pairtools/lib/select.py
  45. +419 −190 pairtools/lib/stats.py
  46. +77 −0 pyproject.toml
  47. +28 −5 readthedocs.yml
  48. +0 −4 requirements-dev.txt
  49. +0 −8 requirements.txt
  50. +0 −14 requirements_doc.txt
  51. +23 −56 setup.py
  52. +17 −0 tests/data/mock.4dedup_diffcolnames.pairsam
  53. +2 −0 tests/data/mock.pairsam
  54. +11 −0 tests/data/mock.parse2-single-end.expand.sam
  55. +8 −0 tests/data/mock.parse2-single-end.sam
  56. +33 −22 tests/data/mock.parse2.sam
  57. +6 −0 tests/data/mock_empty.4dedup.pairsam
  58. +75 −0 tests/test_dedup.py
  59. +5 −5 tests/test_merge.py
  60. +135 −0 tests/test_parse2.py
  61. +3 −2 tests/test_scaling.py
  62. +3 −3 tests/test_sort.py
  63. +60 −1 tests/test_stats.py
30 changes: 20 additions & 10 deletions .flake8
Original file line number Diff line number Diff line change
@@ -5,14 +5,24 @@ exclude =

max-line-length = 120
ignore =
E203 # whitespace before ':'
E266 # too many leading '#' for block comment
E501 # line too long
W503 # line break before binary operator
# whitespace before ':'
E203
# too many leading '#' for block comment
E266
# line too long
E501
# line break before binary operator
W503
select =
C # mccabe complexity
E # pycodestyle
F # pyflakes error
W # pyflakes warning
B # bugbear
B950 # line exceeds max-line-length + 10%
# mccabe complexity
C
# pycodestyle
E
# pyflakes error
F
# pyflakes warning
W
# bugbear
B
# line exceeds max-line-length + 10%
B950
88 changes: 88 additions & 0 deletions .github/workflows/python-build-wheels.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
name: Build wheels

on: [workflow_dispatch]

jobs:
make_sdist:
name: Make SDist
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Optional, use if you use setuptools_scm
submodules: true # Optional, use if you have submodules

- name: Install dependencies
run: python -m pip install cython numpy pysam

- name: Build SDist
run: pipx run build --sdist

- uses: actions/upload-artifact@v4
with:
name: cibw-sdist
path: dist/*.tar.gz

build_wheels:
name: Build wheels on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
strategy:
matrix:
# macos-13 is an intel runner, macos-14 is apple silicon
os: [ubuntu-latest]
#, windows-latest, macos-13, macos-14]
python-version: [ "3.11" ] # "3.7", "3.8", "3.9", "3.10",

steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
# - name: Build wheels
# uses: pypa/cibuildwheel@v2.21.0
# # uses: pypa/cibuildwheel@v2.17.0
# # env:
# # CIBW_SOME_OPTION: value
# # ...
# # with:
# # package-dir: .
# # output-dir: wheelhouse
# # config-file: "{package}/pyproject.toml"

- name: Install cibuildwheel
run: python -m pip install cibuildwheel==2.22.0

- name: Build wheels
run: python -m cibuildwheel --output-dir dist
# to supply options, put them in 'env', like:
env:
#CIBW_BUILD_FRONTEND: "pip; args: --no-build-isolation"
CIBW_BUILD_FRONTEND: "build; args: --no-isolation"
CIBW_BEFORE_ALL: "yum install bzip2-devel xz-devel -y;"

# we have to recompile pysam so that repairwheel can later find various libraries (libssl, libnghttp2, etc)
#CIBW_BEFORE_ALL: "yum install bzip2-devel xz-devel openssl-devel openldap-devel krb5-devel libssh-devel libnghttp2-devel -y;"
CIBW_BEFORE_BUILD: "python -m pip install setuptools cython numpy pysam --no-binary pysam"

# skip building 32-bit wheels (i686)
CIBW_ARCHS_LINUX: "auto64"

# we could use 2_28 to download pysam's wheel instead of compiling it ;
# HOWEVER THAT DIDN'T WORK BECAUSE PYSAM DEPENDS ON LIBSSL, LIBNGHTTP2, ETC, WHICH CANNOT BE FOUND
# SO WE ARE BACK TO COMPILING PYSAM'S WHEEL (no-binary pysam)
# CIBW_MANYLINUX_X86_64_IMAGE: "manylinux_2_28"

## skip building pypy and musllinux
CIBW_SKIP: pp* *musllinux*

#CIBW_REPAIR_WHEEL_COMMAND: 'auditwheel -v repair -w {dest_dir} {wheel}'

#PIP_NO_CACHE_DIR: "false"
#PIP_NO_BUILD_ISOLATION: "false"
#PIP_NO_BINARY: "pysam"

- uses: actions/upload-artifact@v4
with:
name: cibw-wheels-${{ matrix.os }}-${{ strategy.job-index }}
path: ./dist/*.whl
69 changes: 44 additions & 25 deletions .github/workflows/python-publish-test.yml
Original file line number Diff line number Diff line change
@@ -1,32 +1,51 @@

# This workflows will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

name: Publish Python Package to Test PyPI

on:
release:
types: [prereleased]
# release:
# types: [published]
workflow_dispatch:

jobs:
deploy:

publish_all:
name: Publish wheels and sdist to Test PyPI

# if: github.event_name == 'release' && github.event.action == 'published'

environment: testpypi
permissions:
id-token: write
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine cython numpy pysam
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist
twine upload --repository-url https://test.pypi.org/legacy/ dist/*
- uses: dawidd6/action-download-artifact@v7
with:
# Required, if the repo is private a Personal Access Token with `repo` scope is needed or GitHub token in a job where the permissions `action` scope set to `read`
#github_token: ${{secrets.GITHUB_TOKEN}}
# Optional, workflow file name or ID
# If not specified, will be inferred from run_id (if run_id is specified), or will be the current workflow
workflow: python-build-wheels.yml
# Optional, the status or conclusion of a completed workflow to search for
# Can be one of a workflow conclusion:
# "failure", "success", "neutral", "cancelled", "skipped", "timed_out", "action_required"
# Or a workflow status:
# "completed", "in_progress", "queued"
# Use the empty string ("") to ignore status or conclusion in the search
workflow_conclusion: success

- name: Publish sdist 📦 to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: cibw-sdist
repository-url: https://test.pypi.org/legacy/

- name: Publish wheels 📦 to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: cibw-wheels-ubuntu-latest-0
repository-url: https://test.pypi.org/legacy/







67 changes: 42 additions & 25 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
@@ -1,31 +1,48 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries

name: Upload Python Package
name: Publish Python Package to PyPI

on:
release:
types: [created]
# release:
# types: [published]
workflow_dispatch:

jobs:
deploy:

publish_all:
name: Publish wheels and sdist to PyPI

# if: github.event_name == 'release' && github.event.action == 'published'

environment: pypi
permissions:
id-token: write
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine cython pysam numpy
- name: Build and publish
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: |
python setup.py sdist
twine upload dist/*
- uses: dawidd6/action-download-artifact@v7
with:
# Required, if the repo is private a Personal Access Token with `repo` scope is needed or GitHub token in a job where the permissions `action` scope set to `read`
#github_token: ${{secrets.GITHUB_TOKEN}}
# Optional, workflow file name or ID
# If not specified, will be inferred from run_id (if run_id is specified), or will be the current workflow
workflow: python-build-wheels.yml
# Optional, the status or conclusion of a completed workflow to search for
# Can be one of a workflow conclusion:
# "failure", "success", "neutral", "cancelled", "skipped", "timed_out", "action_required"
# Or a workflow status:
# "completed", "in_progress", "queued"
# Use the empty string ("") to ignore status or conclusion in the search
workflow_conclusion: success

- name: Publish sdist 📦 to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: cibw-sdist

- name: Publish wheels 📦 to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
with:
packages-dir: cibw-wheels-ubuntu-latest-0






Original file line number Diff line number Diff line change
@@ -1,17 +1,21 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: Python package

on: push

name: Test build, lint and test
on:
push:
branches: [ master ]
tags:
- "v*" # Tag events matching v*, i.e. v1.0, v20.15.10
pull_request:
branches: [ master ]
jobs:
build:

runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10"]
python-version: ["3.9", "3.10", "3.11", "3.12"]

steps:
- uses: actions/checkout@v2
@@ -21,10 +25,9 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip wheel setuptools
pip install numpy cython pysam
pip install -r requirements-dev.txt
pip install -e .
python -m pip install --upgrade pip wheel setuptools build
pip install cython pysam numpy
pip install -e .[test] --no-build-isolation -v -v
- name: Lint with flake8
run: |
# stop the build if there are Python syntax errors or undefined names
@@ -35,3 +38,6 @@ jobs:
run: |
pip install pytest
pytest
Loading