-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
create_db() does not parse directives from GFF files starting in v0.11.0 #213
Comments
@dtdoering I'm unable to reproduce in a test -- see the new test at a4b443b and passing tests here. Are you able to reproduce this with the latest in the |
Hey @daler, getting back around to this now! I pasted your However, I must have done a poor job with my initial description of the issue because I am now only able to reproduce the issue again by supplying a At any rate, by supplying |
For the sake of completeness, here is the script ( #!/usr/bin/env python
import sys
import os
import tempfile
from textwrap import dedent
gffutils_git_path = os.path.join(os.environ.get('HOME'), 'sft', 'gffutils')
sys.path.insert(1, gffutils_git_path)
import gffutils
print(f"gffutils version: {gffutils.__version__}")
def test_issue_213():
# GFF header directives seem to be not parsed when building a db from
# a file, even though it seems to work fine from a string.
data = dedent(
"""
##gff-version 3
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
"""
)
# Ensure directives are parsed from DataIterator
it = gffutils.iterators.DataIterator(data, from_string=True)
assert it.directives == ["gff-version 3"]
# Ensure they're parsed into the db from a string
db = gffutils.create_db(data, dbfn=":memory:", from_string=True, verbose=False, dialect={'fmt': 'gff3'})
assert db.directives == ["gff-version 3"], db.directives
# Ensure they're parsed into the db from a file
tmp = tempfile.NamedTemporaryFile(delete=False).name
with open(tmp, "w") as fout:
fout.write(data + "\n")
db = gffutils.create_db(tmp, ":memory:", dialect={'fmt': 'gff3'})
assert db.directives == ["gff-version 3"], db.directives
assert len(db.directives) == 1
# Ensure they're parsed into the db from a file, and going to a file (to
# exactly replicate example in #213)
db = gffutils.create_db(tmp, dbfn='issue_213.db', force=True, dialect={'fmt': 'gff3'})
assert db.directives == ["gff-version 3"], db.directives
assert len(db.directives) == 1
test_issue_213() Then I run it with: ~/gffutils_debug.py |
I am trying to read in a GFF that doesn't adhere to spec so I can use
gffutils
to fix it and then write out a corrected file. I've gotten everything working, except that the GFF header directive(s) don't appear in the output file. It appears they are not being parsed upon creation of the database.Interestingly, this only happens when a FeatureDB is created from a file, not from e.g. a
dedent()
ed string as in the existingparser_test.py
.This behavior is exhibited in v0.11.1 and v0.11.0, but not v0.10.1 (thus, a workaround is to downgrade to v0.10.1).
To reproduce:
Use
gffutils
> v0.10.1.Create a test file
test.gff
with the following:Example code:
Expected output:
Observed output:
System/environment info:
OS: GNU/Linux
Python: 3.8.5 (still happens with 3.8.13)
Conda environment:
The text was updated successfully, but these errors were encountered: