Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

biscuit asm fails on generated epiread files #13

Open
ttriche opened this issue Feb 5, 2017 · 3 comments
Open

biscuit asm fails on generated epiread files #13

ttriche opened this issue Feb 5, 2017 · 3 comments

Comments

@ttriche
Copy link
Contributor

ttriche commented Feb 5, 2017

Compiled with debug symbols and run under gdb...

gdb -q --args ./biscuit asm epiread.txt
Reading symbols from ./biscuit...done.
(gdb) b main_asm
Breakpoint 1 at 0x453853: file src/asm_pairwise.c, line 91.
(gdb) r
Starting program: /home/tim/biscuit/biscuit asm epiread.txt
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Breakpoint 1, main_asm (argc=2, argv=0x7fffffffdf50) at src/asm_pairwise.c:91
91	int main_asm(int argc, char *argv[]) {
... 
(gdb) n
125	    assert(strlen(in->fields[4]) == 1);
(gdb) print in->fields[4]
$1 = 0x6f0600 "10468"
(gdb) n
biscuit: src/asm_pairwise.c:125: main_asm: Assertion `strlen(in->fields[4]) == 1' failed.

Program received signal SIGABRT, Aborted.

I'm having a hard time figuring out why it is asserting that a coordinate should be length 1...

epiread.txt

@ttriche
Copy link
Contributor Author

ttriche commented Feb 5, 2017

Hey I think I have an idea, maybe this is just off-by-one or off-by-two?

Here is what epiread.txt (above) looks like:

chr1	MG01HX08:278:H3VCCALXX:8:1104:27052:45892	1	-	10468	
CCCCTC	10341	CCATCACC

Unless something funny is happening, in wztsv.h, https://github.com/zwdzwd/utils/blob/3281eb6f70e431480a036d33ba587c70f6fc26c7/wztsv.h#L86, tsv_read should fill up the tsv struct "in" as follows:

(gdb) explore in
'in' is a pointer to a value of type 'tsv_t'
Continue exploring it as a pointer to a single value [y/n]: y

  The value of '*in' is of type 'tsv_t' which is a typedef of type 'struct tsv_t'
The value of '*in' is a struct/class of type 'struct tsv_t' with the following fields:

    fields = <Enter 0 to explore this field of type 'char **'>
         n = <Enter 1 to explore this field of type 'size_t'>
         m = <Enter 2 to explore this field of type 'size_t'>
        fp = <Enter 3 to explore this field of type 'gzFile'>
  finished = 0 .. (Value of type 'int')
  line_max = 10000 .. (Value of type 'int')
      line = <Enter 6 to explore this field of type 'char *'>
Enter the field number of choice: q 
(gdb) print in
$1 = (tsv_t *) 0x6e60a0
(gdb) print in->fields[0]
$2 = 0x6f0550 "chr1"
(gdb) print in->fields[1]
$3 = 0x6f0570 "MG01HX08:278:H3VCCALXX:8:1104:27052:45892"
(gdb) print in->fields[2]
$4 = 0x6f05b0 "1"
(gdb) print in->fields[3]
$5 = 0x6f0530 "-"
(gdb) print in->fields[4]
$6 = 0x6f0600 "10468"

So it looks like this is maybe an off-by-one error?

If I replace the code with

    assert(strlen(in->fields[2]) == 1);
    assert(strlen(in->fields[3]) == 1);

it looks like we are OK, although then it's going to corrupt the _snp / _cg code following. Let's see...

@ttriche
Copy link
Contributor Author

ttriche commented Feb 5, 2017

nb. It's not an issue of single-end vs paired-end epiread format:

gdb -q --args ./biscuit asm epiread.paired.txt 
Reading symbols from ./biscuit...done.
(gdb) b main_asm
Breakpoint 1 at 0x453853: file src/asm_pairwise.c, line 91.
(gdb) r
Starting program: /home/tim/Dropbox/biscuit/biscuit asm epiread.paired.txt
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
...
(gdb) print in->fields[3]
$1 = 0x6f0530 "CCCT"
(gdb) print in->fields[4]
$2 = 0x6f05e0 "."
(gdb) n
biscuit: src/asm_pairwise.c:124: main_asm: Assertion `strlen(in->fields[3]) == 1' failed.

epiread.paired.txt

@ttriche
Copy link
Contributor Author

ttriche commented Feb 5, 2017

followup: changing

assert(strlen(in->fields[3]) == 1);
assert(strlen(in->fields[4]) == 1);

to

assert(strlen(in->fields[2]) == 1);
assert(strlen(in->fields[3]) == 1);

allows things to proceed and eventually the program exits normally, although I haven't come across an ASM-possessing region in my osteosarcoma to test it out on. So maybe it is an off-by-one error?

I need a positive control before I'm willing to send in a patch :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant