Skip to content

Correct how the temp bed files are created#9

Open
davep wants to merge 3 commits intoshenlab-sinai:masterfrom
davep:bugfix
Open

Correct how the temp bed files are created#9
davep wants to merge 3 commits intoshenlab-sinai:masterfrom
davep:bugfix

Conversation

@davep
Copy link

@davep davep commented Apr 11, 2019

This is slightly related to #4 and also to another issue that has been observed, the latter where some of the resulting bed files could end up with "corrupt" content, but never in a consistent way.

There appears to be a problem with how sep_chrom_bed creates the bed files. The naming seems designed to try and avoid clashes but, unfortunately, stands a good chance of ensuring clashes in some situations.

The main problem is that, rather than using conventional methods of generating temp files, a file name is generated from time, is checked for, and if it looks like it already exists a new name is generated with rand (the use of + rather than . in that being the cause of #4). Unfortunately, if multiple processes are being used, this can result in the same sequence of possible file names being generated.

Note that Parallel::ForkManager is used to handle parallel processes, and that it has this warning about rand.

My thinking here is that it might be possible for more than one process to arrive on the same file name at the same time, and then both write to the same file and cause issues and apparent corruption.

This pull request contains changes that switch to using perl's own tempfile function to generate the file names.

davep added 3 commits April 11, 2019 10:22
This fixes a problem where sep_chrom_bed could easily create clashed file
names, while working to actually not create clashed file names. The problem
is that, rather than using tempfile, the code was using seconds since the
Unix epoch and then, if it looked like a clash was possible, adding a random
number to the end of it.

The problem here is that, mixed with a fork (which it would do), it would
create multiple forks that all followed the sane random sequence. See:

https://metacpan.org/pod/Parallel::ForkManager#USING-RAND()-IN-FORKED-PROCESSES

for why. Long story short: the code that would try and ensure there wasn't a
filename clash would almost guarantee that there was.

So this change switches to using an actual temp file name generation
function to create the temporary bed file names.
- Separate the chromosome from the rest of the name.
- Have a slightly longer run of template characters.
- Had .bed as a suffix.

None of this should have the code really work any differently, but it should
make the file names easier on the eye when looking at tmp.
@hsuh001
Copy link

hsuh001 commented Sep 24, 2020

Hello,

I'm trying to use diffReps.pl script and have installed it using cpanm diffReps-1.55.3.tar.gz with all the dependencies.
But when I am running it, I keep getting the error messages:

"Cannot delete file .1600851433.95585.chr7.bed: No such file or directory".

And the result file diff.nb.txt only includes headers, with no other results.

So I refer to your advice and replace MyShortRead.pm with your modified one. Something seems to get better that
the result file diff.nb.txt includes some results but only a part, e.g. only in "chr1". Noted that I still keep getting the error messages:

"Open bed file /tmp/diffreps-chr10-XHhRSQVv27.bed error: No such file or directory"
"Cannot delete file /tmp/diffreps-chr7-0NAoqUewWz.bed: No such file or directory"

Should I address this problem via specifying "TMPDIR" with a local path in line 268 of your modified MyShortRead.pm? Or any other ideas?

Thanks for your time,
Jing Xiao

@davep
Copy link
Author

davep commented Sep 24, 2020

@hsuh001 I'd suggest raising this as an issue for the author if I were you; I don't personally have any experience with this software and just happen to be a software developer who noticed a very particular issue with a bit of perl code.

@hsuh001
Copy link

hsuh001 commented Sep 24, 2020

@davep hi, I have got this problem addressed and run diffReps.pl script successfully. Thanks for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants