I am trialling Cogent on the test_human.fa data as shown on the Wiki, though I did not use the reference genome hg38 for the coding genome reconstruction as I do not have a reference genome for my data and wanted to check I could do everything without it.
python /path/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py
test_human/ hg38 human_output
The above works well. I used hg38 as a placeholder (I don’t have the reference genome hg38 in my wd on purpose), as I received errors when not including this ref genome placeholder. This worked nicely with the human_output.family_summary.txt:
gene_family input_size num_Cogent_contigs num_genome_contig genome_cov genome_acc genome_chimeric genome_contigs
test_human/1_0 9 4 0 0.00 0.00 False
test_human/0_0 10 1 0 0.00 0.00 False
However, when I include blastn in this step:
python /path/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py
test_human/ hg38 human_output
--blastn in.fa.blastn
I receive the error:
Traceback (most recent call last):
File "/apps/unit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py", line 62, in
main(args.cogent_dir, args.genome, args.output_prefix, args.genome2, args.blastn_filename)
File "/apps/unit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py", line 41, in main
sp.tally_for_a_Cogent_dir(d, writer1, writer2, genome1, genome2, blastn_filename)
File "/hpcshare/appsunit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_results.py", line 191, in tally_for_a_Cogent_dir
best_of = read_blastn(os.path.join(dirname, blastn_filename), qlen_dict)
File "/hpcshare/appsunit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_results.py", line 44, in read_blastn
if e < best_of[seqid]: best_of[seqid] = (e, name)
TypeError: '<' not supported between instances of 'float' and 'tuple'
And the human_output.family_summary.txt is empty, but does now include the blastn headers:
gene_family input_size num_Cogent_contigs num_genome_contig genome_cov genome_acc genome_chimeric genome_contigs num_blastn blastn_best
I checked, and the in.fa.blastn file is in each of the cogent directories. Do you have any suggestions of what could be going wrong here?
Thank you!
I am trialling Cogent on the test_human.fa data as shown on the Wiki, though I did not use the reference genome hg38 for the coding genome reconstruction as I do not have a reference genome for my data and wanted to check I could do everything without it.
python /path/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py
test_human/ hg38 human_output
The above works well. I used hg38 as a placeholder (I don’t have the reference genome hg38 in my wd on purpose), as I received errors when not including this ref genome placeholder. This worked nicely with the human_output.family_summary.txt:
gene_family input_size num_Cogent_contigs num_genome_contig genome_cov genome_acc genome_chimeric genome_contigs
test_human/1_0 9 4 0 0.00 0.00 False
test_human/0_0 10 1 0 0.00 0.00 False
However, when I include blastn in this step:
python /path/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py
test_human/ hg38 human_output
--blastn in.fa.blastn
I receive the error:
Traceback (most recent call last):
File "/apps/unit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py", line 62, in
main(args.cogent_dir, args.genome, args.output_prefix, args.genome2, args.blastn_filename)
File "/apps/unit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_contigs_per_family.py", line 41, in main
sp.tally_for_a_Cogent_dir(d, writer1, writer2, genome1, genome2, blastn_filename)
File "/hpcshare/appsunit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_results.py", line 191, in tally_for_a_Cogent_dir
best_of = read_blastn(os.path.join(dirname, blastn_filename), qlen_dict)
File "/hpcshare/appsunit/RavasiU/cogent/8.0.0/Cogent/helper_scripts/tally_Cogent_results.py", line 44, in read_blastn
if e < best_of[seqid]: best_of[seqid] = (e, name)
TypeError: '<' not supported between instances of 'float' and 'tuple'
And the human_output.family_summary.txt is empty, but does now include the blastn headers:
gene_family input_size num_Cogent_contigs num_genome_contig genome_cov genome_acc genome_chimeric genome_contigs num_blastn blastn_best
I checked, and the in.fa.blastn file is in each of the cogent directories. Do you have any suggestions of what could be going wrong here?
Thank you!