Skip to content

!FILE-ERROR!: Unknown character found in sequence #6

@linzhi2013

Description

@linzhi2013

Hi Patrick,

I found the program has some problems when an alignment has internal whitespaces.

Say I have two files aln_1.fas and aln_2.fas in the same directory, from which I run the command:

perl /home/gmeng/soft/bin/FASconCAT-G_v1.05.pl  -s -p -p -n -l

the program stopped:
image

The content of aln_1.fas file:

>JA12
---------- ---------- ---------- ---------- ---------- ----------
---------- ---------- ---------- ---------- ---------- ----------
---------- ---------- ---------- ---------- ---------- -------cga
ataacagaaa gaggtgttgg ggctggttgg actatttatc cccccttatc tggttcttta
tctattatag gggctattaa ttttatttct actatcatta atatgcgaat tataggggtg
>SA1
tctattatag gggctattaa ttttatttct actatcatta atatgcgaat tataggggtg
ataacagaaa gaggtgttgg ggctggttgg actatttatc cccccttatc tggttcttta
ataacagaaa gaggtgttgg ggctggttgg actatttatc cccccttatc tggttcttta
ataacagaaa gaggtgttgg ggctggttgg actatttatc cccccttatc tggttcttta
ataacagaaa gaggtgttgg ggctggttgg actatttatc cccccttatc tggttcttta

The content of aln_2.fas file:

>JA12
cctgctcaat gtaaatagcc gcagtactgt gctaaggtag cataatcact tgtttcctaa
aagaaaagat tacgacctcg atgttgaatt aattagtctt aaagcaaaaa ttaaagaaag
tctgttcgac ttataaataa tt
>SA1
ataacagaaa gaggtgttgg ggctggttgg actatttatc cccccttatc tggttcttta
cctgctcaat gtaaatagcc gcagtactgt gctaaggtag cataatcact tgtttcctaa
tctgttcgac ttataaataa tt

The code printing out the Error message was:
image

Therefore, it is the above code that cannot handle the whitespace inside an alignment.

If I remove the whitespace, for example,
aln_1.1.fas:

>JA12
------------------------------------------------------------
------------------------------------------------------------
---------------------------------------------------------cga
ataacagaaagaggtgttggggctggttggactatttatccccccttatctggttcttta
tctattataggggctattaattttatttctactatcattaatatgcgaattataggggtg
>SA1
tctattataggggctattaattttatttctactatcattaatatgcgaattataggggtg
ataacagaaagaggtgttggggctggttggactatttatccccccttatctggttcttta
ataacagaaagaggtgttggggctggttggactatttatccccccttatctggttcttta
ataacagaaagaggtgttggggctggttggactatttatccccccttatctggttcttta
ataacagaaagaggtgttggggctggttggactatttatccccccttatctggttcttta

aln_2.1.fas:

>JA12
cctgctcaatgtaaatagccgcagtactgtgctaaggtagcataatcacttgtttcctaa
aagaaaagattacgacctcgatgttgaattaattagtcttaaagcaaaaattaaagaaag
tctgttcgacttataaataatt
>SA1
ataacagaaagaggtgttggggctggttggactatttatccccccttatctggttcttta
cctgctcaatgtaaatagccgcagtactgtgctaaggtagcataatcacttgtttcctaa
tctgttcgacttataaataatt

then the program works fine.

Should the program remove the whitespace in the sequences when it reads the alignments? As far as I know, whitespaces are not treated as special characters like - or Ns in an alignment, right? If this is the case, we may safely remove the whitespace in the sequences of alignments.

Cheers
Guanliang

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions