Skip to content

Output with [X] symbol, wrong identification of data type #11

@Anna-pro

Description

@Anna-pro

Hello Patrick,
I have my protein alignments after Mafft in .fas format:

>WP_002656499.1
----------MSK-FFER------DDDIVALA-------TPFLSSALCVIRSS-------
----------GASSISKFSKI----FSNHSALNSASGNTIHYGYILDS------------
--------EN--------GCKVDEVVVCLYRAPKSFTGQDAIEVMAHGSVIGIKKIIDLF
LKSGFRMAEPGEFT--LRAFLAKKIDLTKAEAIHEIIFAKT-------------------
-------NKTYSLA-VNKLSGALFVKIDAIKKSILNFLSAVSVYLDYEVDD---------
---------------------HEISIPFDL----ILSSKAELKKLINSYKVYEKIDNGVA
LVLAGSVNAGKSSLFNLFLKKDRSIVSSYPGTTRDYIEASFELDGI-LFNLFDTAGLRD-
---------ADNFVERLGIEKSNSLIKEASLVIYVIDV-----SSNLTKD----DFLFID
S-------------------------------NKSNSKILFVLNKIDLK-----------
-------INKSTEEFV-------------------------------------RSKVLNS
SNLIMISTKNLEGIDILYDKIRALISYERVEIGL--------------------------
------------------------------------------------------------
-------DDIIIS-SNRQMQLLEKAYAL--------------------------------
ILDLLSK-----------IDR-QVSYD---------MLAFDAYEIIN-------------
------CLGEITGE---------VSSED------VLDNMFK-------------------
---NFCL--GK-

After running FASconCAT (perl FASconCAT-G_v1.04.pl -s -n -l) I received FcC_supermatrix.nex, which contains following:

#NEXUS

begin data;
dimensions ntax= 7893 nchar= 79315;
format datatype=dna interleave missing=-;
matrix
WP_013967880.1  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
WP_021687749.1  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
WP_002680262.1  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
disjointig_22_115  M------------------- -------------------- -------------------- -------------------- --------------------
WP_015712824.1  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
disjointig_54_381  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
disjointig_17_2181  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
disjointig_102_770  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
WP_044978940.1  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX
disjointig_31_1360  XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX

First question: why format data type identified as DNA? And probably it links to my next question:why sequences are mainly presented by [X] symbol?
I would appreciate any help and advices how I can fix it for further phylogenetic tree analysis. Thank you!

Best wishes,
Anna

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions