Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deploy MSA database on my local server #66

Open
shiyu-wangbyte opened this issue Feb 16, 2025 · 5 comments
Open

Deploy MSA database on my local server #66

shiyu-wangbyte opened this issue Feb 16, 2025 · 5 comments

Comments

@shiyu-wangbyte
Copy link

Hello, I do not have MSA information on the input json file, so I can use --use_msa_server flag.

But now, i am interested in deploying the MSA database on my local server, So that i can finished all calculations on my local server.

What should i do?

Thanks.

@zhangyuxuann
Copy link
Collaborator

@shiyu-wangbyte we are compatiable with colabfold local. you can refer to https://github.com/bytedance/Protenix/blob/main/docs/colabfold_compatiable_msa.md

@shiyu-wangbyte
Copy link
Author

OK, thank you

@YoelShoshan
Copy link

@zhangyuxuann

In a scenario of a complex (for example - 2 protein chains) what is the correct way to use scripts/colabfold_msa.py ?

Should the user create a .fasta file per chain, and call colabfold_msa.py on it?
or is there another way?

Asking because if I provide a .fasta file with two chains, 'colabfold_msa.py' will take just one and ignore the rest in its postprocessing step of dealing with taxonomy.

I was able to bypass this by running it separately per chain, but I'm not sure if it's equivalent to what Protenix expects.

@JinyuanSun
Copy link

JinyuanSun commented Mar 3, 2025

@zhangyuxuann

In a scenario of a complex (for example - 2 protein chains) what is the correct way to use scripts/colabfold_msa.py ?

Should the user create a .fasta file per chain, and call colabfold_msa.py on it? or is there another way?

Asking because if I provide a .fasta file with two chains, 'colabfold_msa.py' will take just one and ignore the rest in its postprocessing step of dealing with taxonomy.

I was able to bypass this by running it separately per chain, but I'm not sure if it's equivalent to what Protenix expects.

can you provide a copy of the current input fasta file? @YoelShoshan

@YoelShoshan
Copy link

@JinyuanSun

sure!
for example:
7vux_needs_msa.fasta:

>A|protein
MWNPPTFSPALLVVTEGDNATFTCSFSNTSESFVLNWYRMSPSNQTDKLAAFPEDRSQPGQDSRFRVTQLPNGRDFHMSVVRARRNDSGTYLCGAISLAPKAQIKESLRAELRVTERRAEVPTAHPSPSPENLYFQ
>B|protein
EVKLVESGGGLVQPGGSLRLSCAASGFAFSSYDMSWVRQAPGKRLEWVATISGGGRYTYYPDTVKGRFTISRDNAKNSHYLQMNSLRAEDTAVYFCASPYGGYFDVWGQGTLVTVSSASTKGPSVFPLAPCSRSTSESTAALGCLVKDYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTKTYTCNVDHKPSNTKVDKRVE
>C|protein
EIVLTQSPATLSLSPGERATLSCRASQSISNFLHWYQQKPGQAPRLLIKYASQSISGIPARFSGSGSGTDFTLTISSLEPEDFAVYFCQQSNSWPHTFGQGTKVEIKRTVAAPSVFIFPPSDEQLKSGTASVVCLLNNFYPREAKVQWKVDNALQSGNSQESVTEQDSKDSTYSLSSTLTLSKADYEKHKVYACEVTHQGLSSPVTKSFNRGEC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants