Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add createdb support for sequence db input #545

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

matchy233
Copy link

According to the source code of createdb, it should be able to accept MMSeqs databases as one of the input sources. But the current implementation fail to handle MMSeqs db input. This PR fixes this issue.

Probably need to edit MMSeqsBase.cpp with new instructions.

@@ -126,8 +129,11 @@ int createdb(int argc, const char **argv, const Command& command) {
}

KSeqWrapper* kseq = NULL;
std::string seq = ">";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this into the if condition?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we can't :( we need to refer to the address of this string later. Variables defined inside if {} block is not accessible to the outer scope.

@matchy233
Copy link
Author

I can modify MMseqsBase.cpp to add instructions for the fixed version of createdb later if needed

@milot-mirdita
Copy link
Member

This feature was meant for turning a bunch of fasta files in form of a DB (e.g., produced by tar2db) into a normal MMseqs2 sequence databases. It is being used for this purpose in the databases downloader workflow.

If you want to consume sequence dbs and produce new sequence dbs, i would suggest to add a check for the presence of a header db and only then do your new code.

@matchy233
Copy link
Author

Thanks for the explanation! I'll modify the code to support the old implementation as well as the new one. Maybe we can add the usage for database input to the usage text? so that some curious users (like me) would not get confused next time :P

@matchy233 matchy233 changed the title Fix createdb support for db input Add createdb support for sequence db input Mar 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants