Skip Menu |

This queue is for tickets about the Bio-LITE-Taxonomy-NCBI-Gi2taxid CPAN distribution.

Report information
The Basics
Id: 94565
Status: resolved
Priority: 0/
Queue: Bio-LITE-Taxonomy-NCBI-Gi2taxid

People
Owner: MOTIF [...] cpan.org
Requestors: CANTALUPO [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.10
Fixed in: (no value)



Subject: Version 10 is very slow
Hello, Previously, I used version 6 of Gi2taxid (thank you for your wonderful module) and bin creation for the NCBI Protein and Nucleotide dmg files only took 30 minutes or so. Now I have version 10 installed (MacOSX Server 8 core 32GB RAM) and it took over 12 hrs to only partially finish the Protein database. I stopped the process since I thought something was wrong. At this rate, it will take days (if not a week) to finish the Nucleotide bin since it is much larger. Is this a bug or just that the algorithm is slow. Is there a way to speed it up? Thank you for your help, Paul
Sorry for the late response! This notification has been sent to an email address I don't use anymore. I have updated my mail redirections. I had to make this change because the /fast/ version for creating the bin dictionary was creating problems in 32-bit perls. I have included now another change that should fix this problem. Instead of "all in disk" or "all in mem" I'm using a mixed strategy (writing to mem and flushing every 30Mb). With this new version (0.12) I'm able to create the nucleotide bin dict in 22 minutes (in an average desktop computer). If you try again, let me know if the fix also works for you.
This should now be fixed in version 0.12