Subject: | makefile of Bio-LITE-Taxonomy-NCBI-Gi2taxid |
Date: | Wed, 3 Dec 2014 08:43:09 +0100 |
To: | <bug-Bio-LITE-Taxonomy-NCBI-Gi2taxid [...] rt.cpan.org> |
From: | "Denis BAURAIN" <denis.baurain [...] ulg.ac.be> |
Hi Miguel,
I've noticed a minor issue with the current (0.12) Makefile.PL of your module Bio-LITE-Taxonomy-NCBI-Gi2taxid. You request :
'File::Tail' => 0.96
but this should read:
'File::Tail' => 0.096
Otherwise your module fails to install without forcing.
All the best,
Denis
--
Prof. Denis BAURAIN
Eukaryotic Phylogenomics
Department of Life Sciences
University of Liège
Sart Tilman, Bât. B22
B-4000 Liège, Belgium
At Fri, 30 May 2014 16:22:48 +0200, denis.baurain@ulg.ac.be wrote:
Show quoted text
>Thank you, Miguel.
>This is great! :-)
>
>All the best,
>Denis
>
>At Fri, 30 May 2014 10:08:31 +0100, mp@ebi.ac.uk wrote:
>
>>Hi Denis,
>>
>>Thanks a lot for your email.
>>Your request makes a lot of sense. I have included your patch, added a
>>couple of tests and corrected the indentation :-)
>>
>>I have uploaded the new version to cpan (v0.09). It may take some time
>>to reach all the mirrors but it should be available during the morning.
>>
>>Please, let me know if you have any other problem with the module.
>>
>>Best regards,
>>
>>M;
>>
>>
>>On 29/05/14 22:05, Denis BAURAIN wrote:
>>> Dear Miguel,
>>>
>>> During the last few years, I have been developing an extensive suite
>>of Perl modules for phylogenomics. My modules do not use BioPerl but
>>rest on Bio::Phylo and on your Bio::LITE::Taxonomy distribution. Note
>>that you will not find them on CPAN yet because there are still in
>>active development.
>>>
>>> Anyway, I make extensive use of NCBI Taxonomy in my work and the fact
>>that Bio::LITE::Taxonomy does not handle synonyms drives me crazy
>>because NCBI often renames organisms: e.g., Canis familiaris does not
>>work (should be Canis lupus familiaris) ; Xenopus tropicalis neither
>>(should be Xenopus (Silurana) tropicalis) etc.
>>>
>>> Therefore, I propose you a very small patch so as to handle them in
>>your module. The idea is simply to include the synonyms as additional
>>keys for a given taxon id. This would increase the size of the taxon id
>>lookup by about 15% considering the current version of the NCBI Taxonomy
>>database :
>>>
>>> $ grep -c 'scientific name' taxdump/names.dmp
>>> 1160242
>>> grep -c 'synonym' taxdump/names.dmp
>>> 187443
>>>
>>> From my understanding of your module, this should not have adverse
>>effects since there will still be only one name by taxon id when going
>>from taxon id to name. This is only in the opposite direction that
>>synonyms will come to play.
>>>
>>> I attach a patch file. Here's the patch in context (sorry for the
>>funny indentation; it appears like this on my system):
>>>
>>> sub _name_nodes
>>> {
>>> my ($self) = @_;
>>> my $namesFile = $self->{namesFile};
>>> my $nodesNames;
>>> if ((UNIVERSAL::isa($namesFile, 'GLOB')) or (ref \$namesFile eq
>>'GLOB')) {
>>> $nodesNames = $namesFile;
>>> } else {
>>> open $nodesNames, "<", $namesFile or croak $!;
>>> }
>>> while (<$nodesNames>){
>>> chomp;
>>> my ($taxId,$taxName,$comment) = _process_tax_name ($_);
>>> if ($comment eq "scientific name"){
>>> ${$self->{nodes}->{$taxId}}{name} = $taxName;
>>> $self->{names}->{$taxName} = $taxId;
>>> }
>>> elsif ($comment eq "synonym") {
>>> $self->{names}->{$taxName} = $taxId;
>>> }
>>> }
>>> close $nodesNames;
>>> }
>>>
>>> Would you like to consider this change for your next release? This
>>would help me a lot. If you need help for writing tests for this
>>feature, please let me know.
>>>
>>> Thank you very much for your time!
>>>
>>> Best regards,
>>> Denis
>>>
>>