Subject: | BioPerl, NCBI Eutilities genbank format |
Date: | Tue, 24 Mar 2009 16:02:15 -0500 |
To: | <bug-bioperl [...] rt.cpan.org> |
From: | "Cathy Gresham" <gresham [...] cse.msstate.edu> |
BioPerl version 1.6.0
Perl version 5.8.8
Linux Suse version 10 sp2
Noticed that the genbank format in BioPerl was not returning all the
DBSOURCES. especially the xrefs (non-sequence databases):
my @db_links=();
my $collection = $seq->annotation;
for my $dblink ( $collection->get_Annotations('dblink')) {
# my $temp_link = sprintf("%s:%s",$dblink->database,$dblink->primary_id);
my $temp_link = $dblink->database . ":" . $dblink->primary_id;
push (@db_links,$temp_link);
}
Looked in genbank.pm
saw where it was only going to return them if the DBSOURCE had an initial line of swissprot:
the line I had was UniProtKB:
I added the following lines
idb:/usr/lib/perl5/site_perl/5.8.8/Bio/SeqIO # diff genbank.pm genbankKEEP.pm
463,465c463
< if (($dbsource =~ s/swissprot:\s+locus\s+(\S+)\,.+\n// ) ||
< ($dbsource =~ s/UniProtKB:\s+locus\s+(\S+)\,.+\n// ) ||
< ($dbsource =~ s/UniProt:\s+locus\s+(\S+)\,.+\n// )) {
---
Show quoted text
> if( $dbsource =~ s/swissprot:\s+locus\s+(\S+)\,.+\n// ) {
seems to be returning the values now.
Cathy