Subject: | Bio::SeqIO::fastq has a bug |
Date: | Sun, 14 Sep 2014 23:49:18 +0800 |
To: | bug-bioperl [...] rt.cpan.org |
From: | Brook Nong <brooknong [...] gmail.com> |
I find a bug.
when I use module Bio::SeqIO to read some files in fastq format. Most
files were successfully processed, but not all.
Files which contain any sequences quality line start with an '@' will
failed to read. Like this:
@Illumina_SRR125365.38 s_5_1_0001_qseq_37 length=76
CCGCCATTTCTTCAAATCTTTTCTTTTCTTTAGGAGTCATCAATTTCCATTTCTCTGCACATTTCTTTGAAAATTA
+Illumina_SRR125365.38 s_5_1_0001_qseq_37 length=76
@CCCCCCCCBCCCCCCCCCCCAACCCCCCCCC?CCCCCCCCCCCCCAACCCCCCCCCCCCCCCCCCCCB??<BC>#
and the failure information show below:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: Unknown symbol with ASCII value 62 outside of quality range
STACK: Error::throw
STACK: Bio::Root::Root::throw
/usr/local/share/perl/5.14.2/Bio/Root/Root.pm:449
STACK: Bio::SeqIO::fastq::next_dataset
/usr/local/share/perl/5.14.2/Bio/SeqIO/fastq.pm:132
STACK: Bio::SeqIO::fastq::next_seq /usr/local/share/perl/5.14.2/Bio/SeqIO/
fastq.pm:51
STACK: pair_fix.pl:50
-----------------------------------------------------------
when i deleted these sequences, it can work perfectly again.
Distribution name and version: Bio::SeqIO, 1.006924
Perl version: perl 5, version 14, subversion 2 (v5.14.2) built for
x86_64-linux-gnu-thread-multi
Operating System vendor and version: 81~precise1-Ubuntu SMP Tue Jul 15
04:02:22 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux