CC: | bug-Bio-SamTools [...] rt.cpan.org |
Subject: | Re: CQ tag bug |
Date: | Mon, 8 Feb 2010 11:30:16 -0500 |
To: | Juan Lorenzo Rodriguez Flores <juan [...] ucsd.edu> |
From: | Lincoln Stein <lincoln.stein [...] gmail.com> |
Hello,
Please apply the following patch to lib/Bio/DB/Sam.xs and let me know if it
fixes your problem:
Index: lib/Bio/DB/Sam.xs
===================================================================
--- lib/Bio/DB/Sam.xs (revision 22660)
+++ lib/Bio/DB/Sam.xs (working copy)
@@ -178,28 +178,6 @@
return 0;
}
-/* copied from bam_aux.c because "we need it" */
-/* no longer needed with 0.1.4
-uint8_t *bam_aux_get_core(bam1_t *b, const char tag[2])
-{
- uint8_t *s;
- int y = tag[0]<<8 | tag[1];
- s = bam1_aux(b);
- while (s < b->data + b->data_len) {
- int type, x = (int)s[0]<<8 | s[1];
- s += 2;
- if (x == y) return s;
- type = toupper(*s); ++s;
- if (type == 'C') ++s;
- else if (type == 'S') s += 2;
- else if (type == 'I' || type == 'F') s += 4;
- else if (type == 'D') s += 8;
- else if (type == 'Z' || type == 'H') { while (*s)
putchar(*s++); ++s; }
- }
- return 0;
-}
-*/
-
MODULE = Bio::DB::Sam PACKAGE = Bio::DB::Tam PREFIX=tam_
Bio::DB::Tam
@@ -626,10 +604,11 @@
{
s = bam1_aux(b); /* s is a khash macro */
while (s < b->data + b->data_len) {
+ fprintf(stderr,"tag=%c%c\n",s[0],s[1]);
XPUSHs(sv_2mortal(newSVpv(s,2)));
s += 2;
type = *s++;
- if (type == 'A') { printf("A:%c", *s); ++s; }
+ if (type == 'A') { ++s; }
else if (type == 'C') { ++s; }
else if (type == 'c') { ++s; }
else if (type == 'S') { s += 2; }
@@ -637,7 +616,7 @@
else if (type == 'I') { s += 4; }
else if (type == 'i') { s += 4; }
else if (type == 'f') { s += 4; }
- else if (type == 'Z' || type == 'H') { while (*s) ++s; }
+ else if (type == 'Z' || type == 'H') { while (*s) ++(s); ++(s); }
}
}
On Sun, Feb 7, 2010 at 3:17 PM, Juan Lorenzo Rodriguez Flores <juan@ucsd.edu
Show quoted text
> wrote:
Show quoted text> Dear Dr. Stein,
> I believe I have found a bug in the Alignment object.
>
> I am trying to extract the CQ (colorspace quality) string from an
> Bio::DB::Bam::Alignment object using the @values array. The CQ field
> is mangled and chopped into pieces.
>
> For example, the folowing code:
>
> my @tags=$aln->get_all_tags;
> foreach my $b (@tags) {
> print "$b\n";
> }
>
> Prints out the following "tags":
> AS
> NH
> IH
> HI
> CS
> C
> Z8
> 89
> 4*
> ,7
> 47
> ;7
> 57
> ;9
> 8:
> 86
> <>
> 5<
> 99
> 62
--
Lincoln D. Stein
Director, Informatics and Biocomputing Platform
Ontario Institute for Cancer Research
101 College St., Suite 800
Toronto, ON, Canada M5G0A3
416 673-8514
Assistant: Renata Musa <Renata.Musa@oicr.on.ca>
> >,
> 64
> ;7
>
> The first 5 are real tags, the rest are a chopped-up version of the
> quality string. The proper quality string is
>
> AS:i:1450
> NH:i:1
> IH:i:1
> HI:i:1
> CS:Z:t31200103321223210230002020132130130003000210112003
> CQ:Z:858964*&,7;479;75576;968:>86?<><5<199162#>,864;;73
>
>
> --
> Juan Lorenzo Rodriguez-Flores, Ph.D.
> POSTAL:
> Moores UCSD Cancer Cente, # 0901
> 3855 Health Sciences Drive,
> La Jolla, CA 92093-0901
> OFFICE: Moores 3rd Floor, Rm 3352
> MAP: http://tinyurl.com/ltctyy
> BLOG: http://www.juansearch.com
> EMAIL: <juan@ucsd.edu>
>