Bug #76040 for DBD-SQLite: functions with state

Sun Mar 25 21:55:23 2012 mochan [...] fis.unam.mx - Ticket created

Subject:	functions with state
Date:	Sun, 25 Mar 2012 19:55:09 -0600
To:	bug-DBD-SQLite [...] rt.cpan.org
From:	Luis Mochan <mochan [...] fis.unam.mx>

I'm not sure this is a bug, but it certainly was problematic and took me a long time to sort out, and I couldn't find documentation on the following behavior. In order to obtain the first $N rows for each value of a given identifier id, I used statements like my %counters; $dbh->func("mycount", 1, sub {++$counters{$_[0]}}, "create_function"); my $results=$dbh->selectall_arrayref( "SELECT mycount(id) AS cid, id, moreInfo WHERE cid<=$N" ); My idea was to hold in $counters{firstid}, $conters{secondid},... the number of times that 'firstid´, 'secondid' ... had appeared during the search, using the hash %counters as an array of counters to keep the state during the 'select'. The problem is that I obtained less rows than expected. Curiously, changing the where clause to WHERE cid<=$N AND cid<=$N I obtained an even smaller number of rows, as if the first comparison differed from the second one. I found out that the reason for these odd (at least, surprising, IMHO) behaviors is that each time I use 'cid' the function sub {++$counters{$_[0]} } was called. I expected to use 'cid' as if it were a variable holding the result of the function, and didn't expect it to work as a macro expanssion. Is this a bug or was my expectation wrong? Is there a way of holding computed values as if they were variables without re-doing the computation? Best regards, Luis -- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Apdo. Postal 48-3, 62251 | (*)/\/ \ Cuernavaca, Morelos, México | mochan@fis.unam.mx /\_/\__/ O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Mon Mar 26 00:19:15 2012 ishigaki [...] cpan.org - Correspondence added

Hi. 1) If your function does "aggregate" (like "count" or "sum"), you should use "create_aggregate" instead of "create_function". 2) Consult an appropriate SQL cookbook/tutorial to learn how to do "self join". You don't need to create a custom aggregate function; something like the following should do the trick for you: SELECT * FROM (SELECT ROWID, * FROM foo) AS a JOIN (SELECT ROWID, * FROM foo) AS b ON a.id = b.id AND a.ROWID >= b.ROWID GROUP BY a.ROWID HAVING count(*) <= 2 Hope this helps. On Mon Mar 26 10:55:23 2012, mochan@fis.unam.mx wrote: Show quoted text

> I'm not sure this is a bug, but it certainly was problematic and took > me a long time to sort out, and I couldn't find documentation on the > following behavior. In order to obtain the first $N rows for each > value of a given identifier id, I used statements like > > my %counters; > $dbh->func("mycount", 1, sub {++$counters{$_[0]}},

"create_function"); Show quoted text

> my $results=$dbh->selectall_arrayref( > "SELECT mycount(id) AS cid, id, moreInfo WHERE cid<=$N" > ); > > My idea was to hold in $counters{firstid}, $conters{secondid},... the > number of times that 'firstid´, 'secondid' ... had appeared during the > search, using the hash %counters as an array of counters to keep the > state during the 'select'. The problem is that I obtained less rows

than Show quoted text

> expected. Curiously, changing the where clause to > > WHERE cid<=$N AND cid<=$N > > I obtained an even smaller number of rows, as if the first comparison > differed from the second one. I found out that the reason > for these odd (at least, surprising, IMHO) behaviors is that each time > I use 'cid' the function > sub {++$counters{$_[0]} } > was called. I expected to use 'cid' as if it were a variable holding > the result of the function, and didn't expect it to work as a macro > expanssion. Is this a bug or was my expectation wrong? Is there a way > of holding computed values as if they were variables without re-doing > the computation? > > Best regards, > Luis >

Mon Mar 26 00:19:17 2012 The RT System itself - Status changed from 'new' to 'open'

Mon Mar 26 09:13:51 2012 mochan [...] fis.unam.mx - Correspondence added

Subject:	Re: [rt.cpan.org #76040] functions with state
Date:	Mon, 26 Mar 2012 07:13:27 -0600
To:	bug-DBD-SQLite [...] rt.cpan.org
From:	Luis Mochán <mochan [...] fis.unam.mx>

Thanks! My function does not aggregate, as I wanted not one but several values, but anyway, the self join seems to be a neat solution; I'll try it as soon as I reach my office. Thanks and best regards, Luis Kenichi Ishigaki via RT <bug-DBD-SQLite@rt.cpan.org> wrote: Show quoted text

><URL: https://rt.cpan.org/Ticket/Display.html?id=76040 > > >Hi. > >1) If your function does "aggregate" (like "count" or "sum"), you >should use "create_aggregate" instead of "create_function". > >2) Consult an appropriate SQL cookbook/tutorial to learn how to do >"self join". You don't need to create a custom aggregate function; >something like the following should do the trick for you: > > SELECT * > FROM (SELECT ROWID, * FROM foo) AS a > JOIN (SELECT ROWID, * FROM foo) AS b > ON a.id = b.id AND a.ROWID >= b.ROWID > GROUP BY a.ROWID HAVING count(*) <= 2 > >Hope this helps. > >On Mon Mar 26 10:55:23 2012, mochan@fis.unam.mx wrote:

>> I'm not sure this is a bug, but it certainly was problematic and took >> me a long time to sort out, and I couldn't find documentation on the >> following behavior. In order to obtain the first $N rows for each >> value of a given identifier id, I used statements like >> >> my %counters; >> $dbh->func("mycount", 1, sub {++$counters{$_[0]}},

>"create_function");

>> my $results=$dbh->selectall_arrayref( >> "SELECT mycount(id) AS cid, id, moreInfo WHERE cid<=$N" >> ); >> >> My idea was to hold in $counters{firstid}, $conters{secondid},... the >> number of times that 'firstid´, 'secondid' ... had appeared during

>the

>> search, using the hash %counters as an array of counters to keep the >> state during the 'select'. The problem is that I obtained less rows

>than

>> expected. Curiously, changing the where clause to >> >> WHERE cid<=$N AND cid<=$N >> >> I obtained an even smaller number of rows, as if the first comparison >> differed from the second one. I found out that the reason >> for these odd (at least, surprising, IMHO) behaviors is that each

>time

>> I use 'cid' the function >> sub {++$counters{$_[0]} } >> was called. I expected to use 'cid' as if it were a variable holding >> the result of the function, and didn't expect it to work as a macro >> expanssion. Is this a bug or was my expectation wrong? Is there a way >> of holding computed values as if they were variables without re-doing >> the computation? >> >> Best regards, >> Luis >>

-- Enviado vía señales de humo.

Mon Mar 26 10:14:41 2012 mochan [...] fis.unam.mx - Correspondence added

Subject:	Re: [rt.cpan.org #76040] functions with state
Date:	Mon, 26 Mar 2012 08:14:32 -0600
To:	Kenichi Ishigaki via RT <bug-DBD-SQLite [...] rt.cpan.org>
From:	Luis Mochan <mochan [...] fis.unam.mx>

The self-join seems to work as desired! Thanks again. Best regards, Luis On Mon, Mar 26, 2012 at 12:19:16AM -0400, Kenichi Ishigaki via RT wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=76040 > > > Hi. > > 1) If your function does "aggregate" (like "count" or "sum"), you > should use "create_aggregate" instead of "create_function". > > 2) Consult an appropriate SQL cookbook/tutorial to learn how to do > "self join". You don't need to create a custom aggregate function; > something like the following should do the trick for you: > > SELECT * > FROM (SELECT ROWID, * FROM foo) AS a > JOIN (SELECT ROWID, * FROM foo) AS b > ON a.id = b.id AND a.ROWID >= b.ROWID > GROUP BY a.ROWID HAVING count(*) <= 2 > > Hope this helps. > > On Mon Mar 26 10:55:23 2012, mochan@fis.unam.mx wrote:

> > I'm not sure this is a bug, but it certainly was problematic and took > > me a long time to sort out, and I couldn't find documentation on the > > following behavior. In order to obtain the first $N rows for each > > value of a given identifier id, I used statements like > > > > my %counters; > > $dbh->func("mycount", 1, sub {++$counters{$_[0]}},

> "create_function");

> > my $results=$dbh->selectall_arrayref( > > "SELECT mycount(id) AS cid, id, moreInfo WHERE cid<=$N" > > ); > > > > My idea was to hold in $counters{firstid}, $conters{secondid},... the > > number of times that 'firstid´, 'secondid' ... had appeared during the > > search, using the hash %counters as an array of counters to keep the > > state during the 'select'. The problem is that I obtained less rows

> than

> > expected. Curiously, changing the where clause to > > > > WHERE cid<=$N AND cid<=$N > > > > I obtained an even smaller number of rows, as if the first comparison > > differed from the second one. I found out that the reason > > for these odd (at least, surprising, IMHO) behaviors is that each time > > I use 'cid' the function > > sub {++$counters{$_[0]} } > > was called. I expected to use 'cid' as if it were a variable holding > > the result of the function, and didn't expect it to work as a macro > > expanssion. Is this a bug or was my expectation wrong? Is there a way > > of holding computed values as if they were variables without re-doing > > the computation? > > > > Best regards, > > Luis > >

> > >

-- o W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ Apdo. Postal 48-3, 62251 | (*)/\/ \ Cuernavaca, Morelos, México | mochan@fis.unam.mx /\_/\__/ O< ascii ribbon campaign - stop html mail - www.asciiribbon.org

Wed May 16 01:15:26 2012 ishigaki [...] cpan.org - Correspondence added

Marked as resolved. Thanks.

Wed May 16 01:15:32 2012 ishigaki [...] cpan.org - Status changed from 'open' to 'resolved'