Bug #79467 for PerlX-ArraySkip: Skipping every other element

Wed Sep 05 19:44:15 2012 sharyanto [...] cpan.org - Ticket created

Subject:

Skipping every other element

Instead of doing multiple skip calls (which incurs multiple overhead) like: give( arg giver => $alice, arg recipient => $bob, arg gift => $dinosaur, arg wrapped => 1, ); how about something that can skip every other elements, like: sub give { say join " ", @_ } sub array_keys { my $i; grep { ++$i % 2 } @_ } sub array_values { my $i; grep { $i++ % 2 } @_ } give(array_keys foo=>1, bar=>2, baz=>3); give(array_values foo=>1, bar=>2, baz=>3); I suspect prior art on CPAN though.

Wed Sep 05 21:20:56 2012 perl [...] toby.ink - Correspondence added

You might expect something like your array_values function to be a lot faster than arrayskip, but according to my benchmarking in many cases it is not. array_values appears to start winning when there's seven or more positional parameters. (i.e. 14 items in the list, seven of which need to be removed.) Below that, PerlX::ArraySkip wins. (This is tested on Perl 5.16.0 on a reasonably slow computer.) An XS implementation of either might be fun to try. Also worth thinking about would be some Devel::Declare trickery to detect: arrayskip BAREWORD => at compile time and completely remove it from the op tree. In cases where there wasn't a bareword (e.g. a quoted string, or a more complex expression) then the arrayskip call would be left as-is. That would incur a slight compile-time penalty, but eliminate the run-time penalty altogether.

Subject:

bench-arrayskip.pl

use Benchmark ':all'; use PerlX::ArraySkip 'skip'; sub give { die "ARGH" unless $_[0]==1 && $_[1]==2 && $_[2]==3 } sub array_values { my $i; grep { $i++ % 2 } @_ } cmpthese(200_000, { AllInOne => sub { give(array_values foo => 1, bar => 2, baz => 3) }, ArraySkip => sub { give(skip foo => 1, skip bar => 2, skip baz => 3) }, }); cmpthese(200_000, { LongAllInOne => sub { give(array_values foo => 1, bar => 2, baz => 3, quux => 4, quuux => 5, xyzzy => 6, garble => 7) }, LongArraySkip => sub { give(skip foo => 1, skip bar => 2, skip baz => 3, skip quux => 4, skip quuux => 5, skip xyzzy => 6, skip garble => 7) }, }); cmpthese(200_000, { VeryLongAllInOne => sub { give(array_values foo => 1, bar => 2, baz => 3, quux => 4, quuux => 5, xyzzy => 6, garble => 7, alice => 8, bob => 9, carol => 10) }, VeryLongArraySkip => sub { give(skip foo => 1, skip bar => 2, skip baz => 3, skip quux => 4, skip quuux => 5, skip xyzzy => 6, skip garble => 7, skip alice => 8, skip bob => 9, skip carol => 10) }, });

Wed Sep 05 21:20:57 2012 The RT System itself - Status changed from 'new' to 'open'

Thu Sep 06 04:16:44 2012 sharyanto [...] cpan.org - Correspondence added

To be fair, grep is not the fastest method here. Here's another take. My array_values() wins all benchmark cases. There might be faster version.

Subject:

bench-arrayskip-2.pl

use Benchmark ':all'; use PerlX::ArraySkip 'skip'; sub give { die "ARGH" unless $_[0]==1 && $_[1]==2 && $_[2]==3 } my @odds = map { $_*2-1 } 1..100; sub array_values { @_[@odds[0..$#_/2]] } cmpthese(900_000, { AllInOne => sub { give(array_values foo => 1, bar => 2, baz => 3) }, ArraySkip => sub { give(skip foo => 1, skip bar => 2, skip baz => 3) }, }); cmpthese(900_000, { LongAllInOne => sub { give(array_values foo => 1, bar => 2, baz => 3, quux => 4, quuux => 5, xyzzy => 6, garble => 7) }, LongArraySkip => sub { give(skip foo => 1, skip bar => 2, skip baz => 3, skip quux => 4, skip quuux => 5, skip xyzzy => 6, skip garble => 7) }, }); cmpthese(900_000, { VeryLongAllInOne => sub { give(array_values foo => 1, bar => 2, baz => 3, quux => 4, quuux => 5, xyzzy => 6, garble => 7, alice => 8, bob => 9, carol => 10) }, VeryLongArraySkip => sub { give(skip foo => 1, skip bar => 2, skip baz => 3, skip quux => 4, skip quuux => 5, skip xyzzy => 6, skip garble => 7, skip alice => 8, skip bob => 9, skip carol => 10) }, });

Thu Sep 06 08:20:48 2012 perl [...] toby.ink - Correspondence added

On 2012-09-06T09:16:44+01:00, SHARYANTO wrote: Show quoted text

> To be fair, grep is not the fastest method here. > > Here's another take. My array_values() wins all benchmark cases.

There Show quoted text

> might be faster version.

Yes, that's a much faster array_values implementation. On my computer, arrayskip still wins in the three parameter case, but the difference is negligible (about 5%). I also came up with this as a nifty array_values implementation: sub array_values { my $i=1; grep { $i=!$i } @_ } ... which is faster than the modulus arithmetic array_values implementation, but not as fast as the one with precomputed @odds.

Thu Sep 06 09:51:54 2012 perl [...] toby.ink - Correspondence added

The Devel::Declare version manages about a 40% speed-up on my original version. This is faster than array_values for lists of up to 6 positional parameters (i.e. 12 parameters including the labels). It manages this by rewriting: give(skip foo => 1, skip bar => 2); to: give(skip(), 1, skip(), 2); The skip sub calls are still in there (I can't find a way to eliminate them entirely with Devel::Declare) but with an empty argument list and nothing to do, they run much faster.

Subject:

DD.pm

package PerlX::ArraySkip::DD; use PerlX::ArraySkip 0.001 (); use Devel::Declare 0.006007 (); use Devel::Declare::Context::Simple 0 (); use B::Hooks::EndOfScope 0.09; use Sub::Install 0.925 qw( install_sub ); use namespace::clean 0.19; sub import { my ($class, @funcs) = @_; my $caller = caller; @_ = ($class, into => $caller, as => \@funcs); goto \&install; } sub install { my ($class, %args) = @_; my $target = $args{into}; my $func = $args{as} || 'arrayskip'; $func = [$func] unless ref $func; Devel::Declare->setup_for($target => { map { my $name = $_; ($name => { const => sub { my $ctx = Devel::Declare::Context::Simple->new; $ctx->init(@_); return $class->_transform($name, $ctx); }, }) } @$func }); for my $name (@$func) { install_sub { into => $target, as => $name, code => \&PerlX::ArraySkip::arrayskip, } } on_scope_end { namespace::clean->clean_subroutines($target, @$func); }; return 1; } sub _transform { my ($class, $name, $ctx) = @_; my $linestr = $ctx->get_linestr; my $start = $ctx->offset; if (substr($linestr, $ctx->offset) =~ m{^( \s* $name \s+ (?: (?:[^0-9\W]\w+) | (?:'[^']+') | (?:"[^']+") ) \s* (?: (?:=>) | \, ) \s* )}x) { my $l = length $1; substr($linestr, $start, $l) = ',' x $l; } $ctx->set_linestr($linestr); return 1; } 1;

Thu Sep 06 11:38:15 2012 sharyanto [...] cpan.org - Correspondence added

Nice. Performance issue aside, I'd still suggest something like array_values() for the choice of syntax.

Tue May 14 04:45:45 2013 perl [...] toby.ink - Taken

Tue May 14 05:47:31 2013 perl [...] toby.ink - Correspondence added

I've just released PerlX::ArraySkip::XS which should improve performance significantly. For something like array_values(), something in the List:: namespace would be more appropriate. (e.g. List::MoreUtils.)

Tue May 14 05:47:32 2013 perl [...] toby.ink - Status changed from 'open' to 'rejected'