Hi Jeffrey,
Btw, I want to help you with the memory consumption aspect. Are workers
consuming a lot of memory or just the manager process or both? Are you
calling the spawn method before creating the array?
The upcoming MCE 1.505 will allow one to specify a code reference for
input_data. The lazy array may be declared inside the iterator closure
below (was not sure if @a is needed after running MCE).
The code snippet below behaves similarly to other input_data types with
full support for chunk_size => 1 or greater, MCE->abort, MCE->next and
MCE->last.
use Tie::Array::Lazy;
use MCE;
tie my @a, 'Tie::Array::Lazy', [], sub {
$_[0]->index;
};
sub _iterator {
my $j = 0; my $max = 1000;
return sub {
my $i = $j; $j += MCE->chunk_size;
return if $i > $max;
return $j <= $max ? @a[$i .. $j - 1] : @a[$i .. $max];
};
}
MCE->new(
max_workers => 4, chunk_size => 15, input_data => _iterator(),
user_func => sub {
my ($self, $chunk_ref, $chunk_id) = @_;
## $_ = $chunk_ref->[0] when chunk_size => 1, otherwise $_ =
$chunk_ref
## MCE->print($_, "\n");
MCE->print("$chunk_id: ", join(' ', @{ $chunk_ref }), "\n");
}
)->run;
Will commit the update to MCE into SVN in the next couple of days.
In the meantime, the following can be done using MCE 1.504 (chunk_size
being 1 by default). Length is used inside the while loop due to "defined"
not working -- will be fixed in 1.505 (update to the do method to support
both "" and undef properly).
use Tie::Array::Lazy;
use MCE;
tie my @a, 'Tie::Array::Lazy', [], sub {
$_[0]->index
};
{
my $max = 1000; my $j = 0;
sub _iterator {
return if $j > $max;
return $a[$j++];
}
}
my $mce = MCE->new(
max_workers => 4,
user_func => sub {
my ($self) = @_;
while (length (my $next = MCE->do('_iterator'))) {
MCE->print($next . "\n");
}
}
)->run;
Perhaps, you're wanting to chunk as well. Below works with MCE 1.504 while
waiting for 1.505 to be released soon.
use Tie::Array::Lazy;
use MCE;
tie my @a, 'Tie::Array::Lazy', [], sub {
$_[0]->index
};
{
my $max = 1000; my $chunk_size = 15; my $j = 0;
sub _iterator {
my $i = $j; $j += $chunk_size;
return if $i > $max;
return ($j <= $max ? @a[$i .. $j - 1] : @a[$i .. $max]);
}
}
MCE->new(
max_workers => 4,
user_func => sub {
my ($self) = @_;
while (my @next = MCE->do('_iterator')) {
MCE->print(join(' ', @next), "\n");
}
}
)->run;
Perhaps chunk_id is needed too.
use Tie::Array::Lazy;
use MCE;
tie my @a, 'Tie::Array::Lazy', [], sub {
$_[0]->index
};
{
my $max = 1000; my $chunk_size = 15; my $chunk_id = 0; my $j = 0;
sub _iterator {
my $i = $j; $j += $chunk_size;
return if $i > $max;
return (++$chunk_id, $j <= $max ? @a[$i .. $j - 1] : @a[$i .. $max]);
}
}
MCE->new(
max_workers => 4,
user_func => sub {
my ($self) = @_;
while (my ($chunk_id, @next) = MCE->do('_iterator')) {
MCE->print("$chunk_id: ", join(' ', @next), "\n");
}
}
)->run;
Again, the MCE 1.505 release is coming soon.
Regards,
Mario
On Fri, Jan 10, 2014 at 3:48 PM, Mario Roy via RT <bug-MCE@rt.cpan.org>wrote:
Show quoted text> Queue: MCE
> Ticket <URL:
https://rt.cpan.org/Ticket/Display.html?id=91778 >
>
> Looking into this now. If feasible, will be included for the upcoming MCE
> 1.505 release.
>
> Regards,
> Mario
>
>
> On Wed, Jan 1, 2014 at 5:03 AM, Mario Roy via RT <bug-MCE@rt.cpan.org
> >wrote:
>
>
http://stackoverflow.com/questions/109880/is-there-a-perl-solution-for-lazy-lists-this-side-of-perl-6
> >
> > Perhaps, will enhance input_data or add a new input_iterator option. In
> the
> > meantime, MCE-do(...) can be used.
> >
> > Not related, I'm currently working on a script to wrap MCE around the
> grep,
> > egrep, fgrep, agrep and tre-agrep C binaries.
> >
> > Will look into lazy-arrays and/or custom iterators afterwards.
> >
> > Happy New Year,
> >
> > -mario
> >
> >
> >
> > On Wed, Jan 1, 2014 at 4:29 AM, Mario Roy via RT <bug-MCE@rt.cpan.org
> > >wrote:
> >
> > > Queue: MCE
> > > Ticket <URL:
https://rt.cpan.org/Ticket/Display.html?id=91778 >
> > >
> > > Hi Jeffrey,
> > >
> > > Yes, that's correct. MCE needs the length when processing data via the
> > > input_data option or via the process method.
> > >
> > > However, one can create a lazy array from the manager process. Workers
> > can
> > > call MCE->do('callback_func', ...) to retrieve values from the lazy
> > array.
> > > In this case, input_data is not specified and MCE->run(0) is used
> versus
> > > MCE->process(...).
> > >
> > > The MCE->do(...) method is bi-directional.
> > >
> > > my @list = MCE->do('get_items', $optional_arg1, $optional_argN);
> > > my $next = MCE->do('get_next');
> > >
> > >
> > > The MCE->do method can be called as often as needed. The worker will
> need
> > > to know when to break out of a loop.
> > >
> > > Define lazy array;
> > >
> > > sub get_next {
> > > return item(s) from lazy array;
> > > }
> > >
> > > MCE->new(
> > > ...
> > > user_func => sub {
> > > my ($self) = @_;
> > > while (1) {
> > > my $next = MCE->do('get_next');
> > > last unless defined $next;
> > > ...
> > > }
> > > }
> > > );
> > >
> > > MCE->run(0);
> > >
> > >
> > > Regards,
> > > Mario
> > >
> > >
> > >
> > > On Wed, Jan 1, 2014 at 3:35 AM, Jeffrey Ryan Thalhammer via RT <
> > > bug-MCE@rt.cpan.org> wrote:
> > >
> > > > Wed Jan 01 03:35:34 2014: Request 91778 was acted upon.
> > > > Transaction: Ticket created by jeff@stratopan.com
> > > > Queue: MCE
> > > > Subject: Feature Request: Support for lazy arrays of input data
> > > > Broken in: (no value)
> > > > Severity: (no value)
> > > > Owner: Nobody
> > > > Requestors: jeff@stratopan.com
> > > > Status: new
> > > > Ticket <URL:
https://rt.cpan.org/Ticket/Display.html?id=91778 >
> > > >
> > > >
> > > > I've been using MCE as part of Stratopan.com and it has been
> wonderful.
> > > > Thanks so much for your great work!
> > > >
> > > > To further optimize performance and memory consumption, I'd like to
> > pass
> > > > MCE->process() a lazy array that is filled only when each element is
> > > > accessed (such as Tie::Array::Lazy).
> > > >
> > > > This currently won't work because MCE wants to know the total length
> of
> > > > the input data array, but the length would be unknown for a lazy
> array.
> > > >
> > > > Or perhaps I could tie the data to a filehandle instead?
> > > >
> > > > Happy New Year!
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
>
>