Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the WWW-Mechanize CPAN distribution.

Report information
The Basics
Id: 5362
Status: resolved
Priority: 0/
Queue: WWW-Mechanize

People
Owner: Nobody in particular
Requestors: andy [...] petdance.com
book [...] cpan.org
chifung [...] csua.berkeley.edu
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: 0.72
Fixed in: (no value)



Date: Mon, 5 Jan 2004 14:34:09 -0600
From: Andy Lester <andy [...] petdance.com>
To: bug-www-mechanize [...] rt.cpan.org
Subject: Allow turning off the history
Mech's history can eat up a lot of memory if you're doing a lot of fetching. If you don't need the history, it makes sense to be able to turn it off at constructor time. The constructor would be the ONLY place to turn the history on or off. I do NOT want to mess with toggling history state on the fly. -- Andy Lester => andy@petdance.com => www.petdance.com => AIM:petdance
Subject: Limit the page stack depth
Wouldn't it be useful to be able to give a maximum depth for the page stack? This way, when someone writes a bot that runs for a long time and never calls back(), the script does not eat up all the memory. Attached is a patch proposal against 0.72. -- BooK
--- WWW-Mechanize-0.72/lib/WWW/Mechanize.pm 2004-01-13 05:36:36.000000000 +0100 +++ WWW-Mechanize/lib/WWW/Mechanize.pm 2004-02-16 13:17:30.000000000 +0100 @@ -238,6 +238,11 @@ Don't complain on warnings. Setting C<< quiet => 1 >> is the same as calling C<< $agent->quiet(1) >>. Default is off. +=item * C<< stack_depth => $value >> + +Sets the depth of the page stack that keeps tracks of all teh downloaded +pages. Default is -1 (infinite). + =back =cut @@ -255,6 +260,7 @@ onwarn => \&WWW::Mechanize::_warn, onerror => \&WWW::Mechanize::_die, quiet => 0, + stack_depth => -1, ); my %passed_parms = @_; @@ -1134,6 +1140,21 @@ return $self->{quiet}; } +=head2 $mech->stack_depth($value) + +Get or set the page stack depth. Older pages are discarded first. + +A negative value means "keep all the pages". + +=cut + +sub stack_depth { + my $self = shift; + my $old = $self->{stack_depth}; + $self->{stack_depth} = shift if @_; + return $old; +} + =head1 Overridden L<LWP::UserAgent> methods =head2 $mech->redirect_ok() @@ -1402,6 +1423,8 @@ $self->{page_stack} = []; push( @$save_stack, $self->clone ); + shift @$save_stack if $self->stack_depth >= 0 + and @$save_stack > $self->stack_depth; $self->{page_stack} = $save_stack; }
Date: Mon, 5 Apr 2004 14:49:14 +0800
From: Chi-Fung Fan <chifung [...] csua.berkeley.edu>
To: bug-WWW-Mechanize [...] rt.cpan.org
Subject: memory problem
Version: 0.72 Platform: Freebsd 4.9 Problem: WWW::Mechanize's memory usage slowly goes up (300MB in my case), when many pages are visited. The problem is that the _push_page_stack method stores cloned LWP::UserAgent object on every page it visits. Since I never use back() method, can WWW::Mechanize provide an option to disable the back() feature? WorkAround: Create my own module which overrides WWW::Mechanize request() to skip this line $self->_push_page_stack(); Thanks, Chi-Fung
This ticket is getting merged into existing ticket 5362.
I'm not allowing you to disable the stack entirely. You can set it to a depth of one. I hope this is OK. I just don't want to get into the mess of how to handle no stack, and I've never liked using -1 as a special value. Thanks for all your input. Change is in 1.04.