Bug #56997 for Expect: more interact()/interconnect() bugs (multi-char escape, memory leaks)

Wed Apr 28 01:16:33 2010 alex.dupuy [...] mac.com - Ticket created

Subject:	more interact()/interconnect() bugs (multi-char escape, memory leaks)
Date:	Wed, 28 Apr 2010 01:16:13 -0400
To:	bug-Expect [...] rt.cpan.org
From:	Alexander Dupuy <alex.dupuy [...] mac.com>

The Expect.pm documentation says: Show quoted text

> It is B<_highly recommended_> that the escape sequence be a single > character since the > likelihood is great that the sequence will be broken into to separate > reads > from the $object's handle, making it impossible to strip $sequence from > getting printed to $object's listen group.

That is certainly true - but even if you don't care about stripping the escape sequence from output to the listen group, the current Expect 1.21 implementation will almost never recognize an escape sequence broken into separate reads, since the escape character buffer that is used to recognize the sequence contains data read from all available input objects interleaved together. It's easy to fix this by having a separate escape character buffer for each read handle, and the attached (for protection against line wraps) patch modifies Expect.pm to do just that. Furthermore, the current Expect 1.21 implementation of interact/interconnect will (in the absence of a non-zero max_accum() setting, which isn't possible for STDIN on interact, only with interconnect) grow the escape character buffers to hold all possible input as long as no escape sequence is seen. This should be documented, and to allow the max_accum() setting to be used without worrying that some input will be skipped due to trimming, the escape character buffer trimming should be moved to be *after* the foreach $escape_sequence loop, so that all input data will be scanned. The patch modifies Expect.pm to do this, and modifies interact() so that it will not leak memory reading the input filehandle unless a non-"EOF" escape sequence is provided. (The interact() documentation is modified to warn of the possibility that it may leak memory if a non-'EOF' escape sequence is provided, or while reading from the existing expect object if it has no max_accum). Although it is tempting to use the maximum length of all (non-"EOF") escape strings as an accumulation maximum, that could be incorrect if an escape string is actually a regex pattern, so this patch does not attempt to guess an accumulation maximum. Note that the attached patch file combines the above fixes with my previously submitted patch for Bug ID #56990, as well as a one line fix for Bug ID #36970 (at line 1560/1571) and a correction for a debugging statement at line 314. -- mailto:alex.dupuy@mac.com

diff --git a/Expect/Expect.pm b/Expect/Expect.pm index 1d0d201..767ff85 100644 --- a/Expect/Expect.pm +++ b/Expect/Expect.pm @@ -311,7 +311,7 @@ sub set_seq { print STDERR "Escape seq. '" . $escape_sequence; print STDERR "' function for ${*$self}{exp_Pty_Handle} set to '"; print STDERR ${${*$self}{exp_Function}}{$escape_sequence}; - print STDERR "(" . join(',', @_) . ")'\r\n"; + print STDERR "(" . join(',', @{${${*$self}{exp_Parameters}}{$escape_sequence}}) . ")'\r\n" if defined(${${*$self}{exp_Parameters}}{$escape_sequence}); } } @@ -1058,7 +1058,16 @@ sub interact { $self->set_group($in_object); } $in_object->set_group($self); - $in_object->set_seq($escape_sequence,undef) if defined($escape_sequence); + if (defined($escape_sequence) && $escape_sequence ne "EOF") + { + # note, this will leak memory as long as the escape sequence is not seen + $in_object->set_seq($escape_sequence,undef) + } + else + { + # prevent memory leakage if no escape sequence given + $in_object->max_accum(1); + } # interconnect normally sets stty -echo raw. Interact really sort # of implies we don't do that by default. If anyone wanted to they could # set it before calling interact, of use interconnect directly. @@ -1092,7 +1101,7 @@ sub interconnect { # my ($handle)=(shift); call as Expect::interconnect($spawn1,$spawn2,...) my ($rmask,$nfound,$nread); my ($rout, @bits, $emask, $eout, @ebits ) = (); - my ($escape_sequence,$escape_character_buffer); + my ($escape_sequence,%escape_character_buffer); my (@handles) = @_; my ($handle,$read_handle,$write_handle); my ($read_mask,$temp_mask) = ('',''); @@ -1177,13 +1186,13 @@ sub interconnect { print STDERR "interconnect: read $nread byte(s) from ${*$read_handle}{exp_Pty_Handle}.\r\n" if ${*$read_handle}{"exp_Debug"} > 1; # Test for escape seq. before printing. # Appease perl -w - $escape_character_buffer = '' unless defined ($escape_character_buffer); - $escape_character_buffer .= ${*$read_handle}{exp_Pty_Buffer}; + $escape_character_buffer{$read_handle} = '' unless defined ($escape_character_buffer{$read_handle}); + $escape_character_buffer{$read_handle} .= ${*$read_handle}{exp_Pty_Buffer}; foreach $escape_sequence (keys(%{${*$read_handle}{exp_Function}})) { - print STDERR "Tested escape sequence $escape_sequence from ${*$read_handle}{exp_Pty_Handle}"if ${*$read_handle}{"exp_Debug"} > 1; - # Make sure it doesn't grow out of bounds. - $escape_character_buffer = $read_handle->_trim_length($escape_character_buffer,${*$read_handle}{"exp_Max_Accum"}) if (${*$read_handle}{"exp_Max_Accum"}); - if ($escape_character_buffer =~ /($escape_sequence)/) { + next if $escape_sequence eq "EOF"; + print STDERR "Tested escape sequence $escape_sequence from ${*$read_handle}{exp_Pty_Handle}" if ${*$read_handle}{"exp_Debug"} > 1; + print STDERR " against $escape_character_buffer{$read_handle}" if ${*$read_handle}{"exp_Debug"} > 2; + if ($escape_character_buffer{$read_handle} =~ /($escape_sequence)/) { if (${*$read_handle}{"exp_Debug"}) { print STDERR "\r\ninterconnect got escape sequence from ${*$read_handle}{exp_Pty_Handle}.\r\n"; # I'm going to make the esc. seq. pretty because it will @@ -1204,12 +1213,14 @@ sub interconnect { # Clear the buffer so no more matches can be made and it will # only be printed one time. ${*$read_handle}{exp_Pty_Buffer} = ''; - $escape_character_buffer = ''; + $escape_character_buffer{$read_handle} = ''; # Do the function here. Must return non-zero to continue. # More cool syntax. Maybe I should turn these in to objects. last CONNECT_LOOP unless &{${${*$read_handle}{exp_Function}}{$escape_sequence}}(@{${${*$read_handle}{exp_Parameters}}{$escape_sequence}}); } } + # Make sure it doesn't grow out of bounds. + $escape_character_buffer{$read_handle} = $read_handle->_trim_length($escape_character_buffer{$read_handle},${*$read_handle}{"exp_Max_Accum"}) if (${*$read_handle}{"exp_Max_Accum"}); $nread = 0 unless defined($nread); # Appease perl -w? waitpid(${*$read_handle}{exp_Pid}, WNOHANG) if (defined (${*$read_handle}{exp_Pid}) &&${*$read_handle}{exp_Pid}); if ($nread == 0) { @@ -1557,7 +1568,7 @@ sub _trim_length { # purposes) AND debug >= 3, don't trim. return($string) if (defined ($self) and ${*$self}{"exp_Debug"} >= 3 and (!(defined($length)))); - my($indicate_truncation) = '...' unless $length; + my($indicate_truncation) = $length ? '' : '...'; $length = 1021 unless $length; return($string) unless $length < length($string); # We wouldn't want the accumulator to begin with '...' if max_accum is passed diff --git a/Expect/Expect.pod b/Expect/Expect.pod index 9384d74..ea0ccc4 100644 --- a/Expect/Expect.pod +++ b/Expect/Expect.pod @@ -496,6 +496,13 @@ group_ will be set to 'raw -echo' for the duration of interconnection. Setting $object->manual_stty() will stop this behavior per object. The original tty settings will be restored as interconnect exits. +Note that unless $object->max_accum() is set to a non-zero value for all +interconnected objects, the interconnection will accumulate input as long +as no escape sequence is seen on an input (until memory is exhausted). +You should set $object->max_accum() to be large enough to contain the +longest escape sequence for any input, or 1 if no escape sequence (other +than the special 'EOF' sequence) is given. + For a generic way to interconnect processes, take a look at L<IPC::Run>. @@ -521,6 +528,13 @@ is read from B<FILEHANDLE>, not $object. $object's listen group will consist solely of \*FILEHANDLE for the duration of the interaction. \*FILEHANDLE will not be echoed on STDOUT. +Note that unless $object->max_accum() is set to a non-zero value, the +interaction will accumulate input from it without bound (until memory +is exhausted). If a non-"EOF" escape sequence is specified, the +interaction will accumulate input from \*FILEHANDLE without bound as +long as the escape sequence is not seen. If that is an issue, +explicitly create the input object and set its max_accum, and use +interconnect() instead of interact(), =item $object->log_group(0 | 1 | undef) @@ -591,7 +605,8 @@ array defined by the caller, passed by reference to set_seq(). Your function should return a non-zero value if execution of interconnect() is to resume after the function returns, zero or undefined if interconnect() should return after your function returns. -The special sequence 'EOF' matches the end of file being reached by $object. +The special sequence 'EOF' matches the end of file being reached by $object; +if you need to match that as a character string, use '[E]OF' instead.. See interconnect() for details.

Fri May 21 06:05:34 2010 RGiersig [...] cpan.org - Correspondence added

The whole interconnect/interact part is just one big hack, it would be best to throw it out (no, I wouldn't due to backward-compatibility reasons) and rewrite it cleanly from scratch with a cleaner interface. This has been on my agenda for a long time but I still don't have the time for it. If you want to take it on, I'll be happy to help reviewing the design and clarifying any issues. Just be sure to use new names in the namespace (interconnect2()/interact2() or any other cute name you can come up with) and not step on anything existing. Roland -- RGiersig@cpan.org

Fri May 21 06:05:37 2010 The RT System itself - Status changed from 'new' to 'open'

Fri May 21 06:05:44 2010 RGiersig [...] cpan.org - Taken