Subject: | no state reset at eof unless strict_comment is set |
This is all under FreeBSD 4.8 on x86.
Consider this test script:
-----------------------8<----------------------------------8<------
#!perl -w
{
package Foo;
use strict;
use base qw(HTML::Parser);
use vars qw($AUTOLOAD);
sub AUTOLOAD {
my ($self, @args) = @_;
return if $AUTOLOAD eq 'Foo::DESTROY';
print join ',', $AUTOLOAD, @args;
print "\n";
}
}
my $foo = Foo->new(
api_version => 3,
start_document_h => ['input_start_document', 'self'],
start_h => ['input_start', 'self,text'],
end_h => ['input_end', 'self,text'],
text_h => ['input_text', 'self,text'],
declaration_h => ['input_declaration', 'self,text'],
comment_h => ['input_comment', 'self,text'],
process_h => ['input_process', 'self,text'],
end_document_h => ['input_end_document', 'self'],
);
print "==== HTML::Parser $HTML::Parser::VERSION\n";
$foo->parse('<');
$foo->eof;
print "==\n";
$foo->parse('>');
$foo->eof;
print "====\n";
-----------------------8<----------------------------------8<-----
Under perl 5.005.03 and HTML::Parser 3.28, it works as expected:
==== HTML::Parser 3.28
Foo::input_start_document
Foo::input_text,<
Foo::input_end_document
==
Foo::input_start_document
Foo::input_text,>
Foo::input_end_document
====
But with HTML::Parser 3.31 (under both perl 5.8.0 and perl 5.8.1) it produces this output:
==== HTML::Parser 3.31
Foo::input_start_document
Foo::input_comment,<
Foo::input_end_document
==
Foo::input_comment,<>
Foo::input_end_document
====
Adding
strict_comment => 1,
to the constructor eliminates the problem.
It looks to me like the code that treats an open tag at eof as a comment is interfering with the state reset that's supposed to happen at eof. Removing the end_document hook doesn't fix the problem.