Skip Menu |

This queue is for tickets about the HTML-Parser CPAN distribution.

Report information
The Basics
Id: 18595
Status: rejected
Priority: 0/
Queue: HTML-Parser

People
Owner: Nobody in particular
Requestors: MSISK [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 3.51
Fixed in: (no value)



Subject: parse of single text word fails
When parsing a single text word containing no tokens, HTML::Parser fails to pass that word to the text handler. This fails: $p->parse('singleword') This works: $p->parse('singleword ') This works: $p->parser('<p>singleword') Granted, this is a degenerate case, but its causing big problems for me while trying to strip HTML from strings and one of the source strings happens to be a single plain text word. When it fails, there is only a single start_document event with no content. The text word is never seen. Thanks! Matt
Show quoted text
> This fails: $p->parse('singleword')
HTML::Parser keep any incomplete word until you either call parse with the next chunk that completes the word of you call $p->eof to signal that this is really the end of the document. This is the expected behaviour. Show quoted text
> This works: $p->parser('<p>singleword')
Here the "<p>" will be reported, but "singleword" is still buffered.
Subject: Re: [rt.cpan.org #18595] parse of single text word fails
Date: Sat, 08 Apr 2006 03:31:46 -0500
To: bug-HTML-Parser [...] rt.cpan.org
From: Matt Sisk <sisk [...] mojotoad.com>
Gisle_Aas via RT wrote: Show quoted text
> HTML::Parser keep any incomplete word until you either call parse with > the next chunk that completes the word of you call $p->eof to signal > that this is really the end of the document. This is the expected > behaviour. >
Thanks for the response -- clearly I missed the $p->eof part. In terms of strings, rather than 'words', this makes sense. Cheers, Matt