Skip Menu |

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 19724
Status: rejected
Priority: 0/
Queue: HTML-Tree

People
Owner: Nobody in particular
Requestors: ddascalescu+cpan [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 3.20
Fixed in: (no value)



Subject: Can't distinguish among ending tags
Consider this HTML code: <p>Line 1.<br>Line 2.<br /></p> My application needs to output the HTML with an RTF layer, preserving as much as possible from the HTML layout/spacing. The problem I'm running into is that HTML::Element doesn't allow me to detect that the <br> tag has no closing tag, the <br /> element is XML-properly empty (bonus points for being able to detect the space before the ending slash), and <p> does have a closing tag. Is there a way to distinguish between these three cases?
On Tue Jun 06 02:49:23 2006, guest wrote: Show quoted text
> The problem I'm running > into is that HTML::Element doesn't allow me to detect that the <br> tag > has no closing tag, the <br /> element is XML-properly empty (bonus > points for being able to detect the space before the ending slash), and > <p> does have a closing tag. Is there a way to distinguish between these > three cases?
<br /> vs. <br> is possible: the former will have an attribute of '/', where the latter will not. However, this only works if the element is self contained. As far as <p> having a closing tag, I'm evaluating how possible that is, but I'm focusing on bug fixes first.
From: grandpa [...] cpan.org
On Tue Jun 06 02:49:23 2006, guest wrote: Show quoted text
> Consider this HTML code: > > <p>Line 1.<br>Line 2.<br /></p> > > ... The problem I'm running > into is that HTML::Element doesn't allow me to detect that the <br> tag > has no closing tag ...
<br> never has a close tag - that is, <br>...</br> is not legal HTML or XHTML. <br /> is not a close tag, it is an empty br element. Note that br elements are always empty!
From: ddascalescu+perl [...] gmail.com
On Fri Feb 22 18:59:31 2008, GRANDPA wrote: Show quoted text
> On Tue Jun 06 02:49:23 2006, guest wrote:
> > Consider this HTML code: > > > > <p>Line 1.<br>Line 2.<br /></p> > > > > ... The problem I'm running > > into is that HTML::Element doesn't allow me to detect that the > > <br> tag has no closing tag ...
> > <br> never has a close tag - that is, <br>...</br> is not legal HTML or > XHTML. <br /> is not a close tag, it is an empty br element. Note that > br elements are always empty!
What I meant is that I'm trying to tell between <br>, <br/> and <br /> because my application must preserve the input HTML as faithfully as possible. It has been suggested above that <br/> will have an attribute of '/', while <br> won't. How can I also get the amount of whitespace in <br /> ?
On Sun Feb 24 11:44:56 2008, dandv wrote: Show quoted text
> On Fri Feb 22 18:59:31 2008, GRANDPA wrote:
> > On Tue Jun 06 02:49:23 2006, guest wrote:
> > > Consider this HTML code: > > > > > > <p>Line 1.<br>Line 2.<br /></p> > > > > > > ... The problem I'm running > > > into is that HTML::Element doesn't allow me to detect that the > > > <br> tag has no closing tag ...
> > > > <br> never has a close tag - that is, <br>...</br> is not legal HTML or > > XHTML. <br /> is not a close tag, it is an empty br element. Note that > > br elements are always empty!
> > What I meant is that I'm trying to tell between <br>, <br/> and <br /> > because my application must preserve the input HTML as faithfully as > possible. It has been suggested above that <br/> will have an attribute > of '/', while <br> won't. How can I also get the amount of whitespace in > <br /> ?
What you want can't be done using HTML::Parser, which is what HTML::TreeBuilder uses for parsing HTML. You would need to patch HTML::Parser to be able to get the white space information you want. Cheers, Jeff.