Skip Menu |

This queue is for tickets about the HTML-Tiny CPAN distribution.

Report information
The Basics
Id: 34378
Status: resolved
Worked: 1.3 hours (80 min)
Priority: 0/
Queue: HTML-Tiny

People
Owner: andy [...] hexten.net
Requestors: spamcollector_cpan [...] juerd.nl
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: Invalid HTML syntax
HTML::Tiny returns XHTML/XML syntax, which is *not* always valid HTML. For example, "<br />" is invalid HTML. It should be "<br>". My page using Captcha::reCAPTCHA does not validate as HTML because of this. -- Juerd
On Sun Mar 23 18:30:23 2008, JUERD wrote: Show quoted text
> HTML::Tiny returns XHTML/XML syntax, which is *not* always valid HTML. > > For example, "<br />" is invalid HTML. It should be "<br>". > > My page using Captcha::reCAPTCHA does not validate as HTML because of this.
Just a suggestion to Andy but you could probably pass "HTML" or "XHTML" to ->new and you would get "<br>" or "<br />". But it may not be his intent to product "valid" HTML.
Indeed the generated HTML is broken. More details for differences between XHTML and HTML: http://www.w3.org/TR/xhtml1/diffs.html and http://www.w3.org/TR/xhtml1/guidelines.html And more in depth: http://www.cs.tut.fi/~jkorpela/html/empty.html In particular, the problem is generating "minimized" elements, such as <br />. In HTML (which is a dialect of SGML and has little to do with XHTML or XML), the slash in this context is a null end tag, which can be used as a shorthand for closing tags. Thus, <br /> translated to the more ordinary form is <br>>, i.e. line break followed by greater than. Naturally this breaks validators. In HTML::Tiny, these elements are called "closed" elements. Apparently Andy Armstrong has already anticipated this, because there are two places in the source code marked with a comment indicating than a special "xml mode" flag is needed. A patch is attached, which lets the user give a parameter to the constructor. By default, the module generates XHTML. Use the following to make it generate valid[1] HTML: my $h = HTML::Tiny->new(mode => 'html'); The patch is otherwise trivial, except that all n+1 tests needed to be updated as well. As a bonus the patched version now generates correct empty attributes; i.e. <input checked="checked" /> instead of <input checked />. [1] Valid as defined "do what I intend, not what I ask for".
Download HTML-Tiny-1.01.tar.gz
application/x-gzip 16.1k

Message body not shown because it is not plain text.

Naturally attached the wrong file... Correct file attached to this reply.
Download HTML-Tiny-1.01-htmlfix.diff.gz
application/x-gzip 4.4k

Message body not shown because it is not plain text.

Thanks everyone. I've just applied VRK's excellent patch and released 1.03 to the CPAN.