Somehow, this reply that I sent to comment-HTML-Parser@rt.cpan.org never
showed up in the bug (is it supposed to?), so I'm putting it in by hand now.
----------------------
On Thu, Jan 13, 2005 at 10:07:17PM -0600, Kenneth Pronovici wrote:
Show quoted text> > Full context and any attached attachments can be found at:
> > <URL:
http://rt.cpan.org/NoAuth/Bug.html?id=9676 >
> >
> > This was behaviour that I picked up by reading the KHTML sources
> > to figure out how they did it. Before I change this again I would
> > like to do some research to figure out what MSIE and FireFox do in
> > this situation. Is multiline quoted strings actually allowed
> > in JavaScript?
>
> I figured you had a good reason for the change. I'll do some research
> and see if I can figure out whether these strings are actually allowed.
Ok, I've done some digging in Google and Google Groups.
I've found two pages that directly discuss string syntax within <script>
tags. Both say that a line break is not allowed within a string
literal.
http://academ.hvcc.edu/~kantopet/javascript/index.php?page=js+syntax&parent=core
+javascript
"You also cannot have line-breaks inside text strings literals. If you
need to run a text string across multiple lines, you should break the
string into multiple tokens and use a concatenation operator to string
it together."
http://www.netmechanic.com/news/vol4/javascript_no23.htm
"...JavaScript interprets the line breaks to mean that you're trying to
close the string improperly."
Besides these, there are a lot of conversations on comp.lang.javascript
helping newbies debug exactly this problem (often errors about an
unterminated string literal).
I also found Douglas Crockford's online Javascript validator:
http://www.crockford.com/javascript/jslint.html
Even with the "strict line ending" option unchecked, the validator does
not allow line breaks within literals.
I tend to think that your implementation is correct, and I don't think
you'd want to support obviously invalid syntax unless MSIE and/or
Firefox do (and maybe not even then).
I have pretty much zero experience with Javascript. However, I worked
up this minimal sample page:
<html>
<head><title>Test Javascript Page</title></head>
<body>
<script language="javascript">
document.write("Short string<br>");
document.write("Longer...........................string.<br>");
/*document.write("Split
string.");*/
</script>
</body>
</html>
The first two document.write() lines should be valid. The third is the
questionable string literal containing a line break. I've tested this
so far in recent versions of Mozilla, Firefox, Epiphany, Kazehakase and
Konqueror on my Debian box. All of them render the page properly with
the split string commented out and render nothing (failure?) with the
split string left in. I can't test MSIE on this box, unfortunately.
Anyway, unless MSIE surprises me, I don't think you really need to
change HTML::Parser.
KEN
--
Kenneth J. Pronovici <pronovic@debian.org>