Subject: | as_XML_indented omits contents of script tag |
Date: | Fri, 8 Nov 2013 22:27:40 +0100 |
To: | bug-HTML-TreeBuilder-XPath [...] rt.cpan.org |
From: | Zsbán Ambrus <ambrus [...] math.bme.hu> |
Hi Mirod.
It appears that the as_XML_indented method of HTML::TreeBuilder::XPath
can omit the text contents of a script tag.
This happens when you parse a HTML document that uses a plain script
tag with javascript contents as text inline (not commented). Below
I'll show a perl command that generates this problem for me. See the
version numbers of modules I used below.
Ambrus
$ perl -we 'use warnings; use 5.016; use HTML::TreeBuilder::XPath; my
$tb = HTML::TreeBuilder::XPath->new; $tb->parse(qq(<script>\nfunction
cold() {}\n</script><p>hot)); $tb->eof; say $tb->as_XML; say
$tb->as_XML_compact; say $tb->as_XML_indented; for
(qw(HTML::TreeBuilder::XPath HTML::TreeBuilder XML::XPathEngine
HTML::Entities HTML::Tagset)) { say "$_ ",$_->VERSION; }'
<html><head><script>
function cold() {}
</script></head><body><p>hot</p></body></html>
<html><head><script>
function cold() {}
</script></head><body><p>hot</p></body></html>
<html>
<head>
<script></script>
</head>
<body>
<p>hot</p>
</body>
</html>
HTML::TreeBuilder::XPath 0.14
HTML::TreeBuilder 5.02
XML::XPathEngine 0.13
HTML::Entities 3.69
HTML::Tagset 3.20
$ perl -V
Summary of my perl5 (revision 5 version 16 subversion 3) configuration:
Platform:
osname=linux, osvers=2.6.37, archname=x86_64-linux
uname='linux king 2.6.37 #6 smp sun mar 13 20:15:05 cet 2011
x86_64 gnulinux '
config_args='-der'
hint=previous, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=define, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='cc', ccflags ='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O2',
cppflags='-fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include -fno-strict-aliasing -pipe -fstack-protector
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
-fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
ccversion='', gccversion='4.7.1', gccosandvers=''
intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'
libpth=/usr/local/lib /lib/../lib /usr/lib/../lib /lib /usr/lib
/lib64 /usr/lib64 /usr/local/lib64
libs=-lnsl -ldl -lm -lcrypt -lutil -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lc
libc=/lib/libc-2.11.3.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version='2.11.3'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'
cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib
-fstack-protector'
Characteristics of this binary (from libperl):
Compile-time options: HAS_TIMES PERLIO_LAYERS PERL_DONT_CREATE_GVSV
PERL_MALLOC_WRAP PERL_PRESERVE_IVUV USE_64_BIT_ALL
USE_64_BIT_INT USE_LARGE_FILES USE_LOCALE
USE_LOCALE_COLLATE USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC USE_PERLIO USE_PERL_ATOF
Built under linux
Compiled at May 12 2013 14:39:37
@INC:
/usr/local/perl5.16/lib/perl5/site_perl/5.16.3/x86_64-linux
/usr/local/perl5.16/lib/perl5/site_perl/5.16.3
/usr/local/perl5.16/lib/perl5/5.16.3/x86_64-linux
/usr/local/perl5.16/lib/perl5/5.16.3
/usr/local/perl5.16/lib/perl5/site_perl/5.16.1
/usr/local/perl5.16/lib/perl5/site_perl/5.16.1/x86_64-linux
/usr/local/perl5.16/lib/perl5/site_perl
.
$