Skip Menu |

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 105399
Status: resolved
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: mivkovic [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.89
Fixed in: 2.0108



Subject: whitespace after <br> between <span> tags not preserved
Parsing HTML and using textContent, whitespace is lost if it is after a <br/> which is between 2 <span> tags. If there is no <br/>, or if the space is before the <br/>, the whitespace is preserved. #!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $html1 = <<END1; <div><span>two</span><br/> <span>words</span></div> END1 my $html2 = <<END2; <div><span>two</span> <br/> <span>words</span></div> END2 my $html3 = "<div><span>two</span> <br/><span>words</span></div>"; my $html4 = "<div><span>two</span><br/> <span>words</span></div>"; my $parser = XML::LibXML->new(); foreach my $html ($html1, $html2, $html3, $html4) { my $doc = $parser->load_html(string => $html)->getDocumentElement; my $div = $doc->findnodes('//div')->[0]; print $div->textContent, "\n"; }
Subject: span-br-space-test.pl
#!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $html1 = <<END1; <div><span>two</span><br/> <span>words</span></div> END1 my $html2 = <<END2; <div><span>two</span> <br/> <span>words</span></div> END2 my $html3 = "<div><span>two</span> <br/><span>words</span></div>"; my $html4 = "<div><span>two</span><br/> <span>words</span></div>"; my $parser = XML::LibXML->new(); foreach my $html ($html1, $html2, $html3, $html4) { my $doc = $parser->load_html(string => $html)->getDocumentElement; my $div = $doc->findnodes('//div')->[0]; print $div->textContent, "\n"; }
Testing on another machine, I see this is already fixed in a later version. Sorry. I should not use this old Ubuntu version (12.04) when preparing bug reports...