Subject: | Parsing of tags in text not correct |
Hello,
when i'm parsing a Node that contains another Node inside of text-value
which is followed by a newline char, the newline char will be dropped.
Code to reproduce:
use XML::DOM::Lite;
my $xml = "<root><child>missing\n<tag>between</tag>\nnewline</child>/root>";
my $doc = XML::DOM::Lite::Parser->new()->parse($xml);
my $appended = "";
foreach my $childChild
(@{$doc->documentElement->getElementsByTagName("child")->item(0)->childNodes})
{
if($childChild->nodeType == XML::DOM::Lite::Constants::TEXT_NODE)
{
$appended .= $childChild->nodeValue;
}
elsif($childChild->nodeType == XML::DOM::Lite::Constants::ELEMENT_NODE)
{
$appended .= $childChild->firstChild->nodeValue;
}
}
print "'$appended'\n";
Output:
'missing
betweennewline'
Expected output:
'missing
between
newline'
Possible Fix:
I commented out line 160 in Parser.pm/_handle_text_node, this seemed to
fix my problem - but is it safe to do or will it mess something else up?
sub _handle_text_node {
my ($self, $text) = @_;
$parent = $self->{stack}->[$#{$self->{stack}}];
#$text =~ s/^\n//so; return unless defined $text;
return $self->_mk_gen_node($text, $parent, TEXT_NODE);
}
System information:
Module: XML-DOM-Lite-0.15
Perl: v5.12.1
OS: Linux 2.6.34 x86_64
Thank you