Skip Menu |

This queue is for tickets about the XML-Parser CPAN distribution.

Report information
The Basics
Id: 37610
Status: resolved
Priority: 0/
Queue: XML-Parser

People
Owner: Nobody in particular
Requestors: ivm41i114ak [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)



Subject: char handler divides its strings into parts
hello a strange problem while trying to parse slightly larger xml file (example - attached "generated.xml") all strings found by "Char" handler must begin by "de.sssssss.aaaaaaaa.in...", but after filtering all such strings, as well as empty ones, i get entries like following: 3559795548 ngframework.beans.factory.3296971492.60479238 aa.in.feature.springframework.beans.factory.9166016451.5671657454 516.1874589391 springframework.beans.factory.2106908156.390806959 aaaa.in.feature.springframework.beans.factory.7914037839.8975530171 348733 ctory.3934897581.278099645 ngframework.beans.factory.6759656287.5782942729 sss.aaaaaaaa.in.feature.springframework.beans.factory.5831720.2441053869 1 s.factory.3780012509.9929918564 in.feature.springframework.beans.factory.9549407632.6695296495 these are the ends of strings, dividied by Char handler; in the output, the line above each of these strings contains the beginning of the string source xml file ("generated.xml") and the script ("1.pl") are attached sorry if i've done sth wrong and it's not a bug regards iv
Subject: 1.pl
#!/usr/local/bin/perl -wT use strict; use XML::Parser; my $p3 = new XML::Parser(Style=>'Subs'); sub imported_ { my $expat = shift; my $name = shift; $expat->setHandlers(Char => \&char, Default => \&other); } sub other {}; sub char { my $expat = shift; my $string = shift; print "$string\n"; }; #$p3->parsefile('INTY2C8C2200_java_20080710155505.xml'); $p3->parsefile('generated.xml');
Subject: generated.xml

Message body is not shown because it is too large.

This behavior is normal, and documented in the docs for the Char handler. The handler can (and will!) be called several times for a single string in various circumstances: new line in the data, entity in the data, reaching the end of expat's internal buffer. You need to buffer the data and wait until you get to the end tag before using the data (see for example http://www.perlmonks.org/?node_id=31798 for how to do this). __ mirod
From: ivm41i114ak [...] gmail.com
On Sat Nov 01 03:17:11 2008, MIROD wrote: Show quoted text
> This behavior is normal, and documented in the docs for the Char > handler. The handler can (and will!) be called several times for a > single string in various circumstances: new line in the data, entity in > the data, reaching the end of expat's internal buffer. You need to > buffer the data and wait until you get to the end tag before using the > data (see for example http://www.perlmonks.org/?node_id=31798 for how to > do this). > __ > mirod
thanks a lot for the info. i think this bug may be closed. best regards, iv