Subject: | bug in parse_file? |
I am parsing the attached xml file by using the following perl codes:
use XML::Simple;
use Data::Dumper;
$xml = new XML::Simple;
$data = $xml->XMLin("test.xml");
print Dumper($data);
In the first dozens of lines of output, the parsed result is right, but
broken characters appeared in the other lines:
{
'translation' => {
'content' => "Dispositivo de Mem\x{f3}",
'TYPE' => 'done'
}
},
The parsing result is correct when the whole file is read into memory
first and pass the string to XMLin :
use File::Slurp;
my $str = read_file("test.xml");
$data = $xml->XMLin($str);
print Dumper($data);
So, I am guessing there might be a bug in parse_file .
perl -V
Summary of my perl5 (revision 5 version 8 subversion 8) configuration:
Platform:
osname=linux, osvers=2.4.21-40.elsmp, archname=i386-linux-thread-multi
uname='linux 2.4.21-40.elsmp #1 smp thu feb 2 22:22:39 est 2006 i686
i686 i386 gnulinux '
config_args='-des -Doptimize=-O2 -Dmyhostname=localhost
-Dperladmin=root@localhost -Dcc=gcc -Dprefix=/opt/perl
-Darchname=i386-linux -Duseshrplib -Dusethreads -Duseithreads
-Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm
-Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Ubincompat5005
-Uversiononly -Dpager=/usr/bin/less'
hint=recommended, useposix=true, d_sigaction=define
usethreads=define use5005threads=undef useithreads=define
usemultiplicity=define
useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-fno-strict-aliasing -pipe -Wdeclaration-after-statement
-I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64
-I/usr/include/gdbm',
optimize='-O2',
cppflags='-D_REENTRANT -D_GNU_SOURCE -DTHREADS_HAVE_PIDS
-fno-strict-aliasing -pipe -Wdeclaration-after-statement
-I/usr/local/include -I/usr/include/gdbm'
ccversion='', gccversion='3.2.3 20030502 (Red Hat Linux 3.2.3-54)',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t',
lseeksize=8
alignbytes=4, prototype=define
Linker and Libraries:
ld='gcc', ldflags =' -L/usr/local/lib'
libpth=/usr/local/lib /lib /usr/lib
libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
libc=/lib/libc-2.3.2.so, so=so, useshrplib=true, libperl=libperl.so
gnulibc_version='2.3.2'
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E
-Wl,-rpath,/opt/perl/lib/5.8.8/i386-linux-thread-multi/CORE'
cccdlflags='-fpic', lddlflags='-shared -L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: MULTIPLICITY PERL_IMPLICIT_CONTEXT
PERL_MALLOC_WRAP THREADS_HAVE_PIDS USE_ITHREADS
USE_LARGE_FILES USE_PERLIO USE_REENTRANT_API
Built under linux
Compiled at Feb 10 2007 15:57:49
@INC:
/opt/perl/lib/5.8.8/i386-linux-thread-multi
/opt/perl/lib/5.8.8
/opt/perl/lib/site_perl/5.8.8/i386-linux-thread-multi
/opt/perl/lib/site_perl/5.8.8
/opt/perl/lib/site_perl
.
Subject: | test.xml |
Message body is not shown because it is too large.