Skip Menu |

This queue is for tickets about the XML-Tidy CPAN distribution.

Report information
The Basics
Id: 24113
Status: resolved
Priority: 0/
Queue: XML-Tidy

People
Owner: Pip [...] CPAN.Org
Requestors: Frank.G.Goss [...] aphis.usda.gov
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: XML-Tidy changes encoding
Date: Wed, 27 Dec 2006 10:29:05 -0700
To: bug-XML-Tidy [...] rt.cpan.org
From: Frank.G.Goss [...] aphis.usda.gov
version: XML-Tidy 1.2.43HJnFa Perl version: v5.8.8 build for MSWin32-x86-multi-thread This is the code fragment that I am running. opendir (DIR, $sourceDir) || die "Could not open directory, $sourceDir: $!\n"; while (defined($file = readdir(DIR))) { next if $file =~ /^\.\.?$/; # skip . and .. print "processing file, $file\n"; my $sourceFile = $sourceDir."/".$file; my $tidyObj = XML::Tidy->new('filename' => $sourceFile); $tidyObj->tidy(' '); $tidyObj->write('filename' => $sourceFile.".BAK"); } closedir(DIR); I have a number of XML file to tidy-up. The original files have the following declaration: <?xml version="1.0" encoding="ISO-8859-1"?> After tidying up the declaration changes to: <?xml version="1.0" encoding="utf-8"?> This causes errors with the validation since there are some accented characters not in the UTF-8 character set. How can Tidy be changed to preserve the declaration and the encoding? Regards, Frank Goss
Subject: Re: [rt.cpan.org #24113] XML-Tidy changes encoding
Date: Wed, 27 Dec 2006 09:49:41 -0800
To: bug-XML-Tidy [...] rt.cpan.org, Frank.G.Goss [...] aphis.usda.gov
From: "Pip Stuart" <pipstuart [...] gmail.com>
Hello Frank, Thanks for reporting this oversight. I'm not sure if the XML declaration header is exposed from the XML::XPath module that my XML::Tidy inherits from but... I'll try to use whatever is there unmolested (or include some work-around code) the next time I can get to packaging a new release. Sorry for any inconvenience caused by my bug. Sincerely, -Pip@CPAN.Org On 12/27/06, Frank.G.Goss@aphis.usda.gov via RT <bug-XML-Tidy@rt.cpan.org> wrote: Show quoted text
> Wed Dec 27 12:29:32 2006: Request 24113 was acted upon. > Transaction: Ticket created by Frank.G.Goss@aphis.usda.gov > Queue: XML-Tidy > Subject: XML-Tidy changes encoding > Broken in: (no value) > Severity: (no value) > Owner: Nobody > Requestors: Frank.G.Goss@aphis.usda.gov > Status: new > Ticket <URL: > http://rt.cpan.org/Ticket/Display.html?id=24113 > > > > version: XML-Tidy 1.2.43HJnFa > > Perl version: v5.8.8 build for MSWin32-x86-multi-thread > > This is the code fragment that I am running. > > opendir (DIR, $sourceDir) || die "Could not open directory, $sourceDir: > $!\n"; > while (defined($file = readdir(DIR))) { > next if $file =~ /^\.\.?$/; # skip . and .. > print "processing file, $file\n"; > my $sourceFile = $sourceDir."/".$file; > my $tidyObj = XML::Tidy->new('filename' => $sourceFile); > $tidyObj->tidy(' '); > $tidyObj->write('filename' => $sourceFile.".BAK"); > } > closedir(DIR); > > I have a number of XML file to tidy-up. The original files have the > following declaration: > > <?xml version="1.0" encoding="ISO-8859-1"?> > > After tidying up the declaration changes to: > > <?xml version="1.0" encoding="utf-8"?> > > This causes errors with the validation since there are some accented > characters not in the UTF-8 character set. > > How can Tidy be changed to preserve the declaration and the encoding? > > Regards, > Frank Goss > > > > version: XML-Tidy 1.2.43HJnFa > > Perl version: v5.8.8 build for MSWin32-x86-multi-thread > > This is the code fragment that I am running. > > opendir (DIR, $sourceDir) || die "Could not open directory, $sourceDir: > $!\n"; > while (defined($file = readdir(DIR))) { > next if $file =~ /^\.\.?$/; # skip . and .. > print "processing file, $file\n"; > my $sourceFile = $sourceDir."/".$file; > my $tidyObj = XML::Tidy->new('filename' => $sourceFile); > $tidyObj->tidy(' '); > $tidyObj->write('filename' => $sourceFile.".BAK"); > } > closedir(DIR); > > I have a number of XML file to tidy-up. The original files have the > following declaration: > > <?xml version="1.0" encoding="ISO-8859-1"?> > > After tidying up the declaration changes to: > > <?xml version="1.0" encoding="utf-8"?> > > This causes errors with the validation since there are some accented > characters not in the UTF-8 character set. > > How can Tidy be changed to preserve the declaration and the encoding? > > Regards, > Frank Goss
I've just released XML-Tidy-1.4.A7QCvHw to the CPAN which resolves this issue (exclusively for the 'filename' constructor case). -- -Pip@CPAN.Org