Skip Menu |

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 93117
Status: open
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: MARKOV [...] cpan.org
Cc: achim.adam [...] univie.ac.at
AdminCc:

Bug Information
Severity: Important
Broken in: 2.0110
Fixed in: (no value)



CC: "Achim Adam" <achim.adam [...] univie.ac.at>
Subject: utf8 in toStringC14N
Dear XML::LibXML maintainers, We bumped into a nasty utf8 problem. We need to use cannonicalization (c14n) to generate sha1 digests in SOAP messages, which are then signed cryptographically. XML::LibXML enables the utf8 flag on all output, using C2sv(), also in toStringC14N(). Since version 5.74, Digest::SHA does a utf8_downgrade on the strings it sums. The latter changes the c14n output string, in an example I have at hand. Hence, the sha is incorrect! Cannonicalization is a horribly sensitive process. We would like to see the output as bytes, not flagged to be "Perl's internal idea of utf8" which may trigger unexpected character conversions. Please, can you add a function toBytesC14N() which leaves the utf8 flag off? Probably, the output of toStringC14N is only good for print during debugging, not for automated use. Of course, Digest::SHA should consider its parameter as bytes, not strings. So, we probably file a bug-report for that module.
Hi Mark, thanks for your report. On Tue Feb 18 04:12:45 2014, MARKOV wrote: Show quoted text
> Dear XML::LibXML maintainers, > > We bumped into a nasty utf8 problem. > > We need to use cannonicalization (c14n) to generate sha1 digests in > SOAP messages, which are then signed cryptographically. XML::LibXML > enables the utf8 flag on all output, using C2sv(), also in > toStringC14N(). Since version 5.74, Digest::SHA does a utf8_downgrade > on the strings it sums. The latter changes the c14n output string, in > an example I have at hand. Hence, the sha is incorrect! > > Cannonicalization is a horribly sensitive process. We would like to > see the output as > bytes, not flagged to be "Perl's internal idea of utf8" which may > trigger unexpected character conversions. Please, can you add a > function toBytesC14N() which leaves the utf8 flag off? Probably, the > output of toStringC14N is only good for print during debugging, not > for automated use. > > Of course, Digest::SHA should consider its parameter as bytes, not > strings. So, we probably file a bug-report for that module.
Assuming I understand it, it sounds acceptable. A patch, with an accompanying, reproducing patch to the automated tests will be welcome, and will speed up the process of getting it there. "Code talks!" Regards, -- Shlomi Fish
Subject: Re: [rt.cpan.org #93117] utf8 in toStringC14N
Date: Tue, 18 Feb 2014 10:55:32 +0100
To: Shlomi Fish via RT <bug-XML-LibXML [...] rt.cpan.org>
From: Mark Overmeer <secretaris [...] nluug.nl>
* Shlomi Fish via RT (bug-XML-LibXML@rt.cpan.org) [140218 09:34]: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=93117 > > Assuming I understand it, it sounds acceptable. A patch, with an > accompanying, reproducing patch to the automated tests will be welcome, > and will speed up the process of getting it there. "Code talks!"
The toStringC14() hook in XS.pm is quite large, and only the last statement has to be skipped in the toBytesC14N(). Although I have a few modules with XS myself, I have no idea how to implement this wrapper simple without duplicating code. I know you are much better in this than me. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net
CC: "H. Merijn Brand" <h.m.brand [...] xs4all.nl>
Subject: Re: [rt.cpan.org #93117] utf8 in toStringC14N
Date: Tue, 18 Feb 2014 12:02:57 +0100
To: Shlomi Fish via RT <bug-XML-LibXML [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
* NLUUG, Mark Overmeer (secretaris@nluug.nl) [140218 10:55]: Show quoted text
> * Shlomi Fish via RT (bug-XML-LibXML@rt.cpan.org) [140218 09:34]:
> > <URL: https://rt.cpan.org/Ticket/Display.html?id=93117 > > > Assuming I understand it, it sounds acceptable. A patch, with an > > accompanying, reproducing patch to the automated tests will be welcome, > > and will speed up the process of getting it there. "Code talks!"
An implementation in Perl space is simple, as contributed by Merijn Brand (Tux) diff -purd a/LibXML.pm b/LibXML.pm --- a/LibXML.pm 2014-02-01 15:12:09.000000000 +0100 +++ b/LibXML.pm 2014-02-18 11:49:50.832573878 +0100 @@ -27,6 +27,7 @@ use XML::LibXML::Error; use XML::LibXML::NodeList; use XML::LibXML::XPathContext; use IO::Handle; # for FH reads called as methods +use Encode qw(_utf8_off ); BEGIN { $VERSION = "2.0110"; # VERSION TEMPLATE: DO NOT CHANGE @@ -1346,6 +1347,13 @@ sub toStringC14N { ); } +sub toBytesC14N { + my $self = shift; + my $x = $self->toStringC14N( @_ ); + _utf8_off( $x ); + return $x; +} + { my $C14N_version_1_dot_1_val = 2;
Hi Mark, On Tue Feb 18 06:03:14 2014, solutions@overmeer.net wrote: Show quoted text
> * NLUUG, Mark Overmeer (secretaris@nluug.nl) [140218 10:55]:
> > * Shlomi Fish via RT (bug-XML-LibXML@rt.cpan.org) [140218 09:34]:
> > > <URL: https://rt.cpan.org/Ticket/Display.html?id=93117 > > > > Assuming I understand it, it sounds acceptable. A patch, with an > > > accompanying, reproducing patch to the automated tests will be welcome, > > > and will speed up the process of getting it there. "Code talks!"
> > An implementation in Perl space is simple, as contributed by Merijn > Brand (Tux)
Thanks! A few comments: 1. It seems a bit hacky, but that's not a big deal. 2. It lacks a test case and some documentation. 3. It can be done by using a wrapper function outside the scope of XML::LibXML (which may work as a workaround for you), but I still find it useful as a part of XML::LibXML itself. Regards, -- Shlomi Fish Show quoted text
> > diff -purd a/LibXML.pm b/LibXML.pm > --- a/LibXML.pm 2014-02-01 15:12:09.000000000 +0100 > +++ b/LibXML.pm 2014-02-18 11:49:50.832573878 +0100 > @@ -27,6 +27,7 @@ use XML::LibXML::Error; > use XML::LibXML::NodeList; > use XML::LibXML::XPathContext; > use IO::Handle; # for FH reads called as methods > +use Encode qw(_utf8_off ); > > BEGIN { > $VERSION = "2.0110"; # VERSION TEMPLATE: DO NOT CHANGE > @@ -1346,6 +1347,13 @@ sub toStringC14N { > ); > } > > +sub toBytesC14N { > + my $self = shift; > + my $x = $self->toStringC14N( @_ ); > + _utf8_off( $x ); > + return $x; > +} > + > { > my $C14N_version_1_dot_1_val = 2;
Subject: Re: [rt.cpan.org #93117] utf8 in toStringC14N
Date: Tue, 18 Feb 2014 12:19:47 +0100
To: Shlomi Fish via RT <bug-XML-LibXML [...] rt.cpan.org>
From: Mark Overmeer <solutions [...] overmeer.net>
Thanks for your active maintenance! * Shlomi Fish via RT (bug-XML-LibXML@rt.cpan.org) [140218 11:09]: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=93117 > > 2. It lacks a test case and some documentation.
The description depends on the opinion you have about it. IMO, the cannonicalization produces bytes: it is a serialization of the nodes. The bytes are in utf8 encoding (always, according to the C14N requirement) but *may* contain differences with Perl's internal concept of utf8. Actually, our problem with the downgrade performed by Digest::SHA shows that there is some difference. No idea what. The C14N is probably always followed by a SHA checksum (not only sha1) which should accept bytes (not strings) as input as well. So, I would advice the use of "toBytesC14N" (oh, we also need a toBytesEC14N) over the "toStringC14N". But you may give preference the reverse. That will impact the docs considerably. Show quoted text
> 3. It can be done by using a wrapper function outside the scope of > XML::LibXML (which may work as a workaround for you), but I still find > it useful as a part of XML::LibXML itself.
Thanks. -- Regards, MarkOv ------------------------------------------------------------------------ Mark Overmeer MSc MARKOV Solutions drs Mark A.C.J. Overmeer MARKOV Solutions Mark@Overmeer.net solutions@overmeer.net http://Mark.Overmeer.net http://solutions.overmeer.net