Skip Menu |

This queue is for tickets about the Glib CPAN distribution.

Report information
The Basics
Id: 39439
Status: resolved
Priority: 0/
Queue: Glib

People
Owner: Nobody in particular
Requestors: schmorp [...] schmorp.de
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: SvPVutf8 and friends considered harmful
Date: Sat, 20 Sep 2008 05:37:25 +0200
To: bug-Glib [...] rt.cpan.org
From: Marc Lehmann <schmorp [...] schmorp.de>
I recently stumbled over an issue with the Perl XS API where perl destroys values. This problem is present in Glib as well. Consider the following function: log_message (SV *msg) CODE: const char *msg_ = SvPVutf8_nolen (msg); This looks innocent enough, but when wants to log a reference, e.g. for debuging purposes: my $ref = new Object; log_message $ref; Then ref is changed into a string by SvPVutf8. I consider this an API problem, but unfortunately, it won't be fixed within perl. I don't know the perfect solution for this, but I suspect that a wrapper around SvPVutf8 and friends is required that checks whether an SV is really a string, and only then call SvPVutf8 (if it isn't a string,m then SvPV will return a valid utf-8 string because it contains only ascii characters). If you want me to develop such a wrapper, just prompt me. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\
On Fri Sep 19 23:38:41 2008, schmorp@schmorp.de wrote: Show quoted text
> I don't know the perfect solution for this, but I suspect that a wrapper > around SvPVutf8 and friends is required that checks whether an SV is > really a string, and only then call SvPVutf8 (if it isn't a string,m then > SvPV will return a valid utf-8 string because it contains only ascii > characters). > > If you want me to develop such a wrapper, just prompt me.
As far as I can see, SvPVutf8 or its companions are only used in gperl_filename_from_sv, right? So I think we could just fix it there, if it needs fixing. If you can come up with a test case that demonstrates the problem, that would be appreciated. And of course, a patch to fix the problem is more than welcome too! -Torsten
On Fri Sep 19 23:38:41 2008, schmorp@schmorp.de wrote: Show quoted text
> I recently stumbled over an issue with the Perl XS API where perl destroys > values. This problem is present in Glib as well. > > Consider the following function: > > log_message (SV *msg) > CODE: > const char *msg_ = SvPVutf8_nolen (msg); > > This looks innocent enough, but when wants to log a reference, e.g. for > debuging purposes: > > my $ref = new Object; > log_message $ref; > > Then ref is changed into a string by SvPVutf8. > > I consider this an API problem, but unfortunately, it won't be fixed within > perl.
I’ve just noticed this ticket. This API problem was fixed in perl commit fe46cbda82 (5.15.8). Show quoted text
> > I don't know the perfect solution for this, but I suspect that a wrapper > around SvPVutf8 and friends is required that checks whether an SV is > really a string, and only then call SvPVutf8 (if it isn't a string,m then > SvPV will return a valid utf-8 string because it contains only ascii > characters).
Not true, as a reference could stringify to "\xFF" it if thas overloading. Show quoted text
> > If you want me to develop such a wrapper, just prompt me. >
Subject: Re: [rt.cpan.org #39439] SvPVutf8 and friends considered harmful
Date: Mon, 9 Sep 2013 07:03:09 +0200
To: Father Chrysostomos via RT <bug-Glib [...] rt.cpan.org>
From: Marc Lehmann <schmorp [...] schmorp.de>
On Sun, Sep 08, 2013 at 02:34:33PM -0400, Father Chrysostomos via RT <bug-Glib@rt.cpan.org> wrote: Show quoted text
> > I consider this an API problem, but unfortunately, it won't be fixed within > > perl.
> > I’ve just noticed this ticket. This API problem was fixed in perl commit fe46cbda82 (5.15.8).
Good to hear that the perl maintainers reverted their opinion. Now they only need to fix SvPV to be compatible with existing modules - ok, it's hopeless, but I still dream of perl maintainers actually fixing these bugs rather than cast them into concrete :) Show quoted text
> > I don't know the perfect solution for this, but I suspect that a wrapper > > around SvPVutf8 and friends is required that checks whether an SV is > > really a string, and only then call SvPVutf8 (if it isn't a string,m then > > SvPV will return a valid utf-8 string because it contains only ascii > > characters).
> > Not true, as a reference could stringify to "\xFF" it if thas overloading.
Well, according to your patch description, the patch is susceptible to that problem too as it uses the same logic: "As of commit fe46cbda82 SvPVutf8 no longer tries to coerce its argument if it is not already a string.", so you are contradicting yourself here somewhere. I still think the algorithm as outlined above (and supposedly now in perl) works, as the result of a stringification only contains "\xFF" when it is a string, so it should work for overloaded values. Anyway, either there is a later patch that actually fixes it, or the algorithm as outlined by me is actually correct, or both the algorithm and perl are buggy - but something must give. In the former two cases, this bug has been fixed. In the latter, perl is still causing problems for overloaded values. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\
On Mon Sep 09 01:03:30 2013, schmorp@schmorp.de wrote: Show quoted text
> On Sun, Sep 08, 2013 at 02:34:33PM -0400, Father Chrysostomos via RT > <bug-Glib@rt.cpan.org> wrote:
> > > I consider this an API problem, but unfortunately, it won't be > > > fixed within > > > perl.
> > > > I’ve just noticed this ticket. This API problem was fixed in perl > > commit fe46cbda82 (5.15.8).
> > Good to hear that the perl maintainers reverted their opinion.
I wasn’t aware of any opinion that SvPVutf8 wasn’t broken as originally implemented. Show quoted text
> Now > they > only need to fix SvPV to be compatible with existing modules - ok, > it's > hopeless, but I still dream of perl maintainers actually fixing these > bugs > rather than cast them into concrete :)
That was long before I had anything to do with perl development. I’m afraid it’s too late to do anything about that now, specially since the majority of XS modules have updated to do things the new way (or were written after the new way became The Way). It was actually 5.10’s utter brokenness that led me to start contributing to perl, since the hundreds of bugs introduced in it didn’t seem to be getting addressed. Show quoted text
>
> > > I don't know the perfect solution for this, but I suspect that a > > > wrapper > > > around SvPVutf8 and friends is required that checks whether an SV > > > is > > > really a string, and only then call SvPVutf8 (if it isn't a > > > string,m then > > > SvPV will return a valid utf-8 string because it contains only > > > ascii > > > characters).
> > > > Not true, as a reference could stringify to "\xFF" it if thas > > overloading.
> > Well, according to your patch description, the patch is susceptible to > that problem too as it uses the same logic: "As of commit fe46cbda82 > SvPVutf8 no longer tries to coerce its argument if it is not already a > string.", so you are contradicting yourself here somewhere.
I meant that your workaround described above, if you want to apply it in 5.14 and before, would be incorrect if it used SvPV for references, as SvPV(ref) can return bytes sequences that are not valid UTF-8. For 5.14, you would have to use SvPV, check the utf8 flag, and then convert the buffer if the utf8 flag was off. Show quoted text
> > I still think the algorithm as outlined above (and supposedly now in > perl) > works, as the result of a stringification only contains "\xFF" when it > is > a string, so it should work for overloaded values.
The algorithm now in perl is to copy it first if it is a glob, a reference, a read-only non-PV, or a regular expression. (Incidentally, it would also copy a pad name, but the assumption is that pad names will never get through to that code.) Show quoted text
> > Anyway, either there is a later patch that actually fixes it, or the > algorithm as outlined by me is actually correct, or both the algorithm > and > perl are buggy - but something must give.
I don’t quite understand. I think it is because you misunderstood me. I think I wasn’t clear enough at first. Show quoted text
> In the former two cases, this bug has been fixed. In the latter, perl > is > still causing problems for overloaded values.
Got me really confused there!
On Mon Sep 09 01:30:34 2013, SPROUT wrote: Show quoted text
> (Incidentally, it would also copy a pad name, but the assumption is > that pad names will never get through to that code.)
Ahem. Correction: it would copy a pad name representing a closed-over lexical declared outside the sub.
Subject: Re: [rt.cpan.org #39439] SvPVutf8 and friends considered harmful
Date: Mon, 9 Sep 2013 23:17:27 +0200
To: Father Chrysostomos via RT <bug-Glib [...] rt.cpan.org>
From: Marc Lehmann <schmorp [...] schmorp.de>
On Mon, Sep 09, 2013 at 01:30:34AM -0400, Father Chrysostomos via RT <bug-Glib@rt.cpan.org> wrote: Show quoted text
> I wasn’t aware of any opinion that SvPVutf8 wasn’t broken as originally implemented.
AS often, when I reported it, I was told that this isn't considered a bug, but instead the usage of SvPVutf8 would be in error. I had to implement workarounds in quite a few places. Show quoted text
> That was long before I had anything to do with perl development. I’m > afraid it’s too late to do anything about that now, specially since > the majority of XS modules have updated to do things the new way (or > were written after the new way became The Way).
Well, 99% of all statistics are made up on the spot, and this smells like one. First of all, I doubt that this is even remotely true, and second, all these new modules would unlikely to be broken, because very few modules use SvPV and later check the utf-8 flag, and third, new modules will suffer from the same bugs - each time xs modules use "char *" or equivalent they are almost certain to be buggy, and a greta number of new xs modules does exactly do that. The problem is clearly ongoing, and not fixed as you naively assume. Show quoted text
> It was actually 5.10’s utter brokenness that led me to start > contributing to perl, since the hundreds of bugs introduced in it > didn’t seem to be getting addressed.
I thank you for it. Show quoted text
> > > > SvPV will return a valid utf-8 string because it contains only > > > > ascii > > > > characters).
> > > > > > Not true, as a reference could stringify to "\xFF" it if thas > > > overloading.
> > > > Well, according to your patch description, the patch is susceptible to > > that problem too as it uses the same logic: "As of commit fe46cbda82 > > SvPVutf8 no longer tries to coerce its argument if it is not already a > > string.", so you are contradicting yourself here somewhere.
> > I meant that your workaround described above, if you want to apply it in 5.14 and before, would be incorrect if it used SvPV for references, as SvPV(ref) can return bytes sequences that are not valid UTF-8.
That obviously wouldn't be a problem, as SvPV would only be used in the non-string case. Show quoted text
> For 5.14, you would have to use SvPV, check the utf8 flag, and then convert the buffer if the utf8 flag was off.
... or use any number of other correct ways that might or might not be faster, smaller, or worse. Show quoted text
> > Anyway, either there is a later patch that actually fixes it, or the > > algorithm as outlined by me is actually correct, or both the algorithm > > and > > perl are buggy - but something must give.
> > I don’t quite understand. I think it is because you misunderstood me. I think I wasn’t clear enough at first.
No, the point was that your own patch used basically the same algorithm, but for some reason you either applied a dual standard (when you describe the algorithm it's ok, when others do it's "not true"), or, more importantly(!) at the time you wrote the patch you weren't aware of that issue and the patch was wrong. I felt it was my duty to point this out to you, as the latter case would have meant that the problem you attributed to my description of the algorithm would also be in yours, and thus, still in perl. I'm glad to hear that it was just you being more picky about my words than your own, so the result seems sane. In any case, this discussion is too drawn out and off-topic for rt.cpan.org, as it's unrelated to the problem at hand, and since rt.cpan.org has this idiotic behaviour of hiding e-mail addresses (and thus making it hard to contatc people directly), I won't spam it more here. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / schmorp@schmorp.de -=====/_/_//_/\_,_/ /_/\_\