Skip Menu |

This queue is for tickets about the Regexp-Grammars CPAN distribution.

Report information
The Basics
Id: 99980
Status: resolved
Priority: 0/
Queue: Regexp-Grammars

People
Owner: Nobody in particular
Requestors: bbkr [...] post.pl
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 1.036
Fixed in: (no value)



Subject: utf8 flag is lost in match object on v5.20+
#!/usr/bin/env perl use utf8; use Regexp::Grammars; my $parser = qr{ <TOP> <rule: TOP>.* }xms; 'zażółć_gęślą_jaźń' =~ $parser; print "parsed_as_utf8 = ", utf8::is_utf8( $/{'TOP'} ); __END__ On Perl 5.14.4 and 5.16.3 it correctly sets utf8 flag on captured string. On Perl 5.20.1 and 5.21.5 flag is lost.
Subject: Re: [rt.cpan.org #99980] utf8 flag is lost in match object on v5.20+
Date: Tue, 4 Nov 2014 19:24:36 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Thanks for the report. However, this problem is not specific to Regexp::Grammars, as the attached test script demonstrates. I will report the issue to the core developers. Damian

Message body is not shown because sender requested not to inline it.

From: bbkr [...] post.pl
Thanks! Can you please link ticket where it is reported so everyone can track its status? I don't see it on RT.
Subject: Re: [rt.cpan.org #99980] utf8 flag is lost in match object on v5.20+
Date: Thu, 6 Nov 2014 07:54:41 +1100
To: bug-Regexp-Grammars [...] rt.cpan.org
From: Damian Conway <damian [...] conway.org>
Show quoted text
> Can you please link ticket where it is reported so everyone can track its status? I don't see it on RT.
There was a problem with the original report. I have just resubmitted. The ticket is: https://rt.perl.org/Ticket/Display.html?id=123135 Damian
From: bbkr [...] post.pl
Bug is confirmed, should be fixed in Perl 5.20.2 release: https://rt.perl.org/Ticket/Display.html?id=122913
On Thu Nov 06 12:16:54 2014, bbkr@post.pl wrote: Show quoted text
> Bug is confirmed, should be fixed in Perl 5.20.2 release: > > https://rt.perl.org/Ticket/Display.html?id=122913
Yes, the fix has been backported to maint-5.20. If you need to work around the bug for 5.20.0 and 5.20.1, I believe my $x = $^N; utf8::decode $x if utf8::is_utf8 $_; ... do something with $x, not $^N ... will do the trick. Within regexp code blocks, $_ is aliased to the string being matched against. And it is within code blocks that $^N behaves erratically.
From: bbkr [...] post.pl
From Perl 5.20.2 changelog: "In Perl 5.20.0, $^N accidentally had the internal UTF8 flag turned off if accessed from a code block within a regular expression, effectively UTF8-encoding the value. This has been fixed. [perl #123135]" So ticket can be closed. Thanks!