Skip Menu |

This queue is for tickets about the URI CPAN distribution.

Report information
The Basics
Id: 81460
Status: new
Priority: 0/
Queue: URI

People
Owner: Nobody in particular
Requestors: tim [...] tim-landscheidt.de
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 1.60
Fixed in: (no value)



Subject: Unescaping UTF-8 file URLs doesn't work with "use encoding"
The script: | use encoding 'utf8'; | use strict; | use warnings; | use Data::Dumper; | use URI::file; | my $u = URI->new ('file:///home/tim/Videos/Nerdist%20Podcast/Nerdist%20Podcast:%20Live%20 @%20Largo%20w%20%E2%80%93%20%20Adam%20Savage!%20(%2310).mp3'); | print Dumper $u->file (); yields with perl 5.14.3 under Linux (Fedora 16): | $VAR1 = "/home/tim/Videos/Nerdist Podcast/Nerdist Podcast: Live \@ Largo w \x{fffd}\x{fffd}\x{fffd} Adam Savage! (#10).mp3"; (Note the three "\x{fffd}".) I believe this is due to URI::Escape::uri_unescape()'s: | [...] | s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg; | [...] where (assumption:) not a concatenation of bytes is constructed that is then converted, but each byte is tried to be converted individually. Unfortunately, thinking about Perl's UTF-8 magic makes my brain hurt, so I couldn't confirm this :-).
From: tim [...] tim-landscheidt.de
This issue can be closed. I can no longer reproduce it with Fedora 25, perl-URI 1.71, URI/file.pm 4.21 which now yields the correct (in a sense): | $VAR1 = "/home/tim/Videos/Nerdist Podcast/Nerdist Podcast: Live \@ Largo w \342\200\223 Adam Savage! (#10).mp3";