Skip Menu |

This queue is for tickets about the Encode CPAN distribution.

Report information
The Basics
Id: 75670
Status: resolved
Priority: 0/
Queue: Encode

People
Owner: DANKOGAI [...] cpan.org
Requestors: geoff.rowell [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 2.44
Fixed in: (no value)



Subject: Wrong decoding for GSM 3.38 character \x09
RE: Encode::GSM0338 Per the official 3GPP technical spec for GSM 3.38 (latest, revision 7.2.0): http://www.3gpp.org/ftp/Specs/archive/03_series/03.38/0338-720.zip The GSM 3.38 character \x09 is incorrectly decoded as LATIN SMALL LETTER C WITH CEDILLA (U+00E7). Its correct decoding is LATIN CAPITAL LETTER C WITH CEDILLA (U+00C7).
Checked other sources and confirmed your claim is correct. Fixed in my repository diff -u -r2.0 ucm/gsm0338.ucm --- ucm/gsm0338.ucm 2004/05/16 20:55:24 2.0 +++ ucm/gsm0338.ucm 2012/08/05 22:42:30 @@ -369,7 +369,8 @@ <U00E4> \x7B |0 # LATIN SMALL LETTER A WITH DIAERESIS <U00E5> \x0F |0 # LATIN SMALL LETTER A WITH RING ABOVE <U00E6> \x1D |0 # LATIN SMALL LETTER AE -<U00E7> \x09 |0 # LATIN SMALL LETTER C WITH CEDILLA +#<U00E7> \x09 |0 # LATIN SMALL LETTER C WITH CEDILLA +<U00C7> \x09 |0 # LATIN CAPITAL LETTER C WITH CEDILLA <U00E8> \x04 |0 # LATIN SMALL LETTER E WITH GRAVE <U00E9> \x05 |0 # LATIN SMALL LETTER E WITH ACUTE <U00EC> \x07 |0 # LATIN SMALL LETTER I WITH GRAVE Dan the Maintainer Thereof On Sat Mar 10 09:47:58 2012, gbrowell wrote: Show quoted text
> RE: Encode::GSM0338 > > Per the official 3GPP technical spec for GSM 3.38 (latest, revision > 7.2.0): > > http://www.3gpp.org/ftp/Specs/archive/03_series/03.38/0338-720.zip > > The GSM 3.38 character \x09 is incorrectly decoded as LATIN SMALL LETTER C > WITH CEDILLA (U+00E7). > Its correct decoding is LATIN CAPITAL LETTER C WITH CEDILLA (U+00C7).
! lib/Encode/GSM0338.pm t/gsm0338.t REALLY fixed RT#75670: Wrong decoding for GSM 3.38 character \x09 ucm/gsm0338.ucm is dropped from MANIFEST since 2.25 but I was fixing the wrong file! https://rt.cpan.org/Ticket/Display.html?id=75670 =========================================================== ======== RCS file: lib/Encode/GSM0338.pm,v retrieving revision 2.1 diff -u -r2.1 lib/Encode/GSM0338.pm --- lib/Encode/GSM0338.pm 2008/05/07 20:56:05 2.1 +++ lib/Encode/GSM0338.pm 2012/08/15 04:40:49 @@ -1,5 +1,5 @@ # -# $Id: GSM0338.pm,v 2.1 2008/05/07 20:56:05 dankogai Exp dankogai $ +# $Id: GSM0338.pm,v 2.1 2008/05/07 20:56:05 dankogai Exp $ # package Encode::GSM0338; @@ -138,7 +138,8 @@ "\x{00E4}" => "\x7B", # LATIN SMALL LETTER A WITH DIAERESIS "\x{00E5}" => "\x0F", # LATIN SMALL LETTER A WITH RING ABOVE "\x{00E6}" => "\x1D", # LATIN SMALL LETTER AE - "\x{00E7}" => "\x09", # LATIN SMALL LETTER C WITH CEDILLA + #"\x{00E7}" => "\x09", # LATIN SMALL LETTER C WITH CEDILLA + "\x{00C7}" => "\x09", # LATIN CAPITAL LETTER C WITH CEDILLA "\x{00E8}" => "\x04", # LATIN SMALL LETTER E WITH GRAVE "\x{00E9}" => "\x05", # LATIN SMALL LETTER E WITH ACUTE "\x{00EC}" => "\x07", # LATIN SMALL LETTER I WITH GRAVE =========================================================== ======== RCS file: t/gsm0338.t,v retrieving revision 2.1 diff -u -r2.1 t/gsm0338.t --- t/gsm0338.t 2007/04/22 14:56:12 2.1 +++ t/gsm0338.t 2012/08/15 05:09:26 @@ -13,7 +13,7 @@ use strict; use utf8; -use Test::More tests => 778; +use Test::More tests => 780; use Encode; use Encode::GSM0338; @@ -87,6 +87,10 @@ } } +# https://rt.cpan.org/Ticket/Display.html?id=75670 +is decode("gsm0338", "\x09") => chr(0xC7), 'RT75670: decode'; +is encode("gsm0338", chr(0xC7)) => "\x09", 'RT75670: encode'; + __END__ for my $c (map { chr } 0..127){ my $b = "\x1b$c";