Skip Menu |

This queue is for tickets about the Lingua-EN-Titlecase CPAN distribution.

Report information
The Basics
Id: 132504
Status: new
Priority: 0/
Queue: Lingua-EN-Titlecase

People
Owner: Nobody in particular
Requestors: rosyth168 [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Feature request, titlecase doesn't handle '4th' etc correctly.
Date: Tue, 5 May 2020 13:58:18 +0200
To: bug-Lingua-EN-Titlecase [...] rt.cpan.org
From: John Tweed <rosyth168 [...] gmail.com>
Hi, Thanks for the work you made in creating titlecase, it's been helpful normalising my record collection. Except for names like 1st, 3rd, 4th etc, that translate to 1St, 3Rd or 4Th, not the intended result. Here is my not very rigorous modification to address this issue. (Where .bak is the original). $ diff -Naur ../Titlecase.pm* --- ../Titlecase.pm 2020-05-05 13:48:49.030787393 +0200 +++ ../Titlecase.pm.bak 2020-05-05 13:55:01.632107658 +0200 @@ -10,7 +10,6 @@ uc_threshold mixed_threshold mixed_rx - numeric_rx wordish_rx allow_mixed word_punctuation @@ -108,9 +107,6 @@ | \G(?<!\A)[[:upper:]] /x) unless $self->mixed_rx; - $self->numeric_rx(qr/ - [[:digit:]]+(?:th|st|nd|rd) - /x) unless $self->numeric_rx; $self->allow_mixed(undef); $self->mixed_threshold(0.25) unless $self->mixed_threshold; @@ -212,11 +208,8 @@ my $wp = $self->word_punctuation; my $wordish = $self->wordish_rx; - my $numeric = $self->numeric_rx; $self->{_lexer} = sub { - # print("LEXER -> ",$_[0]); - $_[0] =~ s/\A($numeric)// and return [ "word", "$1" ]; $_[0] =~ s/\A($wordish)// and return [ "word", "$1" ]; $_[0] =~ s/\A(.)//s and return [ undef, "$1" ]; return ();