Subject: | hebrew 'ben' fix breaks normal bens |
namecase-test.pl demonstrates the problem, which the patch fixes
basically, for 'ben jones', the first name remains lowercase as a result
of the code that accomodates hebrew 'ben' meaning 'son of'.
Subject: | ben.patch |
--- Lingua-EN-NameCase-1.15/NameCase.pm 2008-02-27 00:46:53.000000000 -0700
+++ /Library/Perl/5.8.8/Lingua/EN/NameCase.pm 2009-06-04 11:41:38.000000000 -0600
@@ -110,7 +110,9 @@
# Fixes for "son (daughter) of" etc. in various languages.
s{ \b Al(?=\s+\w) }{al}gox ; # al Arabic or forename Al.
s{ \b Ap \b }{ap}gox ; # ap Welsh.
- s{ \b Ben(?=\s+\w) }{ben}gox ; # ben Hebrew or forename Ben.
+ # <http://www.jewfaq.org/jnames.htm> search for: followed by ben
+ # without first (?<=\S\s), first name of 'ben jones' remains lowercase
+ s{ (?<=\S\s)\b Ben(?=\s+\w) }{ben}gox ; # ben Hebrew or forename Ben.
s{ \b Dell([ae])\b }{dell$1}gox ; # della and delle Italian.
s{ \b D([aeiu]) \b }{d$1}gox ; # da, de, di Italian; du French.
s{ \b De([lr]) \b }{de$1}gox ; # del Italian; der Dutch/Flemish.
Subject: | namecase-test.pl |
#!/usr/bin/perl
use strict;
use Lingua::EN::NameCase 'nc';
while (<DATA>) {
chomp;
my $nc = $_;
$nc = nc($nc);
print "[$_]\tbecomes [$nc]\n";
}
__DATA__
DR SARAH BEETLE
june O'LEARY
MICHAEL JOHN JACOBS JR
MR. jon whitacre iii
MARY BETH DAVIDSON MD
MS LAURA CONLEY-ROSE
LAURA&DAVID SMITH
ESTATE OF LAURA JONES
MS MS. LAURA J BYRD
ben mcgrath
Aharon ben Amram ha-Kohein