Subject: | regexps for upper and lower case are anglocentric |
Using \p{Upper} and \p{Lower} would be an improvement.
Test case attached.
Subject: | not-just-a-z.t |
=pod
=encoding utf-8
=head1 PURPOSE
At the time of writing, MooseX::Types::Common::String uses regular
expressions featuring C<< [a-z] >> and C<< [A-Z] >> to test lower-
and upper-caseness.
This is very anglocentric - there are many, many other lower- and
upper-case characters commonly used in other languages. The current
situation is not even sufficient for English text where many loan words,
and even some native words include accented characters. These include
I<< café >>, I<< encycopædia >> and I<< naïve >>.
There's no excuse for this; Perl has very good Unicode support, including
built-in character classes for matching lower- and upper-case characters.
=head1 AUTHOR
Toby Inkster E<lt>tobyink@cpan.orgE<gt>.
=head1 COPYRIGHT AND LICENCE
This software is copyright (c) 2013 by Toby Inkster.
This is free software; you can redistribute it and/or modify it under
the same terms as the Perl 5 programming language system itself.
=cut
use strict;
use warnings;
use utf8;
use Test::More;
use MooseX::Types::Common::String -all;
ok( is_UpperCaseStr('CAFÃ'), q[CAFÃ is uppercase] );
ok( !is_UpperCaseStr('CAFé'), q[CAFé is not (entirely) uppercase] );
ok( is_LowerCaseStr('café'), q[café is lowercase] );
ok( !is_LowerCaseStr('cafÃ'), q[cafà is not (entirely) lowercase] );
ok( is_UpperCaseSimpleStr('CAFÃ'), q[CAFÃ is uppercase] );
ok( !is_UpperCaseSimpleStr('CAFé'), q[CAFé is not (entirely) uppercase] );
ok( is_LowerCaseSimpleStr('café'), q[café is lowercase] );
ok( !is_LowerCaseSimpleStr('cafÃ'), q[cafà is not (entirely) lowercase] );
done_testing;