Subject: | Removing non-word characters breaks contractions |
Pod::Spelling removes non-word characters ("punctuation") within
_clean_text():
$text =~ s/[\W]+/ /gs; # Remove punctuation
However, this of course means that contractions such as "isn't" are
broken, being turned into "isn t" - thereby causing the first part to be
treated as a spelling error by the time the spell checker sees it.
A brief example using a script which uses Pod::Spelling to check for
spelling errors in the named file:
[davidp@columbia:~]$ ~/podspellcheck dancer/customer/lib/customer.pm
Spelling errors: couldn, isn, isn, couldn, isn, isn, couldn, Couldn,
isn, couldn, GMT, wasn, isn, isn, doesn, couldn, isn, isn, couldn, API,
[... snipped rest ....]