Bug #34868 for Games-Cryptoquote: Patterns list wrong buildup

Subject:	Patterns list wrong buildup
Date:	Fri, 11 Apr 2008 10:59:55 +0200
To:	bug-Games-Cryptoquote [...] rt.cpan.org
From:	"Walter Baeck" <walter.baeck [...] gmail.com>

I haven't actually run CryptoQuote.pm ; I was just browsing through the code to understand its approach. From the included patterns.txt file, I get the general idea of how the algorithm works. Based on whether letters are recurring or unique in a word, a lookup key is formed that allows quick access to a list of known plaintext words of exactly this same pattern. But these lookup keys treat upper/lower case letters as different, which shouldn't be the case. I'm used to CryptoQuotes printed in the newspaper in all-uppercase, so I never thought of the issue. But from the example in your own source code, I understand that lowercase and uppercase substitutions are meant to be consistent (when an uppercase 'B' stands for an 'N', then automatically the lowercase 'b' is also guaranteed to stand for an 'n' - and vice versa). Therefore, the word encodings should be classified regardless of uppercase/lowercase, and the patterns.txt file should be built up as such. While representing the found solution, for esthetic correctness, the casing could be retrieved from the encoded quote, and reproduced. Perhaps it is the intention to restrict proper names from matching against lowercase codes within the quote itself. (I think this is a dangerous idea, because common words can also occur with a capital in them, at the beginning of a sentence within the quote.) But then still, information is lost by blandly considering the uppercase codes as wholly different. The coded author's first name "Npn" should match "Ada" or "Bob", but not "Cat". Walter