Skip Menu |

This queue is for tickets about the WWW-Contact CPAN distribution.

Report information
The Basics
Id: 46280
Status: open
Priority: 0/
Queue: WWW-Contact

People
Owner: Nobody in particular
Requestors: olaf [...] wundersolutions.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.24
Fixed in: (no value)



Subject: Funky encoding of some non-alphanumberic chars in Hotmail names
When returning hotmail contacts which don't have first name associated with them, them module is returning funky stuff like: { 'email' => 'john_sample@hotmail.com', 'name' => 'john_sample\\x26\\x2364\\x3bhotmail.com' }, (Obviously the @ sign here). Could be fixed with: =~ s{\\x26\\x2364\\x3b}{@}gxms { 'email' => 'sample@domain.com', 'name' => 'Sample \\x26\\x2340\\x3bhome\\x26\\x2341\\x3b' } Looks to be "(home)" I don't really understand the encoding issues well enough to fix it myself for all possible characters, but I think it merits a ticket. :) Thanks for all of your helpful work! Olaf
On Thu May 21 12:14:08 2009, OALDERS wrote: = Show quoted text
> > I don't really understand the encoding issues well enough to fix it > myself for all possible characters, but I think it merits a ticket. :) >
One of my co-workers had a crack at this. I've attached an example for decoding the funkiness. Olaf
#!/usr/bin/perl use strict; use warnings; use HTML::Entities; my $name = "john_sample\\x26\\x2364\\x3bhotmail.com"; $name =~ s{\\x(..)}{chr(hex($1))}egxms; $name = decode_entities($name); print $name, "\n";
Thanks. 0.25 is on the way CPAN. http://fayland.org/CPAN/WWW-Contact-0.25.tar.gz
I was working with Olaf on this issue earlier, and I was thinking that the regex I came up with before could cause some issues in some rare instances, so I came up with a new one that should be a bit safer: $name =~ s{(?|\\x\{([a-f0-9]{4})\}|\\x([a-f0-9]{2}))}{chr(hex($1))}egxms; That should account for all hex characters in either the \x?? or \x{????} formats. This way on the off chance that the text returned actually has \\x and then something other than a hex code it won't bother trying to do a replacement on it.
Sorry, it has syntax error: 'Sequence (?|...) not recognized in regex; marked by <-- HERE in m/(?| <-- HERE \\x{([a-f0-9]{4})}|\\x([a-f0-9]{2}))/' do you mean $name =~ s{(\\x\{([a-f0-9]{4})\}|\\x([a-f0-9]{2}))}{chr(hex($1))}egxms; Thanks
On Fri May 22 00:10:39 2009, FAYLAND wrote: Show quoted text
> Sorry, it has syntax error: 'Sequence (?|...) not recognized in regex; > marked by <-- HERE in m/(?| <-- HERE
\\x{([a-f0-9]{4})}|\\x([a-f0-9]{2}))/' Show quoted text
> > do you mean > > $name =~ s{(\\x\{([a-f0-9]{4})\}|\\x([a-f0-9]{2}))}{chr(hex($1))}egxms; > > Thanks
That's strange that you get a syntax error with that code. I've just run it on my machine here and it works fine. The ?| portion tells the perl regex engine to to restart capture numbering from 1 in each branch. In this case, this should cause either the ([a-f0-9]{4}) or ([a-f0-9]{2}) captures to both be indicated by $1. Which is needed so that the chr(hex($1)) portion works correctly in either case. Not sure why you get a syntax error when you try it. With: $name =~ s{(\\x\{([a-f0-9]{4})\}|\\x([a-f0-9]{2}))}{chr(hex($1))}egxms; I get errors on my end saying illegal hexadecimal digit (\), not to mention there would then be technically 3 capture buffers in there I think. You could do: $name =~ s{\\x([a-f0-9]{2})}{chr(hex($1))}egxms; That only tests for 2 digit hex characters but it should cover most cases at least and still shouldn't have any false replacements.
I come up by $name =~ s{\\x([A-Fa-f0-9]{2})}{chr(hex($1))}egxms; at last. Thanks for your patch. 0.26 is on the way CPAN.
Glad that one works for you. A reason that the other code I left might not work is that the (?|pattern)functionality is only available as of Perl 5.10.0 so if you are using an earlier build that might have been the issue.
do you know how to check if perl version is > 5.10? use $[ or $^V? how to compare? Thanks.
On Sat May 23 01:58:17 2009, FAYLAND wrote: Show quoted text
> do you know how to check if perl version is > 5.10? > use $[ or $^V? how to compare? > > Thanks.
I think either of those should work. Perl::Version looks like it would handle the version comparison for you: http://search.cpan.org/~andya/Perl-Version-1.009/lib/Perl/Version.pm#Comparison Olaf