Skip Menu |

This queue is for tickets about the WWW-Contact CPAN distribution.

Report information
The Basics
Id: 42925
Status: resolved
Priority: 0/
Queue: WWW-Contact

People
Owner: Nobody in particular
Requestors: david [...] axiombox.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: W::C::Hotmail returns whacky results
Date: Fri, 30 Jan 2009 19:12:03 -0500
To: bug-WWW-Contact [...] rt.cpan.org
From: David Moreno <david [...] axiombox.com>
I have only tested with a single account so far, but the contacts returned are a bit screwed after a few correct ones. See: <contact> <name>antgonza&#64;hotmail.com‎</name> <mail>antgonza@hotmail.com</mail> </contact> <contact> <name>antilooped&#64;msn.com‎</name> <mail>antilooped@msn.com</mail> </contact> <contact> <name>antonio_ognio‎</name> <mail>antuanmanguan@hotmail.com</mail> </contact> <contact> <name>antuanmanguan&#64;hotmail.com‎</name> <mail>aphextwin_bucephalus@hotmail.com</mail> </contact> <contact> <name>aphextwin_bucephalus&#64;hotmail.com‎</name> <mail>arellanoac@hotmail.com</mail> </contact> <contact> <name>arellanoac&#64;hotmail.com‎</name> <mail>arggovea@yahoo.com.mx</mail> </contact> <contact> <name>arggovea&#64;yahoo.com.mx‎</name> <mail>arkangel_800@hotmail.com</mail> </contact> <contact> <name>arkangel_800&#64;hotmail.com‎</name> <mail>asciigirlzita@hotmail.com</mail> </contact> Don't mind the XML output, that's only my project's output that iterates on the array returned. See that after antilooped, the next result doesn't carry an 'at' symbol or a &#64; entity and then the key- value pairs are mixed with the next hash. David Moreno http://damog.net/
if you can provide a faked HTML of "/mail/PrintShell.aspx?type=contact" page, it would be great. so that I can take a look and fix it. or patches are welcome. :) Thanks.
Subject: Re: [rt.cpan.org #42925] W::C::Hotmail returns whacky results
Date: Sat, 31 Jan 2009 11:29:42 -0500
To: bug-WWW-Contact [...] rt.cpan.org
From: David Moreno <david [...] axiombox.com>
Hello! So no, I can't provide the print page since it has my entire contacts list and I wouldn't want it to be public :-) However, I found that some of the contacts on that page had no email: It had a name, but no email. I don't know if this is because my Live account is setup with a custom domain (related with the previous bug I reported), but I doubt it, so I guess it applies for all kinds of accounts. Anyway, the issue was manageable with the following patch, as you said they were welcome :) --- WWW-Contact-0.19/lib/WWW/Contact/Hotmail.pm 2009-01-02 19:01:45.000000000 -0500 +++ perl/WWW-Contact-0.19/lib/WWW/Contact/Hotmail.pm 2009-01-31 11:13:06.000000000 -0500 @@ -91,6 +91,12 @@ my $name = $p->peek(1); $name =~ s/(^\s+|\s+$)//isg; push @names, $name; + } elsif( $class and $class eq 'cCol1' ) { + $p->get_tag; # this "should" be table + $tag = $p->get_tag; + unless ($tag->is_start_tag('tr')) { + push @emails, undef; + } } } elsif ( $token->is_start_tag('td') ) { my $class = $token->get_attr('class'); I'm not experienced with TokeParser, so my code would make not complete sense to an experienced person. I'm more of regex/scraping dude :-) Anyway, that should be enough to give you an idea on what I'm trying to do: If the current div class is "cCol1", then try to find the "tr" tag. If it's not there, as I saw on the emails missing they weren't, then add undef to the @emails array. In that way, I now get this: $VAR14 = { 'email' => 'antgonza@hotmail.com', 'name' => "antgonza&#64;hotmail.com\x{200e}" }; $VAR15 = { 'email' => 'antilooped@msn.com', 'name' => "antilooped&#64;msn.com\x{200e}" }; $VAR16 = { 'email' => undef, 'name' => "antonio_ognio\x{200e}" }; $VAR17 = { 'email' => 'antuanmanguan@hotmail.com', 'name' => "antuanmanguan&#64;hotmail.com\x{200e}" }; $VAR18 = { 'email' => 'aphextwin_bucephalus@hotmail.com', 'name' => "aphextwin_bucephalus&#64;hotmail.com\x{200e}" See my previous mail as it was with my "antonio_ognio" contact that my list broke. Out of my 208 contacts, 19 of them returned an undef email, so it's not a very isolated bug. It'd be up to you if pushing undef into @emails is the best thing or just not pushing it and not push the name neither. Hope this works. David Moreno http://damog.net/ On Jan 31, 2009, at 6:56 AM, Fayland Lin via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=42925 > > > if you can provide a faked HTML of "/mail/PrintShell.aspx? > type=contact" > page, it would be great. so that I can take a look and fix it. > > or patches are welcome. :) > > Thanks.
On Sat Jan 31 11:30:39 2009, david@axiombox.com wrote: Show quoted text
> Anyway, the issue was manageable with the following patch, as you said > they were welcome :)
Apparently the patch appears broken on the RT ticket page. I'm attaching it now.
--- WWW-Contact-0.19/lib/WWW/Contact/Hotmail.pm 2009-01-02 19:01:45.000000000 -0500 +++ perl/WWW-Contact-0.19/lib/WWW/Contact/Hotmail.pm 2009-01-31 11:13:06.000000000 -0500 @@ -91,6 +91,12 @@ my $name = $p->peek(1); $name =~ s/(^\s+|\s+$)//isg; push @names, $name; + } elsif( $class and $class eq 'cCol1' ) { + $p->get_tag; # this "should" be table + $tag = $p->get_tag; + unless ($tag->is_start_tag('tr')) { + push @emails, undef; + } } } elsif ( $token->is_start_tag('td') ) { my $class = $token->get_attr('class');