Skip Menu |

This queue is for tickets about the WWW-Myspace CPAN distribution.

Report information
The Basics
Id: 27707
Status: resolved
Priority: 0/
Queue: WWW-Myspace

People
Owner: Nobody in particular
Requestors: tobiesch [...] users.sourceforge.net
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.64
Fixed in: (no value)



Subject: fixed method for comments, added new method
Hi, sorry to "spam" the bug reports but seems the easiest way to contact. I've fixed the methods that retrieve the comments from a profile and I also added a function that gets some basic info from the profile (age, gender, country, etc). I attach the code in the file for you to look at. If you want to incorporate it, I could probably check it into your CVS (given a login) and then you can adapt it as I'm sure there is room for improvement. best, tobias
Subject: code.txt
#--------------------------------------------------------------------- # get_basic_info_on_page( $page ); # This routine takes the SOURCE CODE of the page and returns # a hash of different information contained in the box on the # top left corner =head2 get_basic_info_on_page( $friends_page ); This routine takes the SOURCE CODE of an HTML page and returns a hash of information containing: country - country in profile (names of countries are is standardised on MySpace) cityregion - the line with city and region information (this is free text) headline - what ever it says as next to the picture age - as number gender - as text, either male or female lastlogin - date of last login Note: MySpace joins the profile data from city and region to one line (such as Berlin, Germany). However, both city and region are free text so people can write whatever they want. What is more, region is optional. This function tries to extract the city and the region by splitting cityregion at the last comma. However, it might not work (depending on the profile information) so both city and region can be empty. city - city region - region =cut sub get_basic_info_on_page { my ( $page ) = @_; ##THIS IS LANGUAGE DEPENDENT SO SITE HAS TO BE ACCESSED IN ENGLISH!!! #my $BASIC_INFO = 'Table2".*?>.*<td.*?>(.*?)Last Login'; my $BASIC_INFO = 'Table2".*?>(.*Last Login:.*?)<br>'; $BASIC_INFO = qr/$BASIC_INFO/smo; #my $time=time; #matching does take quite long... (around 6s) $page =~ /$BASIC_INFO/; #$page =~ /Table2.*?>.*<td.*?>(.*?)Last Login/smo; $page=$1; $page =~ /align="left">(.*)/smo; #print "took time:",time-$time,", found $1\n"; ( $DEBUG ) && print $1,"\n"; my %info = (); #assign values and trim leading and trailing white spaces ($info{'headline'},$info{'empty'},$info{'gender'},$info{'age'},$info{'cityregion'},$info{'country'},$info{'empty'},$info{'empty'},$info{'lastlogin'})=map {s/^\s+//;s/\s+$//;$_} split('<br>',$1); #return age as number only $info{'age'} =~ s/^(\d+).*/$1/; #return last login as date only $info{'lastlogin'} =~ s/Last Login:\s+([\d\/]*)/$1/; #let's guess what is the city and what the region if ($info{'cityregion'} =~ /(.*), (.*)/){ $info{'city'} = $1; $info{'region'} = $2; } return (%info); } ######################################### # fixed comments method ####################################### sub tobiesch_get_comments { my ( $friend_id ) = @_; my @comments = (); my $url="http://comment.myspace.com/index.cfm?fuseaction=user.viewComments&friendID=". $friend_id; my $eventtarget='ctl00$Main$PagedComments$pagingNavigation1$NextLinkButton'; my $page=""; my $commentcount; $self->_die_unless_logged_in( 'get_comments' ); #only get a maximum of 20 comment pages #this should translate to 1000 comments #and also serves as a safety measure in case #the method breaks again ( $DEBUG ) && print "Getting $url\n"; $page = $self->get_page( $url ); #find out how many comments in total #STILL NEEDS UPDATE IN CASE OF MORE THAN 1000 COMMENTS... #ALSO SEEMS LIKE SOMETIMES MYSPACE REPORTS MORE COMMENTS (OR FRIENDS) #THAN ARE ACTUALLY THERE... if($page->decoded_content =~ /.*Listing [\d-]+ of (\d+).*/smo){ $commentcount=$1; }else{ $self->error("Could not find how many comments are on profile"); return undef; } for(my $i=1;$i<=20;$i++) { $page=$self->{current_page}; push @comments, $self->tobiesch_get_comments_from_page( $page->decoded_content ); #make sure we did not get an error return undef if ($self->error); last unless ( $self->_next_button( $page->decoded_content ) ); ( $DEBUG ) && print "try to submit form to access comments page #",$i+1,"\n"; #submit the form to get to next page $self->submit_form({ follow => 0, form_name => "aspnetForm", no_click => 1, fields_ref => { __EVENTTARGET => $eventtarget, __EVENTARGUMENT => '' } #re1 => 'something unique.?about this[ \t\n]+page', }); sleep ( int( rand( 2 ) ) + 1 ); } unless(scalar (@comments) == $commentcount){ $self->error("Could not collect all comments. Have " . @comments .", should have $commentcount"); return undef; } return \@comments; } # Take a page, return a list of comment data sub tobiesch_get_comments_from_page { my ( $page ) = @_; my @comments = (); # Get to the comments section to avoid mis-reads if ( $page !~ m/Add Comment<\/a>/gs ) { $self->error("Comment section not found on page"); return undef; } # Read the comment data and push it into our array. while ( $page =~ s/.*?UserID=([0-9]+).*?<h4>(.*?)<\/h4>\s*(.*?)\s*<\/textarea>//smo ) { push @comments, { sender => $1, date => $2, comment => $3 }; #print "found 1:$1\nfound 2:$2\nfound 3:$3\n"; } return @comments; } ################################################
From: olaf [...] wundersolutions.com
On Sat Jun 23 10:20:20 2007, tobiesch wrote: Show quoted text
> Hi, > > sorry to "spam" the bug reports but seems the easiest way to contact. > I've fixed the methods that retrieve the comments from a profile and I > also added a function that gets some basic info from the profile (age, > gender, country, etc). I attach the code in the file for you to look at. > If you want to incorporate it, I could probably check it into your CVS > (given a login) and then you can adapt it as I'm sure there is room for > improvement. > > best, tobias
Hi Tobias, Thanks very much for this! Can you have a look at the cache_friend method in WWW::Myspace::Data? This already does some of the basic info that you're fetching here. I know bands have different info than the info on personal pages. If we can cover everything in your new method, we could add it to WWW::Myspace and I can have WWW::Myspace::Data::cache_friend access this new method that you've written. All the best, Olaf
Subject: Re: [rt.cpan.org #27707] fixed method for comments, added new method
Date: Sat, 23 Jun 2007 10:42:39 -0700
To: bug-WWW-Myspace [...] rt.cpan.org
From: Grant Grueninger <grantg [...] spamarrest.com>
Hi Tobias, To echo Olaf, thanks very much! (and see below). On Jun 23, 2007, at 7:20 AM, via RT wrote: Show quoted text
> > sorry to "spam" the bug reports but seems the easiest way to contact.
This is the right place to submit fixes - otherwise they get lost. :) Show quoted text
> I've fixed the methods that retrieve the comments from a profile and I > also added a function that gets some basic info from the profile (age, > gender, country, etc). I attach the code in the file for you to > look at. > If you want to incorporate it, I could probably check it into your CVS > (given a login) and then you can adapt it as I'm sure there is room > for > improvement.
I would like to give you a login. Here are the basic developer guidelines: - Sign up for a sourceforge account and send me your username. - You'll be added to the development mailing list so you'll get commit notices and email regarding the module - IMPORTANT: We try to keep the module in a releasable state at all times because we frequently need to make a change to react to a myspace change. This means if you're developing a new method, don't commit it until it at least passes syntax checks. If it's incomplete, make sure the POD says so clearly and it's in the "IN PROGRESS" section. - In-development methods go at the bottom of the module in the "IN PROGRESS" section. - Put new working methods in the appropriate section. - Test your POD: - Check the formatting - Make sure someone reading your documentation will know how to use your method without looking at the code. - Make sure the documentation is complete and accurate - Write a test for your method. - You have to debug your method anyway - write a test while you're at it. Tests are just simple debugging scripts that call your method and look for results. See the existing tests in the "t" folder and add a new one. Ask if you need help. Current issues: - We are working on migrating all regular expressions to the "regex" hash near the top of Myspace.pm. All regular expressions should be accessed using the "_apply_regex" method, or the "_regex" method if appropriate. See the "is_band" and "is_inactive" methods for examples. (Note that this migration is under a bit of debate, since it's easier to maintain the module if the REs are in the method itself, but due to potential upcoming myspace changes we've generally decided that handling them through centralized methods will probably be necessary soon). - Myspace recently added multiple-language versions. We're requiring that users set their account information to use English for now. The login method has been updated to work based on URL, which is (currently) language-independent. For internationalization (i18n) we may adopt use of language-specific regex (hence the start of the above migration), or we may have the module automatically switch the user's country on login and switch it back on logout if possible. Other notes: - Olaf and I use BBedit. If you use BBedit, turn on "Auto-Expand Tabs". You'll notice the blank placeholder subroutines in the module that denote sections. These appear conveniently in the subroutine list in BBedit. If you have a Mac and don't use BBedit, try the demo: www.barebones.com Show quoted text
> > =head2 get_basic_info_on_page( $friends_page ); > > This routine takes the SOURCE CODE of an HTML page and returns > a hash of information containing: > country - country in profile (names of countries are is > standardised on MySpace) > cityregion - the line with city and region information (this is > free text) > headline - what ever it says as next to the picture > age - as number > gender - as text, either male or female > lastlogin - date of last login > > Note: MySpace joins the profile data from city and region to one > line (such as Berlin, Germany). > However, both city and region are free text so people can write > whatever they want. What is more, > region is optional. This function tries to extract the city and the > region by splitting cityregion > at the last comma. However, it might not work (depending on the > profile information) so both city > and region can be empty. > city - city > region - region > > > =cut
I think the formatting here got messed up in the email. If it formats ok when you "perldoc Myspace.pm" then maybe you can just commit it once I have your sourceforge info set up. Show quoted text
> > sub get_basic_info_on_page { > my ( $page ) = @_; > > ##THIS IS LANGUAGE DEPENDENT SO SITE HAS TO BE ACCESSED IN > ENGLISH!!! > #my $BASIC_INFO = 'Table2".*?>.*<td.*?>(.*?)Last Login'; > my $BASIC_INFO = 'Table2".*?>(.*Last Login:.*?)<br>'; > $BASIC_INFO = qr/$BASIC_INFO/smo; > > #my $time=time; > #matching does take quite long... (around 6s) > $page =~ /$BASIC_INFO/;
This is probably a good candidate for the regex hash and associated methods. We try to use all-caps variable names for global variables. Show quoted text
> #STILL NEEDS UPDATE IN CASE OF MORE THAN 1000 COMMENTS... > #ALSO SEEMS LIKE SOMETIMES MYSPACE REPORTS MORE COMMENTS (OR > FRIENDS) > #THAN ARE ACTUALLY THERE...
Yup. Myspace almost never reports the correct number of friends or comments. Usually, this is due to deleted accounts not being subtracted from your friend count. But, in some profiles, if you approve friend requests using the checkboxes and "approve selected friends" button, it will *decrease* your friend count by the number of friends you approve. Grant
From: tobiesch [...] users.sourceforge.net
Hi Grant and Olaf, thanks for the positive feedback. I am a bit busy right now but give me some days and I will try to incorporate all your comments in the code and submit it to the current CVS version. Thanks for the login, my sourceforge user name is tobiesch(@users.sourceforge.net). best, tobias On Sat Jun 23 13:43:07 2007, grantg@spamarrest.com wrote: Show quoted text
> Hi Tobias, > > To echo Olaf, thanks very much! (and see below). > > On Jun 23, 2007, at 7:20 AM, via RT wrote:
> > > > sorry to "spam" the bug reports but seems the easiest way to contact.
> > This is the right place to submit fixes - otherwise they get lost. :) >
> > I've fixed the methods that retrieve the comments from a profile and I > > also added a function that gets some basic info from the profile (age, > > gender, country, etc). I attach the code in the file for you to > > look at. > > If you want to incorporate it, I could probably check it into your CVS > > (given a login) and then you can adapt it as I'm sure there is room > > for > > improvement.
> > I would like to give you a login. Here are the basic developer > guidelines: > - Sign up for a sourceforge account and send me your username. > - You'll be added to the development mailing list so you'll > get commit notices and email regarding the module > - IMPORTANT: We try to keep the module in a releasable state at all > times because > we frequently need to make a change to react to a myspace change. > This means if you're developing a new method, don't commit it > until it at least passes syntax checks. If it's incomplete, make > sure > the POD says so clearly and it's in the "IN PROGRESS" section. > - In-development methods go at the bottom of the module in the > "IN PROGRESS" section. > - Put new working methods in the appropriate section. > - Test your POD: > - Check the formatting > - Make sure someone reading your documentation will know how to use > your method without looking at the code. > - Make sure the documentation is complete and accurate > - Write a test for your method. > - You have to debug your method anyway - write a test while you're > at it. > Tests are just simple debugging scripts that call your method > and look > for results. See the existing tests in the "t" folder and add a > new one. > Ask if you need help. > > Current issues: > - We are working on migrating all regular expressions to the "regex" > hash > near the top of Myspace.pm. All regular expressions should be > accessed > using the "_apply_regex" method, or the "_regex" method if > appropriate. > See the "is_band" and "is_inactive" methods for examples. > (Note that this migration is under a bit of debate, since it's easier > to maintain the module if the REs are in the method itself, but > due to > potential upcoming myspace changes we've generally decided that > handling > them through centralized methods will probably be necessary soon). > - Myspace recently added multiple-language versions. We're requiring > that > users set their account information to use English for now. The > login > method has been updated to work based on URL, which is (currently) > language-independent. For internationalization (i18n) we may adopt > use of language-specific regex (hence the start of the above > migration), > or we may have the module automatically switch the user's country on > login and switch it back on logout if possible. > > Other notes: > - Olaf and I use BBedit. If you use BBedit, turn on "Auto-Expand Tabs". > You'll notice the blank placeholder subroutines in the module that > denote sections. These appear conveniently in the subroutine list > in BBedit. If you have a Mac and don't use BBedit, try the demo: > www.barebones.com >
> > > > =head2 get_basic_info_on_page( $friends_page ); > > > > This routine takes the SOURCE CODE of an HTML page and returns > > a hash of information containing: > > country - country in profile (names of countries are is > > standardised on MySpace) > > cityregion - the line with city and region information (this is > > free text) > > headline - what ever it says as next to the picture > > age - as number > > gender - as text, either male or female > > lastlogin - date of last login > > > > Note: MySpace joins the profile data from city and region to one > > line (such as Berlin, Germany). > > However, both city and region are free text so people can write > > whatever they want. What is more, > > region is optional. This function tries to extract the city and the > > region by splitting cityregion > > at the last comma. However, it might not work (depending on the > > profile information) so both city > > and region can be empty. > > city - city > > region - region > > > > > > =cut
> > I think the formatting here got messed up in the email. If it > formats ok when you "perldoc Myspace.pm" then maybe you can just > commit it once I have your sourceforge info set up. >
> > > > sub get_basic_info_on_page { > > my ( $page ) = @_; > > > > ##THIS IS LANGUAGE DEPENDENT SO SITE HAS TO BE ACCESSED IN > > ENGLISH!!! > > #my $BASIC_INFO = 'Table2".*?>.*<td.*?>(.*?)Last Login'; > > my $BASIC_INFO = 'Table2".*?>(.*Last Login:.*?)<br>'; > > $BASIC_INFO = qr/$BASIC_INFO/smo; > > > > #my $time=time; > > #matching does take quite long... (around 6s) > > $page =~ /$BASIC_INFO/;
> > This is probably a good candidate for the regex hash and associated > methods. > We try to use all-caps variable names for global variables. > >
> > #STILL NEEDS UPDATE IN CASE OF MORE THAN 1000 COMMENTS... > > #ALSO SEEMS LIKE SOMETIMES MYSPACE REPORTS MORE COMMENTS (OR > > FRIENDS) > > #THAN ARE ACTUALLY THERE...
> > Yup. Myspace almost never reports the correct number of friends or > comments. Usually, this is due to deleted accounts not being > subtracted from your friend count. But, in some profiles, if you > approve friend requests using the checkboxes and "approve selected > friends" button, it will *decrease* your friend count by the number > of friends you approve. > > Grant
Subject: Re: [rt.cpan.org #27707] fixed method for comments, added new method
Date: Sun, 24 Jun 2007 23:03:59 -0700
To: bug-WWW-Myspace [...] rt.cpan.org
From: Grant Grueninger <grantg [...] spamarrest.com>
Hi Tobias, I've added you as a developer. Log into sourceforge and you should see www-myspace in your "my projects" list. Limited developer documentation is in the Wiki at http://www- myspace.wiki.sourceforge.net/ You can poke around your menu options in there. We aren't using most of the features, and the release on sourceforge is very old (too hard to to a release there so we just do it on CPAN). You should also have svn commit access. On Jun 23, 2007, at 2:52 PM, via RT wrote: Show quoted text
> > Queue: WWW-Myspace > Ticket <URL: http://rt.cpan.org/Ticket/Display.html?id=27707 > > > Hi Grant and Olaf, > > thanks for the positive feedback. > > I am a bit busy right now but give me some days and I will try to > incorporate all your comments in the code and submit it to the current > CVS version. > > Thanks for the login, my sourceforge user name is > tobiesch(@users.sourceforge.net). > > best, tobias > > On Sat Jun 23 13:43:07 2007, grantg@spamarrest.com wrote:
>> Hi Tobias, >> >> To echo Olaf, thanks very much! (and see below). >> >> On Jun 23, 2007, at 7:20 AM, via RT wrote:
>>> >>> sorry to "spam" the bug reports but seems the easiest way to >>> contact.
>> >> This is the right place to submit fixes - otherwise they get lost. :) >>
>>> I've fixed the methods that retrieve the comments from a profile >>> and I >>> also added a function that gets some basic info from the profile >>> (age, >>> gender, country, etc). I attach the code in the file for you to >>> look at. >>> If you want to incorporate it, I could probably check it into >>> your CVS >>> (given a login) and then you can adapt it as I'm sure there is room >>> for >>> improvement.
>> >> I would like to give you a login. Here are the basic developer >> guidelines: >> - Sign up for a sourceforge account and send me your username. >> - You'll be added to the development mailing list so you'll >> get commit notices and email regarding the module >> - IMPORTANT: We try to keep the module in a releasable state at all >> times because >> we frequently need to make a change to react to a myspace change. >> This means if you're developing a new method, don't commit it >> until it at least passes syntax checks. If it's incomplete, make >> sure >> the POD says so clearly and it's in the "IN PROGRESS" section. >> - In-development methods go at the bottom of the module in the >> "IN PROGRESS" section. >> - Put new working methods in the appropriate section. >> - Test your POD: >> - Check the formatting >> - Make sure someone reading your documentation will know how to >> use >> your method without looking at the code. >> - Make sure the documentation is complete and accurate >> - Write a test for your method. >> - You have to debug your method anyway - write a test while you're >> at it. >> Tests are just simple debugging scripts that call your method >> and look >> for results. See the existing tests in the "t" folder and add a >> new one. >> Ask if you need help. >> >> Current issues: >> - We are working on migrating all regular expressions to the "regex" >> hash >> near the top of Myspace.pm. All regular expressions should be >> accessed >> using the "_apply_regex" method, or the "_regex" method if >> appropriate. >> See the "is_band" and "is_inactive" methods for examples. >> (Note that this migration is under a bit of debate, since it's >> easier >> to maintain the module if the REs are in the method itself, but >> due to >> potential upcoming myspace changes we've generally decided that >> handling >> them through centralized methods will probably be necessary soon). >> - Myspace recently added multiple-language versions. We're requiring >> that >> users set their account information to use English for now. The >> login >> method has been updated to work based on URL, which is (currently) >> language-independent. For internationalization (i18n) we may >> adopt >> use of language-specific regex (hence the start of the above >> migration), >> or we may have the module automatically switch the user's >> country on >> login and switch it back on logout if possible. >> >> Other notes: >> - Olaf and I use BBedit. If you use BBedit, turn on "Auto-Expand >> Tabs". >> You'll notice the blank placeholder subroutines in the module that >> denote sections. These appear conveniently in the subroutine list >> in BBedit. If you have a Mac and don't use BBedit, try the demo: >> www.barebones.com >>
>>> >>> =head2 get_basic_info_on_page( $friends_page ); >>> >>> This routine takes the SOURCE CODE of an HTML page and returns >>> a hash of information containing: >>> country - country in profile (names of countries are is >>> standardised on MySpace) >>> cityregion - the line with city and region information (this is >>> free text) >>> headline - what ever it says as next to the picture >>> age - as number >>> gender - as text, either male or female >>> lastlogin - date of last login >>> >>> Note: MySpace joins the profile data from city and region to one >>> line (such as Berlin, Germany). >>> However, both city and region are free text so people can write >>> whatever they want. What is more, >>> region is optional. This function tries to extract the city and the >>> region by splitting cityregion >>> at the last comma. However, it might not work (depending on the >>> profile information) so both city >>> and region can be empty. >>> city - city >>> region - region >>> >>> >>> =cut
>> >> I think the formatting here got messed up in the email. If it >> formats ok when you "perldoc Myspace.pm" then maybe you can just >> commit it once I have your sourceforge info set up. >>
>>> >>> sub get_basic_info_on_page { >>> my ( $page ) = @_; >>> >>> ##THIS IS LANGUAGE DEPENDENT SO SITE HAS TO BE ACCESSED IN >>> ENGLISH!!! >>> #my $BASIC_INFO = 'Table2".*?>.*<td.*?>(.*?)Last Login'; >>> my $BASIC_INFO = 'Table2".*?>(.*Last Login:.*?)<br>'; >>> $BASIC_INFO = qr/$BASIC_INFO/smo; >>> >>> #my $time=time; >>> #matching does take quite long... (around 6s) >>> $page =~ /$BASIC_INFO/;
>> >> This is probably a good candidate for the regex hash and associated >> methods. >> We try to use all-caps variable names for global variables. >> >>
>>> #STILL NEEDS UPDATE IN CASE OF MORE THAN 1000 COMMENTS... >>> #ALSO SEEMS LIKE SOMETIMES MYSPACE REPORTS MORE COMMENTS (OR >>> FRIENDS) >>> #THAN ARE ACTUALLY THERE...
>> >> Yup. Myspace almost never reports the correct number of friends or >> comments. Usually, this is due to deleted accounts not being >> subtracted from your friend count. But, in some profiles, if you >> approve friend requests using the checkboxes and "approve selected >> friends" button, it will *decrease* your friend count by the number >> of friends you approve. >> >> Grant
> > > >
Added to version 0.66