Skip Menu |

This queue is for tickets about the WWW-Google-SiteMap CPAN distribution.

Report information
The Basics
Id: 30592
Status: resolved
Priority: 0/
Queue: WWW-Google-SiteMap

People
Owner: Nobody in particular
Requestors: bryn.dole [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: WWW-Google-SiteMap bug: duplicate URLs
Date: Thu, 8 Nov 2007 15:37:50 -0800
To: bug-WWW-Google-SiteMap [...] rt.cpan.org
From: "Bryn Dole" <bryn.dole [...] gmail.com>
I get duplicate URLs in my sitemap. Here is an easy fix for to urls() in WWW-Google-SiteMap-1.09/lib/WWW/Google/SiteMap.pm Maybe the bug is in the crawler part, but this is an easy fix. Bryn sub urls { my $self = shift; $self->{urls} = \@_ if @_; my %hist; my @urls = grep { ref($_) && defined $_->loc && !$hist{$_->loc}++} @{$self->{urls}}; return wantarray ? @urls : \@urls; }
Subject: Re: [rt.cpan.org #30592] AutoReply: WWW-Google-SiteMap bug: duplicate URLs
Date: Thu, 8 Nov 2007 16:28:45 -0800
To: bug-WWW-Google-SiteMap [...] rt.cpan.org
From: "Bryn Dole" <bryn.dole [...] gmail.com>
Never mind. It was pilot error. Looks like sitemap.gz file get appended, and this is why I was seeing dups. Bryn On Nov 8, 2007 3:38 PM, Bugs in WWW-Google-SiteMap via RT < bug-WWW-Google-SiteMap@rt.cpan.org> wrote: Show quoted text
> > Greetings, > > This message has been automatically generated in response to the > creation of a trouble ticket regarding: > "WWW-Google-SiteMap bug: duplicate URLs", > a summary of which appears below. > > There is no need to reply to this message right now. Your ticket has been > assigned an ID of [rt.cpan.org #30592]. Your ticket is accessible > on the web at: > > http://rt.cpan.org/Ticket/Display.html?id=30592 > > Please include the string: > > [rt.cpan.org #30592] > > in the subject line of all future correspondence about this issue. To do > so, > you may reply to this message. > > Thank you, > bug-WWW-Google-SiteMap@rt.cpan.org > > ------------------------------------------------------------------------- > I get duplicate URLs in my sitemap. Here is an easy fix for to urls() in > WWW-Google-SiteMap-1.09/lib/WWW/Google/SiteMap.pm > > Maybe the bug is in the crawler part, but this is an easy fix. > > Bryn > > > sub urls { > my $self = shift; > $self->{urls} = \@_ if @_; > my %hist; > my @urls = grep { ref($_) && defined $_->loc && !$hist{$_->loc}++} > @{$self->{urls}}; > return wantarray ? @urls : \@urls; > } > >