Subject: | SiteMap parsing doesn't report failure reason, upon failure |
Date: | Thu, 20 Jul 2006 13:43:48 +0200 |
To: | bug-WWW-Google-SiteMap [...] rt.cpan.org |
From: | Vinko Vrsalovic Bolte <vinko.vb [...] raditech.es> |
Reproducing the error:
Use a SiteMap with a malformed tag, for instance, change "changefreq" to
"changefrec". Example (lets call this file badsitemap.xml):
<?xml version="1.0" encoding="UTF-8"?><urlset
xmlns="http://www.google.com/schemas/sitemap/0.84">
<url><loc>http://www.readymade.com/foo/bar?code=000001457</loc><lastmod>2006-05-30</lastmod><changefrec>weekly</changefrec><
/url>
<url><loc>http://www.readymade.com/foo/bar?code=005009420</loc><lastmod>2006-02-21</lastmod><changefrec>weekly</changefrec><
/url>
</urlset>
Feed it to this program:
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Google::SiteMap;
print STDERR "Parsing ".$ARGV[0]."\n";
my $map = WWW::Google::SiteMap->new(
file => $ARGV[0],
);
$map->read();
foreach my $url ($map->urls()) {
print $url->loc();
print "\n";
}
by using this command (lets assume you named the script
parseSitemap.pl):
$perl parseSitemap.pl badsitemap.xml
Parsing badsitemap.xml
Could not parse badsitemap.xml
at /usr/lib/perl5/site_perl/5.8.5/WWW/Google/SiteMap.pm line 131, <GEN0>
line 4.
That message "<GEN0> line 4" is not helpful at all. But, if we change
line 131 of SiteMap.pm from
$twig->safe_parse(join('',$fh->getlines)) || die "Could not parse
$file";
to
$twig->safe_parse(join('',$fh->getlines)) || die "Could not parse $file
due to $@";
We get a clearer error description:
$perl parseSitemap.pl badsitemap.xml
Parsing badsitemap.xml
Could not parse badsitemap.xml due to Can't locate object method
"changefrec" via package "WWW::Google::SiteMap::URL"
at /usr/local/share/perl/5.8.7/WWW/Google/SiteMap.pm line 119, <GEN0>
line 4.
Which will point at the right direction in the discovery of the bug.
Regards,
Vinko