Skip Menu |

This queue is for tickets about the File-Fetch CPAN distribution.

Report information
The Basics
Id: 18942
Status: resolved
Priority: 0/
Queue: File-Fetch

People
Owner: Nobody in particular
Requestors: grousse [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: (no value)
Fixed in: (no value)



Subject: unproper handling of http errors in external handlers
The following test case claims a valid download, whereas the given URL returns a 404 error, because of lynx handler. Blacklisting is enough to fix the issue. As a side note: - the wget handler creates an empty file, while returning an error - the curl handler does not create a file, while returning a success
Subject: test_fetch.pl
#!/usr/bin/perl use File::Fetch; use strict; my $url = 'http://search.cpan.org/CPAN/authors/id/A/AD/ADAMK/Config-Tiny-2.06.tar.bz2'; my $ff = File::Fetch->new(uri => $url); my $result = $ff->fetch(); if ($result) { print "OK\n"; } else { print "NOK\n"; }
On Wed Apr 26 16:41:59 2006, GROUSSE wrote: Show quoted text
> The following test case claims a valid download, whereas the given URL > returns a 404 error, because of lynx handler. Blacklisting is enough to > fix the issue. > > As a side note: > - the wget handler creates an empty file, while returning an error > - the curl handler does not create a file, while returning a success
Thanks for the report. I've made the following changes, in the hope to fix a few of these issues: * the wget handler, on a failed attempt, now unlinks its outputfile * the curl handler is updated to follow '302 moved' and such like status messages * lynx use is further discouraged, as it doesn't communicate http status messages back to the caller at all. Full patch below. ==== //member/kane/file-fetch/lib/File/Fetch.pm#20 - /Users/kane/sources/p4/other/ file-fetch/lib/File/Fetch.pm ==== 411a413,416 Show quoted text
> ### wget creates the output document always, even if the fetch > ### fails.. so unlink it in that case > 1 while unlink $to; >
505a511,517 Show quoted text
> ### XXX on a 404 with a special error page, $captured will actually > ### hold the contents of that page, and make it *appear* like the > ### request was a success, when really it wasn't :( > ### there doesn't seem to be an option for lynx to change the exit > ### code based on a 4XX status or so. > ### the closest we can come is using --error_file and parsing that, > ### which is very unreliable ;(
587c599,601 < push @$cmd, '--fail', '--output', $to, $self->uri; --- Show quoted text
> ### curl doesn't follow 302 (temporarily moved) etc automatically > ### so we add --location to enable that. > push @$cmd, '--fail', '--location', '--output', $to, $self->uri;
864a879,891 Show quoted text
> =head2 I used 'lynx' to fetch a file, but its contents is all wrong! > > C<lynx> can only fetch remote files by dumping its contents to C<STDOUT>, > which we in turn capture. If that content is a 'custom' error file > (like, say, a C<404 handler>), you will get that contents instead. > > Sadly, C<lynx> doesn't support any options to return a different exit > code on non-C<200 OK> status, giving us no way to tell the difference > between a 'successfull' fetch and a custom error page. > > Therefor, we recommend to only use C<lynx> as a last resort. This is > why it is at the back of our list of methods to try as well. >