Subject: | cached downloads to avoid unneeded HTTP GETs |
If you are using a remote repository (a PAR::Repository::Client
object), it's possible that each module you require will cause an HTTP
GET of the par file from the repository. If you're require'ing a
large number of such modules from a repo, all those GETs take too much
time. Even though they're conditional GETs, so only the first one
actually downloads content, the latency of all the http GETs is still
significant.
The attached patch caches the download result to avoid the extra HTTP
GETs.
-Ken
Subject: | fetch-once.diff |
--- lib/PAR/Repository/Client/HTTP.pm.orig 2012-10-12 10:47:04.000000000 -0500
+++ lib/PAR/Repository/Client/HTTP.pm 2012-10-12 11:43:54.000000000 -0500
@@ -85,6 +85,7 @@
{
my %escapes;
+ my %fetched_already;
sub _fetch_file {
my $self = shift;
$self->{error} = undef;
@@ -99,6 +100,16 @@
$local_file =~ s/([^\w\._])/$escapes{$1}/g;
$local_file = File::Spec->catfile( $self->{cache_dir}, $local_file );
+ # Each module you require from a repo will get you here if the
+ # repo is checked for that module. If you're require'ing a large
+ # number of such modules from a repo, all those
+ # LWP::Simple::mirror() GETs take too much time. Even though
+ # they're conditional GETs, so only the first one actually
+ # downloads content, the latency of all the http GETs is still
+ # significant. So, cache the download result to avoid
+ # HTTP-GET'ing the same file more than once.
+ return $local_file if $fetched_already{$local_file};
+
my $timeout = $self->{http_timeout};
my $old_timeout = $ua->timeout();
$ua->timeout($timeout) if defined $timeout;
@@ -109,7 +120,7 @@
return();
}
- return $local_file if -f $local_file;
+ return $fetched_already{$local_file} = $local_file if -f $local_file;
return();
}
}