Subject: | during updating of local repository, older module versions were not being removed |
First of all, thank you for developing CPAN::Mini! It helps a loot,
since I don't have Internet connection available sometimes.
Today I decided to update my local CPAN, that was made a long time ago
(July of 2007) and was located in a CDROM.
I justed copied it to a directory in my Cygwin, updated to the last
version of CPAN::Mini and start the update process (which is indeed much
faster than fetching everything again).
The problem is that some old modules were not removed in the process. I
got the size of the local repository and in fact was bigger than a CDROM
can hold!
I looked for bugs and found one related to my problem. Since the issue
seems to be still broken (at least as shown in RT) I made a simple
script, nothing more than a hack, to find the repeated modules and
remove than.
It's quite complicated to make 100% of the job, so this script
(attached) does only part of the removing older versions. Due the
complete lack of formal specification about publishing a module, it
seems impossible to match all module names and versions and there are
author that put their modules inside directories, others that use ZIPed
files or TGZ instead of tar.gz in the tarballs.
The script will just skip those cases and keep working. It will check
the last modification date of each release and remove all of them except
the most recent.
I hope the script helps others until the problem in CPAN::Mini is fixed.
I would be glad to help fixing it too.
Regards,
Alceu
Subject: | fix.pl |
#!/usr/bin/perl
use warnings;
use strict;
use Cwd;
use File::Spec;
use File::stat;
my $previous_dir;
my $regex = '^([\w\-]+\w)(\-[\d\.\-]+)((\.tar\.gz)|(\.zip)|(\.tgz))$';
chdir File::Spec->catdir( 'authors', 'id' );
my $root_dir = cwd();
my $ids = read_dir($root_dir);
foreach my $id ( @{$ids} ) {
chdir($id) or die 'Cannot change from ' . cwd() . " to $id: $!\n";
my $intermediates = read_dir( cwd() );
foreach my $intermediate ( @{$intermediates} ) {
chdir($intermediate)
or die 'Cannot change from ' . cwd() . "to $intermediate: $!\n";
my $authors = read_dir( cwd() );
foreach my $author ( @{$authors} ) {
chdir($author)
or die 'Cannot change from ' . cwd() . "to $author: $!\n";
my $modules = read_dir( cwd() );
my %table;
foreach my $module ( @{$modules} ) {
next if $module eq 'CHECKSUMS';
if ( -d $module ) {
print "found a dir in $author\n";
next;
}
get_module( $module, $author, \%table );
}
foreach my $module ( keys(%table) ) {
if ( scalar( keys( %{ $table{$module} } ) ) > 1 ) {
print "there are repeated modules in $author\n";
# sort by last modification
my @sorted =
sort { $b <=> $a } keys( %{ $table{$module} } );
#removes the higher value (most recent module)
shift(@sorted);
foreach my $last_mod (@sorted) {
unlink $table{$module}->{$last_mod}
or die
"Cannot remove $table{$module}->{$last_mod}: $!\n";
}
}
}
# going down one level
chdir('..') or die "Cannot go down one level: $!\n";
}
chdir('..') or die "Cannot go down one level: $!\n";
}
chdir('..') or die "Cannot go down one level: $!\n";
}
sub read_dir {
my $path = shift;
my @list;
opendir( DIR, $path ) or die "Cannot read $path: $!\n";
foreach ( readdir(DIR) ) {
next if $_ eq '.';
next if $_ eq '..';
push( @list, $_ );
}
return \@list;
}
sub get_module {
my $module = shift;
my $author = shift;
my $table_ref = shift;
if ( $module =~ /$regex/io ) {
#my ( $name, $version, $extension ) = ( $module =~ /$regex/io );
my $name = ( $module =~ /$regex/io )[0];
my $file_props = stat($module) or die "Cannot stat $module: $!\n";
if ( exists( $table_ref->{$name} ) ) {
$table_ref->{$name}->{ $file_props->mtime() } = $module;
}
else {
$table_ref->{$name} = { $file_props->mtime() => $module };
}
}
else {
print "$module of $author does not match regex\n";
}
}