Skip Menu |

This queue is for tickets about the File-Rsync-Mirror-Recent CPAN distribution.

Report information
The Basics
Id: 123180
Status: open
Priority: 0/
Queue: File-Rsync-Mirror-Recent

People
Owner: Nobody in particular
Requestors: ask [...] perl.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Add rrr-client option to not process "-Z" file
Date: Mon, 2 Oct 2017 20:11:11 -0700
To: Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
If you do a full rsync initially and after any pause in syncing shorter than a year (with 1Y files), it’d be safe to not process the Z file. For CPAN this would save a lot of memory. A variation would be an option to skip changes older than X (and skipping loading the next RECENT file if the previous one already included “X time ago”. (I’m setting up a new backup www.cpan.org origin and trying to make it work on a VM with not much memory).
Hmmm, just to make sure that there is no misunderstanding: the running rrr client never touches the Z file after the initial pass through it. Except if somebody would mark the whole installation as dirty which has never happened so far. Here are the two processes on my running client: # ps auxww| grep rer|grep -v grep root 29318 0.0 0.0 61928 2352 pts/6 S+ Apr27 0:00 sudo /home/src/perl/repoperls/installed-perls/host/k93jessie/v5.24.1/2397/bin/perl -I /home/k/sources/rersyncrecent/lib /home/k/sources/CPAN/andk-cpan-tools/bin/testing-rmirror.pl pull root 29319 0.0 0.1 55144 8988 pts/6 S+ Apr27 142:39 /home/src/perl/repoperls/installed-perls/host/k93jessie/v5.24.1/2397/bin/perl -I /home/k/sources/rersyncrecent/lib /home/k/sources/CPAN/andk-cpan-tools/bin/testing-rmirror.pl pull You see moderate memory consumption.
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Tue, 3 Oct 2017 01:14:50 -0700
To: Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
Hmn, where does it keep local state? I started it yesterday and until I got more memory it kept quitting with an OOM error. It got more memory and finished, I think, but stuck around using ~4-600MB memory. It also missed some files (ironically the report I got was the 0.4.4 update was missing :-) ), it appears. Maybe because it was being quit[1]. I had to reboot the system for other reasons, so it’s running again now (working on the 1Y batch right now). [1] I know we usually use another word, but I’ve been reading way too much news the last 24 hours to want to use that…
The *initial* iteration requires memory, all subsequent iterations require very little memory. To provide the initial memory I would believe that swap space should do. The forking that you see going on disposes the memory back to the operating system, so that the long running process is frugal. In other words. It is as you diagnosed: the required amount of state after the initial iteration through the Z file is miniscule and that's already reflected in the behaviour of the rrr-client. At least that's how it should behave unless there is a bug somewhere.
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Tue, 3 Oct 2017 10:19:03 -0700
To: Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
Show quoted text
> On Oct 3, 2017, at 2:23, Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent@rt.cpan.org> wrote: > > In other words. It is as you diagnosed: the required amount of state after the initial iteration through the Z file is miniscule and that's already reflected in the behaviour of the rrr-client. At least that's how it should behave unless there is a bug somewhere.
Yeah, that’s how it works elsewhere, too. :-) I’m running it under Docker, though I don’t see how it’d make a difference. Pid 1 (the first process) is nice and tiny, but the second process seems to “live forever” so the memory doesn’t get reclaimed. I’ll try getting a full log out. This is how I run it: docker run -v /cpan:/cpan quay.io/perl/cpanorg:latest /cpan/rrr-client where /cpan/rrr-client is #!/bin/sh exec </dev/null exec 2>&1 exec rrr-client \ --verbose \ --source cpan-rsync.perl.org::CPAN/RECENT.recent \ --target /cpan/CPAN/ \ --tmpdir /cpan/tmp/ (my goal here was to make it really quick to setup another server with an up-to-date mirror and a consistent http configuration; I have another container that does the http part).
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Tue, 3 Oct 2017 10:29:41 -0700
To: Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
Hmn, it looks like it’s still holding the memory because it didn’t finish the Z file? (Though it was last worked on ~8 hours ago). Last thousands of lines from the container: https://tmp.askask.com/2017/10/rrr-client-4000.log All the logs (might include a restart or two): https://tmp.askask.com/2017/10/rrr-client.log
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Tue, 3 Oct 2017 10:31:53 -0700
To: Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
Oh - last mail for now: the pid of the second process is changing every few minutes, but it remains a (very) large process. I also got a few monitoring alerts through the night that it hadn’t been updating promptly (properly reflected in the logs too).
CC: undisclosed-recipients:;
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Tue, 03 Oct 2017 22:04:39 +0200
To: "ask\ [...] perl.org via RT" <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Andreas Koenig <andreas.koenig.7os6VVqR [...] franz.ak.mind.de>
I saw this in the log: Sync 1507021875 (240841/642218/Z) authors/id/D/DW/DWHEELER/App-Sqitch-0.994.meta ... And there is where my language so far was not precise enough. If I remember correctly, the memory consumption goes down after the Z file has once been fully processed. This will be when you see something akin: Sync NNNNNNNNNN (642218/642218/Z) authors/id/X/XX/XXXXXXXX/Xxx-Xxxxxx-N.NNN.xxxx ... -- andreas
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Tue, 3 Oct 2017 13:27:24 -0700
To: Andreas Koenig via RT <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
Show quoted text
> On Oct 3, 2017, at 13:20, (Andreas J. Koenig) via RT <bug-File-Rsync-Mirror-Recent@rt.cpan.org> wrote: > > Sync 1507021875 (240841/642218/Z) authors/id/D/DW/DWHEELER/App-Sqitch-0.994.meta ...
Yeah, that was almost 11 hours ago. It looks like it’s not processing the Z file anymore. Maybe it is too slow and it keeps thinking it needs to process the 1h file and never gets back to the Z file?
Looks like a bug to me. Procrastination is not an allowed state. Looking around for possible tunings that might influence the likelihood of success, I find max_files_per_connection and rsync_options/timeout the most interesting ones. I made a release 0.4.5 with --max-files-per-connection and --rsync-timeout available on the command line. I'd try 1000000 and 600 from the idea that swapping may play a role, so higher timeout may help. Increasing the number of files above the number of file we have, has the effect that one can expect that all files go through in the first iteration.
Subject: Re: [rt.cpan.org #123180] Add rrr-client option to not process "-Z" file
Date: Wed, 4 Oct 2017 22:40:57 -0700
To: "(Andreas J. Koenig) via RT" <bug-File-Rsync-Mirror-Recent [...] rt.cpan.org>
From: Ask Bjørn Hansen <ask [...] perl.org>
Cool, it’s running now with those options. We’ll see tomorrow how it’s working!