Skip Menu |

This queue is for tickets about the SVN-Web CPAN distribution.

Report information
The Basics
Id: 17359
Status: resolved
Priority: 0/
Queue: SVN-Web

People
Owner: Nobody in particular
Requestors: dietrich.streifert [...] visionet.de
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in: 0.43
Fixed in: (no value)



Subject: Large number of commits (>1000) breaks SVN::Web
We have installed SVN::Web on Solaris 10 x86 (perl v5.8.7) and use it with mod_perl2 an apache2. After we have commited a huge number of files to svn (>1000) to a repository browsing with SVN::Web in this repository leads to huge memory consumption in httpd (mod_perl2) of up to 300 MByte per process. So accessing this repository through SVN::Web isn't usable anymore. Retrieving RSS-Feeds from the repository leads to a timeout. I think this is due to the number of repository paths which are held in a hash. Any solution on this (beside of just not doing such big commits with svn)? Best regards...
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Tue, 31 Jan 2006 02:13:57 +0000
To: bug-SVN-Web [...] rt.cpan.org
From: Nik Clayton <nik [...] ngo.org.uk>
Guest via RT wrote: Show quoted text
> We have installed SVN::Web on Solaris 10 x86 (perl v5.8.7) and use it > with mod_perl2 an apache2. > > After we have commited a huge number of files to svn (>1000) to a > repository browsing with SVN::Web in this repository leads to huge > memory consumption in httpd (mod_perl2) of up to 300 MByte per process.
Do you mean a) You've got a repository with a large number of files in it, or b) You have a commit that touched 1000's of files. I assume you mean (b), but I want to be sure. Show quoted text
> So accessing this repository through SVN::Web isn't usable anymore. > Retrieving RSS-Feeds from the repository leads to a timeout. > > I think this is due to the number of repository paths which are held in > a hash. > > Any solution on this (beside of just not doing such big commits with svn)?
Which SVN::Web actions have the problem? You say SVN::Web::RSS. I would expect SVN::Web::Revision, and probably SVN::Web::Log too. Does browsing the repository (i.e., just clicking on the links to files and directories) work? If so I can probably whip up a patch that disables path generation after a configurable number of paths have been seen. N
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Tue, 31 Jan 2006 08:45:16 +0100
To: bug-SVN-Web [...] rt.cpan.org
From: Dietrich Streifert <dietrich.streifert [...] visionet.de>
Hello Nik! First of all: Thank you for your fast answer and your really nice work on SVN::Web. Now to your questions: Answer (b) is true: I commited about 1000 new files to the repository. And yes the problem is also in SVN::Web::Revision and SVN::Web::Log. Browsing the repository is no problem. As I understood by reading your source you generate a hash of the touched paths for a specific revision. So I think this path is consuming a lot of memory (in my case httpd grows up to 300 MByte). Every time my development team members retrieved the RSS of the revision log of this repository it ended up in 3 to 6 running httpd which made the subversion server unresponsive. So as a first measure the patch which has a configurable limit to the path generation would be great. I don't know if this should be configurable globaly or per action (class). A solution to this would be to have a paging option for SVN::Web::Log and SVN::Web::Revision so that even commits with a huge number of paths can be browsed. But that may break your code. Best nik@ngo.org.uk via RT schrieb: Show quoted text
> Guest via RT wrote: >
>> We have installed SVN::Web on Solaris 10 x86 (perl v5.8.7) and use it >> with mod_perl2 an apache2. >> >> After we have commited a huge number of files to svn (>1000) to a >> repository browsing with SVN::Web in this repository leads to huge >> memory consumption in httpd (mod_perl2) of up to 300 MByte per process. >>
> > Do you mean > > a) You've got a repository with a large number of files in it, or > > b) You have a commit that touched 1000's of files. > > I assume you mean (b), but I want to be sure. > >
>> So accessing this repository through SVN::Web isn't usable anymore. >> Retrieving RSS-Feeds from the repository leads to a timeout. >> >> I think this is due to the number of repository paths which are held in >> a hash. >> >> Any solution on this (beside of just not doing such big commits with svn)? >>
> > Which SVN::Web actions have the problem? You say SVN::Web::RSS. I would > expect SVN::Web::Revision, and probably SVN::Web::Log too. > > Does browsing the repository (i.e., just clicking on the links to files and > directories) work? > > If so I can probably whip up a patch that disables path generation after a > configurable number of paths have been seen. > > N > >
-- Mit freundlichen Grüßen Dietrich Streifert Visionet GmbH
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Tue, 31 Jan 2006 08:51:19 +0100
To: bug-SVN-Web [...] rt.cpan.org
From: Dietrich Streifert <dietrich.streifert [...] visionet.de>
Hello Nik! Just to have concrete numbers: In the commit 1640 new files were added to the repository. Best regards. nik@ngo.org.uk via RT schrieb: Show quoted text
> Guest via RT wrote: >
>> We have installed SVN::Web on Solaris 10 x86 (perl v5.8.7) and use it >> with mod_perl2 an apache2. >> >> After we have commited a huge number of files to svn (>1000) to a >> repository browsing with SVN::Web in this repository leads to huge >> memory consumption in httpd (mod_perl2) of up to 300 MByte per process. >>
> > Do you mean > > a) You've got a repository with a large number of files in it, or > > b) You have a commit that touched 1000's of files. > > I assume you mean (b), but I want to be sure. > >
>> So accessing this repository through SVN::Web isn't usable anymore. >> Retrieving RSS-Feeds from the repository leads to a timeout. >> >> I think this is due to the number of repository paths which are held in >> a hash. >> >> Any solution on this (beside of just not doing such big commits with svn)? >>
> > Which SVN::Web actions have the problem? You say SVN::Web::RSS. I would > expect SVN::Web::Revision, and probably SVN::Web::Log too. > > Does browsing the repository (i.e., just clicking on the links to files and > directories) work? > > If so I can probably whip up a patch that disables path generation after a > configurable number of paths have been seen. > > N > >
-- Mit freundlichen Grüßen Dietrich Streifert Visionet GmbH
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Tue, 31 Jan 2006 12:07:01 +0000
To: bug-SVN-Web [...] rt.cpan.org
From: Nik Clayton <nik [...] ngo.org.uk>
Dietrich, dietrich.streifert@visionet.de via RT wrote: Show quoted text
> Now to your questions: > > Answer (b) is true: I commited about 1000 new files to the repository. > > And yes the problem is also in SVN::Web::Revision and SVN::Web::Log. > > Browsing the repository is no problem.
Right, that makes sense. They all generate lists of changed files. Show quoted text
> As I understood by reading your source you generate a hash of the > touched paths for a specific revision. So I think this path is consuming > a lot of memory (in my case httpd grows up to 300 MByte). Every time my > development team members retrieved the RSS of the revision log of this > repository it ended up in 3 to 6 running httpd which made the subversion > server unresponsive.
I'm not sure it's the hash. At least, not exactly. A hash with a few thousand small keys and values shouldn't consume 300MB of memory. I suspect that SVN::Web isn't using Subversion's memory management functions correctly, so memory's not being freed as quickly as it should be. Memory usage for these functions should really be O(1), not O(n). Show quoted text
> So as a first measure the patch which has a configurable limit to the > path generation would be great. I don't know if this should be > configurable globaly or per action (class).
I'll work on this over the next few days, and I'll send you a patch to test when I've got one. N
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Wed, 01 Feb 2006 11:25:39 +0000
To: bug-SVN-Web [...] rt.cpan.org
From: Nik Clayton <nik [...] ngo.org.uk>
Dietrich, dietrich.streifert@visionet.de via RT wrote: Show quoted text
> Now to your questions: > > Answer (b) is true: I commited about 1000 new files to the repository. > > And yes the problem is also in SVN::Web::Revision and SVN::Web::Log. > > Browsing the repository is no problem. > > As I understood by reading your source you generate a hash of the > touched paths for a specific revision. So I think this path is consuming > a lot of memory (in my case httpd grows up to 300 MByte). Every time my > development team members retrieved the RSS of the revision log of this > repository it ended up in 3 to 6 running httpd which made the subversion > server unresponsive.
I don't think it's the number of files that were committed that is the problem. As a test, I just created a repository and committed 1,729 files to it one commit. Then I configured SVN::Web to browse the repo. The ::Revision and ::Log actions are just as snappy as they should be, and memory usage is normal. Could you confirm that you see the same thing. To do this: * Create a repo: mkdir /tmp/repo svnadmin create /tmp/repo/big-commit * Checkout the repo mkdir /tmp/checkout cd /tmp/checkout svn checkout file:///tmp/repo/big-commit cd big-commit * Run the attached 'create' script in the 'big-commit' directory. sh create This should create 1,700+ files in various subdirectories. * Do a recursive add of all the directories, and then commit them. svn add 1 2 3 4 5 6 7 8 9 10 11 12 svn commit -m 'Big commit for testing' * Configure SVN::Web to browse the repo. Something like this in config.yaml will do. repos: big: '/tmp/repos/big-commit' * Now point your browser at the 'big' repo using SVN::Web, and verify that you can browse it, and that the ::Revision and ::Log actions are as fast as expected. If you do all that, and things are as fast as expected then it's not the number of files in the commit that's slowing things down. There must be some other factor in your commit that's resulting in the slow down. One other thing -- the testing I've just done is with svnweb-server, rather than Apache. If you try the above with Apache and it shows the problem, could you try it with svnweb-server, and see if that has a problem too. If it does then there's something in your environment that's not in mine that's causing the problem. If it doesn't then it's something different between Apache and svnweb-server. Thanks for your help in tracking this down. N
#!/bin/sh for dir in 1 2 3 4 5 6 7 8 9 10 11 12; do for subdir in 1 2 3 4 5 6 7 8 9 10 11 12; do mkdir -p $dir/$subdir for file in 1 2 3 4 5 6 7 8 9 10 11 12; do touch $dir/$subdir/file.$file done done done
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Wed, 01 Feb 2006 16:15:06 +0100
To: bug-SVN-Web [...] rt.cpan.org
From: Dietrich Streifert <dietrich.streifert [...] visionet.de>
Hello Nik! I tried your test with mod_perl2 and apache2. I can see the httpd grow up to 120 MByte peak memory usage when retrieving the ::Revision and the ::Log. This is exactly for the test you proposed in a file system repository. We use BDB repositories. I'll do tomorrow some more tests with BDB repositories and the test with the svnweb-server. Maybe in turn you can test this on apache? Thank you for your help. Best regards.... nik@ngo.org.uk via RT schrieb: Show quoted text
> Dietrich, > > dietrich.streifert@visionet.de via RT wrote: >
>> Now to your questions: >> >> Answer (b) is true: I commited about 1000 new files to the repository. >> >> And yes the problem is also in SVN::Web::Revision and SVN::Web::Log. >> >> Browsing the repository is no problem. >> >> As I understood by reading your source you generate a hash of the >> touched paths for a specific revision. So I think this path is consuming >> a lot of memory (in my case httpd grows up to 300 MByte). Every time my >> development team members retrieved the RSS of the revision log of this >> repository it ended up in 3 to 6 running httpd which made the subversion >> server unresponsive. >>
> > I don't think it's the number of files that were committed that is the problem. > > As a test, I just created a repository and committed 1,729 files to it one > commit. Then I configured SVN::Web to browse the repo. The ::Revision and > ::Log actions are just as snappy as they should be, and memory usage is normal. > > Could you confirm that you see the same thing. To do this: > > * Create a repo: > > mkdir /tmp/repo > svnadmin create /tmp/repo/big-commit > > * Checkout the repo > > mkdir /tmp/checkout > cd /tmp/checkout > svn checkout file:///tmp/repo/big-commit > cd big-commit > > * Run the attached 'create' script in the 'big-commit' directory. > > sh create > > This should create 1,700+ files in various subdirectories. > > * Do a recursive add of all the directories, and then commit them. > > svn add 1 2 3 4 5 6 7 8 9 10 11 12 > svn commit -m 'Big commit for testing' > > * Configure SVN::Web to browse the repo. Something like this in > config.yaml will do. > > repos: > big: '/tmp/repos/big-commit' > > * Now point your browser at the 'big' repo using SVN::Web, and > verify that you can browse it, and that the ::Revision and ::Log > actions are as fast as expected. > > If you do all that, and things are as fast as expected then it's not the > number of files in the commit that's slowing things down. There must be > some other factor in your commit that's resulting in the slow down. > > One other thing -- the testing I've just done is with svnweb-server, rather > than Apache. If you try the above with Apache and it shows the problem, > could you try it with svnweb-server, and see if that has a problem too. > > If it does then there's something in your environment that's not in mine > that's causing the problem. > > If it doesn't then it's something different between Apache and svnweb-server. > > Thanks for your help in tracking this down. > > N > > > ------------------------------------------------------------------------ > > #!/bin/sh > > for dir in 1 2 3 4 5 6 7 8 9 10 11 12; do > for subdir in 1 2 3 4 5 6 7 8 9 10 11 12; do > mkdir -p $dir/$subdir > for file in 1 2 3 4 5 6 7 8 9 10 11 12; do > touch $dir/$subdir/file.$file > done > done > done >
-- Mit freundlichen Grüßen Dietrich Streifert Visionet GmbH
I think I've tracked this down, at least for Revision.pm. Can you try the attached patch and let me know if it solves the problem when using the Revision action. If that works for you I'll see if I can use a similar fix for Log.pm
Download diff
application/octet-stream 701b

Message body not shown because it is not plain text.

The attached patch is for Log.pm -- similar problem, similar fix. Can you let me know if it solves the problem?
Download diff
application/octet-stream 780b

Message body not shown because it is not plain text.

On Wed Feb 01 19:47:01 2006, NIKC wrote: Show quoted text
> The attached patch is for Log.pm -- similar problem, similar fix. Can > you let me know if it solves the problem?
One more thing. If this does fix Log.pm then it also fixes RSS.pm, as the RSS functionality is implemented as a subclass of Log.ppm.
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Thu, 02 Feb 2006 08:46:27 +0100
To: bug-SVN-Web [...] rt.cpan.org
From: Dietrich Streifert <dietrich.streifert [...] visionet.de>
Yes! The patches have solved the problem! Now httpd consumes up to 38 MByte for the big-commit repository and up to 45 MByte for our production repository which was the cause for this issue. I've tested this for ::Log ::Repository and ::RSS. That's OK! Your patches seem to have speedup browsing for "normal" repositories a little bit. Thanks allot for your work! You should release this patches as soon as possible. Best regards... via RT schrieb: Show quoted text
> On Wed Feb 01 19:47:01 2006, NIKC wrote: >
>> The attached patch is for Log.pm -- similar problem, similar fix. Can >> you let me know if it solves the problem? >>
> > One more thing. If this does fix Log.pm then it also fixes RSS.pm, as > the RSS functionality is implemented as a subclass of Log.ppm. >
-- Mit freundlichen Grüßen Dietrich Streifert Visionet GmbH
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Thu, 02 Feb 2006 11:06:16 +0100
To: bug-SVN-Web [...] rt.cpan.org
From: Dietrich Streifert <dietrich.streifert [...] visionet.de>
I've discovered a "commented" section in the log template which creates links to the modified paths of a revision through a loop. The problem is that at runtime this loop is executed through the templating system, so in my case with over 1000 files this leads to a html file with a huge number of comments. The browser runs into a timeout and simply stops retrieving in the middle of the page. Maybe this comment should be deleted from the template? I've done this here and now the ::Log browsing is very fast. A solution would be to implement paging in the log and revision template. Then you would page, lets say 50 paths forward, backward. Best regards.... via RT schrieb: Show quoted text
> The attached patch is for Log.pm -- similar problem, similar fix. Can > you let me know if it solves the problem? >
-- Mit freundlichen Grüßen Dietrich Streifert Visionet GmbH
Subject: Re: [rt.cpan.org #17359] Large number of commits (>1000) breaks SVN::Web
Date: Fri, 03 Feb 2006 03:51:04 +0000
To: bug-SVN-Web [...] rt.cpan.org
From: Nik Clayton <nik [...] ngo.org.uk>
dietrich.streifert@visionet.de via RT wrote: Show quoted text
> Yes! The patches have solved the problem! Now httpd consumes up to 38 > MByte for the big-commit repository and up to 45 MByte for our > production repository which was the cause for this issue. I've tested > this for ::Log ::Repository and ::RSS. > > That's OK! > > Your patches seem to have speedup browsing for "normal" repositories a > little bit.
Good. I've committed the changes. I need to work on some automated tests that warn me if memory usage for any of the actions starts getting suspiciously high. Show quoted text
> Thanks allot for your work! You should release this patches as soon as > possible.
That'll be at least another 8 or 9 days. I'm in the US at the moment, and my laptop doesn't have my GPG keys on it, so I can't sign the release. I'll do another release when I get back home though. You might want to use the changes in revs 775, 777, 778, and 779 in your local installation. Get them from here: http://jc.ngo.org.uk/svnweb/jc/log/nik/CPAN/SVN-Web/trunk/ N
FYI, View.pm has a similar problem. See http://jc.ngo.org.uk/svnweb/jc/revision?rev=780 for a patch that fixes it.