Skip Menu |

Preferred bug tracker

Please visit the preferred bug tracker to report your issue.

This queue is for tickets about the YAML-LibYAML CPAN distribution.

Report information
The Basics
Id: 62827
Status: resolved
Priority: 0/
Queue: YAML-LibYAML

People
Owner: Nobody in particular
Requestors: nrh [...] ikami.com
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: 0.34
Fixed in: (no value)



Subject: poor performance with large(r) files
YAML::LibYAML's performance seems to degrade *significantly* with file size. see the benchmarking below. [nrh@toki ~] cat comp.pl #!/usr/local/bin/perl -w use common::sense; use Benchmark qw(:all); use YAML::Syck qw(); use YAML::XS qw(); my $yaml = `cat foo.yaml`; my $size = -s "foo.yaml"; print "file size: " . $size . "b\n"; cmpthese( 10000, { 'YAML::Syck-' . $YAML::Syck::VERSION => sub { my $obj = YAML::Syck::Load($yaml); my $string = YAML::Syck::Dump($obj); }, 'YAML::XS-' . $YAML::XS::VERSION => sub { my $obj = YAML::XS::Load($yaml); my $string = YAML::XS::Dump($obj); }, } ); [nrh@toki ~/projects/pogo] perl ./comp.pl file size: 650b Rate YAML::XS-0.34 YAML::Syck-1.15 YAML::XS-0.34 1927/s -- -2% YAML::Syck-1.15 1969/s 2% -- (cp large.yaml foo.yaml) [nrh@toki ~] perl ./comp.pl file size: 32455b Rate YAML::XS-0.34 YAML::Syck-1.15 YAML::XS-0.34 44.3/s -- -99% YAML::Syck-1.15 6098/s 13666% --
Subject: Re: [rt.cpan.org #62827] poor performance with large(r) files
Date: Mon, 8 Nov 2010 13:32:52 +1100
To: bug-YAML-LibYAML [...] rt.cpan.org
From: Ingy dot Net <ingy [...] ingy.net>
Can you explain the output to me? It's weird, because understanding how syck works, I would expect quite the opposite. IOW, I would expect Syck to be slow on large data. Also do you have any suspicions of what the slowdown might be? On Mon, Nov 8, 2010 at 12:02 PM, Nicholas Harteau via RT < bug-YAML-LibYAML@rt.cpan.org> wrote: Show quoted text
> Sun Nov 07 20:02:08 2010: Request 62827 was acted upon. > Transaction: Ticket created by nrh@ikami.com > Queue: YAML-LibYAML > Subject: poor performance with large(r) files > Broken in: 0.34 > Severity: Normal > Owner: Nobody > Requestors: nrh@ikami.com > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=62827 > > > > YAML::LibYAML's performance seems to degrade *significantly* with file > size. see the benchmarking below. > > [nrh@toki ~] cat comp.pl > #!/usr/local/bin/perl -w > > use common::sense; > use Benchmark qw(:all); > > use YAML::Syck qw(); > use YAML::XS qw(); > > my $yaml = `cat foo.yaml`; > my $size = -s "foo.yaml"; > print "file size: " . $size . "b\n"; > > cmpthese( 10000, { > 'YAML::Syck-' . $YAML::Syck::VERSION => sub { my $obj = > YAML::Syck::Load($yaml); my $string = YAML::Syck::Dump($obj); }, > 'YAML::XS-' . $YAML::XS::VERSION => sub { my $obj = > YAML::XS::Load($yaml); my $string = YAML::XS::Dump($obj); }, > } > ); > > [nrh@toki ~/projects/pogo] perl ./comp.pl > file size: 650b > Rate YAML::XS-0.34 YAML::Syck-1.15 > YAML::XS-0.34 1927/s -- -2% > YAML::Syck-1.15 1969/s 2% -- > > (cp large.yaml foo.yaml) > > [nrh@toki ~] perl ./comp.pl > file size: 32455b > Rate YAML::XS-0.34 YAML::Syck-1.15 > YAML::XS-0.34 44.3/s -- -99% > YAML::Syck-1.15 6098/s 13666% -- > >
Subject: Re: [rt.cpan.org #62827] poor performance with large(r) files
Date: Mon, 8 Nov 2010 00:46:47 -0500
To: bug-YAML-LibYAML [...] rt.cpan.org
From: Nicholas Harteau <nrh [...] ikami.com>
On Nov 7, 2010, at 9:33 PM, Ingy dot Net via RT wrote: Show quoted text
> <URL: http://rt.cpan.org/Ticket/Display.html?id=62827 > > > Can you explain the output to me? It's weird, because understanding how syck > works, I would expect quite the opposite. IOW, I would expect Syck to be > slow on large data.
I've crammed the data I was using here: https://gist.github.com/667383, as well as some sample output from some of my machines. the expense seems to be in Load(). do you happen to have a corpus of yaml data to toss at this? Show quoted text
> Also do you have any suspicions of what the slowdown might be?
not a clue! I was quite surprised. -- nicholas harteau nrh@ikami.com
From: eam [...] frap.net
On Sun Nov 07 21:33:03 2010, ingy@ingy.net wrote: Show quoted text
> Can you explain the output to me? It's weird, because understanding how syck > works, I would expect quite the opposite. IOW, I would expect Syck to be > slow on large data. > > Also do you have any suspicions of what the slowdown might be?
Looks like a memory leak to me. I reduced this down a bit: #!/usr/local/bin/perl -w use YAML::XS qw(); my $file = 'large.yaml'; my $count = $ARGV[0] || 1000; print "using $count iterations\n"; $yaml = `cat $file`; $size = length $yaml; print "$file: " . $size . "b\n"; for (1..$count) { my $obj = YAML::XS::Load($yaml); } END { system "pmap $$" } #!/usr/local/bin/perl -w use YAML::Syck qw(); my $file = 'large.yaml'; my $count = $ARGV[0] || 1000; print "using $count iterations\n"; $yaml = `cat $file`; $size = length $yaml; print "$file: " . $size . "b\n"; for (1..$count) { my $obj = YAML::Syck::Load($yaml); } END { system "pmap $$" } And I see XS allocating about 135k each loop and ending up with a few hundred megs of memory in use. One brk() each loop. Syck has no such error.
Subject: Re: [rt.cpan.org #62827] poor performance with large(r) files
Date: Mon, 8 Nov 2010 19:57:09 +1100
To: bug-YAML-LibYAML [...] rt.cpan.org
From: Ingy dot Net <ingy [...] ingy.net>
Awesome work. Thanks. I'll look into it or find someone to, or take a patch from you if you can make one. Cheers, Ingy On Mon, Nov 8, 2010 at 5:10 PM, Evan Miller via RT < bug-YAML-LibYAML@rt.cpan.org> wrote: Show quoted text
> Queue: YAML-LibYAML > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=62827 > > > On Sun Nov 07 21:33:03 2010, ingy@ingy.net wrote:
> > Can you explain the output to me? It's weird, because understanding how
> syck
> > works, I would expect quite the opposite. IOW, I would expect Syck to be > > slow on large data. > > > > Also do you have any suspicions of what the slowdown might be?
> > Looks like a memory leak to me. I reduced this down a bit: > > > #!/usr/local/bin/perl -w > use YAML::XS qw(); > my $file = 'large.yaml'; > my $count = $ARGV[0] || 1000; > print "using $count iterations\n"; > $yaml = `cat $file`; > $size = length $yaml; > print "$file: " . $size . "b\n"; > for (1..$count) { > my $obj = YAML::XS::Load($yaml); > } > END { system "pmap $$" } > > #!/usr/local/bin/perl -w > use YAML::Syck qw(); > my $file = 'large.yaml'; > my $count = $ARGV[0] || 1000; > print "using $count iterations\n"; > $yaml = `cat $file`; > $size = length $yaml; > print "$file: " . $size . "b\n"; > for (1..$count) { > my $obj = YAML::Syck::Load($yaml); > } > END { system "pmap $$" } > > And I see XS allocating about 135k each loop and ending up with a few > hundred megs of > memory in use. One brk() each loop. Syck has no such error. > >
I'm assuming the memoery leak fixes in 0.35 fix this.