Skip Menu |

This queue is for tickets about the Text-CSV_XS CPAN distribution.

Report information
The Basics
Id: 81539
Status: resolved
Priority: 0/
Queue: Text-CSV_XS

People
Owner: Nobody in particular
Requestors: mayeu.tik [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.80
Fixed in: 0.94



Subject: Memory leaking
Date: Thu, 29 Nov 2012 15:21:46 +0100
To: bug-Text-CSV_XS [...] rt.cpan.org
From: Matthieu Maury <mayeu.tik [...] gmail.com>
Hi, At my company we work on a software which handle a lot of different type of input file. The software is a long lasting software running on a server. We recently had the need to add the possibility to read data from CSV file and we add a little piece of driver using Text::CSV_XS (we need speed, a lot of speed). But using Text::CSV_XS 0.93 made you software blow up under ram leaking (the script was taking something like 10Gb of memory, but usualy the script run on 250Mb ~ 500Mb of memory). We create a little script presenting the issue : ------------------------------ -------------------- use Text::CSV; $i = 0; foreach $file (@ARGV) { open $FILE, "<$file"; my $CSVFile = Text::CSV->new( { 'quote_char' => '"', 'sep_char' => ';', 'always_quote' => 0, 'quote_space' => 1, 'quote_null' => 1, 'binary' => 1, 'blank_is_undef' => 1, 'empty_is_undef' => 1, } ); my $header_ref = $CSVFile->getline($FILE); my $columns_ref = $CSVFile->getline($FILE); $CSVFile->column_names( @{$columns_ref} ); my $lines = $CSVFile->getline_hr_all($FILE); close FILE ; print "File $i done\n" ; $i++ ; } --------------------------------------------------- By calling this script with 200 csv file in input (and with Text::CSV_XS installed), my perl process increase its memory at every loop, if I remove Text::CSV_XS (and thus using the pure perl implementation of Text::CSV), my perl process use a fixed quantity of memory. We got this issue on different system : Archlinux, perl v5.16.2 x86_64-linux-thread-multi (from arch repository), Text::CSV_XS 0.93 Debian Sqeeze, v5.10. x86_64-linux-gnu-thread-multi (from debian repository, TexT::CSV_XS from debian repository (libtext-csv-xs-perl). Thanks for your help and insight. -- Mayeu http://6x9.fr
Subject: Re: [rt.cpan.org #81539] AutoReply: Memory leaking
Date: Thu, 29 Nov 2012 15:24:35 +0100
To: bug-text-csv_xs <bug-Text-CSV_XS [...] rt.cpan.org>
From: Matthieu Maury <mayeu.tik [...] gmail.com>
I realise the bad file handle is closed, should be "close $FILE;", but that does not change the memory issues.
On Thu Nov 29 09:22:18 2012, mayeu.tik@gmail.com wrote: Show quoted text
> By calling this script with 200 csv file in input (and with > Text::CSV_XS installed), my perl process increase its memory at every > loop, if I remove Text::CSV_XS (and thus using the pure perl > implementation of Text::CSV), my perl process use a fixed quantity of > memory.
I believe the attached patches fix the problem. 0001 fixes the issue 0002 removes the now unused rav_free() (separate if there's some other reason to keep it.) Tony
Subject: 0001-avoid-producing-double-references-on-per-line-AVs-in.patch
From b7623f803c1d78b565d4f7aa9b2850a09b256df5 Mon Sep 17 00:00:00 2001 From: Tony Cook <tony@develop-help.com> Date: Mon, 3 Dec 2012 20:36:14 +1100 Subject: [PATCH 1/2] avoid producing double references on per-line AVs in xsParse_all() The code used newRV(), which is an alias for newRV_inc(), hence the line AVs each had a reference count of 2 when they were returned. rav_free() handled this by releasing both the RV and the underlying AV, but this is unnecessary when the AVs have the correct reference count. --- CSV_XS.xs | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/CSV_XS.xs b/CSV_XS.xs index 553f7fc..09fcdef 100644 --- a/CSV_XS.xs +++ b/CSV_XS.xs @@ -1533,11 +1533,11 @@ static SV *cx_xsParse_all (pTHX_ SV *self, HV *hv, SV *io, SV *off, SV *len) } if (n++ >= tail) { - rav_free (av_shift (avr)); + SvREFCNT_dec(av_shift (avr)); n--; } - av_push (avr, newRV ((SV *)row)); + av_push (avr, newRV_noinc ((SV *)row)); if (n >= length && skip >= 0) break; /* We have enough */ @@ -1545,7 +1545,7 @@ static SV *cx_xsParse_all (pTHX_ SV *self, HV *hv, SV *io, SV *off, SV *len) row = newAV (); } while (n > length) { - rav_free (av_pop (avr)); + SvREFCNT_dec(av_pop (avr)); n--; } -- 1.7.10.4
Subject: 0002-eliminate-rav_free-it-s-not-longer-used.patch
From 6fab5ebee68ee16fb5e3ad926fb7cda0180dae0c Mon Sep 17 00:00:00 2001 From: Tony Cook <tony@develop-help.com> Date: Mon, 3 Dec 2012 20:40:05 +1100 Subject: [PATCH 2/2] eliminate rav_free(), it's not longer used --- CSV_XS.xs | 7 ------- 1 file changed, 7 deletions(-) diff --git a/CSV_XS.xs b/CSV_XS.xs index 09fcdef..b0fed4a 100644 --- a/CSV_XS.xs +++ b/CSV_XS.xs @@ -1497,13 +1497,6 @@ static void cx_av_free (pTHX_ AV *av) sv_free ((SV *)av); } /* av_free */ -#define rav_free(rv) cx_rav_free (aTHX_ rv) -static void cx_rav_free (pTHX_ SV *rv) -{ - av_free ((AV *)SvRV (rv)); - sv_free (rv); - } /* rav_free */ - #define xsParse_all(self,hv,io,off,len) cx_xsParse_all (aTHX_ self, hv, io, off, len) static SV *cx_xsParse_all (pTHX_ SV *self, HV *hv, SV *io, SV *off, SV *len) { -- 1.7.10.4
Thanks Tony. That works well Before: size = 1806, resident = 876, share = 446, data = 506 size = 5366, resident = 4445, share = 472, data = 4055 size = 8574, resident = 7679, share = 473, data = 7263 size = 11775, resident = 10847, share = 473, data = 10464 size = 14956, resident = 14081, share = 473, data = 13645 size = 18158, resident = 17249, share = 473, data = 16847 size = 21359, resident = 20483, share = 473, data = 20048 size = 24560, resident = 23651, share = 473, data = 23249 size = 27761, resident = 26885, share = 473, data = 26450 size = 30962, resident = 30053, share = 473, data = 29651 size = 34163, resident = 33287, share = 473, data = 32852 size = 37364, resident = 36455, share = 473, data = 36053 size = 40565, resident = 39623, share = 473, data = 39254 size = 43766, resident = 42857, share = 473, data = 42455 size = 46967, resident = 46025, share = 473, data = 45656 size = 50174, resident = 49259, share = 473, data = 48863 size = 53375, resident = 52427, share = 473, data = 52064 size = 56558, resident = 55661, share = 473, data = 55247 size = 59760, resident = 58829, share = 473, data = 58449 size = 62961, resident = 62063, share = 473, data = 61650 size = 66162, resident = 65231, share = 473, data = 64851 After: size = 1784, resident = 876, share = 444, data = 502 size = 5334, resident = 4438, share = 460, data = 4041 size = 5378, resident = 4438, share = 460, data = 4085 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 size = 5362, resident = 4485, share = 462, data = 4069 Release on its way …
Subject: Re: [rt.cpan.org #81539] Resolved: Memory leaking
Date: Tue, 4 Dec 2012 10:43:34 +0100
To: bug-text-csv_xs <bug-Text-CSV_XS [...] rt.cpan.org>
From: Matthieu Maury <mayeu.tik [...] gmail.com>
Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=81539 > > > According to our records, your request has been resolved. If you have any > further questions or concerns, please respond to this message.
Thank you for your quick help ! We have tested the patch overnight and nothing was wrong on our side :) Thank again, Regards -- Mayeu http://6x9.fr