Subject: | Running into a memory error using Perl and DateTime |
Date: | Wed, 7 Jan 2015 09:30:41 -0800 |
To: | bug-DateTime [...] rt.cpan.org |
From: | david kayal <davidkayal [...] gmail.com> |
As detailed in:
http://stackoverflow.com/questions/27809615/running-into-a-memory-error-using-perl-and-datetime
I am writing a small tool to parse some application logs for to collect
data that is going to be used as the inputs for Zabbix monitoring. I am
just wanting to keep data from the logs that are within the past two hours.
The format of the logs is pretty simple, the fields are separated by white
space and the first three fields are used to determine the time when the
logging was written.
Here is an example of the first three fields of a log line:
Jan 5 13:42:07
What I set out to do was to utilize one of my favorite modules, DateTime.
Where I convert the above into a DateTime object and then compare that
object to another DateTime object when the utility would be invoked.
Everything was fine an dandy and working nicely until I actually set the
utility against the a portion of the logs it would actually be parsing --
only a couple gigabytes in size. The test run was being done on a kitchen
invoked Ubuntu virtual box instance on my laptop, so the resources are --
as expected -- rather limited. The script would halt with the words
'Killed' displayed.
Looking into /var/log/messages I would see log lines describing the process
being killed due to resource issues.
When I invoked the process again, and then switching to another screen
instance to watch top, I noticed that the memory percentage would grow,
that swap space would being to be consumed all until the script would again
stop with the 'Killed' message.
When I would rerun the script with the DateTime portion commented out, the
script would execute as expected.
In the script I have a subroutine which would be called to create a
DateTime object based upon the information found in the first three fields
of the log line. I have tried where I create the object at the beginning of
the subroutine then undef it prior to returning a value at the end of the
subroutine, I have tried it where I create a global object ( using our )
and then use the DateTime set_* methods to modify what I thought would be a
single object's values.
I have read that perl does not clean up hash memory so that it can be
reused by the program--I feel that this is the base of the issue that I am
running into.
At this point, I am feel the need to get input of others and that is the
reason for this post. All comments and criticisms would be appreciated.
This utility was running on Perl v5.14.2.
This code produces the memory leak:
#!/usr/bin/perl -w
use strict;use DateTime;
my $month = 1;my $day = 6;my $hour = 20;my $minute = 30;my $second = 00;
for (my $count = 0; $count <= 25_000_000; $count++) {
my $epoch = &get_epoch( $month, $day, $hour, $minute, $second );}
sub get_epoch {
my $mon = shift;
my $day = shift;
my $hour = shift;
my $min = shift;
my $sec = shift;
my $temp_dt = DateTime->new(
year => 2015,
month => $mon,
day => $day,
hour => $hour,
minute => $min,
second => $sec,
nanosecond => 500_000_000,
time_zone => 'UTC',
);
return( $temp_dt->epoch );
---
Note:
~/bin# perl -MDateTime -e 'print "$DateTime::VERSION\n"'
1.18