Subject: | Make a copy of the returned hashref from the parse method |
Hi,
I've been using your module to process some files with fixed length records and ran into an issue with the parse method. I think it would be good to change the behaivor of the parse method so that it returns a copy of the hashref instead of the same one over and over again. I realize the current setup is for performance reasons, but, to me, the current setup is confusing and easy to forget. This is probably because the method does not work like you would expect it to, unless you pay close attention to the Caveats section of the documentation. Most modules like this do not require you to make a copy of the data returned.
To see how much of a performance difference it would make to return a copy, I changed the method to return a copy (by simply placing the fix you mention in Caveats inside the sub) and ran some benchmarks comparing the two:
Benchmark: timing 1000000 iterations of Copy, Original...
Copy: 57 wallclock secs (57.53 usr + 0.00 sys = 57.53 CPU) @ 17382.24/s (n=1000000)
Original: 27 wallclock secs (26.55 usr + 0.00 sys = 26.55 CPU) @ 37664.78/s (n=1000000)
Rate Copy Original
Copy 17382/s -- -54%
Original 37665/s 117% --
So, there is a performance loss, but it isn't too horrible unless you do quite a few records. I also tried modifying the method so it would create a new place to store the data every time instead of using $parser->{DATA} and making a copy afterwords, but that made it even slower.
Perhaps you could have the default method return a copy (which is what most people expect), and have another method that is more optimized for those that need the extra performance.
Anyways, I know this isn't really a "bug," just a difference in programming style. But let me know what you think about it.
Thanks for your work on this module, it made finishing a recent project much easier than developing my own solution.