Subject: | Suggestion for optimizing binary dump size |
I've been experimenting with dumping large hashes in ascii and binary (using Storable) format. I noticed that, although the binary dumps are about twice the speed of ascii dumps, the binary files are only about 10% smaller than their ascii counterparts.
On closer examination of the binary dump files then I noticed that the storage of hash keys is not optimized to save space. For example, if I have 500 keys called "CUSTOMERID" then the string "CUSTOMERID" will appear 500 times in the binary dump file. It would be great to have a Storable option which, when selected and a hash is being dumped, analyses key usage on-the-fly and stores keys only once.