Skip Menu |

This queue is for tickets about the Algorithm-Diff-XS CPAN distribution.

Report information
The Basics
Id: 56241
Status: new
Priority: 0/
Queue: Algorithm-Diff-XS

People
Owner: Nobody in particular
Requestors: Stuart Watt (no email address)
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.04
Fixed in: (no value)



Subject: High memory usage for very large sequences
This module appears to have exponential or similar memory usage. I have been using it with sequences of around 300000 elements with small differences, and I get memory usage of multiple gigabytes, often too large for 32-bit systems to handle. I suspect this is down to the algorithm. On the plus side, this definitely performs a lot faster than Algorithm::Diff for sequences that it can handle. I am really hoping that I am wrong, as I would have expected this issue to show up before. The same sequences takes many hours for Algorithm::Diff, so that wasn't an option either. Since I only needed numeric comparison, I used the diff code from libmba as an alternative. This is better-licensed equivalent to the algorithm used by GNU diff, and seems to perform well even for very large sequences. I am happy to extend this into another Algorithm::Diff variation (I need practice at XS) but I thought it worth reporting as an issue. The attached (large) file contains a test and a couple of large sequences I used to test this, compressed using 7z to cut down the file sizes.
Subject: files.7z
Download files.7z
application/octet-stream 526.4k

Message body not shown because it is not plain text.