Subject: | Faster rewrite of Variables::ProhibitReusedNames |
After some profiling, I found that Variables::ProhibitReusedNames was
approximately 30% of the perlcritic runtime on some of our largest
files. I was going to just disable it when I read this comment in the POD:
"The current implementation walks the tree over and over. For a big
file, this can be a huge time sink. I'm considering rewriting to
search the document just once for variable declarations and cache the
tree walking on that single analysis."
So I decided to give rewriting it a try. Rather than caching, I've
altered the policy to apply to the document as a whole and then
recursively descend, bubbling up all the errors that it encounters. I
was a little worried that the recursion might bloat memory for a second
but the PPI objects dwarf the recursion overhead on any decent size
document (which is what I was aiming to optimize anyway).
I have an 8000 line perl module that takes about 18 seconds to process
using the official ProhibitReusedNames and only 9 seconds with the
attached version. That's not as fast as I was hoping for but its
certainly a huge improvement (which can probably be improved even
further). Also, in case anyone is worried, it still runs a tiny bit
faster on just a few line document (0.696s vs 0.709s on my box, fwiw)
The attached code (I didn't bother with a patch, since I rewrote
everything from 'applies_to' down) also passes the regression tests
perfectly (although those don't really exercise the possibility of false
positives). More importantly for me, I get the same results on all the
multi-thousand line modules I could find to test it on (written by
people whose code I don't generally trust, since they wrote
multi-thousand line modules).
I was also a little worried that this wouldn't be 'acceptable' P::C
style since it 'applies_to' the entire document and then searches for
its specific violations but CodeLayout::RequireConsistentNewlines does
the same thing so I felt vaguely justified.
Let me know what you think.
Subject: | ProhibitReusedNames.pm.fast |
Message body not shown because it is not plain text.