Subject: | 3 possible optimisations |
In core, t/porting/podcheck.t is slow. Slow as in roughly a minute runtime, which is
frustrating. But to be fair to it, it's doing a *lot* of work.
I ran it under Devel::NYTProf. [Runtime there about 150s] There's nothing obvious. However,
these 3 changes seem to help.
[diff is from the path in the core - I hope that patch -p3 will apply it cleanly. All tests pass in
the core]
diff --git a/cpan/Pod-Parser/lib/Pod/Checker.pm b/cpan/Pod-Parser/lib/Pod/Checker.pm
index a230542..f901802 100644
--- a/cpan/Pod-Parser/lib/Pod/Checker.pm
+++ b/cpan/Pod-Parser/lib/Pod/Checker.pm
@@ -1099,7 +1099,7 @@ sub _check_ptree {
$text .= $self->_check_ptree($contents, $line, $file, "$nestlist$cmd");
next;
}
- if($nestlist =~ /$cmd/) {
+ if(index($nestlist, $cmd) != -1) {
$self->poderror({ -line => $line, -file => $file,
-severity => 'WARNING',
-msg => "nested commands $cmd<...$cmd<...>...>"});
Superficially I find that more ugly, but it actually is clearer what the code is doing, as [if I've
understood it correctly] $cmd is always 1 letter. NYProf reports this line goes from 0.725s to
0.056s
diff --git a/cpan/Pod-Parser/lib/Pod/Parser.pm b/cpan/Pod-Parser/lib/Pod/Parser.pm
index c807f3f..7d54864 100644
--- a/cpan/Pod-Parser/lib/Pod/Parser.pm
+++ b/cpan/Pod-Parser/lib/Pod/Parser.pm
@@ -840,7 +840,7 @@ sub parse_text {
$seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_);
}
## Keep track of line count
- $line += s/\r*\n//;
+ $line += /\n/;
## Remember the "current" sequence
$seq = $seq_stack[-1];
}
˘
$_ isn't used subsequently, so the substitution isn't necessary.
That makes the need to match \r* redundant.
1.99s to 1.61s
Does this code date from a time when the parser was different, chewing through the text, and
the substitution was needed?
@@ -1096,7 +1096,7 @@ sub parse_from_filehandle {
## See if this line is blank and ends the current paragraph.
## If it isnt, then keep iterating until it is.
- next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/)
+ next unless (($textline =~ /^[^\S\r\n]*[\r\n]*$/)
&& (length $paragraph));
## Now process the paragraph
The capture isn't needed. 4.73s to 4.43s
I'm wondering if there are other unneeded captures, but didn't look, as there weren't any
others in obviously hot places.
The structure of parse_text is bugging me, as this code repeats work:
time calls time/call code
0.153206 87640 0.000002 $_ = $text;
0.765981 87640 0.000009 my @tokens = split /([A-Z]<(?:<+\s)?)/;
0.111030 87640 0.000001 while ( @tokens ) {
0.407500 255096 0.000002 $_ = shift @tokens;
0.000000 0 0.000000 ## Look for the beginning of a sequence
1.999841 255096 0.000008 if ( /^([A-Z])(<(?:<+\s)?)$/ ) {
and it's not cheap work. That last line alone is 2 seconds of repeated work, and over 1% of the
*total* runtime. :-(
I can't see a way to avoid that, without a substantial refactoring.
Nicholas Clark
Subject: | Pod-Parser.patch |
diff --git a/cpan/Pod-Parser/lib/Pod/Checker.pm b/cpan/Pod-Parser/lib/Pod/Checker.pm
index a230542..f901802 100644
--- a/cpan/Pod-Parser/lib/Pod/Checker.pm
+++ b/cpan/Pod-Parser/lib/Pod/Checker.pm
@@ -1099,7 +1099,7 @@ sub _check_ptree {
$text .= $self->_check_ptree($contents, $line, $file, "$nestlist$cmd");
next;
}
- if($nestlist =~ /$cmd/) {
+ if(index($nestlist, $cmd) != -1) {
$self->poderror({ -line => $line, -file => $file,
-severity => 'WARNING',
-msg => "nested commands $cmd<...$cmd<...>...>"});
diff --git a/cpan/Pod-Parser/lib/Pod/Parser.pm b/cpan/Pod-Parser/lib/Pod/Parser.pm
index c807f3f..7d54864 100644
--- a/cpan/Pod-Parser/lib/Pod/Parser.pm
+++ b/cpan/Pod-Parser/lib/Pod/Parser.pm
@@ -840,7 +840,7 @@ sub parse_text {
$seq->append($expand_text ? &$xtext_sub($self,$_,$seq) : $_);
}
## Keep track of line count
- $line += s/\r*\n//;
+ $line += /\n/;
## Remember the "current" sequence
$seq = $seq_stack[-1];
}
@@ -1096,7 +1096,7 @@ sub parse_from_filehandle {
## See if this line is blank and ends the current paragraph.
## If it isnt, then keep iterating until it is.
- next unless (($textline =~ /^([^\S\r\n]*)[\r\n]*$/)
+ next unless (($textline =~ /^[^\S\r\n]*[\r\n]*$/)
&& (length $paragraph));
## Now process the paragraph