Subject: | Regexp::Grammars is very slow |
I'm trying to parse "{}" blocks in a C++ header file (10K of size) using
the following regular expression:
qr{
<curly_block>
<rule: curly_block>
\{ (?: <curly_block> | [^{}] )* \}
}xms;
Unfortunately it takes way to long. Here's the breakdown with dprofpp:
Total Elapsed Time = 25.65577 Seconds
User+System Time = 25.38577 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
98.6 25.05 25.151 3 8.3505 8.3837 Converter::__ANON__
0.32 0.082 0.082 6142 0.0000 0.0000 Regexp::Grammars::_open_log
0.16 0.040 0.049 10 0.0040 0.0049 Converter::BEGIN
0.12 0.030 0.039 6 0.0050 0.0065 FindBin::BEGIN
0.08 0.020 0.030 5 0.0040 0.0060 Scalar::Util::PP::BEGIN
0.08 0.020 0.079 5 0.0040 0.0158 Error::BEGIN
0.08 0.020 0.255 11 0.0018 0.0232 main::BEGIN
0.07 0.019 0.019 2 0.0097 0.0093
Regexp::Grammars::_translate_subru
le_calls
0.04 0.010 0.010 1 0.0100 0.0100 B::bootstrap
0.04 0.010 0.010 1 0.0100 0.0100 File::Spec::Unix::path
0.04 0.010 0.010 1 0.0100 0.0100 main::find_comp_files
0.04 0.010 0.010 3 0.0033 0.0033 vars::BEGIN
0.04 0.010 0.010 3 0.0033 0.0033 Config::BEGIN
0.04 0.010 0.020 2 0.0050 0.0100 lib::BEGIN
0.04 0.010 0.010 6 0.0017 0.0017 Exporter::as_heavy
The equivalent plain vanilla 5.10 regexp is shown below and it produces
the result much faster.
qr{
(?<text> (?&curly_block))
(?(DEFINE)
(?<curly_block>
\{ (?: (?&curly_block) | [^{}] )* \} )
)
}xms;
Total Elapsed Time = 0.493133 Seconds
User+System Time = 0.243133 Seconds
Exclusive Times
%Time ExclSec CumulS #Calls sec/call Csec/c Name
16.4 0.040 0.049 11 0.0036 0.0045 Converter::BEGIN
16.4 0.040 0.079 5 0.0080 0.0158 Error::BEGIN
16.4 0.040 0.255 11 0.0036 0.0232 main::BEGIN
8.23 0.020 0.039 6 0.0033 0.0065 FindBin::BEGIN
8.23 0.020 0.029 28 0.0007 0.0010 Fatal::BEGIN
4.11 0.010 0.010 5 0.0020 0.0020 DynaLoader::dl_load_file
4.11 0.010 0.010 3 0.0033 0.0033 vars::BEGIN
4.11 0.010 0.010 4 0.0025 0.0025 Data::Dumper::BEGIN
4.11 0.010 0.010 3 0.0033 0.0033 Config::BEGIN
4.11 0.010 0.018 3 0.0033 0.0062 Converter::__ANON__
4.11 0.010 0.010 2 0.0050 0.0050
Regexp::Grammars::_translate_subru
le_call
4.11 0.010 0.010 9 0.0011 0.0011 Tie::RefHash::BEGIN
4.11 0.010 0.030 4 0.0025 0.0074 XSLoader::load
4.11 0.010 0.019 1 0.0099 0.0192 FindBin::init
4.11 0.010 0.010 66 0.0001 0.0001 File::Spec::Unix::canonpath
25 seconds with Regexp::Grammar vs 0.5 seconds with stock perl regex.
And this is on a pretty simple regular expression. Anything more
complex than this and Regexp::Grammars doesn't even terminate. This is
a showstopper for me.
perl -V:
Summary of my perl5 (revision 5 version 10 subversion 0) configuration:
Platform:
osname=solaris, osvers=2.9, archname=sun4-solaris-64int
uname='sunos sundev32 5.9 generic_118558-18 sun4u sparc sunw,sun-
fire '
config_args=''
hint=recommended, useposix=true, d_sigaction=define
useithreads=undef, usemultiplicity=undef
useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
use64bitint=define, use64bitall=undef, uselongdouble=undef
usemymalloc=n, bincompat5005=undef
Compiler:
cc='/opt/SUNWspro/bin/cc', ccflags ='-I/bbs/opt/include -
I/opt/swt/include -I/usr/local/include -D_LARGEFILE_SOURCE -
D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
optimize='-O',
cppflags='-I/bbs/opt/include -I/opt/swt/include -I/usr/local/include
-D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64'
ccversion='Sun C 5.5 Patch 112760-09 2004/03/31', gccversion='',
gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=87654321
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
ivtype='long long', ivsize=8, nvtype='double', nvsize=8,
Off_t='off_t', lseeksize=8
alignbytes=8, prototype=define
Linker and Libraries:
ld='/opt/SUNWspro/bin/cc', ldflags ='-L/bbs/opt/lib -L/opt/swt/lib -
L/usr/lib -L/usr/ccs/lib -L/bb/util/common/studio8-v3/SUNWspro/prod/lib
-L/usr/local/lib '
libpth=/bbs/opt/lib /opt/swt/lib /usr/lib /usr/ccs/lib
/bb/util/common/studio8-v3/SUNWspro/prod/lib /usr/local/lib
libs=-lsocket -lnsl -lgdbm -ldl -lm -lc
perllibs=-lsocket -lnsl -ldl -lm -lc
libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
cccdlflags='-KPIC', lddlflags='-G -L/bbs/opt/lib -L/opt/swt/lib -
L/usr/lib -L/usr/ccs/lib -L/bb/util/common/studio8-v3/SUNWspro/prod/lib
-L/usr/local/lib'
Characteristics of this binary (from libperl):
Compile-time options: PERL_DONT_CREATE_GVSV PERL_MALLOC_WRAP
PERL_USE_SAFE_PUTENV USE_64_BIT_INT
USE_LARGE_FILES
USE_PERLIO
Built under solaris
Compiled at Jun 17 2008 17:56:25
%ENV:
PERL5LIB="/bb/util/common/perlmod/lib/site_perl"
@INC:
/bb/util/common/perlmod/lib/site_perl
/bbs/opt/perl-5.10.0/lib/5.10.0/sun4-solaris-64int
/bbs/opt/perl-5.10.0/lib/5.10.0
/bbs/opt/perl-5.10.0/lib/site_perl/5.10.0/sun4-solaris-64int
/bbs/opt/perl-5.10.0/lib/site_perl/5.10.0
.