::File uses a tremendous amount of macros and uses Perl_get_context()
Try running it through the C preprocessor and then through a code
formatter, and it will go 2-3 pages off to the left. Under -O1 with VS
2003 32 bit code, I got a 49 KB DLL with Perl 5.17.2. My ActivePerl 5.10
32 bits stock ::File DLL is 85 KB. Two (A and W variants) XS functions
were ~0x9D0 bytes long each of machine code on my Perl 5.17.2, nearly a
whole page. A C compiler can not optimize away ST(X) between function
calls, how can the C compiler know the Perl stack slot (SV **) or
my_perl->base_of_stack didn't change between function calls? A macro is
always equal to its full expansion, and I dont think any compiler will
automatically function call a macro (maybe GCC will with a C89/C99
breaking -O level, IDK, VC absolutely wont).
I am around 60% done in my cleanup as of today. The ::File DLL is
currently at 21 KB (at 60% done). I have decided to move the macros into
function calls, AKA typemap helpers. Each typemap helper is a static of
course, and does not call any other functions in ::File DLL. Each
typemap helper, in the ideal branch taken, makes no function calls to
the perl interp, or 2nd best branch, 1 function call (2uv, 2iv, av_len,
etc). The macros inside of each typemap helper were not split off into
their own calls so the compiler can optimize by merging all the
different flag and SV type checks and reordering them in the SV head
together (checked by looking at the disassembly code). There was also a
checking of return of SvPV_nolen for a null PV check that was
questionable (with 2 2pv() calls as a result of the SvPV_nolen macro
being used twice).
Another optimization is, saving the SV *s on the C stack/to a register.
rather than writing ST(X) multiple times in the CODE: section. The old
::File often uses INPUT: and OUTPUT: section initializers that override
the typemap. In this case, I will use the C stack SV * rather than the
ST(X). The ST(X) for $arg in the OUTPUT: boilerplate code (not typemap
override initializer), and when using the native typemap code, I have no
control over it.
To save the SV *s on the C stack, I have some design difficulty, I
thought of 4 ways to do it, case 4 is my preferred. I will assume that
the croak parameter names, and their C hungarion prefixes, must stay the
same, the croak parameter names are determined from the INPUT/XSUB
prototype names. If someone thinks that the parameter names can change
and loose their C side hungarian prefixes, tell me and I will do that.
case 1:
If the INPUT: section, for example, is changed
DWORD uShare
to
SV * uShare
So that in a usage croak, it still tells the user that Share is an
unsigned integer, what will the real C auto DWORD var be called?
_uShare? real_uShare? uuShare?
case 2:
DWORD uShare
to
SV * svShare
or
SV * Share
Now the usage croak doesn't say what Share is, just a confusing "sv"
prefix. DWORD uShare is now in a PREINIT: section.
case 3:
Use (...), write usage croak check against items by hand (copy paste
from File.c) and process all in parameters by hand in C (copy paste from
File.c), not using XS's typemap system. I dislike this approach, since
I am trying to keep XS sub definition changes to a minimum to reduce the
number of bugs. Also increased maintenance difficulty in the future.
case 4:
This is what I hacked together which solves the usage croak parameter
name problem, I need to clear this with p5p/ParseXS's authors to make
sure that this will remain forwards compatible indefinitely somehow.
This is the typemap definition, it will save the SV * on the C stack and
allows the C var to be the usage croak parameter.
T_UV
".(unshift($ExtUtils::ParseXS::VERSION >= 3.01 ? $self->{line} :
@{eval('\@line')}, (
'PREINIT:',
' SV * sv'.$var.';',
'INPUT:')),'').
"{sv${\$var} = $arg;
$var= INT2PTR($type, IntIn(aTHX_ sv${\$var}, 1).uv);}
It works fine at the moment (the SV is called "svuShare", slightly
awkward, but it makes sense, SV and a unsigned int), but as you see, it
is using undocumented data of ParseXS and it subject to breakage, the
commit responsible was
http://perl5.git.perl.org/perl.git/commit/879310359dd0a26e227299023420b4cc6501f6b0
which is part of a large series of commit in 2011 to cleanup of ParseXS
to be OOP instead of functions with globals. Worst case, the PREINIT
sections are written by hand in every XSUB to declare the SV * autos,
but then the XSUBs change which is something I am trying to keep the a
minimum as I said before.
In some places, ::File declared a return type just for the C RETVAL, but
never used it in the OUTPUT section and instead did a XS_RETURN* with
the RETVAL, that was fixed to be a void CODE: and a PREINIT: RETVAL. If
the XS_RETURN line is ever removed, a freed scalar will appear as the
return value in Perl language and subsequent bizarre copy panic/crash.
"ST(0) = sv_newmortal();" is not placed by XSPP if RETVAL isn't in OUTPUT:.
I also will also research if its possible to merge some of the A and W
calls into ALIAS XSUBs with no Perl language side, side effects.
I currently have my code as a fork of
http://github.com/chorny/Win32API-File . Is this correct? I plan to put
all the changes into 1 git patch. Do you want the patch as a git patch
against github or a git patch against perl interp blead or something
else? do you want a 60% done patch just for review and comments right
now, or a 60% done patch to apply to some git rep? Any other comments?