Bug #78600 for Win32API-File: refactoring Win32API-File's XS code

Thu Jul 26 19:35:03 2012 BULKDD [...] cpan.org - Ticket created

Subject:

refactoring Win32API-File's XS code

I see numerous problems in the XS code of Win32API::File. Does anyone have objections to me refactoring the XS code in Win32API::File? I dont plan do any changes in ::File's public API and very minor changes in ::File's internal API.

Mon Jul 30 15:21:47 2012 BULKDD [...] cpan.org - Correspondence added

::File uses a tremendous amount of macros and uses Perl_get_context() Try running it through the C preprocessor and then through a code formatter, and it will go 2-3 pages off to the left. Under -O1 with VS 2003 32 bit code, I got a 49 KB DLL with Perl 5.17.2. My ActivePerl 5.10 32 bits stock ::File DLL is 85 KB. Two (A and W variants) XS functions were ~0x9D0 bytes long each of machine code on my Perl 5.17.2, nearly a whole page. A C compiler can not optimize away ST(X) between function calls, how can the C compiler know the Perl stack slot (SV **) or my_perl->base_of_stack didn't change between function calls? A macro is always equal to its full expansion, and I dont think any compiler will automatically function call a macro (maybe GCC will with a C89/C99 breaking -O level, IDK, VC absolutely wont). I am around 60% done in my cleanup as of today. The ::File DLL is currently at 21 KB (at 60% done). I have decided to move the macros into function calls, AKA typemap helpers. Each typemap helper is a static of course, and does not call any other functions in ::File DLL. Each typemap helper, in the ideal branch taken, makes no function calls to the perl interp, or 2nd best branch, 1 function call (2uv, 2iv, av_len, etc). The macros inside of each typemap helper were not split off into their own calls so the compiler can optimize by merging all the different flag and SV type checks and reordering them in the SV head together (checked by looking at the disassembly code). There was also a checking of return of SvPV_nolen for a null PV check that was questionable (with 2 2pv() calls as a result of the SvPV_nolen macro being used twice). Another optimization is, saving the SV *s on the C stack/to a register. rather than writing ST(X) multiple times in the CODE: section. The old ::File often uses INPUT: and OUTPUT: section initializers that override the typemap. In this case, I will use the C stack SV * rather than the ST(X). The ST(X) for $arg in the OUTPUT: boilerplate code (not typemap override initializer), and when using the native typemap code, I have no control over it. To save the SV *s on the C stack, I have some design difficulty, I thought of 4 ways to do it, case 4 is my preferred. I will assume that the croak parameter names, and their C hungarion prefixes, must stay the same, the croak parameter names are determined from the INPUT/XSUB prototype names. If someone thinks that the parameter names can change and loose their C side hungarian prefixes, tell me and I will do that. case 1: If the INPUT: section, for example, is changed DWORD uShare to SV * uShare So that in a usage croak, it still tells the user that Share is an unsigned integer, what will the real C auto DWORD var be called? _uShare? real_uShare? uuShare? case 2: DWORD uShare to SV * svShare or SV * Share Now the usage croak doesn't say what Share is, just a confusing "sv" prefix. DWORD uShare is now in a PREINIT: section. case 3: Use (...), write usage croak check against items by hand (copy paste from File.c) and process all in parameters by hand in C (copy paste from File.c), not using XS's typemap system. I dislike this approach, since I am trying to keep XS sub definition changes to a minimum to reduce the number of bugs. Also increased maintenance difficulty in the future. case 4: This is what I hacked together which solves the usage croak parameter name problem, I need to clear this with p5p/ParseXS's authors to make sure that this will remain forwards compatible indefinitely somehow. This is the typemap definition, it will save the SV * on the C stack and allows the C var to be the usage croak parameter. T_UV ".(unshift($ExtUtils::ParseXS::VERSION >= 3.01 ? $self->{line} : @{eval('\@line')}, ( 'PREINIT:', ' SV * sv'.$var.';', 'INPUT:')),''). "{sv${\$var} = $arg; $var= INT2PTR($type, IntIn(aTHX_ sv${\$var}, 1).uv);} It works fine at the moment (the SV is called "svuShare", slightly awkward, but it makes sense, SV and a unsigned int), but as you see, it is using undocumented data of ParseXS and it subject to breakage, the commit responsible was http://perl5.git.perl.org/perl.git/commit/879310359dd0a26e227299023420b4cc6501f6b0 which is part of a large series of commit in 2011 to cleanup of ParseXS to be OOP instead of functions with globals. Worst case, the PREINIT sections are written by hand in every XSUB to declare the SV * autos, but then the XSUBs change which is something I am trying to keep the a minimum as I said before. In some places, ::File declared a return type just for the C RETVAL, but never used it in the OUTPUT section and instead did a XS_RETURN* with the RETVAL, that was fixed to be a void CODE: and a PREINIT: RETVAL. If the XS_RETURN line is ever removed, a freed scalar will appear as the return value in Perl language and subsequent bizarre copy panic/crash. "ST(0) = sv_newmortal();" is not placed by XSPP if RETVAL isn't in OUTPUT:. I also will also research if its possible to merge some of the A and W calls into ALIAS XSUBs with no Perl language side, side effects. I currently have my code as a fork of http://github.com/chorny/Win32API-File . Is this correct? I plan to put all the changes into 1 git patch. Do you want the patch as a git patch against github or a git patch against perl interp blead or something else? do you want a 60% done patch just for review and comments right now, or a 60% done patch to apply to some git rep? Any other comments?

Mon Jul 30 15:21:49 2012 BULKDD [...] cpan.org - Status changed from 'new' to 'open'

Tue Jul 31 13:46:12 2012 tyemq [...] cpan.org - Correspondence added

Thanks for providing detailed information so an informed reply is possible. The numerous problems appear to mostly boil down to optimization of the resulting machine code. Most of these changes sound reasonable. The macros were likely slightly easier to reuse in other modules but the cost there is surely very minor. So changing to functions sounds worthwhile, especially since some improvements in correctness are likely also included. In general, I am not against optimizations but am very wary about adopting optimizations that decrease code clarity and will usually just reject optimizations that decrease correctness. In this particular case, Win32API::File is a thunk layer between a full-sized interpreter and calls that mostly deal with file systems (and not the most commonly used calls). So there is relatively little real benefit to optimizing the flea so it doesn't weigh down the dog, especially when this flea is usually used only a relatively few times and mostly only when the dog needs to wait for something relatively slow. So I'm most appreciative of any improvements to correctness that you offer. Please be careful to not become too focused on efficiency and end up reducing correctness (it doesn't sound like you have). On Mon Jul 30 15:21:47 2012, BULKDD wrote: Show quoted text

> To save the SV *s on the C stack, I have some design difficulty

All of your proposed approaches to this particular aspect appear to pose significant problems to code clarity. I think these optimizations clearly fail to justify such a cost. Show quoted text

> T_UV > ".(unshift($ExtUtils::ParseXS::VERSION >= 3.01 ? $self->{line} : > @{eval('\@line')}, ( > 'PREINIT:', > ' SV * sv'.$var.';', > 'INPUT:')),''). > "{sv${\$var} = $arg; > $var= INT2PTR($type, IntIn(aTHX_ sv${\$var}, 1).uv);}

I don't really care how ugly or bloated the machine code ends up compared to making the code the human must read bloated and ugly. The uncertainty of the long-term validity of the approach makes it even less desirable. Perhaps you can work on changes to the XS pre-parser to make it simple to override the usage message without such complexity. Perhaps a 5th approach would be to just append "..." to the list of svBlah arguments (if XS supports such, I haven't dived into such details recently) so code readability is only slightly reduced while allowing argument parsing to pull out each SV* for you while also allowing a custom usage message to preserve the current information. I quite often use several of these routines interactively for ad-hoc things where I will call one with no arguments to see how I need to use it. So I want the usage message to continue to be enough to convey how to call the method correctly. I don't find the cost of a few C array look-ups per file system access to be enough to justify more than the most trivial in reductions of code readability, if that. Show quoted text

> I also will also research if its possible to merge some of the A and W > calls into ALIAS XSUBs with no Perl language side, side effects.

I'm not sure what you are describing there. I see value if having raw access to the A and W calls separately, even if there is some kind of smart wrapper that gives the power of the W calls in a relatively modern Perl. I have versions of such wrappers in versions of the module code that I have not brought to a point of release, so such is desired, just not at the expense of raw access. Show quoted text

> I currently have my code as a fork of > http://github.com/chorny/Win32API-File . Is this correct?

Yes. Show quoted text

> I plan to put > all the changes into 1 git patch.

You mean a single commit? It would be more useful as a collection of commits, breaking the work into cohesive chunks of work, especially since some changes may be immediately accepted and others might not. Show quoted text

> Do you want the patch as a git patch > against github or a git patch against perl interp blead or something > else?

A pull request on github would be most useful because you don't have to generate patches and I (or chorny) don't need to convert the patch into commits. Show quoted text

> do you want a 60% done patch just for review and comments right now

Sure. That'd be useful. Thanks, Tye

Sat Aug 04 14:32:15 2012 BULKDD [...] cpan.org - Correspondence added

On Tue Jul 31 13:46:12 2012, TYEMQ wrote: Show quoted text

> > do you want a 60% done patch just for review and comments right now

> > Sure. That'd be useful. > > Thanks, > Tye

This is my preliminary patch, its broken, dont apply it. I will delete the commit and repush it when I fix it. I have a feeling there is a bug in NumOut on x64 when in the caller it casts a DWORD * as a un/signed __int64 * (64 bit UV/IV) without 32 to 64 bit promotion with sign honoring as an argument before passing it to NumOut. The fix will probably be (psuedocode) "sizeof($type) == 8 ? NUMOUT_SZ8B : sizeof($type) == 4 ? NUMOUT_SZ4B : croak("impossible") | NUMOUT_OTHERFLAGS" which will constant fold away unless its an assert fail, with I32 and U32 union members added somewhere. https://github.com/bulk88/Win32API-File/commit/af1fc142d311cc3cd36e91b39fa81776cb5b66b2 Also DeviceIoControl's fixes were scrapped because they were too radically different and too large (maintainability as you call it) (I converted it to a PPCODE and manually updating all @_ SV). I will try something with less patch lines on DeviceIoControl eventually. Comments?

Sat Aug 04 14:46:49 2012 BULKDD [...] cpan.org - Correspondence added

Even with the promotion bug, it passes all its tests on x64 (probably a bad thing, not enough testing). The existing test files do not test every last ::File XSUB. I can't write tests for all the XSUBs due to time reasons, and some of the XSUBs can be quite dangerous (DeviceIoControl for example). But any tests I write for my own testing or my own curiosity I will publish. I already made 1 XSUB test in the 1st commit.

Sat Aug 04 15:07:55 2012 BULKDD [...] cpan.org - Correspondence added

On Tue Jul 31 13:46:12 2012, TYEMQ wrote: Show quoted text

> > I also will also research if its possible to merge some of the A and W > > calls into ALIAS XSUBs with no Perl language side, side effects.

> > I'm not sure what you are describing there. I see value if having raw > access to the A and W calls separately, even if there is some kind of > smart wrapper that gives the power of the W calls in a relatively modern > Perl. I have versions of such wrappers in versions of the module code > that I have not brought to a point of release, so such is desired, just > not at the expense of raw access.

The XSUBs for the A and W calls are exactly the same number of machine code bytes on ASM level and are identical on C and XS levels except for the function call/macro names (typemap helpers and the final call to kernel32). The only difference is a * 1 (CHAR) or a * 2 (WCHAR) in argument processing/buffer growing, to convert from bytes to characters and back. ::File does not do automatic UTF16 conversion for W functions. I dont plan to add that. There is also the problem of massive API breakage for existing people who already pass encoded UTF16 strings to ::File, so if automatic utf16 conversion from Perl UTF8 SVs or Perl ANSI SVs is added, it can't break existing PMs that use ::File internally and Encode themselves but the caller or another PM in the perl process does want to use AutoWide. That would have to be solved before any kind of AutoWide is introduced.