Subject: | Win32::API doesn't keep C stack aligned on x64 |
After a long IRC conversation with someone with a x64 Win 7 machine,
with Strawberry Perl and Mingw compiled API:: DLL. It was found that a
crash occurred when a movdqa was executed to save a XMM reg in a delay
loading DLL tailmerge. movdqa requires an aligned memory address. It was
found that (rsp+0x20)%16 == 8, unaligned. Adding an extra 0 param to the
prototype passed to C fixed the crash confirming the unalignment of the
C stack when there are 5 args with Win32::API.
before.
Show quoted text
____________________________________________________________________
my $acquireCtx = Win32::API->new('advapi32', 'CryptAcquireContext',
'PPPNN', 'I') or die;
$acquireCtx->Call($ctx, 0, 0, 1, hex('40') | hex('F0000000')) or die
"CryptAcquireContext failed";
____________________________________________________________________
after
____________________________________________________________________
my $acquireCtx = Win32::API->new('advapi32', 'CryptAcquireContext',
'PPPNNN', 'I') or die;
$acquireCtx->Call($ctx, 0, 0, 1, hex('40') | hex('F0000000'), 0) or die
"CryptAcquireContext failed";
____________________________________________________________________
fixed the crash. Now this fix would crash on 32 bit due to stdcall.
____________________________________________________________________
__tailMerge_CRYPTSP_dll:
000007FEFE6FDC70 48 89 4C 24 08 mov qword ptr [rsp+8],rcx
000007FEFE6FDC75 48 89 54 24 10 mov qword ptr [rsp+10h],rdx
000007FEFE6FDC7A 4C 89 44 24 18 mov qword ptr [rsp+18h],r8
000007FEFE6FDC7F 4C 89 4C 24 20 mov qword ptr [rsp+20h],r9
000007FEFE6FDC84 48 83 EC 68 sub rsp,68h
000007FEFE6FDC88 66 0F 7F 44 24 20 movdqa xmmword ptr
[rsp+20h],xmm0 ;crash here
000007FEFE6FDC8E 66 0F 7F 4C 24 30 movdqa xmmword ptr
[rsp+30h],xmm1
000007FEFE6FDC94 66 0F 7F 54 24 40 movdqa xmmword ptr
[rsp+40h],xmm2
000007FEFE6FDC9A 66 0F 7F 5C 24 50 movdqa xmmword ptr
[rsp+50h],xmm3
000007FEFE6FDCA0 48 8B D0 mov rdx,rax
000007FEFE6FDCA3 48 8D 0D 5E E1 08 00 lea
rcx,[__DELAY_IMPORT_DESCRIPTOR_CRYPTSP_dll (7FEFE78BE08h)]
000007FEFE6FDCAA E8 01 DA 00 00 call __delayLoadHelper2
(7FEFE70B6B0h)
000007FEFE6FDCAF 66 0F 6F 44 24 20 movdqa xmm0,xmmword ptr
[rsp+20h]
000007FEFE6FDCB5 66 0F 6F 4C 24 30 movdqa xmm1,xmmword ptr
[rsp+30h]
000007FEFE6FDCBB 66 0F 6F 54 24 40 movdqa xmm2,xmmword ptr
[rsp+40h]
000007FEFE6FDCC1 66 0F 6F 5C 24 50 movdqa xmm3,xmmword ptr
[rsp+50h]
000007FEFE6FDCC7 48 8B 4C 24 70 mov rcx,qword ptr [rsp+70h]
000007FEFE6FDCCC 48 8B 54 24 78 mov rdx,qword ptr [rsp+78h]
000007FEFE6FDCD1 4C 8B 84 24 80 00 00 00 mov r8,qword ptr
[rsp+80h]
000007FEFE6FDCD9 4C 8B 8C 24 88 00 00 00 mov r9,qword ptr
[rsp+88h]
000007FEFE6FDCE1 48 83 C4 68 add rsp,68h
000007FEFE6FDCE5 EB 00 jmp
__tailMerge_CRYPTSP_dll+77h (7FEFE6FDCE7h)
000007FEFE6FDCE7 FF E0 jmp rax
000007FEFE6FDCE9 90 nop
000007FEFE6FDCEA 90 nop
000007FEFE6FDCEB 90 nop
000007FEFE6FDCEC 90 nop
000007FEFE6FDCED 90 nop
000007FEFE6FDCEE 90 nop
______________________________________________________________________
This delay loading of cryptsp.dll doesn't happen on my x64 Server 2003
machine (I can't reproduce this crash), but due to MS's reorganization
of DLLs over the last couple years, in Win 7 CryptAcquireContext is not
implemented in advapi32.dll anymore but in cryptsp.dll. MS requires C
stack to be 16 bytes aligned on Win64 for ABI.
I am creating this ticket as a reminder to myself.