Subject: | ARM In-line assembler code for fixed_mul(fixed_t x, fixed_t y) doesn't match |
Andy,
Image::Scale uses some in-line assembler code for fixed point multiplies in 19.12 format, but the code for ARM architectures doesn't produce bit-identical results as the code used for i386 or x86_64 platforms.
The difference is whether rounding of the last fractional bit is done ("yes" for ARM, but "no" for x86_64 and "no" for all platforms that use the fall-back C-code).
Hence perl module tests for Image-Scale never pass on ARM platforms.
This can be fixed by changing one character in the ARM in-line assembly code in ./include/fixed.h
adc => add when combining the integer and fractional parts below.
--- fixed.h_old 2014-02-27 11:21:41.703063000 -0800
+++ fixed.h 2014-02-27 11:21:53.574703000 -0800
@@ -52,7 +52,7 @@
__asm__ __volatile__(
"smull %0, %1, %3, %4\n\t"
"movs %0, %0, lsr %5\n\t"
- "adc %2, %0, %1, lsl %6"
+ "add %2, %0, %1, lsl %6"
: "=&r" (__lo), "=&r" (__hi), "=r" (__result)
: "%r" (x), "r" (y), "M" (FRAC_BITS), "M" (32 - (FRAC_BITS))
: "cc"
: "cc"
);
return __result;
}
This is arguably less optimum (rounding would give better fidelity than truncation), but I would argue that we should prioritize passing module tests and identical behavior on all machine architectures.