Bug #85314 for Math-Random-OO: Unneeded voodoo-math in Math::OO::Random::Normal

Wed May 15 08:40:04 2013 intvnut [...] gmail.com - Ticket created

Subject:	Unneeded voodoo-math in Math::OO::Random::Normal
Date:	Wed, 15 May 2013 07:39:45 -0500
To:	bug-Math-Random-OO [...] rt.cpan.org
From:	Joe Zbiciak <intvnut [...] gmail.com>

I tried sending this directly to dagolden, but it bounced. So, here it is as a bug report. Show quoted text

---------- Forwarded message ---------- From: Joe Zbiciak <intvnut@gmail.com> Date: Wed, May 15, 2013 at 7:31 AM Subject: Unneeded voodoo-math in Math::OO::Random::Normal David, I've been reading through the implementation of Math::OO::Random::Normal, and I think I see some mathematical voodoo in there. I want to see if you agree it's unneeded voodoo that can be removed. # _ltqnorm on (0,1) ranges from about -38 to 8, so we'll use # the bottom half and invert it for the top half my $norm = ( $rnd <= 0.5 ) ? _ltqnorm($rnd) : -( _ltqnorm(1-$rnd) ); Here's what I think is voodoo here: _ltqnorm can return the range -38 to -8 only for very small random numbers. By very small, I mean smaller than about 2^-53. You can see that in Peter's analysis of ltqnorm here: http://home.online.no/~pjacklam/notes/invnorm/ The comment in your code suggests that 1 - $rnd will somehow generate outputs in the range 8 .. 38, but I don't see how that's possible. Because $rnd is a double precision float (at least, I presume it is), for numbers in the range [0.5,1), the least significant bit of the mantissa (also called the "units of the last place" or "ulp") is 2^-53. The largest value of $rnd in [0.5,1) is 1.0 - 2^-53, so the smallest value 1-$rnd could have is 2^-53. That is, we never get into that long tail of 8 .. 38 this way. Depending on how rand() is implemented, we also probably don't get into the long negative take -38 to -8 *either*. At least, common implementations of rand() I've seen (I can't comment on Perl's—it may be better than this) take a uniformly distributed N-bit random integer, where N is on the order of 53 to 64 bits, typecast this double, and then adjust the mantissa. The smallest random number greater than zero this could get you is 2^-63, if I'm not mistaken. That'll get you a bit beyond -8. To get to -38, Peter's analysis indicates you need 2^-1073... which is rather smaller. :-) So, what that's saying is for practical random number generators, norm() won't return a value more than about 8 stdev from the mean, which as Peter's analysis indicates, is less than about 2^-53 in each tail (2^-52) total. All this is a long-winded way of saying that next() could have been written more simply as: sub next { my ($self) = @_; my $rnd = rand() || 1e-254; # can't have zero for normals return _ltqnorm( $rnd ) * $self->stdev + $self->mean; } And even if you don't buy all my math argument, you might look at the $plow / $phigh stuff in _ltqnorm itself, which basically does the "1 - $rnd" for you anyway. If $rnd < 0.02425, _ltqnorm uses it directly. If $rnd > 1-0.02425, _ltqnorm uses 1-$rnd. So on that basis alone, you should either remove the $phigh branch of _ltqnorm, or your own 1-$rnd. :-) (In case you're wondering why I even bothered: I'm writing an optimized _ltqnorm for a DSP, and I was trying to *really* understand what I was looking at.) —Joe -- We sell spatulas, and that's all! http://spatula-city.org/~im14u2c/ http://spacepatrol.info/ -- We sell spatulas, and that's all! http://spatula-city.org/~im14u2c/ http://spacepatrol.info/

Wed May 15 09:51:28 2013 dagolden [...] cpan.org - Correspondence added

Thank you. (Wow, that module is so old the email address is wrong!) Perl supports "long double" precision, so I think that would allow getting beyond 8 standard deviations. The possible voodoo problem I do see looking at it again is that I do the inversion before the multiplication. Better would be to multiply the result by the target standard deviation, then do the inversion, but I think with long doubles, it probably doesn't matter much one way or the other.

Wed May 15 09:51:29 2013 The RT System itself - Status changed from 'new' to 'open'

Wed May 15 10:21:53 2013 intvnut [...] gmail.com - Correspondence added

Subject:	Re: [rt.cpan.org #85314] Unneeded voodoo-math in Math::OO::Random::Normal
Date:	Wed, 15 May 2013 09:21:30 -0500
To:	bug-Math-Random-OO [...] rt.cpan.org
From:	Joe Zbiciak <intvnut [...] gmail.com>

Fair enough that perl supports long double, but that just pushes the boundary out a bit. It doesn't change the argument, only the details. You should be able to get above 8 stdev on the positive side, but also below -38 on the negative side.

The fundamental issue is this: the ulp for numbers in [0.5,1.0) drives the limited output range for the upper tail. Flipping the input with 1-$r doesn't achieve the stated goal, because the smallest value of 1-$r is bounded by that ulp, not the full dynamic range of the underlying floating point type.

And by inversion, do you mean negation? That's lossless.

--
Spatula-City.org: We sell spatulas and that's all.

From: David Golden via RT

Sent: Wednesday, May 15, 2013 8:51 AM

To: intvnut@gmail.com

Reply To: bug-Math-Random-OO@rt.cpan.org

Subject: [rt.cpan.org #85314] Unneeded voodoo-math in Math::OO::Random::Normal

<URL: https://rt.cpan.org/Ticket/Display.html?id=85314 >

Thank you. (Wow, that module is so old the email address is wrong!)

Perl supports "long double" precision, so I think that would allow getting beyond 8 standard deviations.

The possible voodoo problem I do see looking at it again is that I do the inversion before the multiplication. Better would be to multiply the result by the target standard deviation, then do the inversion, but I think with long doubles, it probably doesn't matter much one way or the other.

Wed May 15 17:19:04 2013 dagolden [...] cpan.org - Correspondence added

Ah. I see. I think to get 8+, I would have to divide p by 2 to get into the range of [0,.5) and then do a second random variable to coin flip it positive or negative. That would take advantage of the ability to denormalize the IEEE floats to get closer to zero. But that's probably too much work. Thanks for pointing this out.

Wed May 15 17:19:17 2013 dagolden [...] cpan.org - Severity Normal added

Wed May 15 23:19:09 2013 intvnut [...] gmail.com - Correspondence added

Subject:	Re: [rt.cpan.org #85314] Unneeded voodoo-math in Math::OO::Random::Normal
Date:	Wed, 15 May 2013 22:18:51 -0500
To:	bug-Math-Random-OO [...] rt.cpan.org
From:	Joe Zbiciak <intvnut [...] gmail.com>

I suppose if you really want to win the Thermonuclear Gnatswatter Award on this one, you could make a long-double rand() function that would guarantee to fill the 'chasm' between 2^-63 and -2^-63 by additional calls to the uniform unsigned-long-long random number generator. Something like this: (WARNING: Untested. Just intended to be a quick mockup.) /* Generate a random long double in the range (-0.5, 0.5). If the source random number generator returns 0, slide the scale down and try again. Repeat until we get a non-zero source random number to convert to float. This allows exponents as small as the floating point type supports. */ #include <float.h> #include <stdint.h> #define RAND_MASK ((~0ULL) >> (64 - LDBL_MANT_DIG)) extern uint64_t uniform_long_long_rand(); static const long double init_scale = 0.25L / (long double)(1ULL << (LDBL_MANT_DIG - 1)); static const long double iter_scale = 0.125L / (long double)(1ULL << (LDBL_MANT_DIG - 1)); long double zero_centered_long_double_rand(void) { uint64_t r; long double scale = init_scale; while ( (r = uniform_long_long_rand() & RAND_MASK) == 0 ) scale *= iter_scale; if (uniform_long_long_rand() & 1) scale = -scale; return (long double)r * scale; } With this random number generator, you would then return $r >= 0 ? _ltqnorm( $r ) : - _ltqnorm( -$r ) and get the full range of possibilities. :-) (And if that while() loop iterates more than 3 times for anyone, given a correct and properly functioning, properly seeded source random number generator, I will fly out and buy that person a beer or other beverage of their choice in person. ;-) ) —Joe On Wed, May 15, 2013 at 4:19 PM, David Golden via RT < bug-Math-Random-OO@rt.cpan.org> wrote: Show quoted text

> <URL: https://rt.cpan.org/Ticket/Display.html?id=85314 > > > Ah. I see. > > I think to get 8+, I would have to divide p by 2 to get into the range of > [0,.5) and then do a second random variable to coin flip it positive or > negative. > > That would take advantage of the ability to denormalize the IEEE floats to > get closer to zero. > > But that's probably too much work. > > Thanks for pointing this out. >

-- We sell spatulas, and that's all! http://spatula-city.org/~im14u2c/ http://spacepatrol.info/

Sun Jun 09 20:41:02 2013 dagolden [...] cpan.org - Correspondence added

Fixed and shipped to CPAN.

Sun Jun 09 20:41:02 2013 dagolden [...] cpan.org - Status changed from 'open' to 'resolved'

Bug #85314 for Math-Random-OO: Unneeded voodoo-math in Math::OO::Random::Normal

Preferred bug tracker