Bug #28203 for Test-LectroTest: Random test results PASS 150 : FAIL 2

Sat Jul 14 13:31:50 2007 ANDK [...] cpan.org - Ticket created

Subject:

Random test results PASS 150 : FAIL 2

Now that I have tested T:LT 152 times and have two failures I believe I should let you know. Both fails were with different bleadperls and both were not reproducable. They look like real random ressults This was the first: t/gens............ # Failed test 'String() length under sizing [1..1] dist mean is 0.5 (z-score = -3.98)' # at t/gens.t line 1149. # '3.98295682672687' # < # '3.89' # Looks like you failed 1 test of 248. dubious ^ITest returned status 1 (wstat 256, 0x100) DIED. FAILED test 174 and this was the second: t/gens............ # Failed test 'Float(sized=>0,range=>[-400,-200]) dist mean is -300 (z-score = -3.93)' # at t/gens.t line 1149. # '3.92902581170997' # < # '3.89' # Looks like you failed 1 test of 248. dubious ^ITest returned status 1 (wstat 256, 0x100) DIED. FAILED test 55 Hope this helps,

Sat Jul 14 14:19:42 2007 tom [...] moertel.com - Correspondence added

From:

tom [...] moertel.com

On Sat Jul 14 13:31:50 2007, ANDK wrote: Show quoted text

> Now that I have tested T:LT 152 times and have two failures I believe I > should let you know. Both fails were with different bleadperls and both > were not reproducable. They look like real random ressults

Random failures now and again are to be expected when testing LectroTest's generators because I must necessarily test not only the generator code but also the underlying Perl random number generator; if either are broken LT cannot do its job. Unfortunately, to test for randomness, you need to take samples and make statistical inferences. The more samples you take, the more unlikely a false result becomes. So there's a tradeoff: you can make false failures (such as you encountered) less likely, but only by making the tests run much more slowly. Already, the LT generator tests are painfully slow. Given that the false-failure rate is under 1-percent, I'd rather make people re-run the suite on the occasional failure than I would make *everybody* who runs the tests suffer with yet-even-slower tests. See the documentation for the generator tests for more: $ perldoc t/gens.t Important: This test suite relies upon a number of randomized tests and statistical inferences. As a result, there is a small probability (about 1 in 200) that some part of the suite will fail even if everything is working properly. Therefore, if a test fails, re-run the test suite to determine whether the supposed problem is real or just a rare instance of the Fates poking fun at you. Cheers, Tom

Sat Jul 14 14:19:43 2007 The RT System itself - Status changed from 'new' to 'open'

Sat Jul 14 15:38:25 2007 andreas.koenig.7os6VVqR [...] franz.ak.mind.de - Correspondence added

CC:	ANDK [...] cpan.org
Subject:	Re: [rt.cpan.org #28203] Random test results PASS 150 : FAIL 2
Date:	Sat, 14 Jul 2007 21:37:53 +0200
To:	bug-Test-LectroTest [...] rt.cpan.org
From:	andreas.koenig.7os6VVqR [...] franz.ak.mind.de (Andreas J. Koenig)

Show quoted text

>>>>> On Sat, 14 Jul 2007 14:19:45 -0400, "Tom Moertel via RT" <bug-Test-LectroTest@rt.cpan.org> said:

Show quoted text

> <URL: http://rt.cpan.org/Ticket/Display.html?id=28203 >

Show quoted text

> On Sat Jul 14 13:31:50 2007, ANDK wrote:

>> Now that I have tested T:LT 152 times and have two failures I believe I >> should let you know. Both fails were with different bleadperls and both >> were not reproducable. They look like real random ressults

Show quoted text

> Random failures now and again are to be expected when testing > LectroTest's generators because I must necessarily test not only the > generator code but also the underlying Perl random number generator; if > either are broken LT cannot do its job.

Show quoted text

> Unfortunately, to test for randomness, you need to take samples and make > statistical inferences. The more samples you take, the more unlikely a > false result becomes. So there's a tradeoff: you can make false > failures (such as you encountered) less likely, but only by making the > tests run much more slowly.

Show quoted text

> Already, the LT generator tests are painfully slow.

Agreed:) Show quoted text

> Given that the > false-failure rate is under 1-percent, I'd rather make people re-run the > suite on the occasional failure than I would make *everybody* who runs > the tests suffer with yet-even-slower tests.

Show quoted text

> See the documentation for the generator tests for more:

Show quoted text

> $ perldoc t/gens.t

Show quoted text

> Important: This test suite relies upon a number of randomized tests > and statistical inferences. As a result, there is a small > probability (about 1 in 200) that some part of the suite will fail > even if everything is working properly. Therefore, if a test > fails, re-run the test suite to determine whether the supposed > problem is real or just a rare instance of the Fates poking fun at > you.

How about printing this text out when the test fails? That would make the person running the test aware that what's up. The installer is not always the same person who has read the documentation. Thanks for your time, -- andreas

Thu Aug 30 16:08:13 2007 tom [...] moertel.com - Correspondence added

From:

tom [...] moertel.com

On Sat Jul 14 13:31:50 2007, ANDK wrote: Show quoted text

> Now that I have tested T:LT 152 times and have two failures I believe I > should let you know. Both fails were with different bleadperls and both > were not reproducable. They look like real random ressults

I'm not sure what you mean by "real random results". Do you mean that the results you observed are not consistent with my earlier explanation and deserve further investigation? Cheers, Tom

Thu Aug 30 17:33:55 2007 tom [...] moertel.com - Correspondence added

On Thu Aug 30 16:08:13 2007, TMOERTEL wrote: Show quoted text

> I'm not sure what you mean by "real random results". Do you mean that > the results you observed are not consistent with my earlier explanation > and deserve further investigation?

Please disregard my previous message. (I mistook an old email notification from RT as new and thought you had encountered new test failures.)

Thu Aug 30 17:46:34 2007 tom [...] moertel.com - Correspondence added

On Sat Jul 14 15:38:25 2007, andreas.koenig.7os6VVqR@franz.ak.mind.de wrote: Show quoted text

> How about printing this text out when the test fails? That would make > the person running the test aware that what's up. The installer is not > always the same person who has read the documentation.

Done! Test::LectroTest 0.3600 emits the following diagnostic message when its generator test-suite has a failure: ============================================================ IMPORTANT! A TEST FAILURE MAY NOT REPRESENT A REAL PROBLEM. This test suite relies upon a number of randomized tests and statistical inferences. So, there is a small probability that some part of the suite will fail even if everything is actually fine. Therefore, re-run the test suite. You do not have a problem unless the suite fails repeatably. ============================================================ Thanks for the good idea! Cheers, Tom

Fri Aug 31 12:44:29 2007 tom [...] moertel.com - Correspondence added

With release 0.3600 hitting CPAN, I'm resolving this ticket. In sum, we must live with the possibility of false test failures when testing the random-value generators, but now we warn users about it in a way that's hard to overlook.

Fri Aug 31 12:44:31 2007 tom [...] moertel.com - Status changed from 'open' to 'resolved'