Skip Menu |

This queue is for tickets about the Statistics-Discrete CPAN distribution.

Report information
The Basics
Id: 110211
Status: new
Priority: 0/
Queue: Statistics-Discrete

People
Owner: Nobody in particular
Requestors: seth.williams [...] galaxysemi.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: error code 25 returned with message Can't take sqrt of -2.22045e-16
Date: Tue, 8 Dec 2015 14:45:27 -0700
To: <bug-Statistics-Discrete [...] rt.cpan.org>
From: "Seth Williams" <seth.williams [...] galaxysemi.com>
I think the calculation of variance is incorrect. The variance should NEVER be negative. I'm no statistician but below you can see the edits I made which I think is the more correct calculation and avoids a negative variance. I left the original lines commented. Can someone verify my changes? # Reason for changes: From what I read online, especially at this site (http://www.statcan.gc.ca/edu/power-pouvoir/ch12/5214891-eng.htm#a2) when using a frequency table you should square the difference between the mean and result. This will avoid the negative variance, and seems to be the correct calculation. # Summary of changes: I changed this line: $cumul_value += ($v**2) * $self->{"data_frequency"}{$v}; To this: $cumul_value += (($v - $mean)**2) * $self->{"data_frequency"}{$v}; I changed this line: $self->{"stats"}{"Desc"}{"variance"} = $square_mean - ($mean**2); To this: $self->{"stats"}{"Desc"}{"variance"} = $square_mean; # Entire subroutine variance() from Discrete.pm: sub variance { my $self = shift; if(!defined($self->{"stats"}{"Desc"}{"variance"})) { my $mean = $self->mean(); my $count = $self->count(); my $cumul_value = 0; my $square_mean = 0; my $v; # key is the measurement, value is the number of occurences foreach $v(keys %{$self->{"data_frequency"}}) { #$cumul_value += ($v**2) * $self->{"data_frequency"}{$v}; $cumul_value += (($v - $mean)**2) * $self->{"data_frequency"}{$v}; } if($count > 0) { $square_mean = $cumul_value / $count; } #$self->{"stats"}{"Desc"}{"variance"} = $square_mean - ($mean**2); $self->{"stats"}{"Desc"}{"variance"} = $square_mean; } return $self->{"stats"}{"Desc"}{"variance"}; }