Subject: | possible bug in Math::KullbackLeibler::Discrete |
Date: | Sun, 15 Dec 2013 22:17:47 -0500 |
To: | "bug-math-kullbackleibler-discrete [...] rt.cpan.org" <bug-math-kullbackleibler-discrete [...] rt.cpan.org> |
From: | "Seideman, Jeremy" <JSeideman [...] gc.cuny.edu> |
Hi--
I think i have identified a bug in this module. The line in Qline:
return exists($Q->{$i}) ? $Q->{$i} - $pc : $eps
does not seem consistent with the information on http://www.cs.bgu.ac.il/~elhadad/nlp09/KL.html which states that:
" P'(i) = P(i) - pc if i in SP
P'(i) = eps otherwise for i in SU - SP
and similarly for Q' and where pc and qc are computed so that sum(P'(i)) = 1.0 and sum(Q'(j)) = 1.0. "
For Q'(i), shouldn't it be Q(i) - qc if i in SQ? In other words, instead of using pc in the calculation of Q(i), shouldn't the calculation use qc? In fact, the example on that page indicates using qc for Q'.
I noticed that in some cases pc was greater than $Q->{i} and therefore Q'(i) was a negative number, causing issues with the log function.
Instead, using the line:
return exists($Q->{$i}) ? $Q->{$i} - $qc : $eps;
eliminates those issues.
Is this correct?
Thanks,
Jeremy