Subject: | Bug in treecluster |
I found some errant warning messages when trying to run Algorithm::Cluster::treecluster. I was invoking a call to tree cluster as follows:
my %params = (
applyscale => 0,
transpose => 0,
method => 'a',
dist => 'e',
data => \@distance,
mask => '',
weight => '',
);
Running the code produced several warning messages. I eventually traced the messages to validity tests for mask and weight in the sub data_is_valid_matrix() which is called by check_matrix_dimensions().
line ~93
module_warn( "Wanted array reference, but got a reference to ",
There was a similar warning for the weight validity test caused again by the sub data_is_valid_matirx(). I corrected the problem by wrapping the weight and mask validity tests with an 'unless' in the sub check_matrix_dimensions():
unless ($param->{mask} eq '' )
{
... do the validity test
}
unless ($param->{mask} eq '' )
{
....do the test validity
}
I then make sure to set mask and weight parameters to '' when I pass them to the module. They end up being set to this eventually anyway but the warnings caused much consternation. From the documentation, I think this is supposed to be the behavior.
The main problem seems to be that the warnings generated by data_is_valid_matrix() dont give any indication as to which matrix has a problem. It was impossible to know if the warning was significant or just something to ignore. It would be nice to have a somewhat more informative message if you are going to do the validity check (although I dont see an easy way to do this at the moment).
One final item, the C prototype accepts either 'data' matrix or a 'distance' matrix. This is also true of the Python module. I spent quite a while messing around until I realized that there does not seem to be any distinction between the two in the Perl module (again nonspecific validity warnings were confusing)
Might be nice if treecluster.pm recognized 'distancematrix' as a valid parameter in addition to 'data'. It would also be nice if the warnings could reflect which test was failing (print module line number?).
Thanks