Subject: | Bug in open_dbf method |
Date: | Fri, 5 Jan 2007 15:45:00 +1000 |
To: | <bug-Xbase [...] rt.cpan.org> |
From: | "Stephen Royce" <Stephen.Royce [...] nehta.gov.au> |
Pratap,
I think there's a bug in the open_dbf method of the Xbase module. It is
dropping the last character off the field name. The code in question
(beginning at line 86 of v1.07) is:
seek($self->{'DBFH'},($i-1)*32+32,0);
read($self->{'DBFH'},$field_header,31);
my($fname)=unpack("A*",substr($field_header,0,10));
my($null_pos)=index($fname,chr(0));
$self->{$fn}=substr($fname,0,$null_pos);
$self->{$fname}=$i;
The unpack conversion uses the 'A*' which strips all nulls and spaces.
This means that the index search for the terminating null (chr(0)) of
$fname does not find anything and returns -1, which, in turns, means
that the following substr removes the last character from $fname before
assigning it to $self->{$fn}. Beyond the loss of the character in the
name, I think also that this may have implications for the get_field
method since the field name held in $self->{$fn} and the key of
$self->{$fname} (i.e, $fname itself) are not the same. I haven't needed
this functionality, but scanning the code suggests that the problem will
only occur if you were to access the object hash directly for the field
names, searching for $self->{f-_name1}, $self->{f_name2}, etc. manually
and then passing those values to get_field at a later point. Anyway, I
tried this code myself and found it to work:
seek($self->{'DBFH'},($i-1)*32+32,0);
read($self->{'DBFH'},$field_header,31);
my($fname)=unpack("A*",substr($field_header,0,10));
#
# The following code is included in the Xbase module, but commented out
here because of the
# following bug:
#
# The uppercase 'A' conversion letter in unpack returns the
string stripped of both nulls and
# spaces. This means that the index search for chr(0) _always_
returns -1, which, in turn,
# causes the following substr to drop the last character of the
field name.
#
# my($null_pos)=index($fname,chr(0));
# $self->{$fn}=substr($fname,0,$null_pos);
#
# Instead, we just assign the result of the unpack directly to the
field name element of the object
# hash.
#
$self->{$fn}=$fname;
#
I hope this is useful.
Incidentally, I am using Xbase to read .dbf files and transfer the data
to a relational database (PostgreSQL in my case), creating new tables if
they don't already exist. It would be enormously useful if there were
some method(s) to extract the field details as (perhaps) an array of
hashes like this:
[ { name => <f_name1>, type => <f_type1>, length => <f_len1>, scale =>
<f_ldec1> },
{ name => <f_name2>, type => <f_type2>, length => <f_len2>, scale =>
<f_ldec2> },
.
.
.
]
I tried to create a child class to provide this additional functionality
with this code:
sub get_field_descriptor_arrayref
{
my ($self) = shift;
my ($i);
my $field_descriptor_arrayref = [] ;
if (not ($self->{'hasdbf'})){
carp "DBF file has not been opened\n";
return undef ;
}
for ($i=1;$i<=$self->{'num_fields'};$i++)
{
my ($fn, $ft, $fd, $fl, $fld) = ("f_name$i",
"f_type$i","f_disp$i", "f_len$i", "f_ldec$i");
push( @$field_descriptor_array, { name => $self->{$fn}, type
=> $self->{$ft}, length => $self->{$fl}, scale => $self->{$fld} } ) ;
}
return $field_descriptor_arrayref ;
}
I haven't tested it very thoroughly yet (mainly because I can't actually
get the child class to work without significantly modifying the Xbase
module itself - there's problems with the file handle(s) and the class
that $self is blessed into) but when I incorporated it into the module
code, it seemed to work okay. I had also thought of providing the
option of returning the data as a hash keyed on field name, but it
seemed awkward to retain the field ordering, so I haven't bothered.
(You could include "position => $i", but, as I said this seems awkward
and unnecessarily complicated.) If you think this is useful, please
feel free to incorporate the code as you think best. I have attached my
updated code to help.
Cheers
Stephen Royce
Senior Systems Specialist
Stephen.Royce@nehta.gov.au <mailto:Stephen.Royce@nehta.gov.au>
Message body is not shown because it is too large.