Skip Menu |

This queue is for tickets about the SOAP-Lite CPAN distribution.

Report information
The Basics
Id: 75374
Status: resolved
Priority: 0/
Queue: SOAP-Lite

People
Owner: Nobody in particular
Requestors: cmanley [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.69
Fixed in: 0.714



Subject: Strings of digits with preceding zeros (e.g. telnums) should be serialized as strings
See subject. This is to avoid mutilation of data such as telephone numbers, serial numbers, or any other such string of digits that may have preceding zero characters. The regular expression fixes are simple with an added negative look-ahead for a "0", i.e. (?!0): # In custom SOAP child class (if you use one): sub new { my $self = shift; unless(ref($self)) { $self = $self->SUPER::new(@_); # ...snip... my $typelookup = $self->serializer()->typelookup(); $typelookup->{'int'}->[1] = sub {$_[0] =~ /^(?!0)([+-]?\d{1,10})$/ && ($1 <= 2147483647) && ($1 >= -2147483648); }; $typelookup->{'long'}->[1] = sub {$_[0] =~ /^(?!0)([+-]?\d{1,19})$/ && ($1 <= 9223372036854775807); }; $typelookup->{'float'}->[1] = sub {$_[0] =~ /^(?!0)(-?(?:\d+(?:\.\d*)?|\.\d+|NaN|INF)|([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?)$/}; # ...snip... } return $self; }
On Tue Feb 28 10:06:02 2012, CMANLEY wrote: Show quoted text
> See subject. This is to avoid mutilation of data such as telephone > numbers, serial numbers, or any other such string of digits that may > have preceding zero characters. > The regular expression fixes are simple with an added negative > look-ahead for a "0", i.e. (?!0): > > # In custom SOAP child class (if you use one): > sub new { > my $self = shift; > unless(ref($self)) { > $self = $self->SUPER::new(@_); > # ...snip... > my $typelookup = $self->serializer()->typelookup(); > $typelookup->{'int'}->[1] = sub {$_[0] =~ /^(?!0)([+-]?\d{1,10})$/ > && ($1 <= 2147483647) && ($1 >= -2147483648); }; > $typelookup->{'long'}->[1] = sub {$_[0] =~ /^(?!0)([+- > ]?\d{1,19})$/ > && ($1 <= 9223372036854775807); }; > $typelookup->{'float'}->[1] = sub {$_[0] =~ > /^(?!0)(-?(?:\d+(?:\.\d*)?|\.\d+|NaN|INF)|([+- > ]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?)$/}; > # ...snip... > } > return $self; > }
I just realized that '0' should serialize as string, so I fixed the negative lookaheads in my fixes: $typelookup->{'int'}->[1] = sub {$_[0] =~ /^(?!0.+)([+-]?\d{1,10})$/ && ($1 <= 2147483647) && ($1 >= -2147483648); }; $typelookup->{'long'}->[1] = sub {$_[0] =~ /^(?!0.+)([+-]?\d{1,19})$/ && ($1 <= 9223372036854775807); }; $typelookup->{'float'}->[1] = sub {$_[0] =~ /^(?!0.+)(-?(?:\d+(?:\.\d*)?|\.\d+|NaN|INF)|([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?)$/};
On Tue Feb 28 10:06:02 2012, CMANLEY wrote: Show quoted text
> See subject. This is to avoid mutilation of data such as telephone > numbers, serial numbers, or any other such string of digits that may > have preceding zero characters. > The regular expression fixes are simple with an added negative > look-ahead for a "0", i.e. (?!0): > > # In custom SOAP child class (if you use one): > sub new { > my $self = shift; > unless(ref($self)) { > $self = $self->SUPER::new(@_); > # ...snip... > my $typelookup = $self->serializer()->typelookup(); > $typelookup->{'int'}->[1] = sub {$_[0] =~ /^(?!0)([+-]?\d{1,10})$/ > && ($1 <= 2147483647) && ($1 >= -2147483648); }; > $typelookup->{'long'}->[1] = sub {$_[0] =~ /^(?!0)([+- > ]?\d{1,19})$/ > && ($1 <= 9223372036854775807); }; > $typelookup->{'float'}->[1] = sub {$_[0] =~ > /^(?!0)(-?(?:\d+(?:\.\d*)?|\.\d+|NaN|INF)|([+- > ]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?)$/}; > # ...snip... > } > return $self; > }
I just realized that '0' should serialize as string, so I fixed the negative lookaheads in my fixes: $typelookup->{'int'}->[1] = sub {$_[0] =~ /^(?!0.+)([+-]?\d{1,10})$/ && ($1 <= 2147483647) && ($1 >= -2147483648); }; $typelookup->{'long'}->[1] = sub {$_[0] =~ /^(?!0.+)([+-]?\d{1,19})$/ && ($1 <= 9223372036854775807); }; $typelookup->{'float'}->[1] = sub {$_[0] =~ /^(?!0.+)(-?(?:\d+(?:\.\d*)?|\.\d+|NaN|INF)|([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?)$/};
Hi, this is not a bug, it is by design - a string consisting of digits is encoded as number by default (which is usually correct, even with leading zeros - the string "0" usually means the number zero, and for all fixed-length numerical data preceding zeros are quite normal). You can always set $soap->autotype(0) and assign types (in this case: string) manually. This should always be done for non-uniform data like telefone numbers. Just an example: These three strings can all be (valid) telephone numbers (depending on their context): 12345 012312345 ++49123456789 The first and second would (by default) be encoded as number, the third as string. There's no way for SOAP::Lite (as for all computer programs) to reliably infer the meaning of input data from the data itself. The encoding SOAP::Lite chooses with autotype on is just an educate guess - and on not-so-easy-to-guess data like phone numbers, it just guesses wrong. Best regards, Martin
On Wed Feb 29 07:24:58 2012, MKUTTER wrote: Show quoted text
> Hi, > > this is not a bug, it is by design - a string consisting of digits is > encoded as number by default (which is usually correct, even with > leading zeros - the string "0" usually means the number zero, and for > all fixed-length numerical data preceding zeros are quite normal). > > You can always set $soap->autotype(0) and assign types (in this case: > string) manually. This should always be done for non-uniform data like > telefone numbers. > > Just an example: These three strings can all be (valid) telephone > numbers (depending on their context): > > 12345 > 012312345 > ++49123456789 > > The first and second would (by default) be encoded as number, the third > as string. > > There's no way for SOAP::Lite (as for all computer programs) to > reliably infer the meaning of input data from the data itself. The > encoding SOAP::Lite chooses with autotype on is just an educate guess - > and on not-so-easy-to-guess data like phone numbers, it just guesses > wrong.
I understand, and that's why the guess should be done as smartly as possible. A Perl scalar like '012312345' can never be the result of a numeric calculation nor can it be the result of fetching a value from a numeric database field, so you can safely assume it's a string for autotyping. Passing the scalar '012312345' as an 'int' back to a more strictly typed client based on a language such as Java, PHP, or .NET (or just about any other language besides Perl) will end up having the value deserialized into an int and therefore loosing the preceding zero(s). I hope you understand and can fix the flaw in the design. Regards, Craig
I did some further inspection into the SOAP-Lite code: I'm using version 0.69 and confirmed through testing the incorrect serialization behavior with this version. The autotype regexps in SOAP::Serializer for int and long are unchanged in the latest version which led me to presume that that behavior persists in that version too. However, I just noticed that the latest version contains an extra 'zerostring' autotype which is executed before the int and long autotypes. This effectively fixes the serialization issue in the way I wanted it to be. So you can consider this ticket as already fixed. It is strange though that you rejected this requested serialization behavior when it is already built into the latest release of SOAP-Lite. Thanks for looking anyway.