Skip Menu |

This queue is for tickets about the Convert-Binary-C CPAN distribution.

Report information
The Basics
Id: 45130
Status: open
Priority: 0/
Queue: Convert-Binary-C

People
Owner: Nobody in particular
Requestors: bitcard [...] post2.25u.com
Cc:
AdminCc:

Bug Information
Severity: Wishlist
Broken in: (no value)
Fixed in: 0.73



Subject: unpack() returns nothing when Dimension tag sets dynamic entry to zero size
Hello, dealing with dynamic-size arrays, I ran into a problem when using the "Dimension" tag: - when a compound contains an array that is declared as variable size, or as size 0, then unpacking data with a zero-element array works fine. - however, if the size is changed to 0 by a Dimension tag (from an originally positive value), then: o unpack() in string context still returns proper decoding o unpack() in list context returns an *empty* list; no exception is thrown. (Example script attached). It seems that in the second case, unpack() in list context still assumes that "something" has to be there if an initially positive size has been specified for a dynamic-size entry -- even when the size is set to 0 by the Dimension tag afterwards. I am aware that the "Dimension" tag is experimental yet; however, this bug makes it impossible to detect whether there really wasn't enough data for decoding (leading to an empty list returned from unpack()) or not. I hope this is just a minor issue... Thanks (again) for your support! Best regards, /alex
Subject: dimtest.pl
#!/usr/bin/perl use strict; use Convert::Binary::C; use Data::Dumper; use Tie::Hash::Indexed; ## ## test C code ## my $testCode = <<'EOF'; struct S { int count; int array[1]; /* with [] or [0], everything is fine... */ }; EOF ## ## create converter, parse code ## my $c = new Convert::Binary::C; $c->OrderMembers(1); $c->parse( $testCode ); $c->tag( 'S.array', Dimension => 'count' ); ## ## test unpacking ## my $binString = pack( "I*", 0 ); { my @unpacked; eval { @unpacked = $c->unpack( "struct S", $binString ); }; if ( $@ ) { print STDERR "exception: $@\n"; } print "array mode yields:\n" . Dumper( @unpacked ); } { my $unpacked; $unpacked = $c->unpack( "struct S", $binString ); print "string mode yields:\n" . Dumper( $unpacked ); }
Subject: Re: [rt.cpan.org #45130] unpack() returns nothing when Dimension tag sets dynamic entry to zero size
Date: Sat, 18 Apr 2009 21:21:21 +0200
To: bug-Convert-Binary-C [...] rt.cpan.org
From: Marcus Holland-Moritz <mhx-perl [...] gmx.net>
Hi Alexander, On 2009-04-18, at 14:44:48 -0400, Alexander Ost via RT wrote: Show quoted text
> Sat Apr 18 14:44:47 2009: Request 45130 was acted upon. > Transaction: Ticket created by arost > Queue: Convert-Binary-C > Subject: unpack() returns nothing when Dimension tag sets dynamic entry to > zero size > Broken in: (no value) > Severity: Normal > Owner: Nobody > Requestors: bitcard@post2.25u.com > Status: new > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=45130 > > > dealing with dynamic-size arrays, I ran into a problem when using the > "Dimension" tag: > > - when a compound contains an array that is declared as variable size, > or as size 0, then unpacking data with a zero-element array works fine. > > - however, if the size is changed to 0 by a Dimension tag (from an > originally positive value), then: > o unpack() in string context still returns proper decoding > o unpack() in list context returns an *empty* list; no exception is > thrown. > > (Example script attached). > > It seems that in the second case, unpack() in list context still assumes > that "something" has to be there if an initially positive size has been > specified for a dynamic-size entry -- even when the size is set to 0 by > the Dimension tag afterwards.
Yes, that's exactly the one "shortcoming" that the docs talk about. The Dimension tag doesn't affect compound layout. This is not easy to change (I'm explicitly not talking of "fix"). Actually, if you had turned on warnings, you'd see the following: mhx@r2d2 ~ $ perl -w dimtest.pl array mode yields: Data too short at dimtest.pl line 51. string mode yields: $VAR1 = { 'count' => '0', 'array' => [] }; The warning is being issued when unpacking in scalar context. This is because "unpack" believes that the struct is larger than it would be based on the dimension from count. And that's because the real size isn't known until the data structure is unpacked. So, to implement your suggested behaviour, "unpack" would need to be changed to a two-pass algorithm. Thinking about all the various subtleties this behavourial change implies (like making sizeof work consistently in more complex scenarios, for example) makes my head hurt... Show quoted text
> I am aware that the "Dimension" tag is experimental yet; however, this > bug makes it impossible to detect whether there really wasn't enough > data for decoding (leading to an empty list returned from unpack()) or > not. I hope this is just a minor issue...
Unfortunately not. The best thing you can do is to use a flexible array member in the compound. But even then unpack won't do the right thing in list context. Show quoted text
> Thanks (again) for your support!
Sorry for not being very helpful this time. Cheers, Marcus
Download signature.asc
application/pgp-signature 197b

Message body not shown because it is not plain text.

Hi Marcus, please excuse my late reply - I did some additional tests (0.74 works fine, thanks for that) and now I think I completely understood the implications of your last email... On Sat Apr 18 15:22:00 2009, mhx-perl@gmx.net wrote: Show quoted text
> [...] > > Yes, that's exactly the one "shortcoming" that the docs talk about. > The Dimension tag doesn't affect compound layout.
Ow... my initial understanding was that this affects just direct access to the compound element offsets. However, now I noticed that it also affects normal "straightforward" unpacking, since the data _after_ a variable size array is always read from the original, "static" offset. So currently it is not possible to correctly decode data for a compound like the one in the attached script...? It outputs $ perl -w dimtest2.pl array mode yields: $VAR1 = { 'count' => '1', 'array' => [ 2 ], 'foo' => '2' }; string mode yields: $VAR1 = { 'count' => '1', 'array' => [ 2 ], 'foo' => '2' }; but correct output should be '1',[2],'3'. Show quoted text
> The warning is being issued when unpacking in scalar context. > This is because "unpack" believes that the struct is larger than it > would be based on the dimension from count. And that's because the > real > size isn't known until the data structure is unpacked. So, to > implement > your suggested behaviour, "unpack" would need to be changed to a two- > pass > algorithm. Thinking about all the various subtleties this behavourial > change implies (like making sizeof work consistently in more complex > scenarios, for example) makes my head hurt...
I can imagine that this is a nightmare :-) However, I'm not sure that you really need a two-pass algorithm in general. As long as you're just interested in decoding a binary string completely from beginning to end, you could just maintain some sort of "dynamic component correction offset" which is increased on the fly. Whenever decoding of a dynamic component has been completed, this offset needs to be adapted. Subsequently, the offset has to be considered by all other decoding functions when accessing the input data. But then, I'm completely unaware of the inner workings of your package.... maybe this is much more complex then it seems now (I'll have a close look for sure :-) Show quoted text
> The best thing you can do is to use a flexible > array member in the compound. But even then unpack won't do the right > thing in list context.
Yep... this will only prevent the "message too short error", because dynamic elements will be treated as zero size elements whenever accessing data that is behind them... correct? Show quoted text
> Sorry for not being very helpful this time.
You were, really, even if there's no solution right now. I really appreciate your support. Just one minor question: maybe an explicit hint in the documentation would be useful, indicating that decoding any data that follows dynamic elements will not give the expected results. To me, this wasn't immediately clear from the documentation. Best regards, /alex
attachment added.
#!/usr/bin/perl use strict; use Convert::Binary::C; use Data::Dumper; use Tie::Hash::Indexed; ## ## test C code ## my $testCode = <<'EOF'; struct S { int count; int array[]; int foo; }; EOF ## ## create converter, parse code ## my $c = new Convert::Binary::C; $c->OrderMembers(1); $c->parse( $testCode ); $c->tag( 'S.array', Dimension => 'count' ); ## ## test unpacking ## my $binString = pack( "I*", 1, 2, 3 ); { my @unpacked; eval { @unpacked = $c->unpack( "struct S", $binString ); }; if ( $@ ) { print STDERR "exception: $@\n"; } print "array mode yields:\n" . Dumper( @unpacked ); } { my $unpacked; eval { $unpacked = $c->unpack( "struct S", $binString ); }; if ( $@ ) { print STDERR "exception: $@\n"; } print "string mode yields:\n" . Dumper( $unpacked ); }
Hello, one more suggestion: as a first step, would it be easier to support dynamic size elements that are located at the end of a compound? Currently, a dynamic child compound also destroys the offsets in its parent (example attached): x [~/mac/ctest]> perl -w dimtest3.pl array mode yields: $VAR1 = { 'foo' => { 'count' => '1', 'array' => [ 2 ] }, 'bar' => '2' }; string mode yields: $VAR1 = { 'foo' => { 'count' => '1', 'array' => [ 2 ] }, 'bar' => '2' }; Sorry for spamming - I think this ticket is also a feature request, not a bug), so it can be closed. Best regards, /alex
Hello, one more suggestion: as a first step, would it be easier to support dynamic size elements that are located at the end of a compound? Currently, a dynamic child compound also destroys the offsets in its parent (example attached): x [~/mac/ctest]> perl -w dimtest3.pl array mode yields: $VAR1 = { 'foo' => { 'count' => '1', 'array' => [ 2 ] }, 'bar' => '2' }; string mode yields: $VAR1 = { 'foo' => { 'count' => '1', 'array' => [ 2 ] }, 'bar' => '2' }; Sorry for spamming - I think this ticket is also a feature request, not a bug), so it can be closed. Best regards, /alex
#!/usr/bin/perl use strict; use Convert::Binary::C; use Data::Dumper; use Tie::Hash::Indexed; ## ## test C code ## my $testCode = <<'EOF'; struct S1 { int count; int array[]; }; struct S2 { struct S1 foo; int bar; }; EOF ## ## create converter, parse code ## my $c = new Convert::Binary::C; $c->OrderMembers(1); $c->parse( $testCode ); $c->tag( 'S1.array', Dimension => 'count' ); ## ## test unpacking ## my $binString = pack( "I*", 1, 2, 3 ); { my @unpacked; eval { @unpacked = $c->unpack( "struct S2", $binString ); }; if ( $@ ) { print STDERR "exception: $@\n"; } print "array mode yields:\n" . Dumper( @unpacked ); } { my $unpacked; eval { $unpacked = $c->unpack( "struct S2", $binString ); }; if ( $@ ) { print STDERR "exception: $@\n"; } print "string mode yields:\n" . Dumper( $unpacked ); }
Subject: Re: [rt.cpan.org #45130] unpack() returns nothing when Dimension tag sets dynamic entry to zero size
Date: Sun, 26 Apr 2009 23:24:22 +0200
To: bug-Convert-Binary-C [...] rt.cpan.org
From: Marcus Holland-Moritz <mhx-perl [...] gmx.net>
Hi Alex, On 2009-04-24, at 04:41:23 -0400, Alexander Ost via RT wrote: Show quoted text
> Queue: Convert-Binary-C > Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=45130 > > > Hi Marcus, > > please excuse my late reply - I did some additional tests (0.74 works > fine, thanks for that) and now I think I completely understood the > implications of your last email... > > On Sat Apr 18 15:22:00 2009, mhx-perl@gmx.net wrote:
> > [...] > > > > Yes, that's exactly the one "shortcoming" that the docs talk about. > > The Dimension tag doesn't affect compound layout.
> > Ow... my initial understanding was that this affects just direct > access to the compound element offsets. However, now I noticed that it > also affects normal "straightforward" unpacking, since the data > _after_ a variable size array is always read from the original, > "static" offset. > > So currently it is not possible to correctly decode data for a > compound like the one in the attached script...?
I think that greatly depends on the definition of "correctly". ;-) The above code is not legal C, as the standard forbids a flexible array member anywhere but at the end of a structure: mhx@r2d2 ~ $ cat foo.c struct S { int count; int array[]; int foo; }; mhx@r2d2 ~ $ cc -c foo.c foo.c:4: error: flexible array member not at end of struct Show quoted text
> It outputs > > $ perl -w dimtest2.pl > array mode yields: > $VAR1 = { > 'count' => '1', > 'array' => [ > 2 > ], > 'foo' => '2' > }; > string mode yields: > $VAR1 = { > 'count' => '1', > 'array' => [ > 2 > ], > 'foo' => '2' > }; > > but correct output should be '1',[2],'3'.
Again, that depends on what you expect to get from a snippet of illegal C code. Actually, to me, the output makes perfect sense. The following is a valid piece of C code that is more or less equivalent to your example: struct S { int count; struct { int count2; int array[]; } a; int foo; }; And in this case, for the C compiler, S.foo is actually an alias for S.a.array[0]. As stated in a previous reply, the flexible array member doesn't contribute to struct size. I fully agree that if you actually tag a dimension to the array (which ideally would turn that array from a flexible array member into a fixed-size array), the behaviour should be different and yes, I would in this case expect your output. Show quoted text
> > The warning is being issued when unpacking in scalar context. > > This is because "unpack" believes that the struct is larger than it > > would be based on the dimension from count. And that's because the > > real > > size isn't known until the data structure is unpacked. So, to > > implement > > your suggested behaviour, "unpack" would need to be changed to a two- > > pass > > algorithm. Thinking about all the various subtleties this behavourial > > change implies (like making sizeof work consistently in more complex > > scenarios, for example) makes my head hurt...
> > I can imagine that this is a nightmare :-) > > However, I'm not sure that you really need a two-pass algorithm in > general. As long as you're just interested in decoding a binary string > completely from beginning to end, you could just maintain some sort of > "dynamic component correction offset" which is increased on the > fly. Whenever decoding of a dynamic component has been completed, this > offset needs to be adapted. Subsequently, the offset has to be > considered by all other decoding functions when accessing the input > data.
You're probably correct. The logic that issues the "Data too short" warning currently needs to know the size of a type before doing the unpacking. But in case of dynamic sized types, that logic will most probably have to change anyway. Show quoted text
> But then, I'm completely unaware of the inner workings of your > package.... maybe this is much more complex then it seems now (I'll > have a close look for sure :-)
Don't look too closely ;-) No, not because I fear you might dislike the code, but because it's unlikely that I will do any large changes to the currently released code base. My development version features a completely new parser and AST generator, which will have a major impact on the rest of the code (so major that currently nothing but the new parser is working really well). Show quoted text
> > The best thing you can do is to use a flexible > > array member in the compound. But even then unpack won't do the right > > thing in list context.
> > Yep... this will only prevent the "message too short error", because > dynamic elements will be treated as zero size elements whenever > accessing data that is behind them... correct?
Yes. Show quoted text
> > Sorry for not being very helpful this time.
> > You were, really, even if there's no solution right now. I really > appreciate your support. > > Just one minor question: maybe an explicit hint in the documentation > would be useful, indicating that decoding any data that follows > dynamic elements will not give the expected results. To me, this > wasn't immediately clear from the documentation.
Ok, I'll keep that in mind for the next release. Cheers, Marcus
Download signature.asc
application/pgp-signature 198b

Message body not shown because it is not plain text.