Bug #132263 for Object-Pad: Method calls in non-O::P parent constructor fail

Sun Mar 29 11:33:27 2020 TEAM [...] cpan.org - Ticket created

Subject:

Method calls in non-O::P parent constructor fail

When inheriting from a plain class that has its own constructor, method calls within that constructor fail: Expected $self->{"Object::Pad/slots"} to be an ARRAY reference because the object is not yet set up correctly. Attaching a simple test case.

Subject:

parent-calls-method-in-new.t

use strict; use warnings; use Test::More; use Test::Fatal; use Object::Pad; { # Simple parent class that calls a method during ->new package Example::Parent; sub new { my ($class, %args) = @_; my $self = bless \%args, $class; $self->example_method; return $self; } } { # We would expect a working child class to have a valid object instance by the time # the method is called class Example::Child extends Example::Parent; method example_method { 1 } } new_ok('Example::Child'); done_testing;

Tue Mar 31 19:25:39 2020 leonerd-cpan [...] leonerd.org.uk - Correspondence added

On Sun Mar 29 11:33:27 2020, TEAM wrote: Show quoted text

> When inheriting from a plain class that has its own constructor, > method calls within that constructor fail: > > Expected $self->{"Object::Pad/slots"} to be an ARRAY reference > > because the object is not yet set up correctly. > > Attaching a simple test case.

In summary, I don't think this is an easy one to fix. I shall at least explain the current implementation and why the code breaks. For this we'll need the injected constructor that Object::Pad puts into every class. That's actually implemented in XS, but in essence at 0.17 it does the following for a non-native "subclass of HASH" class: *new = sub { my $class = shift; my $self = $class->SUPER::new(@_); $self->INITSLOTS; $self->BUILDALL(@_); return $self; }; (from 0.18 onwards it also does BUILDARGS but that's an irrelevant complication here) This means that the superclass constructor is entirely completed and returns before we call ->INITSLOTS, which is the part responsible for creating the ->{'Object::Pad/slots'} arrayref. This is a problem here because, during the superclass constructor (i.e. before we have our slots array) it invokes an overridden $self-> method, which hits a sub created by the `method` keyword. Early in its setup it expects to find that array, and complains because it doesn't find it. A first thought of an attempted solution might seem to lazily create the slots array when first required. That would indeed solve this particular class, but wouldn't correctly handle a more complex example such as: class Example::Child extends Example::Parent; has $_ten = 10; method example_method { say "Ten is $_ten" } The $self->example_method code would still be invoked and it would lazily create the slots array, but until it has had a chance to run INITSLOTS the $_ten slot would not be filled in, and we'd get an uninitialised value warning instead. Further thought reveals this isn't really an Object::Pad-specific problem. Indeed the exact same problem comes for basically the exact same reason if Example::Child were a regular "naked perl OO" class and we did the following: package Example::Child; use base qw( Example::Parent ); sub new { my $self = shift->SUPER::new(@_); $self->{ten} = 10; } sub example_method { my $self = shift; say "Ten is $self->{ten}"; } This would fail in the same way as $self->{ten} would be missing at that point. It could be argued that actually, the real bug here is the expectation of a Perl OO class that calling $self-> methods during its own `sub new` constructor makes sense, because the subclass constructor hasn't had a chance to run yet. This admittedly is a fairly common design in practice however. I am reluctant to give the answer "Well don't do that then", but at present I don't have any good ideas for what might be better. Thoughts from anyone welcome? -- Paul Evans

Tue Mar 31 19:25:40 2020 The RT System itself - Status changed from 'new' to 'open'

Wed Apr 01 09:05:31 2020 leonerd-cpan [...] leonerd.org.uk - Correspondence added

Discussion on #cor points out it's not actually the method call itself which is problematic, only the slot accesses within it. There's no problem, semantically, in defining what should happen in this example, because the method only returns a constant 1. Indeed, any method body that contained no slot accesses should be perfectly safe to invoke here. As such, I have a partial fix for this issue. It won't allow any method, but it will at least allow those that don't contain slot accesses. Further discussion suggests it might be safe to allow slot accesses even though they're not set up yet by just presenting all values as (immutable) undef. Forbid writing to them but at least let them read as undef; possibly with a warning about slot access to not-fully-constructed instance. Thoughts continue on that. -- Paul Evans

Subject:

rt132263-1.patch

=== modified file 'Build.PL' --- old/Build.PL 2020-03-27 22:10:11 +0000 +++ new/Build.PL 2020-04-01 13:01:17 +0000 @@ -13,6 +13,7 @@ }, test_requires => { 'Data::Dump' => 0, + 'Test::Fatal' => 0, 'Test::More' => '0.88', # done_testing 'Test::Refcount' => 0, }, === modified file 'lib/Object/Pad.xs' --- old/lib/Object/Pad.xs 2020-03-30 01:28:05 +0000 +++ new/lib/Object/Pad.xs 2020-04-01 13:01:17 +0000 @@ -211,9 +211,18 @@ if(SvTYPE(rv) != SVt_PVHV) croak("Not a HASH reference"); SV **slotssvp = hv_fetchs((HV *)rv, "Object::Pad/slots", 0); - if(!slotssvp || !SvROK(*slotssvp) || SvTYPE(SvRV(*slotssvp)) != SVt_PVAV) - croak("Expected $self->{\"Object::Pad/slots\"} to be an ARRAY reference"); - slotsav = SvRV(*slotssvp); + /* A method invoked during a superclass constructor of a classic perl + * class might encounter $self without slots. For this situation we'll + * pretend we have an empty immutable set of slots + * https://rt.cpan.org/Ticket/Display.html?id=132263 + */ + if(!slotssvp) + slotsav = &PL_sv_undef; + else { + if(!SvROK(*slotssvp) || SvTYPE(SvRV(*slotssvp)) != SVt_PVAV) + croak("Expected $self->{\"Object::Pad/slots\"} to be an ARRAY reference"); + slotsav = SvRV(*slotssvp); + } break; } } @@ -250,6 +259,9 @@ #endif PADOFFSET targ = PL_op->op_targ; + if(SvTYPE(PAD_SV(PADIX_SLOTS)) != SVt_PVAV) + croak("ARGH: expected ARRAY of slots at PADIX_SLOTS"); + AV *slotsav = (AV *)PAD_SV(PADIX_SLOTS); if(slotix > av_top_index(slotsav)) === modified file 't/06extends-HASH.t' --- old/t/06extends-HASH.t 2019-11-20 18:22:03 +0000 +++ new/t/06extends-HASH.t 2020-04-01 13:01:17 +0000 @@ -4,6 +4,7 @@ use warnings; use Test::More; +use Test::Fatal; use Object::Pad; @@ -50,4 +51,27 @@ 'pp($obj) of Object::Pad-extended foreign HASH class' ); } +# Various RT132263 test cases +{ + package RT132263::Parent; + sub new { + my $class = shift; + my $self = bless {}, $class; + $self->example_method; + return $self; + } +} + +# Test case one - no slot access in example_method +{ + class RT132263::Child1 extends RT132263::Parent { + method example_method { 1 } + } + + my $e; + ok( !defined( $e = exception { RT132263::Child1->new } ), + 'RT132263 case 1 constructs OK' ) or + diag( "Exception was $e" ); +} + done_testing;

Wed Apr 01 13:44:11 2020 TEAM [...] cpan.org - Correspondence added

On 2020-04-01 14:05:31, PEVANS wrote: Show quoted text

> Further discussion suggests it might be safe to allow slot accesses > even though they're not set up yet by just presenting all values as > (immutable) undef. Forbid writing to them but at least let them read > as undef; possibly with a warning about slot access to not-fully- > constructed instance. Thoughts continue on that.

Unfortunately that limits options for workarounds: the entire BUILD chain breaks on a trivial `sub new { bless; $self->BUILD }` pattern, because all the attempted setup within the child methods will have no write access to slots. I don't think partial slot support is going to help much in practical cases. Hooking `bless` would be the ideal time for the ->INIT_SLOTS upgrade to happen, although not sure whether that's even possible at CPAN level (even if it is, bless is a hot path - this could have significant performance implications?) How would a child class work around this issue at the moment? Is there anything that could be done to make that easier? For now, it looks like the best option is to reïmplement the logic from the parent class constructor(s) in `method BUILD` - we can't bless an empty hash and pass that over: $ perl -e'use strict; use warnings; package Parent { sub new { my ($class, %args) = @_; my $self = bless { }, $class; $self->BUILD(%args); } sub BUILD { my ($self, %args) = @_; $self } } use Object::Pad; class Child extends Parent { has $thing; sub new { my ($class, %args) = @_; my $self = bless {}, $class; $self->INITSLOTS; return $self->next::method(%args) } method BUILD (%args) { $thing = delete $args{thing}; $self } } Child->new' Subroutine new redefined at -e line 1. Attempt to bless into a reference at -e line 1. and it's too late to call ->INITSLOTS at the start of the method: $ perl -e'use strict; use warnings; package Parent { sub new { my ($class, %args) = @_; my $self = bless { }, $class; $self->BUILD(%args); } sub BUILD { my ($self, %args) = @_; $self } } use Object::Pad; class Child extends Parent { has $thing; method BUILD (%args) { $self->INITSLOTS; $thing = delete $args{thing}; $self } } Child->new' Expected $self->{"Object::Pad/slots"} to be an ARRAY reference at -e line 1. Providing ways for parent classes to make this easier would be a good starting point, assuming that there isn't a neat solution to let existing parent classes work without needing any changes - IO::Async::Notifier in particular would really benefit from a cleaner option than the current workarounds, would be nice to avoid proliferation of code like this: use Object::Pad; { package Workaround::Notifier::Empty; use parent qw(IO::Async::Notifier); sub new { my ($class, %args) = @_; bless \%args, $class } } class Workaround::Notifier extends Workaround::Notifier::Empty; method BUILD (%args) { $self->_init(\%args); $self->configure(%args); $self }

Fri Apr 03 16:26:29 2020 leonerd-cpan [...] leonerd.org.uk - Correspondence added

On Wed Apr 01 13:44:11 2020, TEAM wrote: Show quoted text

> Hooking `bless` would be the ideal time for the ->INIT_SLOTS upgrade > to happen, although not sure whether that's even possible at CPAN > level (even if it is, bless is a hot path - this could have > significant performance implications?)

It wouldn't be possible sufficiently reliably. Options are modifications to either: a) CORE::GLOBAL::bless, or b) PL_ppaddr[OP_BLESS] and both of these will only affect new code compiled afterwards; so the whole setup would be too prone to module load order. -- Paul Evans

Fri Apr 03 16:31:43 2020 leonerd-cpan [...] leonerd.org.uk - Correspondence added

Further discussion on #cor comes up with a potential idea. The trouble isn't really related to the `BUILD` methods, but more the `INITSLOTS` which is responsible for creating the ->{'Object::Pad/slots'} ARRAYref. There's no constructor args involved in this part; it's just a matter of invoking all the partial expression optrees stored in the class data. I.e. it can be done at OP_METHSTART time. Currently the OP_METHSTART simply throws an exception if it doesn't find the slots AV. This could easily be changed to make a call to INITSLOTS to create that, and have the proper constructor skip that step if it's already been done. A potential downside here is that `caller`, stacktraces during exceptions, etc.. would look a bit wrong if any exceptions were thrown. A mitigating factor here is currently we only allow constant expressions to initialise slots with, so it's impossible to create one of those. As and when we ever do allow computed expressions that could peek at `caller` or throw an exception, we can just document the odd behaviour around this situation. -- Paul Evans

Fri Apr 03 20:35:09 2020 leonerd-cpan [...] leonerd.org.uk - Correspondence added

On Fri Apr 03 16:31:43 2020, PEVANS wrote: Show quoted text

> Currently the OP_METHSTART simply throws an exception if it doesn't > find the slots AV. This could easily be changed to make a call to > INITSLOTS to create that, and have the proper constructor skip that > step if it's already been done.

Newly-CPAN'ed version 0.19 allows the following: use IO::Async::Loop; use Object::Pad 0.19; class Net::Async::Bananarama extends IO::Async::Notifier { has $_banana; method configure ( %args ) { $_banana = delete $args{banana}; $self->SUPER::configure( %args ); } method _add_to_loop ( $loop ) { warn "Should talk to the loop about $_banana"; } } my $loop = IO::Async::Loop->new; $loop->add( Net::Async::Bananarama->new( banana => "bright and yellow", ) ); $ perl subclass-io-async-notifier.t Should talk to the loop about bright and yellow at subclass-io-async-notifier.t line 16. -- Paul Evans

Sun Apr 05 12:39:11 2020 leonerd-cpan [...] leonerd.org.uk - Correspondence added

This is resolved by 0.19 -- Paul Evans

Sun Apr 05 12:39:12 2020 leonerd-cpan [...] leonerd.org.uk - Status changed from 'open' to 'resolved'

Sun Apr 05 12:39:12 2020 leonerd-cpan [...] leonerd.org.uk - Fixed in 0.19 added