Skip Menu |

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 61673
Status: resolved
Priority: 0/
Queue: HTML-Tree

People
Owner: Jeff.Fearn [...] gmail.com
Requestors: sprout [...] cpan.org
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: 4.1



Subject: Some methods die when elem isn’ta element_class
Date: Sun, 26 Sep 2010 12:32:04 -0700
To: bug-HTML-Tree [...] rt.cpan.org
From: Father Chrysostomos <sprout [...] cpan.org>
In HTML::Tree 4, some methods check that the element passed as an argument belongs to the element_class or a subclass of it. Up till now, element_class only determined which constructor to call to create the element object. The ability for a constructor to return an object of any class I would consider a *feature* of Perl. HTML::Tree now breaks that. HTML::DOM::Element->new sometimes returns an HTML::DOM::Comment object. I’ve had to add a specialised isa method to HTML::DOM::Comment to work around this.
Subject: Re: [rt.cpan.org #61673] Some methods die when elem isn’ta element_class
Date: Mon, 27 Sep 2010 14:26:24 +1000
To: bug-HTML-Tree [...] rt.cpan.org
From: Jeff Fearn <jefffearn [...] gmail.com>
Show quoted text
> In HTML::Tree 4, some methods check that the element passed as an argument belongs to the element_class or a subclass of it. Up till now, element_class only determined which constructor to call to create the element object. The ability for a constructor to return an object of any class I would consider a *feature* of Perl. HTML::Tree now breaks that. HTML::DOM::Element->new sometimes returns an HTML::DOM::Comment object. I’ve had to add a specialised isa method to HTML::DOM::Comment to work around this. >
This is probably getting caught by the fix for https://rt.cpan.org/Ticket/Display.html?id=35948 The only place this is checked is in the traverse method in as_HTML, where it seems a relatively sane place to check since it is hitting internal features that will segfault given the wrong input. XML::TreeBuilder avoids this by setting_element_class in it's new function: $self->{'_element_class'} = 'XML::Element'; Maybe that's a cleaner approach to subclassing than munging the isa. Cheers, Jeff.
Subject: Re: [rt.cpan.org #61673] Some methods die when elem isn’ta element_class
Date: Sun, 3 Oct 2010 13:04:46 -0700
To: bug-HTML-Tree [...] rt.cpan.org
From: Father Chrysostomos <sprout [...] cpan.org>
On Sep 26, 2010, at 9:26 PM, Jeff Fearn via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=61673 > >
>> In HTML::Tree 4, some methods check that the element passed as an argument belongs to the element_class or a subclass of it. Up till now, element_class only determined which constructor to call to create the element object. The ability for a constructor to return an object of any class I would consider a *feature* of Perl. HTML::Tree now breaks that. HTML::DOM::Element->new sometimes returns an HTML::DOM::Comment object. I’ve had to add a specialised isa method to HTML::DOM::Comment to work around this. >>
> > This is probably getting caught by the fix for > https://rt.cpan.org/Ticket/Display.html?id=35948 > > The only place this is checked is in the traverse method in as_HTML, > where it seems a relatively sane place to check since it is hitting > internal features that will segfault given the wrong input. > > XML::TreeBuilder avoids this by setting_element_class in it's new function: > > $self->{'_element_class'} = 'XML::Element'; > > Maybe that's a cleaner approach to subclassing than munging the isa.
I am doing that. But it doesn’t work in my case, as the constructor that methods like objectify_text call (HTML::DOM::Element->new) is not in a package that all nodes inherit from. Again, constructors do not necessarily return objects belonging to, or inheriting from, the same package.
Subject: Re: [rt.cpan.org #61673] Some methods die when elem isn’ta element_class
Date: Mon, 4 Oct 2010 10:20:53 +0100
To: bug-HTML-Tree [...] rt.cpan.org
From: Jeff Fearn <jefffearn [...] gmail.com>
On Sun, Oct 3, 2010 at 9:04 PM, Father Chrysostomos via RT <bug-HTML-Tree@rt.cpan.org> wrote: Show quoted text
>       Queue: HTML-Tree >  Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=61673 > > > > On Sep 26, 2010, at 9:26 PM, Jeff Fearn via RT wrote: >
>> <URL: https://rt.cpan.org/Ticket/Display.html?id=61673 > >>
>>> In HTML::Tree 4, some methods check that the element passed as an argument belongs to the element_class or a subclass of it. Up till now, element_class only determined which constructor to call to create the element object. The ability for a constructor to return an object of any class I would consider a *feature* of Perl. HTML::Tree now breaks that. HTML::DOM::Element->new sometimes returns an HTML::DOM::Comment object. I’ve had to add a specialised isa method to HTML::DOM::Comment to work around this. >>>
>> >> This is probably getting caught by the fix for >> https://rt.cpan.org/Ticket/Display.html?id=35948 >> >> The only place this is checked is in the traverse method in as_HTML, >> where it seems a relatively sane place to check since it is hitting >> internal features that will segfault given the wrong input. >> >> XML::TreeBuilder avoids this by setting_element_class in it's new function: >> >> $self->{'_element_class'}      = 'XML::Element'; >> >> Maybe that's a cleaner approach to subclassing than munging the isa.
> > I am doing that. But it doesn’t work in my case, as the constructor that methods like objectify_text call (HTML::DOM::Element->new) is not in a package that all nodes inherit from. Again, constructors do not necessarily return objects belonging to, or inheriting from, the same package.
The check is making sure all the objects in the current tree are of a compatible type, it seems a reasonable thing to check. Presumably all the objects returned by a constructor inherit from some base class, even if it's not the one being use to call new. e.g. XML::DOM::Element->new() returns various XML::DOM::* objects.. So if XML::DOM::* was using HTML::Element I'd expect XML::DOM, or maybe XML::DOM::Node, would be setting $self->{'_element_class'} to 'XML::DOM::Node' or similar in the super constructor. In this case it looks like both HTML::DOM::Element and HTML::DOM::Comment both call SUPER::new so it should be just a matter of setting $self->{'_element_class'} in SUPER::new. I'm willing to have a class variable to disable sanity checking if you want, but I'd like to keep the sanity check in place for people who accidentally mix stuff up. Cheers, Jeff.
Subject: Re: [rt.cpan.org #61673] Some methods die when elem isn’ta element_class
Date: Sun, 10 Oct 2010 16:28:41 -0700
To: bug-HTML-Tree [...] rt.cpan.org
From: Father Chrysostomos <sprout [...] cpan.org>
On Oct 4, 2010, at 2:21 AM, Jeff Fearn via RT wrote: Show quoted text
> In this case it looks like both HTML::DOM::Element and > HTML::DOM::Comment both call SUPER::new so it should be just a matter > of setting $self->{'_element_class'} in SUPER::new.
The problem here, again, is that the base class they inherit from is not the class that has the constructor that objectify_text should call. element_class is now doing two different things. Show quoted text
> I'm willing to have a class variable to disable sanity checking if you > want,
That would be nice, or duck typing: Just check whether the $object->can('starttag'). Show quoted text
> but I'd like to keep the sanity check in place for people who > accidentally mix stuff up.
On Mon Oct 11 09:28:52 2010, sprout@cpan.org wrote: Show quoted text
> > I'm willing to have a class variable to disable sanity checking if
> you
> > want,
> > That would be nice, or duck typing: Just check whether the $object-
> >can('starttag').
Oh I like this one! This way really crazy people, like me, can have trees with nodes from completely unrelated subclasses! http://github.com/jfearn/HTML-Tree/commit/b778b99e7a4b32ff94d6cf6875b4cdb36b86db74 Cheers, Jeff.
Version 4.1 shipped with a fix for this bug.