Skip Menu |

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 64220
Status: resolved
Priority: 0/
Queue: HTML-Tree

People
Owner: Nobody in particular
Requestors: miko [...] idocs.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Submitting ideas for revisions to HTML::TreeBuilder
Date: Sat, 25 Dec 2010 11:21:41 -0500
To: bug-html-tree [...] rt.cpan.org
From: "Miko O'Sullivan" <miko [...] idocs.com>
Thank you for HTML::TreeBuilder. It's a great module! Very useful. I'd like to submit some suggested revisions to HTML::TreeBuilder. I needed to get these revisions into production, so instead of hacking HTML::TreeBuilder I created a module that subclasses HTML::TreeBuilder. I'll use that module if and until these ideas are incorporated. The subclssing module, HTML::TreeEvolve, is attached. See the documentation in the module. -Miko

Message body is not shown because sender requested not to inline it.

Subject: re: Submitting ideas for revisions to HTML::TreeBuilder
From: miko [...] idocs.com
And after uploading the module I discovered that I had broken one of my cardinal rules: always run tests *after* making changes (in this case documentation) and *before* uploading. Serves me right that the uploaded module won't compile. Here's a corrected version. Miko On Sat Dec 25 11:21:51 2010, miko@idocs.com wrote: Show quoted text
> Thank you for HTML::TreeBuilder. It's a great module! Very useful. > > I'd like to submit some suggested revisions to HTML::TreeBuilder. I > needed to get these revisions into production, so instead of hacking > HTML::TreeBuilder I created a module that subclasses > HTML::TreeBuilder. I'll use that module if and until these ideas are > incorporated. > > The subclssing module, HTML::TreeEvolve, is attached. See the > documentation in the module. > > -Miko
Subject: TreeEvolve.pm
=head1 NAME HTML::TreeEvolve -- Additions I'd like to see in HTML::TreeBuilder This module adds or modifies some methods to HTML::TreeBuilder and HTML::TreeBuilder::Element that I think would be useful. I needed to get these changes into production, so instead of hacking HTML::TreeBuilder, resulting in code that can't be updated through the open-source process, I created this module which subclasses HTML::TreeBuilder. I'm submitting this module to the developed in the hope that the changes will be incorporated into the standard distribution. This module is meant as a temporary fix, not a permanent production module. =head1 SYNOPSIS # Instantiate tree object. See documentation for some of the # additions/changes to HTML::TreeBuilder->new(). my $tree = HTML::TreeEvolve->new(); # set the class used for new element $tree->element_class(HTML::MyClass); # get the root element $root = $node->root(); # Output HTML, but only the guts. Include end tags where they were # originally sent, even for tags with optional enders. $tree->html(guts_only=>1); =head1 NEW METHODS =cut ############################################################################### # HTML::TreeEvolve::Element # package HTML::TreeEvolve::Element; use strict; use Debug::ShowStuff ':all'; use MemHandle; use base 'HTML::Element'; #------------------------------------------------------------------------------ # $node->html # =head2 $node->html Returns an the HTML code to rebuild the tree. This method is meant as a *sort-of* replacement for $node->as_HTML(). It accomplishes the same basic task: producing HTML. It does not have options for indenting the output or setting which characters get changed to HTML entitity references. It does provide the option to only output the "guts" of the tree by using the guts_only option. It also provides a different interface for setting options. Instead of using positional based options, it uses an options hash (%opts). I'd like to suggest that as_HTML be deprecated in favor of a method like this that would provide a better platform for continued development. Using an options hash means that it's always easy to add another option should it be desired. A positional set of options, where every new option requires adding to the list of arguments being sent, makes it increasingly difficult to use as features are added. It would not even be necessary to actually have two duplicate modules. The code from as_HTML could be modified to use an options hash and renamed html (or some other name, I'm not actually thrilled with "html"... perhaps "as_HTML2"?). Then another as_HTML method could be provided that simply accepts positional options then calls $tree->html() using optional arguments. Easy peasy. =cut sub html { my ($node, %opts) = @_; my ($top, $mem, $optionalEndTag, $empty_element_map, $in_guts); # determine if this is the outermost node in the recursion $top = ! $opts{'recursed'}; # memory handle for storing up HTML $mem = $opts{'memhandle'} || MemHandle->new(''); # hash of tags that are allowed to be without matching end tags $optionalEndTag ||= \%HTML::Element::optionalEndTag; # hash of elements that are always empty $empty_element_map = \%HTML::Element::emptyElement; # If output should only be of guts, determine if we're # in the guts yet. We're in the guts if the caller has # already determined that we're already in the guts, or # if this ISN'T an implicit tag. If output should be of # everything, then we'll just act like we're in the guts. if ($opts{'in_guts'}) { $in_guts = 1 } elsif ($opts{'guts_only'}) { $in_guts = ! $node->implicit } else { $in_guts = 1 } # add start tag if ($in_guts) { print $mem $node->starttag() } # if this isn't an empty element, recurse through children unless ($empty_element_map->{$node->tag}) { # loop through children foreach my $child ($node->content_list) { # recurse if it's an object if (ref $child) { $child->html( memhandle => $mem, recursed => 1, guts_only => $opts{'guts_only'}, in_guts => $in_guts, ); } # else it's text, so just output it else { print $mem HTML::Entities::encode_entities($child); } } } # add end tag if this tag does the end tag thing if ($in_guts) { if ( (! $optionalEndTag->{$node->tag}) || $node->{'_explicit_end_tag'} ) { print $mem $node->endtag(); } } # if top node, return Memhandle string if ($top) { return $mem->mem } } # # html #------------------------------------------------------------------------------ #------------------------------------------------------------------------------ # $tree->element_class # =head2 $tree->element_class Sets/returns the class which will be used for new elements. This method did not seem to be present or called in the current release of HTML::TreeBuilder, although it is mentioned in the documentation. The docmentation says to subclass this method but as far as I can tell no check is ever made if that subclassed method is ever called. Feel free to show me where I'm wrong. It's happened once or twice or 1,000,000 times before. Unlike in HTML::TreeBuilder's documentation, this method does not require subclassing, although it could be if the programmer wants to. This method allowed the programmer to set _element_class by sending an optional second parameter which should be the name of the class for new elements. That value is stored in _element_class, which is what HTML::TreeBuilder seems to use in the current version. Just in case someone wants to send an example of the class instead of the name of the class, you can send an existing object and the ref of that object will be used. Not sure why that would ever be necessary but it's cool. Cool counts. The current class is always returned. =cut sub element_class { my $node = shift; # set class if it was sent if (@_) { my $class = $_[0]; if (ref $class) { $class = ref $class } # store class $node->{'_element_class'} = $class; } # return class return $node->{'_element_class'}; } # # element_class #------------------------------------------------------------------------------ #------------------------------------------------------------------------------ # $node->root # =head2 $node->element_class Returns the root node. If the calling object is itself the root node then returns itself. =cut sub root { my ($node) = @_; my @ancestors = $node->lineage(); # if any ancestors, return the first one if (@ancestors) { return $ancestors[-1] } # else return this node return $node; } # # root #------------------------------------------------------------------------------ # # HTML::TreeEvolve::Element ############################################################################### ############################################################################### # HTML::TreeEvolve # package HTML::TreeEvolve; use strict; use Debug::ShowStuff ':all'; use String::Util ':all'; use Carp 'croak'; use UNIVERSAL qw[isa can]; # extend use base 'HTML::TreeBuilder'; use base 'HTML::TreeEvolve::Element'; #------------------------------------------------------------------------------ # new # =head2 HTML::TreeEvolve->new() This subclassed "new" method provides some functionality I'd like to see in HTML::TreeBuilder->new(). See notes in code. =cut sub new { my $class = shift; my $tree = $class->SUPER::new(); my ($el_class); # default HTML::TreeEvolve::Element class to the called # class plus '::Element'. Not sure this is a good idea for everyone, # but for now using it for my code. $el_class = $tree->element_class($class . '::Element'); # KLUDGE: rebless existing elements # When a tree is first created using HTML::TreeBuilder ()i.e. before it # even gets to this point in the code), it already has <head> and <body> # elements, which are blessed as HTML::Element. This loop reblesses them # as the class returns by $tree->element_class. foreach my $node ($tree->descendents) { bless $node, $el_class; } # return return $tree; } # # new #------------------------------------------------------------------------------ #------------------------------------------------------------------------------ # end # =head2 $tree->end() Overriding but calling HTML::TreeBuilder $tree->end() method. This method determines if an explicit end tag (e.g. </p>) was sent. If it was then that fact is stored in the tag object. When the $node->html() method is called, tags that have explicit enders -- even tags that don't require them -- will have the end tag added to the code. =cut sub end { my ($tree, $tag, @args) = @_; my (@rv); # get return value @rv = $tree->SUPER::end($tag, @args); # If the opening tag can be found, not that is has an explicit # closing tag. TO_CLOSE: foreach my $node (@rv) { if ( (isa $node, 'HTML::Element') && ($node->tag eq $tag) ) { $node->{'_explicit_end_tag'} = 1; last TO_CLOSE; } } # return super call's return values return @rv; } # # end #------------------------------------------------------------------------------ # # HTML::TreeEvolve ############################################################################### # return true 1; __END__ =head1 TERMS AND CONDITIONS Copyright (c) 2010 by Miko O'Sullivan. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This software comes with B<NO WARRANTY> of any kind. =head1 AUTHOR Miko O'Sullivan F<miko@idocs.com> =head1 VERSION =over =item Version 0.10 December 25, 2010 Initial release. Yes, this is what I did on Christas day -- while my wife was working. Don't be sad for me, we had a great Christmas. =back =cut
Hi Miko, two things, firstly this is a very old module, changing the way the main output method works would create untold numbers of issues for existing code, so I can't see that happening without extensive testing of backwards compatibility. Secondly you don't explain the rationale for the changes, e.g. what you couldn't achieve with the current functionality, or what the motivation behind "guts" is. I don;t have time to go through the code and work out this rationale. Sorry for the delay in replying, I'm very busy but i will try and keep up to date on responding to these modules. Cheers, Jeff.
Subject: Re: [rt.cpan.org #64220] Submitting ideas for revisions to HTML::TreeBuilder
Date: Sun, 13 Mar 2011 14:11:30 -0400
To: bug-HTML-Tree [...] rt.cpan.org
From: "Miko O'Sullivan" <miko [...] idocs.com>
That posting pretty much became moot once I realized I was using an outdated module. For some reason the package manager doesn't grab the latest version unless I specifically request that version. Anyway you can ignore the whole post, most of the issues have been addressed in the later release. Miko On Sun, Mar 13, 2011 at 4:23 AM, Jeff Fearn via RT <bug-HTML-Tree@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=64220 > > > Hi Miko, two things, firstly this is a very old module, changing the way > the main output method works would create untold numbers of issues for > existing code, so I can't see that happening without extensive testing > of backwards compatibility. > > Secondly you don't explain the rationale for the changes, e.g. what you > couldn't achieve with the current functionality, or what the motivation > behind "guts" is. I don;t have time to go through the code and work out > this rationale. > > Sorry for the delay in replying, I'm very busy but i will try and keep > up to date on responding to these modules. > > Cheers, Jeff. >
Hi Miko, https://rt.cpan.org/Ticket/Display.html?id=66110 addresses the cpan indexer not picking up the new version of sub modules. I'll be pushing out a new version to address that soon. Good to hear your problem was addressed in the newer versions. Cheers, Jeff.