Skip Menu |

This queue is for tickets about the XML-Twig CPAN distribution.

Report information
The Basics
Id: 20322
Status: rejected
Priority: 0/
Queue: XML-Twig

People
Owner: Nobody in particular
Requestors: cmccutcheon [...] oneil.com
Cc:
AdminCc:

Bug Information
Severity: Critical
Broken in:
  • 3.25
  • 3.26
Fixed in: (no value)



Subject: The method being employed to break circular references is just a little *too* eager.
See the below script for an example of broken and working code: #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $broken = 1; my $xml = "<a><b>foo</b><c><d>bar</d><e>baz</e></c></a>"; if($broken) { my $root = undef; { my $twig = XML::Twig->new(pretty_print => "none"); my $doc = $twig->safe_parse($xml); if($@) { die "ACK, GASP: Parse error: $@\n"; } $root = $doc->root; } warn "Finding nodes...\n"; my @nodes = $root->findnodes("/a/c/e"); warn "This message will never print.\n"; print "Nodes: ", $nodes[0]->sprint; } else { my $twig = XML::Twig->new(pretty_print => "none"); my $doc = $twig->safe_parse($xml); if($@) { die "ACK, GASP: Parse error: $@\n"; } my $root = $doc->root; warn "Finding nodes...\n"; my @nodes = $root->findnodes("/a/c/e"); warn "This message will print.\n"; print "Nodes: ", $nodes[0]->sprint; } The exact error I get is: Can't call method "children" on an undefined value at (eval 12) line 1. If all mentions of XML::Twig are changed to XML::Twig::XPath, the error changes to: Can't use an undefined value as a HASH reference at /usr/lib/perl5/site_perl/5.8.8/XML/Twig/XPath.pm line 131. This problem is particularly frustrating when an XML::Twig is passed between functions. The work-around (read "very ugly hack") is to ->sprint() out the twig (yes, ->sprint() works but ->findnodes() does not... go figure :/) and reparse if it is needed in a different scope than the one in which it was declared -- not exactly conductive to clean code. Also, upgrading Scalar::Util to the latest and greatest (I was at 1.14, then upgraded to 1.18) did not fix the problem either.
Hi, Sorry for the delay, I was away for a couple of weeks. The problem I see with your code is that you go (slightly!) out of your way to get the variable containing the twig out of scope. You keep $root in scope, by declaring it outside of the block in which you create and parse the twig. But you don't do the same for the entire twig. It is declared inside the block, and goes out of scope before you call findnodes. Then findnodes looks for the twig when it sees the leading / in the XPath expression, and gets an undef. But in any case. if you only keep part of a twig in scope (the root being just one possible case), the using an XPath expression with a leading / is bound to create problems. I can probably fix the XML::Twig version of findnodes, the XML::Twig::XPath version might be a little more tricky (some of the code is either in XML::XPath or in XML::XPathEngine). The thing is, why do you discard the twig and keep only the root? I understand that your code is only meant to show the problem, but I kinda fail to imagine situations where you really can't keep the whole twig around, but still need XPath queries with a leading /. So at this point my advice would be "don't do it", and the fix I am thinking of is just to generate a nicer looking error. I am willing to be convinced that this is not the best solution though. Does it make sense? __ mirod
Subject: RE: [rt.cpan.org #20322] The method being employed to break circular references is just a little *too* eager.
Date: Wed, 19 Jul 2006 10:35:43 -0400
To: <bug-XML-Twig [...] rt.cpan.org>
From: "Calvin McCutcheon" <cmccutcheon [...] oneil.com>
Greetings, Yes, what you said made sense. Going slightly out of my way to get the twig out of scope is the simplest way I can demonstrate what happens when an XML::Twig root is passed to a function (such as, a recursive function). Or, when you do a copy of a root node (in preparation for some destructive testing) and try to run an XPath on the copied node (with or without the original twig in scope). As for having XPath queries which begin with a leading "/", think of a tree walker that can get an instruction to go to any point on the tree from any point on the tree. And yes, I can see your point about using an XPath with a leading "/" from an arbitrarily deep node (I still think the root node should be an exception to this though). Speaking of going from any point to any point, attached is a proposed patch for the XML::Twig->xpath method. The key difference is the ability to say something like this: my $xpath = $first_node->xpath($second_node); and get the XPath from the first node to the second node. Also, (for backward compatibility) if no second node is passed in, the output is identical to the original ->xpath method. As for the problem at hand, how is this for a possible solution? Instead of (or in conjunction with) creating a better error message (which would help), why not allow the use of Scalar::Util (and WeakRef) to be user configurable with something like: XML::Twig->new(memory_management => 0); and the corresponding method: $twig->set_memory_management(0); where the default would be "memory_management => 1". In conjunction with this change, the write-ups for ->dispose and ->delete would need a little more prominence in the documentation. (Although, I would vastly prefer the ability to use ->findnodes from arbitrarily deep nodes (with or without the original twig in scope).) Calvin Show quoted text
-----Original Message----- From: via RT [mailto:bug-XML-Twig@rt.cpan.org] Sent: Sunday, July 16, 2006 10:49 AM To: Calvin McCutcheon Subject: [rt.cpan.org #20322] The method being employed to break circular references is just a little *too* eager. <URL: http://rt.cpan.org/Ticket/Display.html?id=20322 > Hi, Sorry for the delay, I was away for a couple of weeks. The problem I see with your code is that you go (slightly!) out of your way to get the variable containing the twig out of scope. You keep $root in scope, by declaring it outside of the block in which you create and parse the twig. But you don't do the same for the entire twig. It is declared inside the block, and goes out of scope before you call findnodes. Then findnodes looks for the twig when it sees the leading / in the XPath expression, and gets an undef. But in any case. if you only keep part of a twig in scope (the root being just one possible case), the using an XPath expression with a leading / is bound to create problems. I can probably fix the XML::Twig version of findnodes, the XML::Twig::XPath version might be a little more tricky (some of the code is either in XML::XPath or in XML::XPathEngine). The thing is, why do you discard the twig and keep only the root? I understand that your code is only meant to show the problem, but I kinda fail to imagine situations where you really can't keep the whole twig around, but still need XPath queries with a leading /. So at this point my advice would be "don't do it", and the fix I am thinking of is just to generate a nicer looking error. I am willing to be convinced that this is not the best solution though. Does it make sense? __ mirod ********************************************************************** Confidentiality Notice The information contained in this e-mail is confidential and intended for use only by the person(s) or organization listed in the address. If you have received this communication in error, please contact the sender at O'Neil & Associates, Inc., immediately. Any copying, dissemination, or distribution of this communication, other than by the intended recipient, is strictly prohibited. **********************************************************************

Message body is not shown because sender requested not to inline it.

Sorry for the delay, job has kept me quite busy this week. On Wed Jul 19 10:36:54 2006, cmccutcheon@oneil.com wrote: Show quoted text
> Going slightly out of my way to get the twig out of scope is the > simplest way I can demonstrate what happens when an XML::Twig root is > passed to a function (such as, a recursive function). Or, when you do a > copy of a root node (in preparation for some destructive testing) and > try to run an XPath on the copied node (with or without the original > twig in scope).
If you pass nodes (root or otherwise) to a function, you just have to make sure the twig itself is still in scope somewhere. If you want to copy the root node, then I will provide a copy (or clone) method for a twig so you can play with it to your heart's content. In the meantime you can write your own, just create an empty twig and attach the copy of the root node to it: # $twig is the original twig my $root_node_copy= $twig->root->copy; my $new_twig= XML::Twig->new; $new_twig->set_root( $root_node_copy); Show quoted text
> As for having XPath queries which begin with a leading "/", think of a > tree walker that can get an instruction to go to any point on the tree > from any point on the tree. And yes, I can see your point about using > an XPath with a leading "/" from an arbitrarily deep node (I still think > the root node should be an exception to this though).
I thought about making an exception for the root node, but really, in all XML models I know, the document node and the root node are 2 different nodes, which really makes sense (the document node can have the XML declaration, or a DTD attached for starters). Show quoted text
> Speaking of going from any point to any point, attached is a proposed > patch for the XML::Twig->xpath method. The key difference is the > ability to say something like this: > > my $xpath = $first_node->xpath($second_node); > > and get the XPath from the first node to the second node. Also, (for > backward compatibility) if no second node is passed in, the output is > identical to the original ->xpath method.
This is quite different from the original xpath method, and seems to me quite specific to your code. You could subclass XML::Twig (and XML::Twig::Elt, look at the elt_class option: http://xmltwig.com/xmltwig/twig_dev.html#METHODS_XML_Twig_new_elt_class) and either use the method in your patch, or one with a different name. Show quoted text
> As for the problem at hand, how is this for a possible solution? > Instead of (or in conjunction with) creating a better error message > (which would help), why not allow the use of Scalar::Util (and WeakRef) > to be user configurable with something like: > > XML::Twig->new(memory_management => 0); > > and the corresponding method: > > $twig->set_memory_management(0); > > where the default would be "memory_management => 1". In conjunction > with this change, the write-ups for ->dispose and ->delete would need a > little more prominence in the documentation. (Although, I would vastly > prefer the ability to use ->findnodes from arbitrarily deep nodes (with > or without the original twig in scope).)
The thing is, all you have to do to avoid the problem, is not have the twig go out of scope. Having XML::Twig test an option for each element during parsing seems like a pretty sub-optimal solution to this. Really, I don't quite understand why you NEED to drop the twig in your code. It's not like it's a huge object, compared to the tree. It's not small, but it has a fixed size, a few Kb at most. __ mirod