Skip Menu |

This queue is for tickets about the XML-XPathEngine CPAN distribution.

Report information
The Basics
Id: 18705
Status: resolved
Priority: 0/
Queue: XML-XPathEngine

People
Owner: Nobody in particular
Requestors: rnapier [...] employees.org
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)



Subject: Unexpected sorting of XPathEngine
In XML::XPathEngine::find, there is sort->remove_duplicates for NodeSet:s. I think I understand why this is being done, but it seems to break at least one page I'm looking at (and breaks it very consistently). You appear to be sorting on the memory location of the item in the NodeSet. Consider this code: --- use LWP::Simple; use HTML::TreeBuilder::XPath; my $page = get( "http://rdu.news14.com/content/weather/7day_forecast/" ); my $tree = HTML::TreeBuilder::XPath->new_from_content( $page ); my $nodes = $tree->findnodes( '//b' ); print $nodes; --- I've attached the specific version of index.html that causes the problem. The issue is that the data in the HTML (and in $tree) is in a different order than the information in $nodes. The data in the first two rows of the forecast table wind up at the end of the NodeList. Is the remove_duplicates actually necessary? (I've removed it without seeing immediate problems.) If so, could you remove duplicates without sorting (this would probably be faster, though it might take a little more memory to hold a %seen hash)?
Subject: index.html

Message body is not shown because it is too large.

Hi, The bug is not in XML::XPathEngine, it is in HTML::TreeBuilder::XPath, the comparison method had a cmp instead of a <=>, which caused the problem you had. I have put an updated version on HTML::TreeBuilder::XPath, at http://www.xmltwig.com/module/html-treebuilder-xpath/ let me know if it works better for you, in which case I will upload it to CPAN. Thanks __ mirod
The new version of HTML::TreeBuilder::XPath seems to fix the problem. Thanks.
On Thu Apr 20 12:23:40 2006, guest wrote: Show quoted text
> The new version of HTML::TreeBuilder::XPath seems to fix the problem. > Thanks.
Thanks, I just uploaded HTML-TreeBuilder-XPath-0.03 to CPAN __ mirod