Skip Menu |

This queue is for tickets about the HTML-ExtractMain CPAN distribution.

Report information
The Basics
Id: 65560
Status: new
Priority: 0/
Queue: HTML-ExtractMain

People
Owner: Nobody in particular
Requestors: carla.lopes [...] fe.up.pt
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: HTML::ExtractMain - problems finding relevant content
Date: Tue, 8 Feb 2011 12:24:09 +0000
To: bug-html-extractmain [...] rt.cpan.org
From: Carla Teixeira Lopes <carla.lopes [...] fe.up.pt>
Hi, I'm using HTML::ExtractMain to extract the main content of web pages and I'm detecting problems in webpages where there should be no problems. For example I try to use it with the contents of the webpage http://www.carlalopes.com/research.html, it can not find any relevant document. This page is very "clean" and well-formed. I also tried to use the Readability application, online at http://lab.arc90.com/experiments/readability/, and no error is returned. Any idea why this happens? Thanks, Carla Teixeira Lopes