Bug #65560 for HTML-ExtractMain: HTML::ExtractMain - problems finding relevant content

Subject:	HTML::ExtractMain - problems finding relevant content
Date:	Tue, 8 Feb 2011 12:24:09 +0000
To:	bug-html-extractmain [...] rt.cpan.org
From:	Carla Teixeira Lopes <carla.lopes [...] fe.up.pt>

Hi, I'm using HTML::ExtractMain to extract the main content of web pages and I'm detecting problems in webpages where there should be no problems. For example I try to use it with the contents of the webpage http://www.carlalopes.com/research.html, it can not find any relevant document. This page is very "clean" and well-formed. I also tried to use the Readability application, online at http://lab.arc90.com/experiments/readability/, and no error is returned. Any idea why this happens? Thanks, Carla Teixeira Lopes