Skip Menu |

This queue is for tickets about the HTML-Tree CPAN distribution.

Report information
The Basics
Id: 58941
Status: rejected
Priority: 0/
Queue: HTML-Tree

People
Owner: Nobody in particular
Requestors: ashishfa [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: bug
Date: Tue, 29 Jun 2010 18:21:23 +0530
To: bug-HTML-Tree [...] rt.cpan.org
From: Ashish Almeida <ashishfa [...] gmail.com>
Dear Pete Krawczyk, ubuntu 9.10 perl, v5.10.0 HTML::Element::3.23 *as_text() * when i fetch text using this function, the text sticks to each other even when it is under separate tags. there should be some white space or other user-defined separator character between words when they belong to different tags since the words stick to each other, I get unexpected errors when i use complete word based search. (query word do not match to words which are at the end or start of the tags since they stick to adjoining words) e.g. <div><a>TATA Manza</a>This is a best car you can buy... </div> this returns TATA ManzaThis is a best car you can buy... and the search query for "Manza" as a full word match fails since my system treats "ManzaThis" as a single word. Regards Ashish -- Ashish Almeida ---------------------------------
On Tue Jun 29 22:51:34 2010, ashishfa@gmail.com wrote: Show quoted text
> Dear Pete Krawczyk, > > ubuntu 9.10 > perl, v5.10.0 > HTML::Element::3.23 > *as_text() * > > when i fetch text using this function, the text sticks to each other even > when it is under separate tags. > there should be some white space or other user-defined separator character > between words when they belong to different tags > > since the words stick to each other, I get unexpected errors when i use > complete word based search. (query word do not match to words which
are at Show quoted text
> the end or start of the tags since they stick to adjoining words) > e.g. > > <div><a>TATA Manza</a>This is a best car you can buy... </div> > > this returns > > TATA ManzaThis is a best car you can buy... > > and the search query for "Manza" as a full word match fails since my
system Show quoted text
> treats "ManzaThis" as a single word.
This is the correct output for inline HTML, if there is no white space then there is no way to be able to tell when someone has HTML that is missing white space and when they have HTML that is deliberately not using white space. Cheers, Jeff.