[GAAS - Fri Oct 3 08:50:30 2003]:
Show quoted text> My mailbox is a mess. Can you post the patch you suggest here?
Here is the email again, patch attached (arjen.diff)
cheers,
arjen
I have thought about the two ways of extending the get_text sub in
HTML:TokeParser. (1. let it have an array argument, 2. reference to
array)
I think that the array reference(2) could be useful for future use, but
it
breaks backward compatibility, doesn't it?
Option 1. is a one-line patch (excluding documentation changes, which
I've
also done), and existing calls to get_text remain valid. It would be
great if you could consider the attached patch that accomplishes it.
Please let me know what you think..
--- TokeParser.pm Tue Apr 10 19:44:04 2001
+++ TokeParser_new.pm Sat Mar 15 19:07:38 2003
@@ -88,7 +88,7 @@
} else {
$tag = "/$tag";
}
- if (!defined($endat) || $endat eq $tag) {
+ if (!defined($endat) || grep { $_ eq $tag } ($endat,@_) ) {
$self->unget_token($token);
last;
}
@@ -200,13 +200,15 @@
["/$tag", $text]
-=item $p->get_text( [$endtag] )
+=item $p->get_text( [$endtag, ...] )
This method returns all text found at the current position. It will
-return a zero length string if the next token is not text. The
-optional $endtag argument specifies that any text occurring before the
-given tag is to be returned. Any entities will be converted to their
-corresponding character.
+return a zero length string if the next token is not text. If
+one or more arguments are given, then we return any text occurring before the first of the specified tags found. For example:
+
+ $p->get_text("p", "br");
+
+will return the text up to either a paragraph of linebreak element. Any entities will be converted to their corresponding character.
The $p->{textify} attribute is a hash that defines how certain tags can
be treated as text. If the name of a start tag matches a key in this
@@ -225,7 +227,7 @@
This means that <IMG> and <APPLET> tags are treated as text, and that
the text to substitute can be found in the ALT attribute.
-=item $p->get_trimmed_text( [$endtag] )
+=item $p->get_trimmed_text( [$endtag, ...] )
Same as $p->get_text above, but will collapse any sequences of white
space to a single space character. Leading and trailing white space is