Date: | Tue, 30 Nov 2004 20:01:00 +0100 |
From: | Dominique Quatravaux <dom [...] idealx.com> |
To: | bug-www-mechanize [...] rt.cpan.org |
Subject: | New features: input_has_label() and other HTML-tree methods |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Dear Mechanists,
Enclosed is a quite jumbo'ed patch (1800+ lines!) to WWW::Mechanize
that purports to add some of the "technical-free" features I've been
raving about on the developper list. The "Changes" file (near the top
of the patch) details the added methods, allow me to outline a few of
them:
~ * input_has_label(): the most important one, allows the programmer
~ to match input widgets using their nearby text labels in the
~ HTML source in a wide variety of situations. Labels may be in
~ the same paragraph as the form controls, or they may be <h3>
~ titles, or they may be in the same HTML table line etc - look at
~ t/html-tree.t to see how robust the heuristics is. The following
~ code (excerpt from the POD) tells the Mech to click on whatever
~ button is labeled "I want no spam" in a technical-free fashion
~ (that is, no need to "view source..." in the browser or count
~ widgets anymore in order to implement that):
~ map { my $input = $_;
~ $input->value($mech->input_has_label($input, qr/I want no
spam/i)) }
~ ($mech->forms->[0]->inputs);
~ * ->node_of_form() and ->nodes_of_input(): starting from the
~ HTML::Form::Input objects, one can get back at their position in
~ the HTML parse tree. Quite handy to get additional contextual
~ info about the widgets that the parser in HTML::Form might
~ neglect to remember;
~ * ->text_node_at(): likewise, starting from a given string offset
~ in the plain-text version of the current page, one can get back
~ at the corresponding HTML node. The converse (from tree to text)
~ is also possible using ->textify_tree(). The programmer can now
~ mix and match regex-based and tree-based methods to check the
~ contents of those HTML documents!
Please let me know what you think of it all. Pursuant to Andy's recent
advice, I have done my best to ensure that this patch is ready for
immediate inclusion in a Mech release (docs, tests, no tabs :-) but of
course I'm prepared to deal with modification requests and resubmit.
Best regards,
- --
Dominique QUATRAVAUX Ingénieur senior
01 44 42 00 08 IDEALX
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBrMNsMJAKAU3mjcsRAsBiAJ0SMKFdMnqUnxUvgkTfXMiQuJ5/nQCdG4Mz
zlZcOiArJ2FemblOR2dtpEQ=
=qOeq
-----END PGP SIGNATURE-----
Message body is not shown because it is too large.