Skip Menu |

This queue is for tickets about the HTML-DOM CPAN distribution.

Report information
The Basics
Id: 65363
Status: resolved
Priority: 0/
Queue: HTML-DOM

People
Owner: Nobody in particular
Requestors: martin [...] akm.com.au
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: 0.046
Fixed in: (no value)



Subject: Problem with HTML::DOM::Element::Form->inputs
Hi, I have found a problem that affects reading the inputs that are in a form. This problem only occurs when a HTML page has multiple forms. Basically it seems like what ever works out what inputs are in which form (I could not find it in the code), is not detecting the </form>, so the first form contains all inputs on the page. Then the second form contains all the inputs not in the first form. Also it is assigning the parent form of the input object to always be form1, instead of the form that it is actually in. Below is an example of this problem: Form | Form | input | input Number | Name | type | value ----------------------------------------------------------------------- Correct Form1 inputs ----------------------------------------------------------------------- Form 1 => form_login_compact => hidden => source Form 1 => form_login_compact => hidden => token Form 1 => form_login_compact => hidden => id Form 1 => form_login_compact => hidden => redirect_to Form 1 => form_login_compact => hidden => qty Form 1 => form_login_compact => hidden => syndicate_purchase_unique_id Form 1 => form_login_compact => hidden => referring_url Form 1 => form_login_compact => hidden => game_offer_id Form 1 => form_login_compact => text => email Form 1 => form_login_compact => password => password Form 1 => form_login_compact => submit => submit Form 1 => form_login_compact => hidden => md5 ------------------------------------------------------------------------ Form2 inputs appearing in Form1 ------------------------------------------------------------------------ Form 1 => form_login_compact => option => lottery_id Form 1 => form_login_compact => text => from_date Form 1 => form_login_compact => text => from_draw Form 1 => form_login_compact => submit => Display ------------------------------------------------------------------------ Form2 inputs, they have Form1 as a parent ------------------------------------------------------------------------ Form 2 => form_login_compact => option => lottery_id Form 2 => form_login_compact => text => from_date Form 2 => form_login_compact => text => from_draw Form 2 => form_login_compact => submit => Display ------------------------------------------------------------------------ The above was generated with the following code ------------------------------------------------------------------------ foreach my $form_number (1..2) { my $form = $mech->form_number($form_number); my $submit_link = URI->new($form->action)->path; foreach my $input ($form->inputs) { my $input_form; DEBUG($input->{_HTML_DOM_f} . ' ' . $input->{_parent}); if (defined($input->{_HTML_DOM_f}) && $input- Show quoted text
>{_HTML_DOM_f} =~ m/^HTML::DOM::Element::Form/) {
$input_form = $input->{_HTML_DOM_f}->name; } elsif (defined($input->{_parent}) && $input->{_parent} =~ m/^HTML::DOM::Element::Form/) { $input_form = $input->{_parent}->name; } elsif ($input =~ m/^HTML::DOM::Collection::Options/) { my $select = $input->[0]->{_parent}; if (defined($select->{_HTML_DOM_f}) && $select- Show quoted text
>{_HTML_DOM_f} =~ m/^HTML::DOM::Element::Form/) {
$input_form = $select->{_HTML_DOM_f}- Show quoted text
>name;
} elsif (defined($select->{_parent}) && $select- Show quoted text
>{_parent} =~ m/^HTML::DOM::Element::Form/) {
$input_form = $select->{_parent}->name; } } print("Form $form_number => " . $input_form . " => " . $input->type . " => " . ($input->name || $input->value) . "\n"); } }
On Wed Feb 02 00:05:35 2011, makman wrote: Show quoted text
> Hi, > > I have found a problem that affects reading the inputs that are in a > form. This problem only occurs when a HTML page has multiple forms. > Basically it seems like what ever works out what inputs are in which > form (I could not find it in the code), is not detecting the </form>, so > the first form contains all inputs on the page. Then the second form > contains all the inputs not in the first form. Also it is assigning the > parent form of the input object to always be form1, instead of the form > that it is actually in.
Could you send me the source of the page? It’s hard to debug this without that. There is some funny business going on in HTML::DOM with regard to form-to-input associations. Many people expect this to work: <td><form> <td><input> So browsers make it work. HTML::DOM tries to copy web browsers in that regard. Basically, if an input is not inside any form, it is associated with the last form that was implicitly closed, if any. That means that, in this case,-- <div><form></div> <form></form> <input> --the input is associated with the first form. There could be a bug in that code somewhere (it’s in the HTML::DOM::Element::HTML package inside lib/HTML/DOM.pm and also in lib/HTML/DOM/Element/Form.pm [search for mg_elem]), but I’ve not been able to come up with a case that exhibits it.
From: martin [...] akm.com.au
Ok, well i had a look at it all and it looks like it probably due to bad markup, caused by the crappy form templating system this site uses, so I can't easily change where it puts the <form> & </form> tags. But basically what is happening is, the page contains: <table> <tr> <form> <input type="hidden"> <input type="hidden"> <td><input type="text"></td> <td><input type="text"></td> <td><input type="text"></td> </form> </tr> <tr><td>....</td><td>....</td><td>....</td></tr> </table> And when the page is loaded and the page is converted to a HTML DOM tree, which moves the closing form tag, so it then looks like <table> <tr> <td> <form> <input type="hidden"> <input type="hidden"> </form> </td> <td><input type="text"></td> <td><input type="text"></td> <td><input type="text"></td> </tr> <tr><td>....</td><td>....</td><td>....</td></tr> </table> Notice how it wraps the form and hidden inputs into a new TD. So the text inputs are not in any form, I think this is what makes it start implicitly close forms, and makes the wrong forms/input association. I think it may be due to your <td><form> workaround, i say that just cause it is wrapping it up in a TD. It looks that is the only thing going on, but not 100% sure, I tried looking at the code, but had no idea what to edit, or where that conversion happened. Anyway the HTML file is attached. Thanks, Martin
Subject: page_with_error.html

Message body is not shown because it is too large.

On Sun Feb 06 20:18:07 2011, makman wrote: Show quoted text
> Ok, well i had a look at it all and it looks like it probably due to > bad > markup, caused by the crappy form templating system this site uses, so > I > can't easily change where it puts the <form> & </form> tags. > > But basically what is happening is, the page contains: > <table> > <tr> > <form> > <input type="hidden"> > <input type="hidden"> > <td><input type="text"></td> > <td><input type="text"></td> > <td><input type="text"></td> > </form> > </tr> > <tr><td>....</td><td>....</td><td>....</td></tr> > </table> > > And when the page is loaded and the page is converted to a HTML DOM > tree, which moves the closing form tag, so it then looks like > <table> > <tr> > <td> > <form> > <input type="hidden"> > <input type="hidden"> > </form> > </td> > <td><input type="text"></td> > <td><input type="text"></td> > <td><input type="text"></td> > </tr> > <tr><td>....</td><td>....</td><td>....</td></tr> > </table> > > Notice how it wraps the form and hidden inputs into a new TD. So the > text inputs are not in any form, I think this is what makes it start > implicitly close forms, and makes the wrong forms/input association. > I > think it may be due to your <td><form> workaround, i say that just > cause > it is wrapping it up in a TD. > > It looks that is the only thing going on, but not 100% sure, I tried > looking at the code, but had no idea what to edit, or where that > conversion happened. > > Anyway the HTML file is attached.
Thank you. I was able to track it down, and have fixed it in version 0.047.