Date: | Fri, 1 Oct 2004 18:37:03 +0200 |
From: | Honza Pazdziora <adelton [...] fi.muni.cz> |
To: | bug-WWW-Mechanize [...] rt.cpan.org |
CC: | libwww [...] perl.org |
Subject: | WWW::Mechanize security: uses default value of file inputs |
Hello,
I started using WWW::Mechanize to retrieve some data from server
which is not under my control. Suddenly, error
Can't open file $sou: No such file or directory at /usr/lib/perl5/site_perl/5.8.1/HTML/Form.pm line 525
appeared. Whoa. I never told it in my script to work with $sou? Or
'$sou'. Where did it come from?
After digging through the sources, I figured out that
<input type="file" name="soubor" value="$sou" vsize="60">
really is in the source HTML and HTML::Form passes it to
WWW::Mechanize. And since I never intented to touch this value, the
default is used and WWW::Mechanize happily tries to upload the file.
I consider this to be a major security hole -- what if the authors
of the HTML page used path to user's id_dsa file?
The HTML 4 spec says in section 17.4.1
file
Creates a file select control. User agents may use the
value of the value attribute as the initial file name.
Since user agents _may_ use this value but are not required to,
and since with WWW::Mechanize there is no interactive control of
the values used, I believe it is crutial that the default value is
not used for the file inputs. Even the visual user agents (Mozilla,
etc.) clear the file inputs, since it is too easy to script automatic
upload with JS.
Please consider the following patch:
--- WWW/Mechanize.pm.orig 2004-10-01 18:06:39.000000000 +0200
+++ WWW/Mechanize.pm 2004-10-01 18:09:26.000000000 +0200
@@ -1387,6 +1387,15 @@
sub _parse_html {
my $self = shift;
$self->{forms} = [ HTML::Form->parse($self->content, $self->base)
];
+ if (@{ $self->{forms} }) {
+ for my $form (@{ $self->{forms} }) {
+ for my $input ($form->inputs) {
+ if ($input->type eq 'file') {
+ $input->value( undef );
+ }
+ }
+ }
+ }
$self->{form} = $self->{forms}->[0];
$self->_extract_links();
}
which removes all file values from all forms, preventing accidental
upload of files.
Alternatively, HTML::Form could do this for us. But there may be
some situations when you might want to see the value that came in the
HTML, so I could not argue strongly for removal of this value already
in HTML::Form. But it could be done on line 128 of HTML::Form 1.44.
I'm Cc'ing libwww@perl.org to gather views -- maybe for file inputs
a separate method like default_value could be used to get the original
value, while value would return undef.
Or maybe there are situations when WWW::Mechanize could retain the
default value. Then the patch suggested could be wrapped with a check
for some option that would keep the file input values intact.
Nonetheless, the default behaviour should be to remove the values.
I'd appreciate your comment about this issue,
--
------------------------------------------------------------------------
Honza Pazdziora | adelton@fi.muni.cz | http://www.fi.muni.cz/~adelton/
.project: Perl, mod_perl, DBI, Oracle, large Web systems, XML/XSL, ...
Only self-confident people can be simple.