Subject: | Gratuitous laundering (untaining) data from the wild |
Data from the question section and from resource records derived
from DNS packets as received 'from the wild' should be kept tainted.
While it is probably acceptable to return untainted simple
fields like answer->type and counters from the header, it is
unacceptable to just blindly untaint fields like rdatastr
and txtdata, or qname from the question section. A remote
server or someone in between could be placing almost arbitrary
text in such DNS packets.
The Net::DNS 0.68 was still reasonable in this respect (leaving
resource records tainted, although it should not have untainted
the qname), but versions 0.69 .. 0.71 unwarrantedly also untaint
strings in TXT records (possibly also in SPF and HINFO, perhaps
elsewhere).
Here is a demonstration:
$ perl -T -MNet::DNS::Resolver -MScalar::Util -e '
sub pr{my($p,$m)=@_;
$p->can($m) && printf("%s %-4s %-8s: %s\n",
Scalar::Util::tainted($p->$m)?"tainted ":"LAUNDERED!",
$p->can("type") ? $p->type : "",
$m, $p->$m());
};
$p=Net::DNS::Resolver->new->send("2.0.0.127.bl.spamcop.net","ANY");
for ($p->question){pr($_,"qname")};
for ($p->answer)
{pr($_,"rdatastr"); pr($_,"txtdata"); pr($_,"address")}'
0.68:
LAUNDERED! ANY qname : 2.0.0.127.bl.spamcop.net
tainted rdatastr: 127.0.0.2
tainted rdatastr: "Blocked - see http://www.spamcop.net/
bl.shtml?127.0.0.2"
tainted txtdata : Blocked - see http://www.spamcop.net/bl.shtml?
127.0.0.2
0.69 .. 0.71:
LAUNDERED! ANY qname : 2.0.0.127.bl.spamcop.net
tainted A rdatastr: 127.0.0.2
tainted A address : 127.0.0.2
LAUNDERED! TXT rdatastr: "Blocked - see http://www.spamcop.net/
bl.shtml?127.0.0.2"
LAUNDERED! TXT txtdata : Blocked - see http://www.spamcop.net/bl.shtml?
127.0.0.2
The attached patch addresses the problem - by two approaches:
- by adding a global "use re 'taint';" to most if not all packages
while overriding its affect selectively only where untaining
is intentional;
- by passing taintedness flag from the argument of Encode::decode
to its result - the Encode::decode is guilty of laundering too,
but is not the only culprit.
Here is my text from an old SpamAssassin problem report
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5645
explaining the situation:
======
The implicit untainting mechanism offered by Perl in a form
of regexp matching has its good and its undesired sides.
The good side is that in some cases it does the right thing
when a programmer is careful to use it carefully.
The down side is that regexp matching is also used for other
purposes, and the untainting action is more often than not
unintentional, leading to data laundering as a side effect,
defeating the taint checking safety net.
The SA module Util already provides routines for explicit
untainting of variables, but so far the untaint_var() has
only rarely been used. On the other front, Perl provides
a pragma (already in 5.6.1, possibly earlier):
use re 'taint';
which allows turning off the implicit untainting.
Combining both offers us best of both worlds, allowing
a programmer to explicitly untaint data when required,
and avoid implicit untainting when just thinking of
program's functionality.
[...]
Letting tainted data propagate to a final consumer as much as
is practical, is a general policy to be followed here - is my
suggestion.
This may smoke out some additional Perl taint bugs regarding the
global variables $1, $2, ... which sometimes get mysteriously tainted.
Localizing these variables is a good practice anyway (avoiding horrible
Perl practice to let subroutines use global temporary variables).
I already added local($1,$2,...) where I stumbled across; it is
possible that some more will be needed or desired.
======
See also:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=3838#c11
Regarding the tainting of $1, $2, etc and propagating the
taintedness bug, this was addressed in
[perl #67962] spamassassin and tainted mode
which I can no longer find in the bug tracker,
but here is a relevant section:
Re: [perl #67962] spamassassin and tainted mode
From: Dave Mitchell <davem@iabyn.com>
To: Mark Martinec <Mark.Martinec@ijs.si>
CC: demerphq <demerphq@gmail.com>, perlbug-followup@perl.org
Date: Thu, 25 Mar 2010 11:10:10 +0000
Show quoted text
> Yves,
>
>
> Done. And I believe I have it distilled now to a small test case.
>
Now fixed by commit 447ee1343739cf8e34c4ff1ba9b30eae75c3f1ab
in branch davem/post-5.12, which should be merged back into blead
once 5.12 has been released, and thus appear in 5.13 onwards:
commit 447ee1343739cf8e34c4ff1ba9b30eae75c3f1ab
Author: David Mitchell <davem@iabyn.com>
AuthorDate: Thu Mar 25 10:56:35 2010 +0000
Commit: David Mitchell <davem@iabyn.com>
CommitDate: Thu Mar 25 10:56:35 2010 +0000
RT #67962: $1 treated as tainted in untainted match
Fix the issue in the following:
use re 'taint';
$tainted =~ /(...)/;
# $1 now correctly tainted
$untainted =~ s/(...)/$1/;
# $untainted now incorrectly tainted
The problem stems from when $1 is updated.
pp_substcont, which is called after the replacement expression has
been
evaluated, checks the returned expression for taintedness, and if
so,
taints the variable being substituted. For a substitution like
s/(...)/x$1/ this works fine: the expression "x".$1 causes $1's get
magic
to be called, which sets $1 based on the recent match, and is
marked as
not tainted. Thus the returned expression is untainted. In the
variant
s/(...)/$1/, the returned value on the stack is $1 itself, and its
get
magic hasn't been called yet. So it still has the tainted flag from
the
previous pattern.
The solution is to mg_get the returned expression before testing for
taintedness.
Affected files ...
M pp_ctl.c
M t/op/taint.t
> > > I'm running 5.10.1 on our mailers now. I suppose I could
> > > remove these localizations of $1,$2,etc and see what happens.
> > > Will let you know if I can reproduce it on 5.10.1.
> > Also it would be really nice to get to the bottom of this.
> >
> > I have looked at the regex code and i have looked at the $1 fetch
> > logic and i dont see how it possibly could ever be tainted.
> >
> > At the very least we should assert that it isnt.
>
> #!/usr/bin/perl -T
>
> use strict;
> use re 'taint';
> use Scalar::Util qw(tainted);
>
> my $mailbox = 'abc@example.com';
> $mailbox .= substr($ENV{PATH},0,0); # make it tainted
>
> # $1 and $2 become tainted
> my(@r) = $mailbox =~ /^(.*?)(\@.*)$/ ? ($1,$2) : ($mailbox,'');
> printf("%d %d\n", tainted($1), tainted($2));
>
> my($nm) = 'aaa-ccc'; # not tainted
> printf("%d\n", tainted($nm));
>
> $nm =~ s/^aaa-(.*)$/$1/; # $nm becomes tainted
> printf("%d\n", tainted($nm));
Subject: | Net-DNS-0.71.patch |
--- DNS.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS.pm 2012-12-27 18:52:43.000000000 +0100
@@ -113,4 +113,5 @@
use strict;
+use re 'taint';
use Carp;
use Net::DNS::RR;
@@ -226,4 +227,5 @@
return $typesbyname{$name} if defined $typesbyname{$name};
+ local $1;
confess "unknown type $name" unless $name =~ m/TYPE(\d+)/o;
@@ -273,4 +275,5 @@
return $classesbyname{$name} if defined $classesbyname{$name};
+ local $1;
confess "unknown class $name" unless $name =~ m/CLASS(\d+)/o;
@@ -451,4 +454,5 @@
my $presentation=shift; # Really wire...
+ local $1;
# Prepend these with a backslash
$presentation =~ s/(["$();@.\\])/\\$1/g;
@@ -490,4 +494,5 @@
my $wire="";
+ local($1,$2);
while ($presentation =~ /\G([^.\\]*)([.\\]?)/g){
$wire .= $1 if defined $1;
--- DNS/Domain.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Domain.pm 2012-12-27 18:52:43.000000000 +0100
@@ -38,4 +38,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
@@ -176,5 +177,7 @@
my $self = shift;
- return $self->{xname} ||= UTF8->decode( Net::LibIDN::idn_to_unicode( $name, 'utf-8' ) || $name );
+
+ # preserve taintedness, Encode::decode drops it!
+ return $self->{xname} ||= substr($name,0,0).UTF8->decode( Net::LibIDN::idn_to_unicode( $name, 'utf-8' ) || $name );
}
@@ -214,4 +217,5 @@
sub string {
my $name = &name;
+ local $1;
$name =~ s/^(['"\$;@])/\\$1/; # escape leading special char
return $name =~ /[$dot]$/o ? $name : $name . $dot; # append trailing dot
@@ -263,5 +267,6 @@
sub _decode_ascii {
- return ASCII->decode(shift) if ASCII;
+ # preserve taintedness, Encode::decode drops it!
+ return substr($_[0],0,0) . ASCII->decode($_[0]) if ASCII;
unless (ASCII) {
@@ -320,4 +325,5 @@
sub _escape { ## Insert escape sequences in string
my $s = shift;
+ local $1;
$s =~ s/([^\055\101-\132\141-\172\060-\071])/$esc{$1}/eg;
return $s;
@@ -340,4 +346,5 @@
sub _unescape { ## Remove escape sequences in string
my $s = shift;
+ local $1;
$s =~ s/\134([\060-\062][\060-\071]{2})/$unesc{$1}/eg; # numeric escape
$s =~ s/\134\066\066\066/\134\134/g; # reveal escaped escape
--- DNS/DomainName.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/DomainName.pm 2012-12-27 18:52:43.000000000 +0100
@@ -42,4 +42,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
--- DNS/Header.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Header.pm 2012-12-27 18:52:43.000000000 +0100
@@ -28,4 +28,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
--- DNS/Mailbox.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Mailbox.pm 2012-12-27 18:52:43.000000000 +0100
@@ -29,4 +29,5 @@
use strict;
+use re 'taint';
use Carp;
@@ -50,4 +51,5 @@
my $class = shift;
local $_ = shift;
+ local($1,$2);
confess 'undefined mail address' unless defined $_;
@@ -84,4 +86,5 @@
my @label = shift->label;
local $_ = shift(@label) || return '<>';
+ local $1;
s/\\\./\./g; # unescape dots
s/\\032/ /g; # unescape space
--- DNS/Nameserver.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Nameserver.pm 2012-12-27 18:52:43.000000000 +0100
@@ -34,4 +34,5 @@
use strict;
+use re 'taint';
use integer;
use Carp qw(cluck);
--- DNS/Packet.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Packet.pm 2012-12-27 18:52:43.000000000 +0100
@@ -32,4 +32,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
--- DNS/Parameters.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Parameters.pm 2012-12-27 18:52:43.000000000 +0100
@@ -22,4 +22,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
--- DNS/Question.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Question.pm 2012-12-27 18:52:43.000000000 +0100
@@ -27,4 +27,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
@@ -282,4 +283,5 @@
# arg looks like IPv4 address: map to in-addr.arpa space
+ local($1,$2,$3,$4,$5);
if (m#(^|:.*:)((^|\d+\.)+\d+)(/(\d+))?$#) {
return undef if new Net::DNS::DomainName('@')->label;
--- DNS/RR/AAAA.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/RR/AAAA.pm 2012-12-27 18:52:43.000000000 +0100
@@ -17,4 +17,5 @@
use strict;
+use re 'taint';
use integer;
@@ -76,4 +77,5 @@
sub address_short {
+ local $1;
for ( sprintf ':%x:%x:%x:%x:%x:%x:%x:%x:', unpack 'n8', shift->{address} ) {
s/(:0[:0]+:)(?!.+:0\1)/::/; # squash longest zero sequence
--- DNS/RR/APL.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/RR/APL.pm 2012-12-27 18:52:43.000000000 +0100
@@ -17,4 +17,5 @@
use strict;
+use re 'taint';
use integer;
@@ -74,4 +75,5 @@
my $self = shift;
+ local($1,$2);
while (@_) { # parse apitem strings
last unless $_[0] =~ m|^(!?)(\d+):(.+)/(\d+)$|;
--- DNS/RR/SSHFP.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/RR/SSHFP.pm 2012-12-27 18:52:43.000000000 +0100
@@ -17,4 +17,5 @@
use strict;
+use re 'taint';
use integer;
@@ -45,4 +46,5 @@
my $babble = $self->babble;
my $fingerprint = $self->fp;
+ local $1;
$fingerprint =~ s/(\S{64})/$1\n/g;
$fingerprint = "(\n$fingerprint )" if length $fingerprint > 40;
--- DNS/RR/TLSA.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/RR/TLSA.pm 2012-12-27 18:52:43.000000000 +0100
@@ -16,4 +16,7 @@
+use strict;
+use re 'taint';
+
sub decode_rdata { ## decode rdata from wire-format octet string
@@ -43,4 +46,5 @@
my @params = map $self->$_, qw(usage selector matchingtype);
my $certificate = $self->cert;
+ local $1;
$certificate =~ s/(\S{64})/$1\n/g;
$certificate = "(\n$certificate )" if length $certificate > 40;
--- DNS/RR/TXT.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/RR/TXT.pm 2012-12-27 18:52:43.000000000 +0100
@@ -19,4 +19,5 @@
use strict;
+use re 'taint';
use integer;
--- DNS/RR.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/RR.pm 2012-12-27 18:52:43.000000000 +0100
@@ -36,4 +36,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
--- DNS/Resolver/Base.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver/Base.pm 2012-12-27 18:52:43.000000000 +0100
@@ -5,4 +5,5 @@
use strict;
+use re 'taint';
BEGIN {
@@ -242,4 +243,5 @@
local $/ = "\n";
local $_;
+ local $1;
while (<FILE>) {
@@ -249,4 +251,8 @@
next unless m/\S/;
+ # allow automatic untainting for compatibility,
+ # trying not to break applications
+ no re 'taint';
+
SWITCH: {
/^\s*domain\s+(\S+)/ && do {
@@ -400,4 +406,5 @@
my @addr;
my @names = @{$names};
+ local $1;
my $oct2 = '(?:2[0-4]\d|25[0-5]|[0-1]?\d\d|\d)';
--- DNS/Resolver/MSWin32.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver/MSWin32.pm 2012-12-27 18:52:43.000000000 +0100
@@ -16,4 +16,5 @@
use strict;
+use re 'taint';
use Win32::IPHelper;
use Win32::TieRegistry qw(KEY_READ REG_DWORD);
@@ -77,4 +78,5 @@
my @a;
my %h;
+ local $1;
foreach my $entry ( split( m/[\s,]+/, lc $searchlist ) ) {
push( @a, $entry ) unless $h{$entry}++;
--- DNS/Resolver/Recurse.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver/Recurse.pm 2012-12-27 18:52:43.000000000 +0100
@@ -4,4 +4,5 @@
#
use strict;
+use re 'taint';
use Net::DNS::Resolver;
@@ -290,4 +291,5 @@
} elsif (my @authority = $packet->authority) {
my %auth = ();
+ local $1;
foreach my $rr (@authority) {
if ($rr->type =~ /^(NS|SOA)$/) {
--- DNS/Resolver/UNIX.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver/UNIX.pm 2012-12-27 18:52:43.000000000 +0100
@@ -16,4 +16,5 @@
use strict;
+use re 'taint';
--- DNS/Resolver/cygwin.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver/cygwin.pm 2012-12-27 18:52:43.000000000 +0100
@@ -16,4 +16,5 @@
use strict;
+use re 'taint';
@@ -119,4 +120,5 @@
my @a;
my %h;
+ local $1;
foreach my $entry ( split( m/[\s,]+/, $searchlist ) ) {
push( @a, $entry ) unless $h{$entry}++;
--- DNS/Resolver/os2.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver/os2.pm 2012-12-27 18:52:43.000000000 +0100
@@ -16,4 +16,5 @@
use strict;
+use re 'taint';
--- DNS/Resolver.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Resolver.pm 2012-12-27 18:52:43.000000000 +0100
@@ -15,4 +15,5 @@
use strict;
+use re 'taint';
use vars qw(@ISA);
--- DNS/Text.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Text.pm 2012-12-27 18:52:43.000000000 +0100
@@ -37,4 +37,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;
@@ -78,4 +79,5 @@
local $_ = &_encode_utf8;
+ local($1,$2);
s/^([\042\047])(.*)\1$/$2/; # strip paired quotes
@@ -201,7 +203,8 @@
sub _decode_utf8 {
- return UTF8->decode(shift) if UTF8;
+ # preserve taintedness, Encode::decode drops it!
+ return substr($_[0],0,0) . UTF8->decode($_[0]) if UTF8;
- return ASCII->decode(shift) if ASCII && not UTF8;
+ return substr($_[0],0,0) . ASCII->decode($_[0]) if ASCII && not UTF8;
unless (ASCII) {
--- DNS/Update.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/Update.pm 2012-12-27 18:52:43.000000000 +0100
@@ -33,4 +33,5 @@
use strict;
+use re 'taint';
use Net::DNS;
--- DNS/ZoneFile.pm 2012-12-15 12:15:26.000000000 +0100
+++ DNS/ZoneFile.pm 2012-12-27 18:52:43.000000000 +0100
@@ -45,4 +45,5 @@
use strict;
+use re 'taint';
use integer;
use Carp;