Skip Menu |

This queue is for tickets about the XML-LibXML CPAN distribution.

Report information
The Basics
Id: 118032
Status: open
Priority: 0/
Queue: XML-LibXML

People
Owner: Nobody in particular
Requestors: ppisar [...] redhat.com
Cc: CARNIL [...] cpan.org
gortan [...] cpan.org
AdminCc:

Bug Information
Severity: (no value)
Broken in: 2.0128
Fixed in: (no value)



Subject: External entities are parsed by default
$XML::LibXML::XML_LIBXML_PARSE_DEFAULTS is set to XML_PARSE_NODICT | XML_PARSE_DTDLOAD | XML_PARSE_NOENT. That overrides libxml2 default behavior that stopped processing external entities by default many years ago <https://git.gnome.org/browse/libxml2/commit/?id=4629ee02ac649c27f9c0cf98ba017c6b5526070f> because it could leak data from a local file system or a local network. This XML-LibXML issue was reported to Debian <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=838097> and to Fedora <https://bugzilla.redhat.com/show_bug.cgi?id=1377996>. Would it be possible to change XML-LibXML default behavior in similar way?
From: ppisar [...] redhat.com
Dne Čt 22.zář.2016 07:44:39, ppisar napsal(a): Show quoted text
> $XML::LibXML::XML_LIBXML_PARSE_DEFAULTS is set to XML_PARSE_NODICT | > XML_PARSE_DTDLOAD | XML_PARSE_NOENT. That overrides libxml2 default > behavior that stopped processing external entities by default
Attached patch disables expanding entities. Maybe XML_PARSE_DTDLOAD should be disables also. Or maybe one should define a default ext_ent_handler that always. That could preserve ability to expand internal entities.
Subject: 0001-Do-not-enable-expanding-entities-by-default.patch
From 05749ae525317d05bd9d4232c080e530854f1d88 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com> Date: Fri, 30 Sep 2016 14:31:26 +0200 Subject: [PATCH] Do not enable expanding entities by default MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Expanding external entity is insecure. <https://www.owasp.org/index.php/XML_External_Entity_(XXE)_Processing>. This patch makes expand_entities option disabled by default. CPAN RT#118032 Signed-off-by: Petr Písař <ppisar@redhat.com> --- LibXML.pm | 2 +- docs/libxml.dbk | 2 +- t/43options.t | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/LibXML.pm b/LibXML.pm index eb3cbd6..9ab4748 100644 --- a/LibXML.pm +++ b/LibXML.pm @@ -261,7 +261,7 @@ use constant { HTML_PARSE_NOERROR => (1<<5), # suppress error reports }; -$XML_LIBXML_PARSE_DEFAULTS = ( XML_PARSE_NODICT | XML_PARSE_DTDLOAD | XML_PARSE_NOENT ); +$XML_LIBXML_PARSE_DEFAULTS = ( XML_PARSE_NODICT | XML_PARSE_DTDLOAD ); # this hash is made global so that applications can add names for new # libxml2 parser flags as temporary workaround diff --git a/docs/libxml.dbk b/docs/libxml.dbk index 30f279b..2c6674b 100644 --- a/docs/libxml.dbk +++ b/docs/libxml.dbk @@ -1676,7 +1676,7 @@ local $XML::LibXML::setTagCompression = 1;</programlisting> <term>expand_entities</term> <listitem> <para>/parser, reader/</para> - <para>substitute entities; possible values are 0 and 1; default is 1</para> + <para>substitute entities; possible values are 0 and 1; default is 0</para> <para>Note that although this flag disables entity substitution, it does not prevent the parser from loading external entities; when substitution of an external entity is disabled, the diff --git a/t/43options.t b/t/43options.t index 826f0ad..53dd35e 100644 --- a/t/43options.t +++ b/t/43options.t @@ -50,7 +50,7 @@ no_network { my $p = XML::LibXML->new(); for my $opt (@all) { - my $ret = (($opt =~ /^(?:load_ext_dtd|expand_entities)$/) ? 1 : 0); + my $ret = (($opt =~ /^(?:load_ext_dtd)$/) ? 1 : 0); # TEST*$all ok( ($p->get_option($opt)||0) == $ret @@ -110,7 +110,7 @@ no_network ok( $p->get_option('recover') == 2, ' TODO : Add test name' ); # TEST - ok( $p->expand_entities() == 1, ' TODO : Add test name' ); + ok( $p->expand_entities() == 0, ' TODO : Add test name' ); # TEST ok( $p->load_ext_dtd() == 1, ' TODO : Add test name' ); $p->load_ext_dtd(0); -- 2.7.4
On Thu Sep 22 07:44:39 2016, ppisar wrote: Show quoted text
> $XML::LibXML::XML_LIBXML_PARSE_DEFAULTS is set to XML_PARSE_NODICT | > XML_PARSE_DTDLOAD | XML_PARSE_NOENT. That overrides libxml2 default > behavior that stopped processing external entities by default many > years ago > <https://git.gnome.org/browse/libxml2/commit/?id=4629ee02ac649c27f9c0cf98ba017c6b5526070f> > because it could leak data from a local file system or a local > network. > > This XML-LibXML issue was reported to Debian > <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=838097> and to > Fedora <https://bugzilla.redhat.com/show_bug.cgi?id=1377996>. > > Would it be possible to change XML-LibXML default behavior in similar > way?
If the default behavior with external entities is being changed to prevent XXE attacks, please also change the default behavior to NOT load external DTDs. DTD loading by default is an SSRF attack.
The given patch as stated seems to be over-applied.

Not only does it inhibit expanding external entities, but it inhibits expanding native entity encodings such as &#38;

Subsequently, applying this patch breaks XML-Simple when it runs its tests using XML::LibXML::SAX due to expecting to decode "&#38;" as "&" , and instead getting it undecoded as "&#38;"

-- 
- CPAN kentnl@cpan.org
- Gentoo Perl Maintainer kentnl@gentoo.org ( perl@gentoo.org )
I have started a pull request on github which changes both the load_ext_dtd and expand_entities defaults: https://github.com/shlomif/perl-XML-LibXML/pull/39 Some doc fixes are still required. I added a test which confirms that the behaviour of the predefined entities (as mentioned by KENTNL) should not change, but I have not yet tested that XML::Simple builds.