Skip Menu |

This queue is for tickets about the XML-SAX-ExpatXS CPAN distribution.

Report information
The Basics
Id: 63715
Status: resolved
Priority: 0/
Queue: XML-SAX-ExpatXS

People
Owner: PCIMPRICH [...] cpan.org
Requestors: lindell.dm [...] gmail.com
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: NoExpand = 0 still expanding entity references
Date: Wed, 8 Dec 2010 08:35:29 +1100
To: bug-XML-SAX-ExpatXS <bug-XML-SAX-ExpatXS [...] rt.cpan.org>
From: David Lindell <lindell.dm [...] gmail.com>
In expat_2.0.1 XML::SAX::ExpatXS version: 1.31 Perl version: 5.10.0 OS vendor and version: Linux UbuntuDesktop9 2.6.28-11-generic #42-Ubuntu SMP Fri Apr 17 01:58:03 UTC 2009 x86_64 GNU/Linux *I am setting NoExpand = 0 and internal entity references are still resolving*. For example &amp; to & &lt; to < &#37; to % etc. Code: my $handler = p2handler->new($element_path_qualifier, $log_output_path, $max_output_size); my $parser = XML::SAX::ExpatXS->new( NoExpand => 0, Handler => $handler, ); eval { print "Loading $sourcefile into handler...\n"; $parser->parse_uri($sourcefile); }; if ($@) { die "$@"; } Thanks, David
I think this is a correct behavior. NoExpand doesn't expand entities defined in the internal DTD subset. Like &js; in the following example. <?xml version="1.0" standalone="yes" ?> <!DOCTYPE author [ <!ELEMENT author (#PCDATA)> <!ENTITY js "Jo Smith"> ]> <author>&js;</author> Predefined entities (&lt; etc.) and numeric entities (&#39; etc.) are always expanded by expat.
See my previous comment.
Subject: Re: [rt.cpan.org #63715] NoExpand = 0 still expanding entity references
Date: Fri, 11 Mar 2011 09:52:21 +1100
To: bug-XML-SAX-ExpatXS [...] rt.cpan.org
From: David Lindell <aikon3390 [...] gmail.com>
Thanks for the response. However, if this is the case, is there a way to have the parser NOT expand predefined and numeric entities? On Fri, Mar 11, 2011 at 2:43 AM, Petr Cimprich via RT < bug-XML-SAX-ExpatXS@rt.cpan.org> wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=63715 > > > I think this is a correct behavior. NoExpand doesn't expand entities > defined in the internal DTD subset. Like &js; in the following example. > > <?xml version="1.0" standalone="yes" ?> > <!DOCTYPE author [ > <!ELEMENT author (#PCDATA)> > <!ENTITY js "Jo Smith"> > ]> > <author>&js;</author> > > Predefined entities (&lt; etc.) and numeric entities (&#39; etc.) are > always expanded by expat. >
On Thu Mar 10 17:52:30 2011, aikon3390@gmail.com wrote: Show quoted text
> Thanks for the response. However, if this is the case, is there a way to > have the parser NOT expand predefined and numeric entities?
I don't think so. XML character data is a stream of Unicode characters (code points). Numeric references to characters must be resolved by each XML processor (parser). There is no difference in XML processing between the character itself and the numeric reference to the same character. You should not need to keep this input variation.
I consider this ticket resolved. On Fri Mar 11 09:12:01 2011, PCIMPRICH wrote: Show quoted text
> On Thu Mar 10 17:52:30 2011, aikon3390@gmail.com wrote:
> > Thanks for the response. However, if this is the case, is there a way to > > have the parser NOT expand predefined and numeric entities?
> > I don't think so. XML character data is a stream of Unicode characters > (code points). Numeric references to characters must be resolved by each > XML processor (parser). There is no difference in XML processing between > the character itself and the numeric reference to the same character. > You should not need to keep this input variation.