Skip Menu |

This queue is for tickets about the XML-RSS CPAN distribution.

Report information
The Basics
Id: 2285
Status: resolved
Priority: 0/
Queue: XML-RSS

People
Owner: KELLAN [...] cpan.org
Requestors: apearson [...] operamail.com
stephen.hall [...] mweb.co.za
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)



Subject: doesn't escape emdashes
if an item contains an emdash character (\227), XML::RSS lets it through without escaping it. this results in non well-formed XML. since XML::RSS properly escapes other things like '<' and '&', i'm assuming that always producing well-formed output is one of the goals.
Would you happen to have a sample RSS feed that shows the problem available? Thanks kellan [guest - Tue Mar 25 17:17:19 2003]: Show quoted text
> if an item contains an emdash character (\227), XML::RSS lets it > through without escaping it. this results in non well-formed XML. > since XML::RSS properly escapes other things like '<' and '&', i'm > assuming that always producing well-formed output is one of the > goals.
From: Gregory Williams
[guest - Tue Mar 25 17:17:19 2003]: Show quoted text
> if an item contains an emdash character (\227), XML::RSS lets it > through without escaping it. this results in non well-formed XML. > since XML::RSS properly escapes other things like '<' and '&', i'm > assuming that always producing well-formed output is one of the > goals.
I'm not sure if this affects the issue, but (AFAIK) \227 is only an emdash in a Microsoft codepage. The emdash in Unicode is U+2014: &#8212;. The mdash doesn't exist in ISO-8859-1.
Subject: How to duplicate this bug.
From: perl [...] crystalflame.net
[guest - Tue Mar 25 17:17:19 2003]: Show quoted text
> if an item contains an emdash character (\227), XML::RSS lets it > through without escaping it. this results in non well-formed XML. > since XML::RSS properly escapes other things like '<' and '&', i'm > assuming that always producing well-formed output is one of the > goals.
Create a basic RSS feed with a dc:subject of "This & that", plain ASCII. Output the RSS feed using $rss->as_string. The resulting RSS feed will fail to validate as proper XML, as the '&' in the subject has not been encoded to the ASCII string '&amp;' in the resultant RSS output. This is contrary to the documentation, which states that as_string will encode special characters. This bug is responsible for virtually all of the reported problems I've seen with XML::RSS, as well as virtually all of the resentment. May I ask why it's been stalled?
On Mon Dec 08 09:59:59 2003, RSOD wrote: Show quoted text
> This is contrary to the documentation, which states that as_string > will encode special > characters. > > This bug is responsible for virtually all of the reported problems > I've seen with XML::RSS, > as well as virtually all of the resentment. May I ask why it's been > stalled?
Is this ever going to be fixed, or should I just s/&/&amp;/g?
Subject: Encoding of special characters in 'source' field of RSS 2.0 items
Special characters (e.g. '&') in the 'source' field of items are not being encoded when generating RSS 2.0, although they are being correctly encoded in the 'sourceUrl' field. It appears that '$self->encode($item->{source})' is missing in line 1234 of RSS.pm ver 1.05
the source encoding is done in r7957, will be released as 1.11. (the emdashes thing should be fixed as part of the "use HTML::Entities" change, also in 1.11).