Skip Menu |

This queue is for tickets about the Text-WikiCreole CPAN distribution.

Report information
The Basics
Id: 67769
Status: open
Priority: 0/
Queue: Text-WikiCreole

People
Owner: Nobody in particular
Requestors: peter_retep [...] gmx.de
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: missing last character with links in lists, when URL ends with slash.
Date: Tue, 26 Apr 2011 20:22:01 +0200
To: bug-Text-WikiCreole [...] rt.cpan.org
From: Peter Retep <peter_retep [...] gmx.de>
Hello, I have a problem with Text::WikiCreole at ubuntu 8.04 (using the ubuntu repository) when I create HTML containing URLs ending with slashs. In the href attribute the last character is missing, but the link is displayed correctly at the browser: $html=Text::WikiCreole::creole_parse(qq{ == Information * http://domain.org/14/ }); results in <ul> <li><a href="http://domain.org/14 <view-source:http://alternativlos.org/14/>">http://domain.org/14/</a></li> </ul> I tried also $html=Text::WikiCreole::creole_parse(qq{ == Information * [[http://domain.org/14/ | http://domain.org/14/]] }); but got another errorous result. Maybe this is an general issue with slashes at end of links? BR, Peter
Subject: PATCH - missing last character with links in lists, when URL ends with slash.
On 2011-04-26 14:22:11, peter_retep@gmx.de wrote: Show quoted text
> Maybe this is an general issue with slashes at end of links?
It seems to be. The problem is that the parser uses the Posix character class [:punct:], which includes the '/' character. Expanding the class out to its individual characters and deleting '/' fixes the problem. There may be strange side-effects in other locales, because the Posix class may change whilst the explicit list doesn't, but I don't know the rules for URLs well enough. ilink => { curpat => '(?=(?:https?|ftp):\/\/)', # stops => '(?=[[:punct:]]?(?:\s|$))', # 2011-09-14 djh - / is not a terminator stops => q/(?=[-!"#$%&'()*+,.:;<=>?@[\\\]^_`{|}~]?(?:\s|$))/, hint => ['h', 'f'], filter => sub { Cheers, Dave
There's another issue buried here, though as far as I can tell its just ugly, not dangerous. Consider the markup: [[http://pcx36/14/ | http://pcx36/15/]] the generated HTML looks like this: <a href="http://pcx36/14/"><a href="http://pcx36/15/">http://pcx36/15/</a></a> That is, the left-hand URL is turned into an <a> element because of the 'link' parsing rule as we'd expect. But then the textual description, which happens to look like a URL, is turned into a child <a> element by the 'ilink' parsing rule, because the description is allowed to contain @all_inline. Firefox is happy with this and follows the child href. I don't know what other browsers do. I guess a cure could be to define an @all_inline_except_links array. BTW, I've just noticed that the Creole 1.0 spec includes an explicit list of punctuation at the end of links "Single punctuation characters (,.?!:;"')". I don't know whether that list is supposed to be definitive.