Skip Menu |

This queue is for tickets about the HTML-WikiConverter CPAN distribution.

Report information
The Basics
Id: 126932
Status: new
Priority: 0/
Queue: HTML-WikiConverter

People
Owner: diberri [...] cpan.org
Requestors: alan [...] ufies.org
Cc:
AdminCc:

Bug Information
Severity: Important
Broken in: 0.68
Fixed in: (no value)



Subject: Using $base_uri not working
I don't know if this is even maintained anymore, but I've been fighting all day with the base_uri attribute not working. From the docs: --- URI to use for converting relative URIs to absolute ones. This effectively ensures that the src and href attributes of image and anchor tags, respectively, are absolute before converting the HTML to wiki markup, which is necessary for wiki dialects that handle internal and external links separately. Relative URIs are only converted to absolute ones if the base_uri argument is present. Defaults to undef. --- As I understand it if you have a link: <a href="foo.html">foo</a> and base_uri is set, the output markdown will include to the full URL. So if base_uri is set to "http://google.com" the output markdown would be: [foo](http://google.com/foo.html) If the base_uri isn't set it would simply return the relative URL [foo](foo.html) This doesn't seem to work. I've dug through and it looks like the conversion of the HTML is working internally - the HTML is converted from <a href="foo.html">foo</a> to <a href="http://google.com/foo.html">foo</a> within WikiConverter.pm, when the rules are executed on an element the conversion from that HTML to Markdown it doesn't work (at least as expected). I seem to have it narrowed down to the _abs2rel() function called from _link() in WikiConverter::Markdown, which creates a new relative URL from the (converted properly) HTML if the base_url is set - BUT this seems to go against what the documentation for the attribute says. "Relative URIs are only converted to absolute ones if the base_uri argument is present. Defaults to undef." To me this means that if a URL is relative (foo.html) it's converted to absolute (google.com/foo.html) if the base_uri is set (ie: google.com). To fix this (at least for my case) I simply return $uri from _abs2rel() without checking for base_uri (line 519), avoiding the creation of a relative link for it (line 520). So it seems like the internal HTML cleanup is doing what it should and creating absolute URLs for all the relative ones, but the HTML to Markdown conversion is undoing that. Test code attached. If I'm misunderstanding what base_uri is doing, or if the documentation is wrong, please excuse me and let me know. If it is working as it should now, is there a way to do what I'm looking for, which is create absolute URLs in the resulting Markdown from relative ones in the HTML input.
Subject: html-wikiconverter-markdown-base_uri-test.pl
#!/usr/bin/perl use strict; use warnings; use Data::Dumper; use HTML::WikiConverter; my $base_uri = "http://arcterex.net"; my $html = <<'END'; <p>This is <a href="/test1.html">link to just a page</a>. <p>This is <a href="http://arcterex.net/test1.html">link to a server + page</a>.</p> <p>This is <a href="http://google.com/test1.html">link to a dfferent server</a>.</p> END print "Input:\n$html\n-----\n"; print "With Base_URI set\n"; my $wc = new HTML::WikiConverter( dialect => 'Markdown', link_style => 'inline', base_uri => $base_uri, ); print $wc->html2wiki($html) . "\n"; print "\n----\nWithout BaseURI set\n"; my $wc2 = new HTML::WikiConverter( dialect => 'Markdown', link_style => 'inline', ); print $wc2->html2wiki($html) . "\n";