Skip Menu |

This queue is for tickets about the IO-HTML CPAN distribution.

Report information
The Basics
Id: 109527
Status: rejected
Priority: 0/
Queue: IO-HTML

People
Owner: Nobody in particular
Requestors: michael.adamcik [...] pro-pos.at
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: bug in io::html which ends up in eating all ram when reading a line from the handle
Date: Mon, 23 Nov 2015 20:57:58 +0100
To: bug-IO-HTML [...] rt.cpan.org
From: Michael Adamcik <michael.adamcik [...] pro-pos.at>
hey there! found a bug in io::html when using "html_file()" which results in eating all ram when reading a line from the handle. i attach the html and a small script which produce this behaviour. i cant explain why it does is with exact this file, and at exact this line.. you will see (hopefully) im using perl v5.20.2 libio-html-perl 1.001-1 on debian stable/testing best regards, michael

Message body is not shown because sender requested not to inline it.

Message body is not shown because sender requested not to inline it.

This does appear to be a bug, but it's not in IO::HTML. I get the exact same results with this program: #!/usr/bin/perl -w use strict; binmode(STDOUT, ":utf8"); open (my $fh, '<:encoding(ISO-2022-JP)', "bug.html"); while (my $l=<$fh>){ print "L: " . $l . "\n"; } The only way IO::HTML is involved is that it correctly detects that the file claims to use ISO-2022-JP and applies that encoding to the filehandle. I suggest reporting this as a bug in Encode (or possibly the Perl core, but I'd try Encode first). Thanks for using IO::HTML, though :-)
Subject: Re: [rt.cpan.org #109527] bug in io::html which ends up in eating all ram when reading a line from the handle
Date: Tue, 24 Nov 2015 02:32:17 +0100
To: bug-IO-HTML [...] rt.cpan.org
From: Michael Adamcik <michael.adamcik [...] pro-pos.at>
yes, i figured out the same... allready reported it to perl core. io::html is wonderful if you deal with many html sites ;) On 2015-11-24 01:37, Christopher J. Madsen via RT wrote: Show quoted text
> <URL: https://rt.cpan.org/Ticket/Display.html?id=109527 > > > This does appear to be a bug, but it's not in IO::HTML. I get the exact same results with this program: > > #!/usr/bin/perl -w > use strict; > > binmode(STDOUT, ":utf8"); > > open (my $fh, '<:encoding(ISO-2022-JP)', "bug.html"); > > while (my $l=<$fh>){ > print "L: " . $l . "\n"; > } > > The only way IO::HTML is involved is that it correctly detects that the file claims to use ISO-2022-JP and applies that encoding to the filehandle. > > I suggest reporting this as a bug in Encode (or possibly the Perl core, but I'd try Encode first). > > Thanks for using IO::HTML, though :-) >