Subject: | Off by one error in strip_html() |
There's an off-by-one error in strip_html(). The attached test script offbyone.t may return different results depending whether data structures are initialized with zeros or not, which may or may not happen during a normal script run. Initialization with zeros may be forced on FreeBSD systems by running the test with MALLOC_OPTIONS=Z, or on any system running with valgrind. valgrind additionally shows errors:
==20936== Conditional jump or move depends on uninitialised value(s)
==20936== at 0x559A37E: tolower (ctype.c:47)
==20936== by 0x4C2C53F: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20936== by 0xEC5A585: strip_html (in /home/slaven.rezic/trash/HTML-Strip-1.06/blib/arch/auto/HTML/Strip/Strip.so)
==20936== by 0xEC59CE8: XS_HTML__Strip_strip_html (in /home/slaven.rezic/trash/HTML-Strip-1.06/blib/arch/auto/HTML/Strip/Strip.so)
==20936== by 0x49FAF4: Perl_pp_entersub (in /opt/perl-5.18.2/bin/perl)
==20936== by 0x498382: Perl_runops_standard (in /opt/perl-5.18.2/bin/perl)
==20936== by 0x439E97: perl_run (in /opt/perl-5.18.2/bin/perl)
==20936== by 0x41DA84: main (in /opt/perl-5.18.2/bin/perl)
==20936==
==20936== Use of uninitialised value of size 8
==20936== at 0x559A39C: tolower (ctype.c:47)
==20936== by 0x4C2C53F: strcasecmp (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20936== by 0xEC5A585: strip_html (in /home/slaven.rezic/trash/HTML-Strip-1.06/blib/arch/auto/HTML/Strip/Strip.so)
==20936== by 0xEC59CE8: XS_HTML__Strip_strip_html (in /home/slaven.rezic/trash/HTML-Strip-1.06/blib/arch/auto/HTML/Strip/Strip.so)
==20936== by 0x49FAF4: Perl_pp_entersub (in /opt/perl-5.18.2/bin/perl)
==20936== by 0x498382: Perl_runops_standard (in /opt/perl-5.18.2/bin/perl)
==20936== by 0x439E97: perl_run (in /opt/perl-5.18.2/bin/perl)
==20936== by 0x41DA84: main (in /opt/perl-5.18.2/bin/perl)
The attached patch fixes this problem.
Regards,
Slaven
Subject: | HTML-Strip-1.06-offbyone.patch |
From f59077592fa7c8ba4d0f2ce184a55e0c04b60e7a Mon Sep 17 00:00:00 2001
From: Slaven Rezic <srezic@cpan.org>
Date: Fri, 11 Apr 2014 17:16:59 +0200
Subject: [PATCH] fix off-by-one error in strip_html()
---
strip_html.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/strip_html.c b/strip_html.c
index 8eaecfe..dcc80db 100644
--- a/strip_html.c
+++ b/strip_html.c
@@ -41,7 +41,7 @@ strip_html( Stripper * stripper, const char * raw, char * output ) {
/* if we're outside a stripped tag block, check tagname against stripped tag list */
} else if( !stripper->f_in_striptag && !stripper->f_closing ) {
int i;
- for( i = 0; i <= stripper->numstriptags; i++ ) {
+ for( i = 0; i < stripper->numstriptags; i++ ) {
if( strcasecmp( stripper->tagname, stripper->o_striptags[i] ) == 0 ) {
stripper->f_in_striptag = 1;
strcpy( stripper->striptag, stripper->tagname );
--
1.7.9.5
Subject: | offbyone.t |
use strict;
use Test::More 'no_plan';
use HTML::Strip;
my $text = "<li>abc < 0.5 km</li><li>xyz</li>";
my $htmlStripper = HTML::Strip->new;
my $change_text = $htmlStripper->parse($text);
$htmlStripper->eof();
is $change_text, 'abc xyz';