Skip Menu |

This queue is for tickets about the File-Slurp CPAN distribution.

Report information
The Basics
Id: 5638
Status: resolved
Priority: 0/
Queue: File-Slurp

People
Owner: uri [...] sysarch.com
Requestors: peterm [...] zeta.org.au
Cc:
AdminCc:

Bug Information
Severity: Normal
Broken in: (no value)
Fixed in: (no value)

Attachments


Subject: Unexpected newline handling in windows
This is perl, v5.8.3 built for MSWin32-x86-multi-thread (with 8 registered patches, see perl -V for more detail) Copyright 1987-2003, Larry Wall Binary build 809 provided by ActiveState Corp. http://www.ActiveState.com ActiveState is a division of Sophos. Built Feb 3 2004 00:28:51 C:\>pmvers File::Slurp 9999.04 Bug ?? Hmmm...but unpleasant effects if one assumes slurping behaviour is similar to earlier File::Slurp or matches other slurp behaviour. (e.g. Perl6::Slurp) Newline handling is different in Windows. Regexes based in form like like $text =~ m/^(.*?)\n/g on a scalar read_file get unexpected results.. and this breaks quite a few things based on line end matching. Nasty. I've had to revert.
From: peterm [...] zeta.org.au
[guest - Thu Mar 11 21:34:45 2004]: Show quoted text
> This is perl, v5.8.3 built for MSWin32-x86-multi-thread > (with 8 registered patches, see perl -V for more detail) > > Copyright 1987-2003, Larry Wall > > Binary build 809 provided by ActiveState Corp. > http://www.ActiveState.com > ActiveState is a division of Sophos. > Built Feb 3 2004 00:28:51 > > C:\>pmvers File::Slurp > 9999.04 > > Bug ?? Hmmm...but unpleasant effects if one assumes slurping > behaviour is similar to earlier File::Slurp or matches other slurp > behaviour. (e.g. Perl6::Slurp) > > Newline handling is different in Windows. > > Regexes based in form like like $text =~ m/^(.*?)\n/g on a scalar > read_file get unexpected results.. and this breaks quite a few things > based on line end matching. > > Nasty. I've had to revert. > > > >
Sorry: Further elaboration: Windows text file with CR/LF combinations read into memory by new slurp read_file appears to handle line ends differently from old File::Slurp's behaviour. In new version, data in memory appears to retain CR/LF lineends, which for me then breaks regex operations which involve $ line-end anchors and/or "\n" characters. New File::Slurp's write_file does away with CR/LF endings (which is why I was confounded by the source of the problem for a while.) Ptkdb display of data appears to provide evidence. Following script assumes a "slurptest1.txt" file in Windows text format in current directory.. and for me, reports a difference between two different read_file operations on the Windows text file. (Same file attached in zip file). !perl # scriptname here use strict; use Carp; use File::Slurp; my ($lines, $newlines, $oldlines, $hex_string, $nline, $oline, $line_text) = ""; my (@newlines, @oldlines, $datalines); while(<DATA>) { $datalines .= $_; } sub old_read_file { my ($file) = @_; local($/) = wantarray ? $/ : undef; local(*F); my $r; my (@r); open(F, "<$file") || croak "open $file: $!"; @r = <F>; close(F) || croak "close $file: $!"; return $r[0] unless wantarray; return @r; } write_file("slurpingtest2.txt", $datalines); $newlines = read_file("slurpingtest2.txt"); $oldlines = old_read_file("slurpingtest2.txt"); print "Test 1:\n"; if($newlines eq $oldlines) { print "New and Old same\n"; } else { print "New and Old different\n"; } if($newlines eq $datalines) { print "New and Data same\n"; } else { print "New and Data different\n"; } if($oldlines eq $datalines) { print "Old and Data same\n"; } else { print "Old and Data different\n"; } print "Test 2:\n"; $newlines = read_file("slurptest1.txt"); $oldlines = old_read_file("slurptest1.txt"); if($newlines eq $oldlines) { print "New and Old same\n"; } else { print "New and Old different\n"; } if($newlines eq $datalines) { print "New and Data same\n"; } else { print "New and Data different\n"; } __END__ A test file with newline separation It appears data in memory is different, but the difference disappears when written back to file. But my regex searches work in memory. :(
Download slurptest1.zip
application/zip 256b

Message body not shown because it is not plain text.

is this bug still around? if so, can you send a test for it that fails. uri