Subject: | 'read' not returning data as soon as available ('read' with 'nowait' not behaving as expected) |
When testing with 'nowait' I was running into some unexpected behaviour:
to get the data that was appended to the file two calls to read are needed..
Looking further: even with 'nowait' it does not fully behave as one would
expect.. I would expect 'read' to return immediately when data is available
in the file. But that does not happen: even if data is available then it
still waits for 'interval' seconds before returning the data..
Example with nowait:
#!/usr/bin/perl
use File::Tail;
use File::Temp;
use IO::Handle;
my $fh = File::Temp->new();
my $fname = $fh->filename;
$fh->autoflush(1);
my $file = File::Tail->new(
name => $fname,
nowait => 1,
tail => -1,
interval => 1,
maxinterval => 1,
);
print $fh "foo\n";
my $r = $file->read();
print "Data read (1): $r\n";
my $r = $file->read();
print "Data read (2): $r\n";
sleep 1;
print $fh "bar\n";
my $r = $file->read();
print "Data read (3): $r\n";
my $r = $file->read();
print "Data read (4): $r\n";
Output:
Data read (1):
Data read (2): foo
Data read (3):
Data read (4): bar
Expected (/ with attached patch):
Data read (1): foo
Data read (2):
Data read (3): bar
Data read (3):
Example without nowait:
#!/usr/bin/perl
use File::Tail;
use File::Temp;
use IO::Handle;
use Time::HiRes qw# time #;
my $fh = File::Temp->new();
my $fname = $fh->filename;
$fh->autoflush(1);
my $file = File::Tail->new(
name => $fname,
tail => -1,
interval => 10,
maxinterval => 10,
);
print $fh "foo\n";
time_read();
print "Sleeping 10 seconds\n";
sleep 10; # this is needed to ensure interval is set
# to 10 seconds.. ->readin adjusts the interval
# when a newline is read
print $fh "bar\n";
time_read();
sub time_read {
print "Calling ->read()\n";
my $start = time;
my $data = $file->read();
my $end = time;
chomp $data;
print "Read: '$data', took: " . ($end - $start) . "s\n";
}
Output:
Calling ->read()
Read: 'foo', took: 0.264503955841064s
Sleeping 10 seconds
Calling ->read()
Read: 'bar', took: 10.0002119541168s
=> unexpected in this output is that it took 10s to read
"bar".. "bar" was appended to the test file *before*
'->read()' was called..
Expected (/ with attached patch):
Calling ->read()
Read: 'foo', took: 3.91006469726562e-05s
Sleeping 10 seconds
Calling ->read()
Read: 'bar', took: 6.60419464111328e-05s
What happens:
* 'read' calls 'predict': if return value of 'predict' is non-zero then read
will return empty string when nowait is set (or sleep when nowait isn't set)
* 'predict' called 'checkpending': check_pending' should return number of bytes
that are available to be read.
Problem: 'checkpending' always returns 0, which in turn causes 'predict' to
return a non-zero value..
'checkpending' contains:
if ($object->{"endpos"}-$object->{curpos}) {
sysseek($object->{handle},$object->{curpos},SEEK_SET);
readin($object,$object->{"endpos"}-$object->{curpos});
}
return ($object->{"endpos"}-$object->{curpos});
If there is data to be read then it does a syseek to curpos (good), and then
reads the data (bad).
Reading the data will cause 'curpos' to be updated.. ('readin' reads all data
==> 'curpos' is now the same as 'endpos')
Then it returns difference between 'endpos' and 'curpos' which will be 0...
'predict' contains:
if (my $len=$object->checkpending) {
readin($object,$len);
return 0;
}
=> if checkpending is modified to not call 'readin' then 'predict' will call it
and return 0 (which causes 'read' to immediately return the data).
But even the 'readin' in predict looks a bit odd..
Docs for 'predict':
Once select returns, when you want to determine which File::Tail objects
have input ready, you can either use the list of objects select returned,
or you can check each individual object with $object->predict. This returns
the ammount of time (in fractional seconds) after which the handle expects
input. If it returns 0, there is input waiting.
Based on that and based on how it's used in 'select' I would expect 'predict'
to *not* read data..
When call to 'readin' is also removed in 'predict' then it falls back on code in 'read' which will call 'readin'..
Attached patch removes these calls to 'readin'.
Note: this changes how 'read' with 'nowait' behaves: now first call to 'read' will return data...
It also alters behaviour of 'checkpending', 'pending' and 'select': these no longer read the file but only adjust 'endpos'..
(My particular use-case for using 'nowait' was using it when adding a test case for a File::Tail bug)
Subject: | 001-nowait.patch |
diff --git a/Tail.pm b/Tail.pm
index 3a7436e..3ad3590 100644
--- a/Tail.pm
+++ b/Tail.pm
@@ -434,7 +434,6 @@ sub checkpending {
if ($object->{"endpos"}-$object->{curpos}) {
sysseek($object->{handle},$object->{curpos},SEEK_SET);
- readin($object,$object->{"endpos"}-$object->{curpos});
}
return ($object->{"endpos"}-$object->{curpos});
}
@@ -447,7 +446,6 @@ sub predict {
my $ttw=$object->{"nextcheck"}-time();
return $ttw if $ttw>0;
if (my $len=$object->checkpending) {
- readin($object,$len);
return 0;
}
if ($object->{"sleepcount"}>$object->adjustafter) {
diff --git a/Tail.pm.debug b/Tail.pm.debug
index 7ebdddd..c79b4f3 100644
--- a/Tail.pm.debug
+++ b/Tail.pm.debug
@@ -450,7 +450,6 @@ sub checkpending {
if ($object->{"endpos"}-$object->{curpos}) {
sysseek($object->{handle},$object->{curpos},SEEK_SET);
- readin($object,$object->{"endpos"}-$object->{curpos});
}
$object->logit("Returning length is ".($object->{"endpos"}-$object->{curpos}));
return ($object->{"endpos"}-$object->{curpos});
@@ -468,7 +467,6 @@ sub predict {
$object->logit("- ttw is $ttw");
return $ttw if $ttw>0;
if (my $len=$object->checkpending) {
- readin($object,$len);
return 0;
}
if ($object->{"sleepcount"}>$object->adjustafter) {
diff --git a/t/40nowait.t b/t/40nowait.t
new file mode 100644
index 0000000..7133043
--- /dev/null
+++ b/t/40nowait.t
@@ -0,0 +1,35 @@
+#!/usr/bin/perl
+
+use strict;
+use warnings;
+use File::Tail;
+use File::Temp;
+use IO::Handle;
+use Test::More;
+
+my $fh = File::Temp->new();
+my $fname = $fh->filename;
+$fh->autoflush(1);
+
+my $max_interval = 1;
+my $file = File::Tail->new(
+ name => $fname,
+ nowait => 1,
+ tail => 0,
+ interval => $max_interval,
+ maxinterval => $max_interval,
+);
+
+print $fh "foo\n";
+is($file->read(), "foo\n", "read returns data as soon as available");
+is($file->read(), "", "read returns empty string when no data");
+
+sleep $max_interval;
+
+print $fh "bar\n";
+is($file->read(), "bar\n", "read returns data as soon as available");
+is($file->read(), "", "read returns empty string when no data");
+
+close $fh;
+
+done_testing();