Skip Menu |

This queue is for tickets about the File-Tail CPAN distribution.

Report information
The Basics
Id: 125814
Status: new
Priority: 0/
Queue: File-Tail

People
Owner: Nobody in particular
Requestors: bram-pcna [...] mail.wizbit.be
Cc:
AdminCc:

Bug Information
Severity: (no value)
Broken in: (no value)
Fixed in: (no value)



Subject: Race condition 'checkpending'/'reset_pointers'
There exist a race condition between 'checkpending' and 'reset_pointers'. 'checkpending' first determines the 'endpos' of the file (using sysseek); if it's determine that the file is not changed and if resetafter interval is passed then a call to 'reset_pointers' is done. What happens in 'reset_pointers' basically boils down to: - open file again (to newhandle) - get size of newhandle (using stat) - compare size returned by stat with size returned by syseek Race condition: data can be written to the file between call to syseek and call to stat.. This is turn causes 'reset_pointers' to reset and read the entire file again.. Note: there is another issue with reset_pointers which is described in #125812/#107522, code/test below assumes fix in that issue is applied. Easiest way to test this is by slightly altering the code to make time of race condition greater by adding a sleep. In 'checkpending' change: ... } elsif (($object->{curpos}==$object->{"endpos"}) && (time()-$object->{lastread})>$object->{'resetafter'}) { $object->reset_pointers; $object->{"endpos"}=sysseek($object->{handle},0,SEEK_END); } ... into: ... } elsif (($object->{curpos}==$object->{"endpos"}) && (time()-$object->{lastread})>$object->{'resetafter'}) { print STDERR "# Race condition time! Sleeping 5 seconds\n"; sleep 5; print STDERR "# Done sleeping, continuing\n"; $object->reset_pointers; $object->{"endpos"}=sysseek($object->{handle},0,SEEK_END); } ... Then run: $ echo foo > /tmp/foo $ perl -MFile::Tail -we 'my $f = File::Tail->new(name => "/tmp/foo", maxinterval => 1, tail => 0); print while ($_ = $f->read());' Wait until it prints "# Race condition time! Sleeping 5 seconds' and then in another shell: $ echo bar >> /tmp/foo Output: $ perl -MFile::Tail -we 'my $f = File::Tail->new(name => "/tmp/foo", maxinterval => 1, tail => 0); print while ($_ = $f->read());' # Race condition time! Sleeping 5 seconds # Done sleeping, continuing foo bar => shows 'foo' and 'bar' which is wrong.. only 'bar' was logged in the file after tail started.. Relevant part of 'reset_pointers' (with some additional comments) sub reset_pointers { my $object=shift @_; ... if (defined($oldhandle)) { # If file has not been changed since last OK read do not do anything $st=stat($newhandle); # lastread uses fractional time, stat doesn't. This can cause false # negatives. # If the file was changed the same second as it was last read, # we only reopen it if it's length has changed. The alternative is that # sometimes, files would be reopened needlessly, and with reset_tail # set to -1, we would see the whole file again. # Of course, if the file was removed the same second as when it was # last read, and replaced (within that second) with a file of equal # length, we're out of luck. I don't see how to fix this. # If file was written to between 'checkpending' and 'now' then: # - '$st->mtime' will be more > "$object->{'lastread'}" (reset_pointers is only called after 'resetafter' interval) # - '$st->size' will be larger then "$object->{'curpos'}" if ($st->mtime<=int($object->{'lastread'})) { if (($st->size==$object->{"curpos"}) && ($st->ino == $object->{"inode"})) { # Reached when file is still unchanged $object->{lastread} = $st->mtime; return; } else { # will continue further to reset # This code is only reached when: # - data is written to the file between 'checkpending' and 'now' # - *and* mtime of file is in the past (system clock changed to a past-time or mtime manually changed (using utime or `touch`) } } else { # Reached when: # - data is written to the file between 'checkpending' and 'now' # - mtime of file updated [without data being added; i.e. `touch`] # - file removed and new file added with same name } # reset file and read it from start ... What the code attempts to do is check if a file was removed and replaced with a new file. When 'stat' returns a sensible value for inode then this code can be replaced with: if ($st->ino == $object->{"inode"}) { # Note: should also check $st->dev.. return; } For platforms that do not return a value for ino this is a bit complexer: Attached code will: - call stat on $oldhandle and compare size with $object->{'endpos'}. If data was written (or file truncated) between 'checkpending' and 'reset_pointers' then this this will be different --> Return and let 'checkpending' handle it - compare stat of $oldhandle with stat of $newhandle: * if size is different then reset file * if mtime of newhandle is after mtime of oldhandle: also reset file Cases in which file is *not* reset: - size of oldhandle and newhandle is the same - mtime of oldhandle is the same as mtime of newhandle: this means it's the same file *OR* file was replaced by a file with identical size and identical mtime - mtime of oldhandle is after mtime of newhandle: this indicates mtime of file was updated without appending data (updated by utime, `touch`, ...) [this can be considered a separate bug fix]
Subject: 004-reset_pointers_race.patch
diff --git a/Tail.pm b/Tail.pm index dc7acc9..327464b 100644 --- a/Tail.pm +++ b/Tail.pm @@ -378,23 +378,40 @@ sub reset_pointers { if (defined($oldhandle)) { # If file has not been changed since last OK read do not do anything $st=stat($newhandle); - # lastread uses fractional time, stat doesn't. This can cause false - # negatives. - # If the file was changed the same second as it was last read, - # we only reopen it if it's length has changed. The alternative is that - # sometimes, files would be reopened needlessly, and with reset_tail - # set to -1, we would see the whole file again. - # Of course, if the file was removed the same second as when it was - # last read, and replaced (within that second) with a file of equal - # length, we're out of luck. I don't see how to fix this. - if ($st->mtime<=int($object->{'lastread'})) { - if (($st->size==$object->{"curpos"}) && ($st->ino == $object->{"inode"})) { - $object->{lastread} = $st->mtime; + + if ($st->ino) { + if ($st->ino == $object->{"inode"}) { + # inode the same --> file not changed --> no reset + # Note: should also check $st->dev.. return; - } else { - # will continue further to reset } } else { + # inode info not available.. check if file needs to be reopend + my $st_old = stat($oldhandle); + + if ($st_old->size != $object->{"curpos"}) { + # file was changed between calling of 'reset_pointers' and now + # -> do not reset file, checkpending will handle it + return; + } + + if ($st->size == $st_old->size) { + # filesize oldhandle and newhandle the same.. + if ($st->mtime == $st_old->mtime) { + # modified time the same.. same file *or* file + # replaced by a file of identical size with + # identical mtime.. --> Can't differentiate + # between these so assume it's the same file.. + return; + } + + if ($st_old->mtime > $st->mtime) { + # mtime of oldhandle after mtime of newhandle but + # no new data added in file + # -> only mtime was updated (i.e. utime/`touch`) + return; + } + } } $object->{handle}=$newhandle; $object->{inode} = $st->ino; diff --git a/Tail.pm.debug b/Tail.pm.debug index 5abdd53..ef9cee7 100644 --- a/Tail.pm.debug +++ b/Tail.pm.debug @@ -382,30 +382,46 @@ sub reset_pointers { if (defined($oldhandle)) { # If file has not been changed since last OK read do not do anything $st=stat($newhandle); - # lastread uses fractional time, stat doesn't. This can cause false - # negatives. - # If the file was changed the same second as it was last read, - # we only reopen it if it's length has changed. The alternative is that - # sometimes, files would be reopened needlessly, and with reset_tail - # set to -1, we would see the whole file again. - # Of course, if the file was removed the same second as when it was - # last read, and replaced (within that second) with a file of equal - # length, we're out of luck. I don't see how to fix this. - if ($st->mtime<=int($object->{'lastread'})) { - $object->logit(" File not modified since last read. Reset skipped."); - $object->logit("mtime:",$st->mtime," lastread: ",$object->{lastread}); - if (($st->size==$object->{"curpos"}) && ($st->ino == $object->{"inode"})) { - $object->{lastread} = $st->mtime; + + if ($st->ino) { + if ($st->ino == $object->{"inode"}) { + # inode the same --> file not changed --> no reset + # Note: should also check $st->dev.. + $object->logit("inode the same. Reset skipped."); return; - } else { - $object->logit("Mtime is old (mtime:".$st->mtime." lastread ".$object->{'lastread'}."), but size is not equal to position, therefore reset!"); - $object->logit("Size :".$st->size." position: ".$object->{'curpos'}); - # will continue further to reset } } else { - $object->logit("considered skipping reset, but did not. file mtime:",$st->mtime," lastread: ",$object->{lastread}," size=",$st->size," read position=",$object->{"curpos"}); + # inode info not available.. check if file needs to be reopend + my $st_old = stat($oldhandle); + + if ($st_old->size != $object->{"curpos"}) { + # file was changed between calling of 'reset_pointers' and now + # -> do not reset file, checkpending will handle it + $object->logit("File modified between checkpending and reset_pointers. Reset skipped."); + return; + } + + if ($st->size == $st_old->size) { + # filesize oldhandle and newhandle the same.. + if ($st->mtime == $st_old->mtime) { + # modified time the same.. same file *or* file + # replaced by a file of identical size with + # identical mtime.. --> Can't differentiate + # between these so assume it's the same file.. + $object->logit("File not modified. Reset skipped."); + return; + } + + if ($st_old->mtime > $st->mtime) { + # mtime of oldhandle after mtime of newhandle but + # no new data added in file + # -> only mtime was updated (i.e. utime/`touch`) + $object->logit("File not modified but mtime updated. Reset skipped."); + return; + } + } } - $object->logit("reseting file mtime:",$st->mtime," lastread: ",$object->{lastread}," size=",$st->size," read position=",$object->{"curpos"}); + $object->logit("reseting file"); $object->{handle}=$newhandle; $object->{inode} = $st->ino; $object->position;