Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero byte read from NFS #189

Closed
sv-gh opened this issue May 11, 2018 · 2 comments
Closed

Zero byte read from NFS #189

sv-gh opened this issue May 11, 2018 · 2 comments

Comments

@sv-gh
Copy link

sv-gh commented May 11, 2018

logstash 6.2.4, logstash-input-file (4.0.5), Centos 6.4, NFS v3.2.29;
Config: input file, filter ruby, output file;
Sample - text log files;
Steps to reproduce: NFS Server, several programs writing logs to NFS mounted directory, Logstash with 'input file' from several NFS mounted logs;

When reading input files from NFS,
FileWatch::WatchedFile.file_read(amount)
function uses @file.sysread(amount) to read data,
this sometime produce result buffer partially filled
with '\0' bytes starting with some position.

We could reliably reproduce the problem on NFS v3.

Suggested fix:

  • check zero byte in result, if present, wait, seek back, and re-read.

Ruby code fix in lib/filewatch/watched_file.rb :

    def file_read(amount)
      #debug "*** file_read #{path}"
      cc = 3
      dt = 1.0/128
      loop do
        set_accessed_at
        buf = @file.sysread(amount)
        return buf unless zc = buf.index("\0")
        amount = buf.bytesize # change amount to read
        # warn "*** ZERO-#{cc} byte (#{zc} of #{amount}) in #{path}"
        cc -= 1
        return buf if cc.zero?
        sleep(dt)
        dt *= 4
        @file.sysseek(-amount, IO::SEEK_CUR)
      end
    end 

Fixed, in filewatch gem, tested with NFS v.3, running OK under heavy NFS operations
load (actively updated log files from numerous client boxes).

sv-gh pushed a commit to sv-gh/logstash-input-file that referenced this issue May 11, 2018
@guyboertje
Copy link
Contributor

It is not advised to use the the file input on remote filesystems. See these links.
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html#_reading_from_remote_network_volumes
#38
#45
#163

There has been a big change in the file input recently. We copied the filewatch library code into the plugin repo and refactored it a lot.

The plugin is single threaded and that thread must alternate between discovering files and reading them.

There is also the legitimate inclusion of NUL characters to consider. A new setting would need to be introduced. I will consider raising an exception in the file_read method so the read loop exits without advancing the bytes read pointer. This would have the effect of spinning between all the files and discovery until files can be read without NUL bytes.

@sv-gh
Copy link
Author

sv-gh commented May 14, 2018

"It is not advised to use" I translate to "we don't know how to make it work, don't even try", but lot of people did and not because they are masochistic, but for business needs.

"Big change and refactoring" arguments are irrelevant in this discussion, sorry, please let me to ignore them. You've completed great job and make lot of changes, I admit,
but they don't address/improve any issues reported in 'Read from NFS' cases.

"legitimate inclusion of NUL characters" -- imho: NUL character in text log is quite some special case.
Contrary to this - "Reading logs from NFS" IS very valid use case, multiple issues raised before,
should be solved long time ago.

If you don't agree with suggested solution in general,
could you please introduce special option for
"Re-reading buffer with NUL char"
to provide at least partial support for poor Logstash users who must read logs from NFS v3.

"I will consider raising an exception" - may I suggest to SEEK back and return empty string,
there is no need for global control transfer in processing well detected situation,
can we keep the code a little bit more clean and functional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants