Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related tasks: #1372 (comment)
This PR use Tensorflow Filesystem API to access HDFS. Instead of relying on
libhdfspp
, which is not included in the current compilation setup.By the way,
libhdfspp
is not another wrapper ofC libhdfs
. But it is an implementation based on RPC protocol. Which is quite complex and some of the code seems not well maitained.IMHO, we can rely on TensorFlow's modular Filesystem HDFS API. Which is based on
libhdfs
and quite stable.libtensorflow_io_plugins.so
is loaded whenimport tensorflow_io
is executed in Python. So the following C++ codereturns a successful
RandonAccessFile
. In this way, we can support reading ORC from HDFS