Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring back GCS ops. #1229

Open
wants to merge 85 commits into
base: master
Choose a base branch
from
Open

Commits on Dec 15, 2020

  1. Configuration menu
    Copy the full SHA
    59ddbc4 View commit details
    Browse the repository at this point in the history
  2. Revert "Deprecate gcs-config (tensorflow#1024)"

    This reverts commit 9702a15.
    michaelbanfield committed Dec 15, 2020
    Configuration menu
    Copy the full SHA
    ca8e327 View commit details
    Browse the repository at this point in the history
  3. Rebase change

    michaelbanfield committed Dec 15, 2020
    Configuration menu
    Copy the full SHA
    cb8ffe7 View commit details
    Browse the repository at this point in the history
  4. Clean up merge

    michaelbanfield committed Dec 15, 2020
    Configuration menu
    Copy the full SHA
    3536338 View commit details
    Browse the repository at this point in the history
  5. Fix lint errors

    michaelbanfield committed Dec 15, 2020
    Configuration menu
    Copy the full SHA
    11c8a51 View commit details
    Browse the repository at this point in the history

Commits on Mar 29, 2021

  1. Configuration menu
    Copy the full SHA
    3560ac6 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    42c2867 View commit details
    Browse the repository at this point in the history

Commits on Mar 30, 2021

  1. Update the API Compatibility test to include tf-nightly vs. tensorflo…

    …w-io==0.17.0 (tensorflow#1230)
    
    * Update the API Compatibility test to include tf-nightly vs. tensorflow-io==0.17.0
    
    as we release tensorflow-io==0.17.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Bump Linux and Windows version checks
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    7c04ee3 View commit details
    Browse the repository at this point in the history
  2. Bump Apache Arrow to 2.0.0 (tensorflow#1231)

    * Bump Apache Arrow to 2.0.0
    
    Also bumps Apache Thrift to 0.13.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update code to match Arrow
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Bump pyarrow to 2.0.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Stay with version=1 for write_feather to pass tests
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Bump flatbuffers to 1.12.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix Windows issue
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix tests
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix Windows
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Remove -std=c++11 and leave default -std=c++14 for arrow build
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update sha256 of libapr1
    
    As the hash changed by the repo.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    760dd25 View commit details
    Browse the repository at this point in the history
  3. Bump Avro to 1.10.1 (tensorflow#1235)

    This PR bumps Avro to 1.10.1.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    9a3663c View commit details
    Browse the repository at this point in the history
  4. Add emulator for gcs (tensorflow#1234)

    * Bump com_github_googleapis_google_cloud_cpp to `1.21.0`
    
    * Add gcs testbench
    
    * Bump `libcurl` to `7.69.1`
    vnghia authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    2e6936f View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    04d6913 View commit details
    Browse the repository at this point in the history
  6. Remove the CI build for CentOS 8 (tensorflow#1237)

    Building shared libraries on CentOS 8 is pretty much the same as
    on Ubuntu 20.04 except `apt` should be changed to `yum`. For that
    our CentOS 8 CI test is not adding a lot of value.
    
    Furthermore with the upcoming CentOS 8 change:
    https://www.phoronix.com/scan.php?page=news_item&px=CentOS-8-Ending-For-Stream
    
    CentOS 8 is effectively EOLed at 2021.
    
    For that we may want to drop the CentOS 8 build (only leave a comment in README.md)
    
    Note we keep CentOS 7 build for now as there are still many users using
    CentOS 7 and CentOS 7 will only be EOLed at 2024. We might drop CentOS 7 build in
    the future as well if there is similiar changes to CentOS 7 like CentOS 8.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    aa1a95d View commit details
    Browse the repository at this point in the history
  7. [MongoDB] update API docstrings (tensorflow#1243)

    * [mongoDB] update API docs
    
    * lint fixes
    
    * rename wrong API
    
    * lint fixes
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    6c29813 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    371877e View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    88b9d8d View commit details
    Browse the repository at this point in the history
  10. Add fail-fast: false to API Compatibility GitHub Actions (tensorflo…

    …w#1246)
    
    This PR adds `fail-fast: false` to API Compatibility GitHub Actions.
    The main reason is to make sure if any job fails, the parallel jobs
    within the same matrix of the workflow can continue. The API Compatibility
    is to see how our plugin binaries match with different versions
    and as such we want to see the whole compatibility match up results.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    d695644 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    b0ffa2e View commit details
    Browse the repository at this point in the history
  12. S3 Improvements (tensorflow#1248)

    * fix `curl ssl` BUILD on macos
    
    * split `@aws-sdk-cpp` to smaller components
    
    * More http code support for `TF_SetStatusFromAWSError`
    
    * Fix memory problem with `s3`
    
    * s3: `RecursivelyCreateDir` fallback to `CreateDir`
    vnghia authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    a68633e View commit details
    Browse the repository at this point in the history
  13. Add missed function RecursivelyCreateDir for hdfs file system impleme…

    …ntation (tensorflow#1218)
    
    This PR adds missed function RecursivelyCreateDir for hdfs file system implementation
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    e29e8a3 View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    fd7bc1a View commit details
    Browse the repository at this point in the history
  15. [audio] cleanup vorbis file after usage (tensorflow#1249)

    * [audio] cleanup vorbis file after usage
    
    * move the file cleanup to destructor
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    7bb3d30 View commit details
    Browse the repository at this point in the history
  16. [s3] add support for testing on macOS (tensorflow#1253)

    * [s3] add support for testing on macOS
    
    * modify docker-compose cmd
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    9966944 View commit details
    Browse the repository at this point in the history
  17. Configuration menu
    Copy the full SHA
    0342a16 View commit details
    Browse the repository at this point in the history
  18. [docs] Restructure README.md content (tensorflow#1257)

    * Refactor README.md content
    
    * bump to run ci jobs
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    b56ad5d View commit details
    Browse the repository at this point in the history
  19. Update libtiff/libgeotiff dependency (tensorflow#1258)

    This PR updates libtiff/libgeotiff to the latest version.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    ab9004d View commit details
    Browse the repository at this point in the history
  20. Update openjpeg to 2.4.0 (tensorflow#1259)

    This PR updates openjpeg to the latest 2.4.0
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    0381b91 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    ac8da58 View commit details
    Browse the repository at this point in the history
  22. Configuration menu
    Copy the full SHA
    3641d2a View commit details
    Browse the repository at this point in the history
  23. Exposes num_parallel_reads and num_parallel_calls (tensorflow#1232)

    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    
    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    
    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    
    * Fixes Lint Issues
    
    * Removes Optional typing for method parameter
    
    -
    
    * Adds test method for _require() function
    
    -This update adds a test to check if ValueErrors
    are raised when given an invalid input for num_parallel_calls
    
    * Uncomments skip for macOS pytests
    
    * Fixes Lint issues
    
    Co-authored-by: Abin Shahab <[email protected]>
    2 people authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    4f340e0 View commit details
    Browse the repository at this point in the history
  24. Fix incomplete row reading issue in parquet files (tensorflow#1262)

    This PR tries to address the issue raised in 1254 where reading parquet
    files will results in `InvalidArgumentError: null value in column`
    
    The issue comes from the fact that parquet's ColumnReader C++ API
    `ReadBatch(...)` does not necessarily respect the number of rows
    requested and may return less instead.
    
    This PR fixes 1254.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    0663d38 View commit details
    Browse the repository at this point in the history
  25. Configuration menu
    Copy the full SHA
    7cb6b0f View commit details
    Browse the repository at this point in the history
  26. add avro tutorial testing data (tensorflow#1267)

    Co-authored-by: Cheng Ren <[email protected]>
    2 people authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    cebc613 View commit details
    Browse the repository at this point in the history
  27. Update Kafka tutorial to work with Apache Kafka (tensorflow#1266)

    * Update Kafka tutorial to work with Apache Kafka
    
    Minor update to the Kafka tutorial to remove the dependency on
    Confluent's distribution of Kafka, and instead work with vanilla
    Apache Kafka.
    
    Signed-off-by: Dale Lane <[email protected]>
    
    * Address review comments
    
    Remove redundant pip install commands
    
    Signed-off-by: Dale Lane <[email protected]>
    dalelane authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    fc8d472 View commit details
    Browse the repository at this point in the history
  28. Update pulsar download link. (tensorflow#1270)

    This PR updates pulsar download link as old link does not work anymore.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    5df32c6 View commit details
    Browse the repository at this point in the history
  29. add github workflow for performance benchmarking (tensorflow#1269)

    * add github workflow for performance benchmarking
    
    * add github-action-benchmark step
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    2bbdd40 View commit details
    Browse the repository at this point in the history
  30. handle missing dependencies while benchmarking (tensorflow#1271)

    * handle missing dependencies while benchmarking
    
    * setup test_sql
    
    * job name change
    
    * set auto-push to true
    
    * remove auto-push
    
    * add personal access token
    
    * use alternate method to push to gh-pages
    
    * add name to the action
    
    * use different id
    
    * modify creds
    
    * use github_token
    
    * change repo name
    
    * set auto-push
    
    * set origin and push results
    
    * set env
    
    * use PERSONAL_GITHUB_TOKEN
    
    * use push changes action
    
    * use github.head_ref to push the changes
    
    * try using fetch-depth
    
    * modify branch name
    
    * use alternative push approach
    
    * git switch -
    
    * test by merging with forked master
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    f8efb10 View commit details
    Browse the repository at this point in the history
  31. Disable s3 macOS for now as docker is not working on GitHub Actions f…

    …or macOS (tensorflow#1277)
    
    * Revert "[s3] add support for testing on macOS (tensorflow#1253)"
    
    This reverts commit 81789bd.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    544740a View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    b71fcb3 View commit details
    Browse the repository at this point in the history
  33. Configuration menu
    Copy the full SHA
    337ef96 View commit details
    Browse the repository at this point in the history
  34. Configuration menu
    Copy the full SHA
    c0d56ee View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    f16c613 View commit details
    Browse the repository at this point in the history
  36. Bump Apache Arrow to 3.0.0 (tensorflow#1285)

    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    ef2927c View commit details
    Browse the repository at this point in the history
  37. Add bazel cache (tensorflow#1287)

    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    03b77de View commit details
    Browse the repository at this point in the history
  38. Add initial bigtable stub test (tensorflow#1286)

    * Add initial bigtable stub test
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix kokoro test
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    f492bc8 View commit details
    Browse the repository at this point in the history
  39. Update azure lite v0.3.0 (tensorflow#1288)

    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    2808ac3 View commit details
    Browse the repository at this point in the history
  40. Add reference to github-pages benchmarks in README (tensorflow#1289)

    * add reference to github-pages benchmarks
    
    * minor grammar change
    
    * Update README.md
    
    Co-authored-by: Yuan Tang <[email protected]>
    
    Co-authored-by: Yuan Tang <[email protected]>
    2 people authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    5a42d1e View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    ff6245a View commit details
    Browse the repository at this point in the history
  42. Configuration menu
    Copy the full SHA
    8a1fead View commit details
    Browse the repository at this point in the history
  43. fix kafka online-learning section in tutorial notebook (tensorflow#1274)

    * kafka notebook fix for colab env
    
    * change timeout from 30 to 20 seconds
    
    * reduce stream_timeout
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    b07d1ab View commit details
    Browse the repository at this point in the history
  44. Only enable bazel caching writes for tensorflow/io github actions (te…

    …nsorflow#1293)
    
    This PR updates so that only GitHub actions run on
    tensorflow/io repo will be enabled with bazel cache writes.
    
    Without the updates, a focked repo actions will cause error.
    
    Note once bazel cache read-permissions are enabled from gcs
    forked repo will be able to access bazel cache (read-only).
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    5299d14 View commit details
    Browse the repository at this point in the history
  45. Enable ready-only bazel cache (tensorflow#1294)

    This PR enables read-only bazel cache
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    f02af15 View commit details
    Browse the repository at this point in the history
  46. Update xz to 5.2.5, and switch the download link. (tensorflow#1296)

    This PR updates xz to 5.2.5, and switch the download link
    to use github instead as it is more stable.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    84bba4c View commit details
    Browse the repository at this point in the history
  47. Configuration menu
    Copy the full SHA
    0a7c5a2 View commit details
    Browse the repository at this point in the history
  48. Rename tests (tensorflow#1297)

    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    880c8b3 View commit details
    Browse the repository at this point in the history
  49. Combine Ubuntu 20.04 and CentOS 7 tests into one GitHub jobs (tensorf…

    …low#1299)
    
    When GitHub Actions runs it looks like there is an implicit concurrent
    jobs limit. As such the CentOS 7 test normally is scheduled later after
    other jobs completes. However, many times CentOS 7 test hangs
    (e.g., https://github.com/tensorflow/io/runs/1825943449). This is likely
    due to the CentOS 7 test is on the GitHub Actions queue for too long.
    
    This PR moves CentOS 7 to run after Ubuntu 20.04 test complete, to try to
    avoid hangs.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    7010a48 View commit details
    Browse the repository at this point in the history
  50. Update names of api tests (tensorflow#1300)

    We renamed the tests to remove "_eager" parts. This PR updates the api test for correct filenames
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    5091b94 View commit details
    Browse the repository at this point in the history
  51. Fix wrong benchmark tests names (tensorflow#1301)

    Fixes wrong benchmark tests names caused by last commit
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    79ccf5e View commit details
    Browse the repository at this point in the history
  52. Patch arrow to temporarily resolve the ARROW-11518 issue (tensorflow#…

    …1304)
    
    This PR patchs arrow to temporarily resolve the ARROW-11518 issue.
    
    See 1281 for details
    
    Credit to diggerk.
    
    We will update arrow after the upstream PR is merged.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    7945ec5 View commit details
    Browse the repository at this point in the history
  53. Avoid error if plugins .so module is not available (tensorflow#1302)

    This PR raises a warning instead of an error in case
    plugins .so module is not available, so that tensorflow-io
    package can be at least partially used with python-only
    functions.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    a398d26 View commit details
    Browse the repository at this point in the history
  54. Remove AWS headers from tensorflow, and use headers from third_party … (

    tensorflow#1241)
    
    * Remove external headers from tensorflow, and use third_party headers instead
    
    This PR removes external headers from tensorflow, and
    use third_party headers instead.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Address review comment
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    df04e37 View commit details
    Browse the repository at this point in the history
  55. Configuration menu
    Copy the full SHA
    cc93afa View commit details
    Browse the repository at this point in the history
  56. Switch to use github to download libgeotiff (tensorflow#1307)

    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    f34d193 View commit details
    Browse the repository at this point in the history
  57. Add @com_google_absl//absl/strings:cord (tensorflow#1308)

    Fix read/STDIN_FILENO
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    801569f View commit details
    Browse the repository at this point in the history
  58. Switch to modular file system for hdfs (tensorflow#1309)

    * Switch to modular file system for hdfs
    
    This PR is part of the effort to switch to modular file system for hdfs.
    When TF_ENABLE_LEGACY_FILESYSTEM=1 is provided, old behavior will
    be preserved.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Build against tf-nightly
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update tests
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Adjust the if else logic, follow review comment
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    a871e52 View commit details
    Browse the repository at this point in the history
  59. Disable test_write_kafka test for now. (tensorflow#1310)

    With tensorflow upgrade to tf-nightly, the test_write_kafka test
    is failing and that is block the plan to modular file system migration.
    
    This PR disables the test temporarily so that CI can continue
    to push tensorflow-io-nightly image (needed for modular file system migration)
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    5b77f96 View commit details
    Browse the repository at this point in the history
  60. Modify --plat-name for macosx wheels (tensorflow#1311)

    * modify --plat-name for macosx wheels
    
    * switch to 10.14
    kvignesh1420 authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    53c9a71 View commit details
    Browse the repository at this point in the history
  61. Switch to modular file system for s3 (tensorflow#1312)

    This PR is part of the effort to switch to modular file system for s3.
    When TF_ENABLE_LEGACY_FILESYSTEM=1 is provided, old behavior will
    be preserved.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    3f7f292 View commit details
    Browse the repository at this point in the history
  62. Update to enable python 3.9 building on Linux (tensorflow#1314)

    * Update to enable python 3.9 building on Linux
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Switch to always use ubuntu:20.04
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    33fca56 View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    314f406 View commit details
    Browse the repository at this point in the history
  64. Configuration menu
    Copy the full SHA
    fb5cab8 View commit details
    Browse the repository at this point in the history
  65. Experimental: Add initial wavefront/obj parser for vertices (tensorfl…

    …ow#1315)
    
    This PR is an early experimental implementation of wavefront obj
    parser in tensorflow-io for 3D objects.
    This PR is the first step to obtain raw vertices in float32
    tensor with shape of `[n, 3]`.
    
    Additional follow up PRs will be needed to handle meshs with
    different shapes (not sure if ragged tensor will be a good fit
    in that case)
    
    Some background on obj file:
    Wavefront (obj) is a format widely used in 3D (another is ply)
    modeling (http://paulbourke.net/dataformats/obj/). It is simple
    (ASCII) with good support for many softwares. Machine learning
    in 3D has been an active field with some advances such as
    PolyGen (https://arxiv.org/abs/2002.10880)
    
    Processing obj files are needed to process 3D with tensorflow.
    
    In 3D the basic elements could be vertices or faces. This PR
    tries to cover vertices first so that vertices in obj file
    can be loaded into TF's graph for further processing within
    graph pipeline.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    221e221 View commit details
    Browse the repository at this point in the history
  66. Configuration menu
    Copy the full SHA
    3b81b85 View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    1c85b77 View commit details
    Browse the repository at this point in the history
  68. Enable python 3.9 build on macOS (tensorflow#1324)

    This PR enables python 3.9 build on macOS, as tf-nightly
    is available with macOS now.
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    ef46f8c View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    3121308 View commit details
    Browse the repository at this point in the history
  70. Configuration menu
    Copy the full SHA
    57d840b View commit details
    Browse the repository at this point in the history
  71. Adds AVRO_PARSER_NUM_MINIBATCH to override num_minibatches and logs t…

    …he parsing time (tensorflow#1283)
    
    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    -Adds test method for _require() function
    -This update adds a test to check if ValueErrors
    are raised when given an invalid input for num_parallel_calls
    
    * Bump Apache Arrow to 2.0.0 (tensorflow#1231)
    
    * Bump Apache Arrow to 2.0.0
    
    Also bumps Apache Thrift to 0.13.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update code to match Arrow
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Bump pyarrow to 2.0.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Stay with version=1 for write_feather to pass tests
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Bump flatbuffers to 1.12.0
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix Windows issue
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix tests
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix Windows
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Remove -std=c++11 and leave default -std=c++14 for arrow build
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update sha256 of libapr1
    
    As the hash changed by the repo.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Add emulator for gcs (tensorflow#1234)
    
    * Bump com_github_googleapis_google_cloud_cpp to `1.21.0`
    
    * Add gcs testbench
    
    * Bump `libcurl` to `7.69.1`
    
    * Remove the CI build for CentOS 8 (tensorflow#1237)
    
    Building shared libraries on CentOS 8 is pretty much the same as
    on Ubuntu 20.04 except `apt` should be changed to `yum`. For that
    our CentOS 8 CI test is not adding a lot of value.
    
    Furthermore with the upcoming CentOS 8 change:
    https://www.phoronix.com/scan.php?page=news_item&px=CentOS-8-Ending-For-Stream
    
    CentOS 8 is effectively EOLed at 2021.
    
    For that we may want to drop the CentOS 8 build (only leave a comment in README.md)
    
    Note we keep CentOS 7 build for now as there are still many users using
    CentOS 7 and CentOS 7 will only be EOLed at 2024. We might drop CentOS 7 build in
    the future as well if there is similiar changes to CentOS 7 like CentOS 8.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * add tf-c-header rule (tensorflow#1244)
    
    * Skip  tf-nightly:tensorflow-io==0.17.0 on API compatibility test (tensorflow#1247)
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * [s3] add support for testing on macOS (tensorflow#1253)
    
    * [s3] add support for testing on macOS
    
    * modify docker-compose cmd
    
    * add notebook formatting instruction in README (tensorflow#1256)
    
    * [docs] Restructure README.md content (tensorflow#1257)
    
    * Refactor README.md content
    
    * bump to run ci jobs
    
    * Update libtiff/libgeotiff dependency (tensorflow#1258)
    
    This PR updates libtiff/libgeotiff to the latest version.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * remove unstable elasticsearch test setup on macOS (tensorflow#1263)
    
    * Exposes num_parallel_reads and num_parallel_calls (tensorflow#1232)
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    - Adds test method for _require() function
    -This update adds a test to check if ValueErrors
    are raised when given an invalid input for num_parallel_calls
    
    Co-authored-by: Abin Shahab <[email protected]>
    
    * Added AVRO_PARSER_NUM_MINIBATCH to override num_minibatches
    
    Added AVRO_PARSER_NUM_MINIBATCH to override num_minibatches. This is recommended to be set equal to the vcore request.
    
    * Exposes num_parallel_reads and num_parallel_calls (tensorflow#1232)
    
    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    
    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    
    * Exposes num_parallel_reads and num_parallel_calls
    
    -Exposes `num_parallel_reads` and `num_parallel_calls` in AvroRecordDataset and `make_avro_record_dataset`
    -Adds parameter constraints
    -Fixes lint issues
    
    * Fixes Lint Issues
    
    * Removes Optional typing for method parameter
    
    -
    
    * Adds test method for _require() function
    
    -This update adds a test to check if ValueErrors
    are raised when given an invalid input for num_parallel_calls
    
    * Uncomments skip for macOS pytests
    
    * Fixes Lint issues
    
    Co-authored-by: Abin Shahab <[email protected]>
    
    * add avro tutorial testing data (tensorflow#1267)
    
    Co-authored-by: Cheng Ren <[email protected]>
    
    * Update Kafka tutorial to work with Apache Kafka (tensorflow#1266)
    
    * Update Kafka tutorial to work with Apache Kafka
    
    Minor update to the Kafka tutorial to remove the dependency on
    Confluent's distribution of Kafka, and instead work with vanilla
    Apache Kafka.
    
    Signed-off-by: Dale Lane <[email protected]>
    
    * Address review comments
    
    Remove redundant pip install commands
    
    Signed-off-by: Dale Lane <[email protected]>
    
    * add github workflow for performance benchmarking (tensorflow#1269)
    
    * add github workflow for performance benchmarking
    
    * add github-action-benchmark step
    
    * handle missing dependencies while benchmarking (tensorflow#1271)
    
    * handle missing dependencies while benchmarking
    
    * setup test_sql
    
    * job name change
    
    * set auto-push to true
    
    * remove auto-push
    
    * add personal access token
    
    * use alternate method to push to gh-pages
    
    * add name to the action
    
    * use different id
    
    * modify creds
    
    * use github_token
    
    * change repo name
    
    * set auto-push
    
    * set origin and push results
    
    * set env
    
    * use PERSONAL_GITHUB_TOKEN
    
    * use push changes action
    
    * use github.head_ref to push the changes
    
    * try using fetch-depth
    
    * modify branch name
    
    * use alternative push approach
    
    * git switch -
    
    * test by merging with forked master
    
    * Disable s3 macOS for now as docker is not working on GitHub Actions for macOS (tensorflow#1277)
    
    * Revert "[s3] add support for testing on macOS (tensorflow#1253)"
    
    This reverts commit 81789bd.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * rename testing data files (tensorflow#1278)
    
    * Add tutorial for avro dataset API (tensorflow#1250)
    
    * remove docker based mongodb tests in macos (tensorflow#1279)
    
    * trigger benchmarks workflow only on commits (tensorflow#1282)
    
    * Bump Apache Arrow to 3.0.0 (tensorflow#1285)
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Add bazel cache (tensorflow#1287)
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Add initial bigtable stub test (tensorflow#1286)
    
    * Add initial bigtable stub test
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix kokoro test
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Add reference to github-pages benchmarks in README (tensorflow#1289)
    
    * add reference to github-pages benchmarks
    
    * minor grammar change
    
    * Update README.md
    
    Co-authored-by: Yuan Tang <[email protected]>
    
    Co-authored-by: Yuan Tang <[email protected]>
    
    * Clear outputs (tensorflow#1292)
    
    * fix kafka online-learning section in tutorial notebook (tensorflow#1274)
    
    * kafka notebook fix for colab env
    
    * change timeout from 30 to 20 seconds
    
    * reduce stream_timeout
    
    * Only enable bazel caching writes for tensorflow/io github actions (tensorflow#1293)
    
    This PR updates so that only GitHub actions run on
    tensorflow/io repo will be enabled with bazel cache writes.
    
    Without the updates, a focked repo actions will cause error.
    
    Note once bazel cache read-permissions are enabled from gcs
    forked repo will be able to access bazel cache (read-only).
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Enable ready-only bazel cache (tensorflow#1294)
    
    This PR enables read-only bazel cache
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Rename tests (tensorflow#1297)
    
    * Combine Ubuntu 20.04 and CentOS 7 tests into one GitHub jobs (tensorflow#1299)
    
    When GitHub Actions runs it looks like there is an implicit concurrent
    jobs limit. As such the CentOS 7 test normally is scheduled later after
    other jobs completes. However, many times CentOS 7 test hangs
    (e.g., https://github.com/tensorflow/io/runs/1825943449). This is likely
    due to the CentOS 7 test is on the GitHub Actions queue for too long.
    
    This PR moves CentOS 7 to run after Ubuntu 20.04 test complete, to try to
    avoid hangs.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update names of api tests (tensorflow#1300)
    
    We renamed the tests to remove "_eager" parts. This PR updates the api test for correct filenames
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Fix wrong benchmark tests names (tensorflow#1301)
    
    Fixes wrong benchmark tests names caused by last commit
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Patch arrow to temporarily resolve the ARROW-11518 issue (tensorflow#1304)
    
    This PR patchs arrow to temporarily resolve the ARROW-11518 issue.
    
    See 1281 for details
    
    Credit to diggerk.
    
    We will update arrow after the upstream PR is merged.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Remove AWS headers from tensorflow, and use headers from third_party … (tensorflow#1241)
    
    * Remove external headers from tensorflow, and use third_party headers instead
    
    This PR removes external headers from tensorflow, and
    use third_party headers instead.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Address review comment
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Switch to use github to download libgeotiff (tensorflow#1307)
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Add @com_google_absl//absl/strings:cord (tensorflow#1308)
    
    Fix read/STDIN_FILENO
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Switch to modular file system for hdfs (tensorflow#1309)
    
    * Switch to modular file system for hdfs
    
    This PR is part of the effort to switch to modular file system for hdfs.
    When TF_ENABLE_LEGACY_FILESYSTEM=1 is provided, old behavior will
    be preserved.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Build against tf-nightly
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Update tests
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Adjust the if else logic, follow review comment
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Disable test_write_kafka test for now. (tensorflow#1310)
    
    With tensorflow upgrade to tf-nightly, the test_write_kafka test
    is failing and that is block the plan to modular file system migration.
    
    This PR disables the test temporarily so that CI can continue
    to push tensorflow-io-nightly image (needed for modular file system migration)
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Switch to modular file system for s3 (tensorflow#1312)
    
    This PR is part of the effort to switch to modular file system for s3.
    When TF_ENABLE_LEGACY_FILESYSTEM=1 is provided, old behavior will
    be preserved.
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Add python 3.9 on Windows (tensorflow#1316)
    
    * Updates the PR to use attribute instead of Env Variable
    
    -Originally AVRO_PARSER_NUM_MINIBATCH was set as an environmental
    variable.  Because tensorflow-io rarely uses env vars to fine tune
    kernal ops this was changed to an attribute. See comment here:
    tensorflow#1283 (comment)
    
    * Added AVRO_PARSER_NUM_MINIBATCH to override num_minibatches
    
    Added AVRO_PARSER_NUM_MINIBATCH to override num_minibatches. This is recommended to be set equal to the vcore request.
    
    * Updates the PR to use attribute instead of Env Variable
    
    -Originally AVRO_PARSER_NUM_MINIBATCH was set as an environmental
    variable.  Because tensorflow-io rarely uses env vars to fine tune
    kernal ops this was changed to an attribute. See comment here:
    tensorflow#1283 (comment)
    
    * Adds addtional comments in source code for understandability
    
    Co-authored-by: Abin Shahab <[email protected]>
    Co-authored-by: Yong Tang <[email protected]>
    Co-authored-by: Vo Van Nghia <[email protected]>
    Co-authored-by: Vignesh Kothapalli <[email protected]>
    Co-authored-by: Cheng Ren <[email protected]>
    Co-authored-by: Cheng Ren <[email protected]>
    Co-authored-by: Dale Lane <[email protected]>
    Co-authored-by: Yuan Tang <[email protected]>
    Co-authored-by: Mark Daoust <[email protected]>
    10 people authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    c7e99a5 View commit details
    Browse the repository at this point in the history
  72. Super Serial- automatically save and load TFRecords from Tensorflow d…

    …atasets (tensorflow#1280)
    
    * super_serial automatically creates TFRecords files from dictionary-style Tensorflow datasets.
    
    * pep8 fixes
    
    * more pep8 (undoing tensorflow 2 space tabs)
    
    * bazel changes
    
    * small change so github checks will run again
    
    * moved super_serial test to tests/
    
    * bazel changes
    
    * moved super_serial to experimental
    
    * refactored super_serial test to work for serial_ops
    
    * bazel fixes
    
    * refactored test to load from tfio instead of full import path
    
    * licenses
    
    * bazel fixes
    
    * fixed license dates for new files
    
    * small change so tests rerun
    
    * small change so tests rerun
    
    * cleanup and bazel fix
    
    * added test to ensure proper crash occurs when trying to save in graph mode
    
    * bazel fixes
    
    * fixed imports for test
    
    * fixed imports for test
    
    * fixed yaml imports for serial_ops
    
    * fixed error path for new tf version
    
    * prevented flaky behavior in graph mode for serial_ops.py by preemptively raising an exception if graph mode is detected.
    
    * sanity check for graph execution in graph_save_fail()
    
    * it should be impossible for serial_ops not to raise an exception now outside of eager mode. Impossible.
    
    * moved eager execution check in serial_ops
    markemus authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    de54c3c View commit details
    Browse the repository at this point in the history
  73. Fix link in avro reader notebook (tensorflow#1333)

    Correct the link to Avro Reader tests in notebook
    oliverhu authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    9644be3 View commit details
    Browse the repository at this point in the history
  74. Bump abseil-cpp to 6f9d96a1f41439ac172ee2ef7ccd8edf0e5d068c (tensorfl…

    …ow#1336)
    
    * Bump abseil-cpp to 6f9d96a1f41439ac172ee2ef7ccd8edf0e5d068c
    
    This PR bumps abseil-cpp to 6f9d96a1f41439ac172ee2ef7ccd8edf0e5d068c
    to fix the build issue.
    
    See related changes in tensorflow/tensorflow/commit/1c9eeb9eaa1b712d71fc29bcc9054c25c7236fa2
    
    Signed-off-by: Yong Tang <[email protected]>
    
    * Remove flaky CentOS 7 build
    
    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    8d7d28f View commit details
    Browse the repository at this point in the history
  75. Release nightly even if test fails (tensorflow#1339)

    Signed-off-by: Yong Tang <[email protected]>
    yongtang authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    3de431d View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    ef8a5d5 View commit details
    Browse the repository at this point in the history
  77. gcs switch to env (tensorflow#1319)

    * switch to env
    
    * switch to gcs on tensorflow-io according to tensorflow/tensorflow#47247
    vnghia authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    4154a2c View commit details
    Browse the repository at this point in the history
  78. improvements for s3 environements variables (tensorflow#1343)

    * lazy loading for `s3` environements variables
    
    * `S3_ENDPOINT` supports http/https
    
    * remove `S3_USE_HTTPS` and `S3_VERIFY_SSL`
    vnghia authored and michaelbanfield committed Mar 30, 2021
    Configuration menu
    Copy the full SHA
    64eb761 View commit details
    Browse the repository at this point in the history