-
Notifications
You must be signed in to change notification settings - Fork 179
Remove zetasql #272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Remove zetasql #272
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This pull request removes the zetasql dependency from the data-validation component by eliminating zetasql‐related files, targets, and configuration settings. Key changes include:
- Removal of custom validation proto and C++ implementation files that indirectly functioned with zetasql.
- Deletion of BUILD targets and WORKSPACE zetasql archive/imports.
- Updating various build and configuration files (e.g. .bazelrc, setup.py, and CI workflows) to reflect the removal.
Reviewed Changes
Copilot reviewed 19 out of 137 changed files in this pull request and generated no comments.
Show a summary per file
File | Description |
---|---|
tensorflow_data_validation/anomalies/status_util.h | Removal of extra trailing newline characters. |
tensorflow_data_validation/anomalies/proto/custom_validation_config.proto | Entire file removed, reflecting removal of custom validation. |
tensorflow_data_validation/anomalies/proto/init.py | Minor formatting changes with no functional impact. |
tensorflow_data_validation/anomalies/BUILD | Removal of custom_validation cc_library and cc_test targets. |
tensorflow_data_validation/anomalies/custom_validation.h & .cc | Entire files removed as part of the zetasql dependency removal. |
tensorflow_data_validation/init.py | Updated import grouping for consistency after removal. |
setup.py and pyproject.toml | Formatting and dependency updates; no functional issues noted. |
WORKSPACE and .bazelrc | Removal of zetasql imports with updated comments. |
CI workflows (.github/workflows/test.yml and ci-lint.yml) | Adjustments to installation and linting configurations. |
.pre-commit-config.yaml | Newly added to enforce code style and linting standards. |
Comments suppressed due to low confidence (2)
.bazelrc:2
- Consider removing these obsolete commented-out zetasql references entirely rather than leaving them in the file to reduce potential confusion in the future.
# Zetasql is removed.
tensorflow_data_validation/anomalies/BUILD:428
- [nitpick] Ensure that all dependencies and references to the removed custom_validation targets are updated accordingly in the rest of the codebase to avoid any unresolved references.
cc_library(
Built on top of #267, which will be merged before this. Final tests running now 🤞 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By "custom validation", they're discussing validation done at the SQL level, right? As far as I can tell in the docs there's no mention of "custom validation" or anything related to SQL.
For anyone reading this doing future feature development: it would be great to have more code documentation...
@@ -71,18 +71,6 @@ http_archive( | |||
], | |||
) | |||
|
|||
# Needed by abseil-py by zetasql. | |||
http_archive( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! Can't believe we're still dealing with 2->3 16 years later...
@@ -112,6 +100,16 @@ http_archive( | |||
url = "https://github.com/abseil/abseil-cpp/archive/%s.tar.gz" % COM_GOOGLE_ABSL_COMMIT, | |||
) | |||
|
|||
|
|||
# re2 required for google tests | |||
http_archive( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, were tests even running before? Not sure how this wasn't part of the build earlier...
), | ||
per_feature_parent_indices, | ||
) | ||
flattened, value_parent_indices = array_util.flatten_nested( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewfulton9 Is this intended here, or did I mess this up when I merged master
?
Removes zetaSQL dependency from data-validation.