-
Notifications
You must be signed in to change notification settings - Fork 7
test: add sqllogictest runner #169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
88877e5
to
4ee8428
Compare
3a10cd2
to
3c2d05e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI looks like his passed, but clicking on the individual tests and inspecting output shows,
...
❌ tests/sqllogictest/explain.slt: FAILED
...
which looked like it might have been missed.
0272567
to
82ed70e
Compare
743b276
to
35d428b
Compare
This change adds a sqllogictest runner + CLI which runs .slt files in the tests/sqllogictest directory. Right now, the runner uses a hardcoded distributed config (ex. `with_network_shuffle_tasks(2)` etc.) but can be extended in the future. Using sqllogictest will make it much easier to write tests (especially for `explain` and `explain (analyze)`). `explain (analyze)` tests will be added in a future commit as it is still being implemented. Also, this change deletes `tests/distributed_aggregation.rs` and moves the test cases to `.slt` files. Documentation: - https://sqlite.org/sqllogictest/doc/trunk/about.wiki - https://github.com/risinglightdb/sqllogictest-rs
We now run any sqllogictest commands from tests/sqllogictest.sh in CI
c30ebf4
to
0dd8d3e
Compare
@robtandy Fixed! Good catch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to see more efforts towards testing 😄 , although I'm going to try to persuade you to not go down this path:
From sqllogictest definition itself:
Sqllogictest is a program designed to verify that an SQL database engine computes correct results by comparing the results to identical queries from other SQL database engines.
Here are my reason to think sqllogictests are not suitable for testing this codebase:
-
In this project, we care about bringing distributed capabilities to DataFusion, but it's not in our scope to test DataFusion correctness itself, that's done already upstream in their sqllogictest.
-
This approach is not suitable for performing assertions over big pieces of data, where we want to assert results against normal DataFusion regardless of what they are (e.g.
assert_eq!(df_results, df_distributed_results)
). -
We lose
insta
as our snapshotting tool in favor of manually maintaining assertions over big portions of text. We can no longer docargo insta accept
for accepting changes in snapshots. -
Knowing that we still need our current snapshot testing, we would be maintaining two paths of testing with overlapping intentions, when we could reinforce on one.
-
This approach requires adding additional custom scripts, pipelines, executable binaries and overall overhead to the project.
-
It's more code to maintain. In terms of amount of code, this approach would pay off if we were to add a lot of sqllogictests, but I don't think we want or should do it, we should focus on testing things like:
- Are the metrics for each node reconstructed across the network?
- Are we able to propagate config properly across network boundaries?
- Are we able to encode/decode custom user nodes?
Answering to the PR description:
Using sqllogictest will make it much easier to write tests
I'm almost certain that writing tests this way is going to be far more complicated than using https://github.com/mitsuhiko/insta as we do in the other tests. I'd strongly recommend to get familiar and comfortable with the tool, and then reconsider if sqllogictest would help.
In short:
I'd redirect the efforts towards reinforcing on the current integration testing tooling in order to make it even more ergonomic so that we can bring more tests if needed, but I'd advocated towards not using sqllogictests in this project.
Ahh I didn't realize we had a solution for this. https://github.com/mitsuhiko/insta looks good. |
test: add sqllogictest runner
This change adds a sqllogictest runner + CLI which runs .slt files in the tests/sqllogictest directory. Right
now, the runner uses a hardcoded distributed config (ex.
with_network_shuffle_tasks(2)
etc.) but can beextended in the future. Using sqllogictest will make it much easier to write tests (especially for
explain
and
explain (analyze)
).explain (analyze)
tests will be added in a future commit as it is still beingimplemented.
Also, this change deletes
tests/distributed_aggregation.rs
and moves the test cases to.slt
files.Documentation:
ci: add sqllogictest files to github ci
We now run any sqllogictest commands from tests/sqllogictest.sh in CI
misc: add .vscode dir to .gitignore