Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Nov 7, 2023

About

More yak shaving.

Details

Similar to pyveci/pueblo#12 and #74, this patch adds generalized testing helper functions to avoid code duplication within the educational repository cratedb-examples, originally conceived at crate/cratedb-examples@dd09e144ef0. We need those kinds of utility functions at many more spots, so we need to find canonical places to store them.

Synopsis

pip install 'cratedb-toolkit[io]'
from cratedb_toolkit.io.sql import DatabaseAdapter

# Define database connection.
cratedb = DatabaseAdapter(dburi="crate://crate@localhost:4200")

# Mangle data. You choose.
cratedb.import_csv_pandas(filepath="test.csv", tablename="foobar")
cratedb.import_csv_dask(filepath="test.csv", tablename="foobar")

References

Notes

Note that details on the interface may change while we go, specifically on the "naming things" / DWIM side of things. If you have any suggestions, feel free to add your voice. Right now, the main focus is to ship it, to be able to re-use it on behalf of Jupyter Notebooks we are currently publishing, in order to reduce boilerplate code within them.

/cc @karynzv, @marijaselakovic, @hlcianfagna, @hammerhead, @WalBeh, @andnig, @ckurze, @vvulf

@codecov
Copy link

codecov bot commented Nov 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
cratedb_toolkit/util/database.py 92.78% <100.00%> (+1.75%) ⬆️

📢 Thoughts on this report? Let us know!

@amotl amotl force-pushed the amo/add-import-csv branch 2 times, most recently from 7a2cb59 to 94b0cdd Compare November 7, 2023 20:01
@amotl amotl requested review from matriv and seut November 7, 2023 20:04
@amotl amotl marked this pull request as ready for review November 7, 2023 20:09
@amotl amotl force-pushed the amo/add-import-csv branch from 94b0cdd to f60d9d3 Compare November 7, 2023 20:13
pbar = ProgressBar()
pbar.register()

# Mangle data.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry don't get, what do you mean mangle?

Copy link
Member Author

@amotl amotl Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a non-word, apologies. I've changed the comment to »Load data into database.« now.

Comment on lines 230 to 231
# TODO: Use amount of CPU cores instead?
npartitions = npartitions or 4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, lets do it? :)
Or any objections?

Suggested change
# TODO: Use amount of CPU cores instead?
npartitions = npartitions or 4
npartitions = npartitions or os.cpu_count()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added with ffbb1a0.

@amotl amotl force-pushed the amo/add-import-csv branch from ffbb1a0 to 2f7bcd6 Compare November 8, 2023 11:24
Base automatically changed from amo/add-run-sql to main November 8, 2023 11:26
@amotl amotl force-pushed the amo/add-import-csv branch from 2f7bcd6 to 9259e60 Compare November 8, 2023 11:27
@amotl amotl force-pushed the amo/add-import-csv branch from 9259e60 to e93902b Compare November 8, 2023 11:27
@amotl amotl merged commit 27ef4ad into main Nov 8, 2023
@amotl amotl deleted the amo/add-import-csv branch November 8, 2023 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants