You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just a short question, I have like 2000 stored dataframes now and I would like to load 500 of it as fast as possible into one python process. Is there a batch-load function in it?
I coded something with ThreadPoolExecutor and it loads 3GB on disk into around a 40GB DataFrame (which is pretty heavy) in under four minutes using 5 threads.
Does somebody see a faster variant? The SSD is relaxed, it looks like the performance limiation lies in df = item.to_pandas(), which is CPU intensive.
The text was updated successfully, but these errors were encountered:
Awesome project.
Just a short question, I have like 2000 stored dataframes now and I would like to load 500 of it as fast as possible into one python process. Is there a batch-load function in it?
I coded something with
ThreadPoolExecutor
and it loads 3GB on disk into around a 40GB DataFrame (which is pretty heavy) in under four minutes using 5 threads.Does somebody see a faster variant? The SSD is relaxed, it looks like the performance limiation lies in
df = item.to_pandas()
, which is CPU intensive.The text was updated successfully, but these errors were encountered: