Skip to content
richard fahey edited this page Nov 23, 2025 · 6 revisions

How to Run Python Scripts in GitHub Action Workflows

https://www.youtube.com/watch?v=zk4bSTD8uWM&

Python ETL repo

https://github.com/Ludwinic1/Data-Pipeline

UV

Stop Using Pip - This New Tool is 100x Faster (UV Tutorial)

https://www.youtube.com/watch?v=6pttmsBSi8M

python imports

import time import pandas as pd import polars as pl import numpy as np

df = pd.read_csv("textfile.csv")

polars is best for large tables, qicker and more efficient.

Keep pandas for

Working with small datasets (< 1GB) Heavy visualization/plotting work Team is Pandas-expert, project has tight deadlines Using libraries that expect Pandas (statsmodels, etc.) Doing exploratory data analysis in notebooks

Provides Polars shines when:

Data is large (> 1GB) Performance matters Building data pipelines Need to process clean data Starting new projects

Data wrangler

https://www.youtube.com/watch?v=tSY7wXv5W-Q&t=2s

Clone this wiki locally