-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define a minimal Python version of the package containing vars_rename
#31
Define a minimal Python version of the package containing vars_rename
#31
Conversation
… test it out" This reverts commit 9cc256b.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #31 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 6 6
Lines 382 382
=========================================
Hits 382 382 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) | ||
shell: Rscript {0} | ||
|
||
- name: Build Python docs | ||
run: sphinx-build python/docs/source docs/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Building these docs into the docs/python
subdirectory makes them show up under the relative path /python
in the deployed docs site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): IMO it would be nice to have the R docs at /r
for consistency, with a redirect/301 at the root to maintain old links.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually think redirects are possible on GitHub Pages, at least if this feature request is still up to date. We could implement redirects if we switched to a different static hosting provider like Netlify, but that's a bigger discussion. Let me know if you can think of any other way to accomplish something similar and I'm happy to take a crack at it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to do client-side redirects with http-equiv.
@@ -0,0 +1,71 @@ | |||
"pin","year","class","char_yrblt","char_bldg_sf","char_land_sf","char_beds","char_rooms","char_fbath","char_hbath","char_frpl","char_type_resd","char_cnst_qlty","char_apts","char_tp_dsgn","char_attic_fnsh","char_gar1_att","char_gar1_area","char_gar1_size","char_gar1_cnst","char_attic_type","char_bsmt","char_ext_wall","char_heat","char_repair_cnd","char_bsmt_fin","char_roof_cnst","char_use","char_age","char_site","char_ncu","char_renovation","char_porch","char_air","char_tp_plan" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we could just share these fixtures with the R package, but the R package needs the fixtures to be saved as RData objects, and I feel like it would be kind of weird to bundle that in the Python package. Down to give it a try if you have a strong preference, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: IMO it's worth it to create a shared directory for these fixtures that the R scripts use to instantiate the .rda
files and the Python package uses for fixtures. The dir would contain this data, saved as CSV and created via SQL (from the existing R data-raw
scripts).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jeancochrane This is looking great! Nice first pass, no major blockers. Looking forward to the remaining package.
You can install the released version of `ccao` directly from GitHub: | ||
|
||
```bash | ||
pip install "git+https://github.com/ccao-data/ccao.git#egg=ccao&subdirectory=python" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be a ccao
package on PyPI. Why not push it there?
python/pyproject.toml
Outdated
{name = "Dan Snow", email="[email protected]"}, | ||
{name = "Jean Cochrane", email="[email protected]"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): I didn't make any of this package! At least put yourself first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of you! Done in cd5bc3e.
python/pyproject.toml
Outdated
] | ||
|
||
[project.optional-dependencies] | ||
test = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): I've taken to naming this dev
instead of test
since ruff
and mypy
aren't specific to testing. I don't feel super strongly about it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, done in aaebf7c!
python/.python-version
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (non-blocking): Why have a .python-version
file if it's not used in the GH workflows? Just to set the default version for uv
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, removed in 1b6bbef.
run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) | ||
shell: Rscript {0} | ||
|
||
- name: Build Python docs | ||
run: sphinx-build python/docs/source docs/python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): IMO it would be nice to have the R docs at /r
for consistency, with a redirect/301 at the root to maintain old links.
python/ccao/vars_funs.py
Outdated
vars_dict = pd.read_csv(str(_data_path / "vars_dict.csv")) | ||
|
||
# Prefix we use to identify variable name columns in the variable dictionary | ||
var_name_prefix = "var_name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Shouldn't this be an all caps constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, done in c8f4e49.
python/ccao/vars_funs.py
Outdated
data: list[str] | pd.DataFrame, | ||
names_from: str, | ||
names_to: str, | ||
output_type: OutputType | str = OutputType.INPLACE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: I'm kind of a dumdum when it comes to this kind of stuff, but what's the advantage of enumerating the possible type values like this? Why not just check them in the function body?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking was that a DRYer approach would reduce maintenance burden in case we have to change the options someday or define logic that might best be owned by the param abstraction. But I think it's a bit overengineered for the current use-case so I stripped it out in 81d8c20 in favor of strings that we check in the body of the function.
python/ccao/vars_funs.py
Outdated
# Validate names arguments | ||
if not isinstance(names_from, str) or not isinstance(names_to, str): | ||
raise ValueError("names_from and names_to must be strings") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: (non-blocking): Since the function is typed, isn't this check redundant? Same goes for the other isinstance
checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that it might be helpful to double-check types in case users don't have type-checking enabled in their project, but since this package is pretty specific to us and we always enable type checking, I doubt it's worth the extra code. I removed the duplicative type checking in b837865.
@@ -0,0 +1,71 @@ | |||
"pin","year","class","char_yrblt","char_bldg_sf","char_land_sf","char_beds","char_rooms","char_fbath","char_hbath","char_frpl","char_type_resd","char_cnst_qlty","char_apts","char_tp_dsgn","char_attic_fnsh","char_gar1_att","char_gar1_area","char_gar1_size","char_gar1_cnst","char_attic_type","char_bsmt","char_ext_wall","char_heat","char_repair_cnd","char_bsmt_fin","char_roof_cnst","char_use","char_age","char_site","char_ncu","char_renovation","char_porch","char_air","char_tp_plan" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: IMO it's worth it to create a shared directory for these fixtures that the R scripts use to instantiate the .rda
files and the Python package uses for fixtures. The dir would contain this data, saved as CSV and created via SQL (from the existing R data-raw
scripts).
python/uv.lock
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (non-blocking): Why include this lock file if the the dependencies are already captured in pyproject.toml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, removed in 5960235.
c1c40d9
to
b837865
Compare
I made some major changes based on your comments @dfsnow, so it would be good to get a second look before I merge! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great @jeancochrane. Nice fixups!
if output_type == "inplace": | ||
assert list(result.columns) == expected | ||
else: | ||
class TestVarsRename: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦 How did I not notice this wasn't in a test class. Nice catch.
This PR defines a minimal Python
ccao
package in this repo with one function ported over from R,vars_rename
. The package lives in thepython/
subdir and uses the following tools:uv
for packagingruff
for linting/formattingpytest
for unit testssphinx
for docsWe add a new
pytest-coverage
GitHub workflow to ensure that tests run on PRs and commits to the main branch. We also rename thepkgdown
workflow todocs
and update it so that it deploys Python docs alongside R docs in the relative path/ccao/python/
.Closes #29.
build-pkgdown-site
check needs to complete before merging, so it's not reporting the CI checks as being successful. I don't understand why this is happening, given that thebuild-pkgdown-site
step no longer exists. Maybe there's a repo-level setting that is requiring this stage? If not, I figure we can just force merge this when it's ready and hope it's a caching bug that resolves itself.