Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Python 3.13 migration stuck on pyarrow due to wrong metadata #3198

Open
h-vetinari opened this issue Nov 23, 2024 · 10 comments
Open

BUG: Python 3.13 migration stuck on pyarrow due to wrong metadata #3198

h-vetinari opened this issue Nov 23, 2024 · 10 comments

Comments

@h-vetinari
Copy link
Contributor

Just noticed on the status page that pyarrow is still marked as awaiting parents, specifically numba. However, pyarrow was already migrated for 3.13 a while ago, and we specifically excluded sparse (which causes the dependence on numba) for py<313. So the bot is picking up incorrect metadata for the graph here.

CC @beckermr

@beckermr
Copy link
Contributor

The bot won't understand that exclusion or how to deal with it.

@beckermr
Copy link
Contributor

You'll need to do one of 2+ things:

  1. extend the bot's logic to understand package dependent node links in the graph (hard)
  2. wait until numba has py313 support (easy)
  3. your clever idea here...

The advantage of item 2 is that you don't have to go our special casing recipes.

@h-vetinari
Copy link
Contributor Author

h-vetinari commented Nov 23, 2024

Huh? This surely worked in the past. We did the exact same thing (conda-forge/arrow-cpp-feedstock@403ae86) for 3.12, and the migration then moved on.

@beckermr
Copy link
Contributor

I am surprised and I think that must have been something else. The bot aggregates the links between feedstocks over all variants of the recipe and so would have had the link. See the logic here: https://github.com/regro/cf-scripts/blob/main/conda_forge_tick/feedstock_parser.py#L289. That logic has been in place for at least three years, if not longer.

@h-vetinari
Copy link
Contributor Author

This surely worked in the past.

Well, perhaps it was an accident then. Could it depend on which python version (CONDA_PY?) is used when rendering recipes to generate the graph? Not sure I followed all the trails correctly, but the following calls into conda-build

config = conda_build.config.get_or_merge_config(
None,
platform=platform,
arch=arch,
variant_config_files=[
cbc_path,
],
)
_cbc, _ = conda_build.variants.get_package_combined_spec(
tmpdir,
config=config,
)

so this could play a role?

@h-vetinari
Copy link
Contributor Author

Specifically for new python migrations, if/once the bot runs on that newest version (here 3.13), it would ignore this case. This could be an explanation how this ended up "working" previously.

@beckermr
Copy link
Contributor

The bot is not supposed to use any of the CONDA_PY variables and those are only set by conda build during some, but not all, operations.

The intent is that each recipe is rendered directly with each of its ci_support files.

@h-vetinari
Copy link
Contributor Author

Understood about the intent, but this was actually a very good "accidental" feature. We're migrating python once per year, and it would be good to be able to keep optional dependencies without throwing the bot out of whack.

How would

extend the bot's logic to understand package dependent node links in the graph (hard)

work for a given migration (say python again)? Wouldn't that create an explosion of separate graphs? Or we'd have one graph, but map the selectors to the edges somehow, in a way that we can query the graph while ignoring certain edges?

Something like

graph_dup = copy.deepcopy(graph)  # the reference graph with edges that have metadata
all_platforms = # ...
plat_allowed = migrator.get("platform_allowlist", all_platforms)
wrong_plat = get_edge_to_plat(graph_dup, plat_allowed)  # edges that aren't in allowlist
graph_dup.remove_edges_from(wrong_plat)
# specific to python migrations: ignore edges that don't apply to newest python (e.g. # [py<313]` if 3.13 is newest)
if migration.startswith("python"):
    # not sure how to determine newest for now
    wrong_py = get_edge_to_py(graph_dup, "3.13")
    graph_dup.remove_edges_from(wrong_py)

# rest of migrator processing

@h-vetinari
Copy link
Contributor Author

You'll need to do one of 2+ things:

Simplest solution for now is conda-forge/pyarrow-feedstock@1a16682. Though I'd still be very interested in finding a way that optional dependencies with a # [py<313] selector don't end up hanging the bot.

@beckermr
Copy link
Contributor

Yeah that’s the idea. It’s not an easy thing to do for sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants