Skip to content

Conversation

flying-sheep
Copy link

@flying-sheep flying-sheep commented May 26, 2025

See #9076 (comment)

Closes #9075

  • No tests necessary
  • Passes pre-commit run --all-files

Companion PR to dask/dask#11966

Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    27 files  ±0      27 suites  ±0   11h 32m 11s ⏱️ + 15m 21s
 4 113 tests ±0   3 997 ✅ +1    111 💤 ±0   5 ❌ ±0 
51 569 runs   - 1  49 256 ✅ ±0  2 284 💤 ±0  29 ❌ ±0 

For more details on these failures, see this check.

Results for commit 492099e. ± Comparison against base commit 801d0ed.

Copy link
Member

@guillaumeeb guillaumeeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As indicated in dask/dask#11966 (review), I agree that these default configuration can be remove, but Dask and Distributed documentation should clearly reflect this change (it could have impact for some users, especially in HPC centers).

So pointing to how to configure this in distributed doc is mandatory to me.

@flying-sheep
Copy link
Author

as you’ve figured out, that’s why the companion PR dask/dask#11966 is linked above.

so in other words you’re saying “LGTM”?

@dchudz
Copy link
Contributor

dchudz commented Sep 26, 2025

I'm not trying to block (not something I can do anyway), but I do want to mention:

I'm a bit concerned about the impact here for anyone using Dask and Numpy together for non-trivial operations (if they don't notice this change and set the variable themselves, they could have either terrible performance, or process failures that are very difficult to understand).

Preferring all or none does make make logical sense to me, but the tradeoff with this change isn't totally clear to me.

(Context/disclosure on me: I work in engineering at Coiled, which supports a number of Dask customers. I expect this change would create some difficulty for us & our customers, but it's not just us that I'm worried about.)

@guillaumeeb
Copy link
Member

so in other words you’re saying “LGTM”?

@flying-sheep sorry I wasn't clear enough, I mean that distributed documentation, so in this repository, should mention this change, and how a user can configure it back, I guess by adding by himself the correct pre-spawn-environ value in the configuration, either in Python code or Yaml files.

I'm a bit concerned about the impact here for anyone using Dask and Numpy together for non-trivial operations (if they don't notice this change and set the variable themselves, they could have either terrible performance, or process failures that are very difficult to understand).

Preferring #9076 (comment) does make make logical sense to me, but the tradeoff with this change isn't totally clear to me.

@dchudz I agree with you, hence my comment, I think a change like that could have some hard to debug impact on existing workloads. So I guess we should choose between all or none:

  • if none, properly document and warn about this change, explain how to configure it back if needed,
  • if all, still add some documentation on how to remove this variables for experts users who want to benefit from core libraries multi threading.

@dchudz
Copy link
Contributor

dchudz commented Sep 26, 2025

Thanks @guillaumeeb. I'll say I'm still a bit worried even if we properly document and warn about this change, explain how to configure it back if needed. But I know it's not my decision.

@flying-sheep
Copy link
Author

flying-sheep commented Sep 28, 2025

distributed documentation, so in this repository, should mention this change

documentation here doesn’t mention this behavior at all, so I changed it at the only location where it is documented instead. where would I add it?

@jacobtomlinson
Copy link
Member

jacobtomlinson commented Sep 29, 2025

I feel like this is a really tough situation because it's a footgun if we do and a footgun if we dont.

I wonder if there is somewhere more visible that we can highlight this? Could we raise a warning somwhere that makes sense? Could we show these values discretely in the dask.array repr, or on the dashboard somewhere?

How can we make this easier to debug?

@flying-sheep
Copy link
Author

I think that’s for dask maintainers to decide. I just walked into the footgun and wanted to be helpful by doing a quick PR with the changes that a maintainer agreed to. Since this seems to be a bigger discussion now, I feel like the dask devs should come to a conclusion.

I’ll happily do small, clearly stated change requests to my PRs, but I won’t change anything until there has been a decision.

@jacobtomlinson
Copy link
Member

Given this change was originally introduced by Coiled I'm keen to hear @dchudz's thoughts. Also if other maintainers are around and have opinions I'd love to hear them @guillaumeeb @quasiben @jrbourbeau

@dchudz
Copy link
Contributor

dchudz commented Sep 29, 2025

Given this change was originally introduced by Coiled

Looks like that's this PR (#5098), in case anyone else is looking for the context.

I'm keen to hear @dchudz's thoughts.

My gut is to leave it alone. It seems to me that the reasoning from that PR still applies. To me it seems that removing these would cause more user pain than it solves (although as @jacobtomlinson notes, I recognize there's user pain in both directions).

I understand that would leave the question of whether to add more, and a kind of inconsistency if we don't. My view on whether to add more is less strong, but I don't really think it's necessary to be consistent by adding every similar env var. I'd be inclined to suggest only thinking about it in response to common user pain.

@guillaumeeb
Copy link
Member

To be fair, setting these variables for users (without novice users really knowing it) is also common practice on HPC systems. I don't think there really is a solution better than the other.

However, I would at least make sure that this setting is also visible in distributed docs somewhere, currently it looks hard to find or understand and might cause problem to a (really) few proportions of expert users.

I would also tend a bit to put all the other variables identified there in default configuration, especially if they have an effect when using Numpy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support NUMBA_NUM_THREADS env variable

4 participants