Skip to content

Conversation

moi90
Copy link
Contributor

@moi90 moi90 commented Oct 10, 2025

Closes #9122

  • Tests added / passed
  • Passes pre-commit run --all-files

@moi90
Copy link
Contributor Author

moi90 commented Oct 10, 2025

I would still need help to write a useful test...

Copy link
Contributor

github-actions bot commented Oct 10, 2025

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

    27 files  ± 0      27 suites  ±0   9h 46m 19s ⏱️ + 3m 6s
 4 113 tests + 1   4 006 ✅  - 1    104 💤 ±0   3 ❌ + 2 
51 529 runs  +13  49 330 ✅ ±0  2 184 💤  - 1  15 ❌ +14 

For more details on these failures, see this check.

Results for commit e54b412. ± Comparison against base commit c9e7aca.

♻️ This comment has been updated with latest results.

Copy link
Member

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a PR to distributed we don't need to do this in a plugin, this could happen directly in the Cluster object for handing off the key and the Worker object for recieving it. We could use the Dask config as the transport for the key to avoid us just injecting it into the environment all the time.

For testing we could just ensure the key is successfully set, transmitted and recieved.

@moi90 moi90 marked this pull request as draft October 14, 2025 10:38
@moi90
Copy link
Contributor Author

moi90 commented Oct 14, 2025

Could you give me a hint how to test this properly?

The following works instantly (without any change), because the cluster started by gen_cluster likely relies on the multiprocessing module which obviously correctly forwards the authkey to child processes.

def _get_authkey():
    import multiprocessing.process
    return multiprocessing.process.current_process().authkey
    
@gen_cluster(client=True)
async def test_authkey(c, s, a, b):
    import multiprocessing.process

    worker_authkey = await c.submit(_get_authkey)

    assert worker_authkey == multiprocessing.process.current_process().authkey

Is there a factory for a cluster that is started using, say, subprocess instead?

@jacobtomlinson
Copy link
Member

I don't think there is a factory, but you could use subprocess to run the dask scheduler and dask worker commands separately.

@moi90
Copy link
Contributor Author

moi90 commented Oct 15, 2025

you could use subprocess

That seems to work, thanks!

We could use the Dask config as the transport for the key

Is this secure? Is the config visible or even serialized to disk? The Python developers make a big deal about the fact that the authkey must always remain inaccessible from outside...

@jacobtomlinson
Copy link
Member

Is this secure?

When you add it to the config in a Python process it would only be held in memory, we never write config back to the disk, it's only read once when dask.config is imported.

When workers are launched it would be transferred via environment variables, which is the same as the original proposal here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Forward multiprocessing.current_process().authkey to workers

2 participants