Conversation
add support for SFTP by delegating to an OpenSSH sftp-server subsystem running in the user pod. Doesn't support SCP, yet.
yuvipanda
left a comment
There was a problem hiding this comment.
This is beautiful :) I've left a couple comments about approaches...
How do you think we can test this? I think the project is now at a stage where tests are becoming necessary...
Either way, this is amazing work! I'm very excited to land this!
| import asyncssh.stream | ||
| import asyncssh.sftp | ||
|
|
||
| def run_override(sftp_server, reader, writer): |
There was a problem hiding this comment.
Do you think we can turn this into a function we explicitly call somewhere, instead of automatically running on import? I don't have much experience with monkey patching, however - so if you think this is the best way to do this, I'm ok with that.
There was a problem hiding this comment.
I think it's fine to have this in a function, probably better to make it explicit
| parent=self.parent, namespace=self.namespace, | ||
| username=self.username).pod_name | ||
| # ensure sftp-server binary is copied into the user pod | ||
| self._run_setup_command( |
There was a problem hiding this comment.
Instead of running kubectl cp, how about:
- We create a small container image that just has sftp-binary
- Set up an emptyDir volume by default in our launched pod
- Run an initContainer with our sftp-binary image, where we copy the binary to the emptyDir.
- When running kubectl exec, we just refer to the binary path in the emptyDir volume, which we will know.
This has a few advantages over kubectl cp:
- sftp-binary image is pulled only once per node, and should be pretty small
- The initContainer should execute pretty quickly, since the image is already there
- We can put other binaries (or whole packages) we want in the emptyDir volume later if we need
- SFTP can begin immediately, without us needing to copy a binary in.
What do you think of this suggestion?
There was a problem hiding this comment.
Great idea! Plus, it removes the need to have tar available in the user image, as would be needed for kubectl cp to work.
| config=True | ||
| ) | ||
|
|
||
| def __init__(self, server, reader, writer): |
There was a problem hiding this comment.
Do you think you can write docstrings for each of these methods? Would be very helpful in deocoding what exactly is happening here.
|
|
||
| def _run_setup_command(self, *kubectl_command): | ||
| self.logger.info(f'executing: {kubectl_command}') | ||
| cmd = subprocess.Popen(kubectl_command) |
There was a problem hiding this comment.
Can this be async? I think .wait() is blocking, so this would cause the entire server (not just this connection) to block until the request completes. Also I'm slightly confused about us starting the process and immediately waiting for it to terminate. Can you expand a little on what that does?
There was a problem hiding this comment.
You're right, this should probably be async. It only executes one-off setup commands such as copying the sftp binary into the pod, and making sure it has execute permissions, that's why the immediate terminate. With the volume-based solution you suggested, none of these steps might be needed anymore, though :)
|
Regarding testing: I've kept the SFTP-related code in the superclass that doesn't know about kubernetes yet, because I thought at least that part should be fairly straightforward to write an automated test for. Will include that, too! |
|
Awesome! I'm really excited by where this can go! |
|
Hello! Thank you for your interest :) I'm currently focusing on jupyterhub-ssh instead, so I've marked this project as inactive. This PR was great, and I'm very grateful for your work here. I've SFTP working there, but with much more contraints - https://github.com/yuvipanda/jupyterhub-ssh/tree/main/jupyterhub-sftp. Hope you'll participate there too. Thank you! <3 |
See #26
This adds basic support for SFTP by delegating to an OpenSSH sftp-server subsystem running
in the user pod. Kind of work-in-progress: Getting this to work with asyncssh is a little ugly at the moment, and involves monkey-patching a function in the asyncssh code. SFTP file transfers into and out of the user pod work fine in my initial trials so far, but are a bit slow (<10 MB/s over a gigabit line), maybe due to the rather verbose logging. SCP doesn't work, yet -- if you simply set
allow_scp=True, the SCP file transfers end up in the kubessh pod instead of the user pod. Also not 100% sure if downloading thesftp-serverbinary from github during the docker build is the best way to get it in there. Comments welcome :)