Reliable list_running_servers(runtime_dir) design #5716

abaelhe · 2020-08-31T11:14:40Z

active check whether that process is really available via real HTTP request;
process pid maybe recycle and reused by host Operating System that i encountered;
and we must use real HTTP request other than the past check_pid(info['pid']).

kevin-bates

Thanks for opening this pull request. I understand why this is being done, although I'm curious how often you're finding the PIDs getting recycled (I know some OSs recycle much more frequently than others) and wonder why you're encountering "stale" nbserver files in the first place. How are you terminating the Notebook server normally? Is there some interaction that is using SIGKILL instead of SIGTERM or a larger problem where the server is improperly terminating? I suppose it takes just one bad apple (i.e., stale PID) and things can go wrong, and, I agree, this approach helps mitigate that - I'm just not sure about the trade-off.

I've provided a few comments and observations that I hope you find helpful. I didn't comment on the test code and it appears some socket-based tests are failing which rely on the list and stop functionality.

kevin-bates · 2020-09-01T16:31:49Z

notebook/notebookapp.py

-    Returns True if the server was stopped by any means, False if stopping it
-    failed (on Windows).
-    """
+def kernel_request(server_info, path='/login', method='GET', body=None, headers=None, timeout=5, log=None):


This method should be renamed since its unrelated to kernel. Perhaps server_request?

Also, please remove the default value from path since that should be explicitly provided by the caller (which is already the case anyway).

This method should be renamed since its unrelated to kernel. Perhaps server_request?

we creator have this naming right;
with respect to Guido name Python; and their first creator name IPython and Jupyter also Anaconda;
i am working on my IPython Swift Kernel, kernel in my brain;
don't mind, different think lead sparks.

Also, please remove the default value from path since that should be explicitly provided by the caller (which is already the case anyway).

caller will provide their; this default value does not conflict to different path;
we use '/login' as 'default value' to remind caller authentication first.

This method needs to change and the default for path either removed or updated to /api

Yeah, I agree with @kevin-bates here. This method needs to be renamed to something like api_request before we can accept it. It's not requesting a response from a "kernel" (as we define it in the Jupyter ecosystem). It's just a general server API ping.

I also agree, the default path should probably be /api.

who create new who own naming rights.
answered.

The name matters here. It obscures this library's API if function names are not clear.

kernel_request is the wrong name for this function. In the Jupyter ecosystem, kernel refers specifically to the underlying language kernel being executed by calls to the server's kernel API. This method is not making a kernel request. It's making a generic request the server's REST API. Let's generalize this function name to something like server_request or api_request.

particularly because Jupyter has a notion of kernels that is specific.

I'm setting this conversation to unresolved, since this change is necessary.

Suggested change

def kernel_request(server_info, path='/login', method='GET', body=None, headers=None, timeout=5, log=None):

def api_request(server_info, path='/api', method='GET', body=None, headers=None, timeout=5, log=None):

notebook/notebookapp.py

abaelhe · 2020-09-01T20:35:03Z

Thanks for opening this pull request. I understand why this is being done, although I'm curious how often you're finding the PIDs getting recycled (I know some OSs recycle much more frequently than others) and wonder why you're encountering "stale" nbserver files in the first place. How are you terminating the Notebook server normally? Is there some interaction that is using SIGKILL instead of SIGTERM or a larger problem where the server is improperly terminating? I suppose it takes just one bad apple (i.e., stale PID) and things can go wrong, and, I agree, this approach helps mitigate that - I'm just not sure about the trade-off.

i am building a Dockerfile on RedHat's quay.io, to deploy my Cloud based Amazon.com Crawler on Mac/Linux/...;
in the meantime; god or the Operating System selected my dear "jupyter-notebook" process;
meanwhile, i am making my IPython Kernel for Apple Swift Language(Python/C++/Swift/LLVM/CLang/LLDB... );
and i choose this reliable and better design and implement as my first milestone;

I've provided a few comments and observations that I hope you find helpful. I didn't comment on the test code and it appears some socket-based tests are failing which rely on the list and stop functionality.

your sparks are welcome.

Accepted, Thanks. Co-authored-by: Kevin Bates <[email protected]>

Accepted. Co-authored-by: Kevin Bates <[email protected]>

kevin-bates · 2020-09-01T22:18:29Z

I just now found that Lab returns yet a different title. This implies any front-end extension will have a different "title", making this approach too fragile. Since those applications could also rely on jupyter notebook list and jupyter notebook stop <port/socket> functionality, I believe a different approach should be adopted and, again, question the utility when, in practice, PID reuse would be rare in normal circumstances (i.e., when the notebook server is stopped gracefully).

…not self.sock

abaelhe · 2020-09-02T07:44:29Z

reopen

kevin-bates

Thanks @abaelhe - this is getting much closer. I'm still slightly concerned about requiring the use of the same notebook version, but I'm not sure that's a very big deal. It's certainly better than terminating the wrong process! To that point, and as I noted in the review, we should probably eliminate the PID-based termination altogether for this PR.

notebook/notebookapp.py

Co-authored-by: Kevin Bates <[email protected]>

abaelhe

Thanks @abaelhe - this is getting much closer. I'm still slightly concerned about requiring the use of the same notebook version, but I'm not sure that's a very big deal. It's certainly better than terminating the wrong process! To that point, and as I noted in the review, we should probably eliminate the PID-based termination altogether for this PR.

really useful key point;
i will change the "/api" protocol, define a better policy for jupyter determination;
and i still invite you as the reviewer.

notebook/notebookapp.py

notebook/base/handlers.py

abaelhe · 2020-09-03T17:13:00Z

Thanks @abaelhe - this is getting much closer. I'm still slightly concerned about requiring the use of the same notebook version, but I'm not sure that's a very big deal. It's certainly better than terminating the wrong process! To that point, and as I noted in the review, we should probably eliminate the PID-based termination altogether for this PR.

really useful key point;
i will change the "/api" protocol, define a better policy for jupyter determination;
and i still invite you as the reviewer.

commit 978d883
Author: git [email protected]
Date: Thu Sep 3 21:19:26 2020 +0800

Multi-instance and multi-version of Jupyter Notebook

commit 07570c6
Author: Abael He [email protected]
Date: Thu Sep 3 08:11:22 2020 +0800

kevin-bates · 2020-09-03T21:18:12Z

Hi @abaelhe - thank you for your patience.

I believe this is getting too complicated for the original goals.

List running servers without using an HTTP request instead of PID.
Do not terminate any process that may have reused the PID.

Instead of adding a version to the /api request and force upgrades, could we just check that the response succeeded? I think we can change to use /api/status later so useful information can be displayed when listing the servers, but for now, I think we just need to check the success status.

Let's try to keep these changes as simple as possible. Does that work for you?

abaelhe · 2020-09-04T03:54:08Z

Hi @abaelhe - thank you for your patience.

I believe this is getting too complicated for the original goals.

List running servers without using an HTTP request instead of PID.

we are on the way to build a reliable ipython/jupyter; a right way requires a bit efforts;

Do not terminate any process that may have reused the PID.

no matter reused process is processing a billions transaction or just hello world;
we MUST respect to user's ownner rights.
especially ipython/jupyter are used to process valuable massive data at most time;

Instead of adding a version to the /api request and force upgrades, could we just check that the response succeeded? I think we can change to use /api/status later so useful information can be displayed when listing the servers, but for now, I think we just need to check the success status.
Let's try to keep these changes as simple as possible. Does that work for you?

the response status is success if response.body marked with JUPYTER_NOTEBOOK_TAG;
we already did that in a reliable and fast "/api" way;
i can't achieve it without you.

kevin-bates

Thank you @abaelhe. There is no need to alter the REST API for this change. Let's just use the fact that a response was successful and issue a message if the version strings don't match.

I still ask that the name and signature of kernel_request() be changed. Thank you.

notebook/base/handlers.py

notebook/notebookapp.py

kevin-bates · 2020-09-04T14:46:52Z

notebook/notebookapp.py

-    Returns True if the server was stopped by any means, False if stopping it
-    failed (on Windows).
-    """
+def kernel_request(server_info, path='/login', method='GET', body=None, headers=None, timeout=5, log=None):


This method needs to change and the default for path either removed or updated to /api

notebook/notebookapp.py

Zsailer · 2020-09-04T18:15:03Z

kernel_request() MUST retrieve Jupyter Notebook identification tag before or after authentication.
and "api/status" is an AUTHENTICATED ONLY call.

Ah, I see. You're right. That is an issue. I agree with you, the /api endpoint should probably return more specific information about the application (namely, Jupyter Notebook).

Here's the problem... changing the response model is a significant REST API breaking change. Such changes take longer to review for us (at Jupyter). As I mentioned before, this type of change affects multiple other applications. In order to accept it, we would have to get buy-in from the larger community. That could take weeks (or even months).

What you're trying to achieve in this PR is a rather small change (though much appreciate!!). To move forward, I see two possible choices:

Drop the API breaking change to the response model in this PR and merge the api_request function that pings the server to verify it's running when list_running_servers is called. Then, sometime in a later PR, we can lobby to change the /api endpoint and improve this function further.
Pause any further review on this PR, propose the change to the REST API through the Jupyter Enhancement Proposals repository, wait for approval there, then return to this PR once everything is approved.

Unfortunately, I (and likely @kevin-bates) don't have the bandwidth to shepherd (2) through the process, but I'd be happy to review a proposal over there. In full disclosure, (2) will likely take much longer to happen.

If you decide (1) is the direction you want to go, I'm happy to merge this after the final review.

Thanks again, @abaelhe, for all your work here. It's great stuff!

Zsailer · 2020-09-04T22:40:55Z

notebook/notebookapp.py

-    Returns True if the server was stopped by any means, False if stopping it
-    failed (on Windows).
-    """
+def kernel_request(server_info, path='/login', method='GET', body=None, headers=None, timeout=5, log=None):


Suggested change

def kernel_request(server_info, path='/login', method='GET', body=None, headers=None, timeout=5, log=None):

def api_request(server_info, path='/api', method='GET', body=None, headers=None, timeout=5, log=None):

abaelhe

kernel_request() MUST retrieve Jupyter Notebook identification tag before or after authentication.
and "api/status" is an AUTHENTICATED ONLY call.

Ah, I see. You're right. That is an issue. I agree with you, the /api endpoint should probably return more specific information about the application (namely, Jupyter Notebook).

Here's the problem... changing the response model is a significant REST API breaking change. Such changes take longer to review for us (at Jupyter). As I mentioned before, this type of change affects multiple other applications. In order to accept it, we would have to get buy-in from the larger community. That could take weeks (or even months).

add one JUPYTER_NOTEBOOK_TAG feature is transparent to those who using "verson" field from "/api";
i don't see that is a break to REST API response model; they are compatible.

What you're trying to achieve in this PR is a rather small change (though much appreciate!!). To move forward, I see two possible choices:

less code, better product; more user.so,
focus on our main road;

Drop the API breaking change to the response model in this PR and merge the api_request function that pings the server to verify it's running when list_running_servers is called. Then, sometime in a later PR, we can lobby to change the /api endpoint and improve this function further.

Pause any further review on this PR, propose the change to the REST API through the Jupyter Enhancement Proposals repository, wait for approval there, then return to this PR once everything is approved.

Unfortunately, I (and likely @kevin-bates) don't have the bandwidth to shepherd (2) through the process, but I'd be happy to review a proposal over there. In full disclosure, (2) will likely take much longer to happen.

If you decide (1) is the direction you want to go, I'm happy to merge this after the final review.

Thanks again, @abaelhe, for all your work here. It's great stuff!

thanks to customers tolerantly using Jupyter Notebook with previous check_pid().

systematic design principle to flexible and diverse methods; shield to spear; they are all unity.
Apple's Logo keeps one bite in history;
why not place JUPYTER_NOTEBOOK_TAG at "/api" or somewhere required; for us, for users.

we break DOS into Windows; RPC into REST API; and many things.
we keep break REST API if that is reasonable;

merge into master branch first;
JUST DO IT!

notebook/notebookapp.py

Zsailer · 2020-09-08T17:16:36Z

we break DOS into Windows; RPC into REST API; and many things.
we keep break REST API if that is reasonable;

merge into master branch first;
JUST DO IT!

This isn't how we do things in the Jupyter ecosystem, for better or worse. Jupyter's REST API is an old protocol that many applications depend on. Changes require formal review through the enhancement proposal process.

In this case, we need to follow Jupyter's process. If you want to propose such a change, we'd appreciate if you'd open the discussion on the enhancement-proposals repo. I am guessing you'll get plenty of support for this idea and we'll be able to move forward fairly quickly. (I just don't have the bandwidth to champion this change myself.)

As this PR stands, we can't accept it until that process is complete or we drop the changes to the REST API.

Reliable list_running_servers(runtime_dir) design

bf52746

kevin-bates reviewed Sep 1, 2020

View reviewed changes

both TCP and UNIX socket are works.

7ae97ff

abaelhe and others added 3 commits September 2, 2020 05:05

Update notebook/notebookapp.py

9cbf95f

Accepted, Thanks. Co-authored-by: Kevin Bates <[email protected]>

Update notebook/notebookapp.py

d774b3e

Accepted. Co-authored-by: Kevin Bates <[email protected]>

consistent persistent file naming rule

ca17f0b

abaelhe added 3 commits September 2, 2020 12:11

consistent persistent file naming rule and same version check rule

a4aae8f

updated list_running_servers() and write_browser_open_file() ONLY if …

40796a6

…not self.sock

bind JUPYTER_PORT or random available port

737a963

abaelhe closed this Sep 2, 2020

nbserver-%s.json % self.file_id

7139ad6

abaelhe reopened this Sep 2, 2020

tests update for new connection file naming rule

19210cb

Zsailer self-requested a review September 2, 2020 15:34

kevin-bates reviewed Sep 2, 2020

View reviewed changes

abaelhe and others added 2 commits September 3, 2020 08:05

Update notebookapp.py

281966d

Update notebook/notebookapp.py

07570c6

Co-authored-by: Kevin Bates <[email protected]>

abaelhe commented Sep 3, 2020

View reviewed changes

notebook/notebookapp.py Show resolved Hide resolved

Multi-instance and multi-version of Jupyter Notebook

978d883

kevin-bates reviewed Sep 3, 2020

View reviewed changes

notebook/notebookapp.py Outdated Show resolved Hide resolved

kevin-bates reviewed Sep 3, 2020

View reviewed changes

notebook/base/handlers.py Show resolved Hide resolved

abaelhe added 4 commits September 3, 2020 22:43

json.loads(response.body) ---> json.loads(response.body.decode())

958151a

User Notifications for policy change of list_running_servers()

a305e0d

User Notifications for policy change of list_running_servers()

908a630

let user make choices to their data, preference or environment.

57330ff

abaelhe closed this Sep 3, 2020

abaelhe reopened this Sep 3, 2020

abaelhe requested a review from kevin-bates September 4, 2020 12:59

kevin-bates reviewed Sep 4, 2020

View reviewed changes

better message to users

40850c4

abaelhe requested a review from kevin-bates September 4, 2020 18:06

Zsailer requested changes Sep 4, 2020

View reviewed changes

abaelhe commented Sep 5, 2020

View reviewed changes

notebook/notebookapp.py Show resolved Hide resolved

abaelhe requested a review from Zsailer September 5, 2020 07:26

abaelhe mentioned this pull request Sep 9, 2020

JEP: Reliable method of Jupyter running status inspection. jupyter/enhancement-proposals#64

Open

Zsailer changed the base branch from main to 6.4.x March 7, 2022 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reliable list_running_servers(runtime_dir) design #5716

Reliable list_running_servers(runtime_dir) design #5716

abaelhe commented Aug 31, 2020

kevin-bates left a comment

kevin-bates Sep 1, 2020

abaelhe Sep 1, 2020 •

edited

Loading

kevin-bates Sep 4, 2020

Zsailer Sep 4, 2020

abaelhe Sep 4, 2020

Zsailer Sep 4, 2020

Zsailer Sep 4, 2020

Zsailer Sep 4, 2020

abaelhe commented Sep 1, 2020 •

edited

Loading

kevin-bates commented Sep 1, 2020

abaelhe commented Sep 2, 2020

kevin-bates left a comment

abaelhe left a comment

abaelhe commented Sep 3, 2020 •

edited

Loading

kevin-bates commented Sep 3, 2020

abaelhe commented Sep 4, 2020 •

edited

Loading

kevin-bates left a comment

kevin-bates Sep 4, 2020

Zsailer commented Sep 4, 2020

Zsailer Sep 4, 2020

abaelhe left a comment •

edited

Loading

Zsailer commented Sep 8, 2020

	def kernel_request(server_info, path='/login', method='GET', body=None, headers=None, timeout=5, log=None):
	def api_request(server_info, path='/api', method='GET', body=None, headers=None, timeout=5, log=None):

Reliable list_running_servers(runtime_dir) design #5716

Are you sure you want to change the base?

Reliable list_running_servers(runtime_dir) design #5716

Conversation

abaelhe commented Aug 31, 2020

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abaelhe Sep 1, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abaelhe commented Sep 1, 2020 • edited Loading

kevin-bates commented Sep 1, 2020

abaelhe commented Sep 2, 2020

kevin-bates left a comment

Choose a reason for hiding this comment

abaelhe left a comment

Choose a reason for hiding this comment

abaelhe commented Sep 3, 2020 • edited Loading

kevin-bates commented Sep 3, 2020

abaelhe commented Sep 4, 2020 • edited Loading

kevin-bates left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zsailer commented Sep 4, 2020

Choose a reason for hiding this comment

abaelhe left a comment • edited Loading

Choose a reason for hiding this comment

Zsailer commented Sep 8, 2020

abaelhe Sep 1, 2020 •

edited

Loading

abaelhe commented Sep 1, 2020 •

edited

Loading

abaelhe commented Sep 3, 2020 •

edited

Loading

abaelhe commented Sep 4, 2020 •

edited

Loading

abaelhe left a comment •

edited

Loading