gh-131591: Allow pdb to attach to a running process #132451

godlygeek · 2025-04-12T18:59:38Z

Initial proof-of-concept and design. This is lacking tests and documentation, but otherwise works well, and it would be very helpful if anyone interested can try it out!

It has been tested on macOS and Linux, though not yet on Windows.

Issue: Implement PEP 768 – Safe external debugger interface for CPython #131591

Signed-off-by: Matt Wozniski <[email protected]>

godlygeek · 2025-04-12T19:02:19Z

Oh, one missing feature: this doesn't currently send a SIGINT to the PID when the client gets a ctrl-c.

That shouldn't be tough to add, at least for Linux and macOS - I'm not sure if something equivalent is doable on Windows or not?

EDIT: This has been implemented in a way that works for all platforms.

pablogsal · 2025-04-12T21:14:34Z

CC @gaogaotiantian

pablogsal · 2025-04-12T21:16:36Z

@gaogaotiantian as promised this is the remote PDB initial implementation. Let’s try to get it ready for beta freeze 🤘

I suggest to try to get the basics landed first so we can iterate in parallel between all of us instead of dying on this PR

gaogaotiantian · 2025-04-14T14:24:05Z

I agree that we should merge the basic stuff in first. It's a relatively long PR so I'll need some time :)

pablogsal · 2025-04-14T15:13:19Z

I agree that we should merge the basic stuff in first. It's a relatively long PR so I'll need some time :)

Feel free to push to the PR if you have small nits or thinks like that to avoid review roundtripping

gaogaotiantian

Overall I think the approach is good. I'm not the socket expert but it seems solid to me. I would like the pdb part to be even simpler - it would be much better to provide some basic version then add features when requested, than to promise a lot and realize some of them are too complicated to make it correct. The users would already be thrilled to have a basic attachable debugger that functions, even without some rarely used commands.

It would be a nice chance to understand what the users are actually using as well. Unsupported features are much better than buggy ones.

Lib/pdb.py

gaogaotiantian · 2025-04-14T19:04:28Z

Lib/pdb.py

+    if opts.pid:
+        # If attaching to a remote pid, unrecognized arguments are not allowed.
+        # This will raise an error if there are extra unrecognized arguments.
+        parser.parse_args()


I think attach is so different than the rest, we should only have one entry to parse and do the attach. parser.parse_args() is also a bit magical here. I think we should combine this piece and the one below and just check if -m or args exists. If not, print some easy to understand message. Then do attach. Put the attaching code together and as a short-cut.

I also considered putting -p and -m in a ArgumentParser.add_mutually_exclusive_group but decided to err on the side of simplicity. That said, if you have a strong opinion on how best to do this, I'm happy to follow your lead. Want to push something that you're happy with?

I'm thinking of combining the two parts together:

if opts.pid: # If attaching to a remote pid, unrecognized arguments are not allowed. # This will raise an error if there are extra unrecognized arguments. opts = parser.parse_args() if opts.module: parser.error("argument -m: not allowed with argument --pid") attach(opts.pid, opts.commands) return # Existing code opts, args = parser.parse_known_args()

gaogaotiantian · 2025-04-14T19:45:55Z

Lib/pdb.py

+    ns: dict[str, typing.Any]
+
+
+class _RemotePdb(Pdb):


If we call the client _PdbClient, why not _PdbServer for this? I think it would clearer that they are a pair.

I think _RemotePdb makes it clearer that it is-a Pdb in the OOP sense.

_PdbServer doesn't imply anything about whether it uses inheritance or composition, but _RemotePdb does, I think.

That said, it's a private class, so there's not much need to bikeshed on the name. If you prefer _PdbServer we can switch it.

I don't think we really do this kind of OOP naming in stdlib. _RemotePdb is easily confused with a pdb that can remote to other processes. I still prefer _PdbServer as it's a clear pair with _PdbClient and it's obvious that it runs in the same process.

Lib/pdb.py

gaogaotiantian · 2025-04-14T20:20:16Z

Lib/pdb.py

+            ]
+            # fmt: on
+
+            self._send(commands_entry={"bpnum": bnum, "terminators": end_cmds})


Why do we need to send these end commands back and forth? They are constants. Also end is a very important one (arguably the most important one as it's in the example).

They're constants, but they're not necessarily the same constants for the client and the server, since they can be running different versions of Python.

If the attaching client is running 3.14.9 and the remote PDB server is running 3.14.0, it's possible that the client will think that some command should terminate the command list, and send a command list to the server that the server thinks is incomplete because it doesn't recognize that terminator. If that happened, we'd be out of sync - the server would still be waiting for more lines, and the client would think that it's done sending lines.

That said, perhaps I'm being overly defensive, and we shouldn't expect any new command to ever be added after 3.X final. But the risk of getting it wrong is leaving you with a broken debugger session that might only be able to be fixed by killing the client.

Another possible option is to be defensive in the:

for line in commands: if self.handle_command_def(line): break

loop below. We could say that loop must break, and that if it doesn't, it can only mean that we got a list that was unterminated, and so we should roll back to the old set of commands. If you prefer that over sending list of early-command-terminators back over the wire, I'm fine with that.

That'd just be something like (pseudocode):

saved = old_commands_for_breakpoint(bnum) for line in commands: if self.handle_command_def(line): break else: self.error("Incomplete command list!") set_commands_for_breakpoint(bnum, saved)

the server would still be waiting for more lines, and the client would think that it's done sending lines.

That won't happen. When the client sent the commands as a list, the server will just finish the list and consider it done for the commands. Nothing worse would happen.

Also, I don't expect the list of resuming commands to change. And as discussed elsewhere, I don't think we should support cross version debugging.

I just tested it and CTRL+D does not work either (to terminate the commands).

That won't happen. When the client sent the commands as a list, the server will just finish the list and consider it done for the commands. Nothing worse would happen.

Ah, you're right - I misremembered and thought that _RemotePdb sets self.commands_defining = True, but it doesn't (it used to in one of my earlier iterations).

OK, in that case, the only reason I can think of to send the list of resuming commands from the remote to the client is aliases. Currently I'm not handling that, and I'm just hardcoding, but in regular PDB you can:

(Pdb) alias blah cont (Pdb) import functools (Pdb) b functools.cache Breakpoint 1 at /opt/bb/lib/python3.13/functools.py:676 (Pdb) commands (com) p "hello from breakpoint" (com) blah (Pdb)

Note that blah ended the command list. The only way we can support that for remote PDB is to send the list of resuming commands from the server to the client. But, I also don't care much if we support that. As you say, if we don't, the only impact is that the user needs to a) use the regular command instead of the alias, or b) add an end command.

That seems fine to me. I think that having the resuming commands end the command list is a weird feature anyway, but I was aiming for feature parity.

So: if you care about aliases for resuming commands ending the commands list, I'll keep this code and update it to also send the list of aliases that expand to one of the commands in the list.

If you don't care about aliases for resuming commands ending the command list, I'm happy to hardcode the list client-side instead.

I just tested it and CTRL+D does not work either (to terminate the commands).

Not for regular PDB either, so that's fine.

But ctrl-c just kills the whole remote pdb client, that's wrong.

Also end is a very important one

lol oops, that's what I get for focusing on edge cases 🤣

Fixed 'end' and ctrl-c in e44a670

I believe the only "correct" way to do this is to send the commands to the server one by one - you'll never get the correct result with client parsing. For example, x = 1;;continue should end the command prompt, and the current client does not do that. Unless you parse everything exactly as the server does, you won't get the same result.

For your blah example - the current code does not handle it, and it would be complicated to make it work properly. If we want to make it right, we need to simulate the send/receive line by line.

The worse case in the current implementation is that the user needs to end the command with an end - that's not too bad. We can do it for now.

For example, x = 1;;continue should end the command prompt, and the current client does not do that.

Oof, yeah. I didn't think about ;;

If we want to make it right, we need to simulate the send/receive line by line.

I can do that. It wouldn't be too tough, and it would wind up looking a lot like interact - set a flag for the "mode", and make it be possible for the client to return to command mode (on ctrl-c or ctrl-d) or for the server to (when the command list is terminated).

I understand how to do this, and it wouldn't be difficult, but it would be almost exactly as complex as interact, for better or worse.

Addressed in 600aa05

Lib/pdb.py

These were removed from the Pdb base class in b577460

The only thing _RemotePdb needs to customize is how the recursive debugger is constructed. Have the base class provide a customization point that we can use.

This makes us robust against changes in the temporary script.

Zheaoli · 2025-04-16T19:47:56Z

To be honest, I think the TCP socket here is not good for the user. Here's the detail

For now, we use closing(socket.create_server(("localhost", 0))) to allocate a random port from the system. This means that the port can be a common port used by other services, like 3000, 9432, 9092, etc. So the service may can not bootup when the debug process has existed and the people may don't know why(I think we don't have documentation for this behavior yet)

So I think we may need to allow people setup a port allocate range to avoid the common port usage range.

But here's a better way to do this. Maybe we can let people choose to use the Unix Domain Socket(aka UDS) or the TCP Socket. For now, most of the OS platform has supported the UDS (even if Windows since 17063), So we can use UDS first and fallback to TCP if the OS platform not support UDS yet)

cc @gaogaotiantian

godlygeek · 2025-04-16T20:24:47Z

For now, we use closing(socket.create_server(("localhost", 0))) to allocate a random port from the system. This means that the port can be a common port used by other services, like 3000, 9432, 9092, etc.

So I think we may need to allow people setup a port allocate range to avoid the common port usage range.

The common port range is avoided automatically on every operating system I know of.

POSIX says:

If the sin_port value passed to bind() is zero, the port number bound to the socket shall be one chosen by the implementation from an implementation-defined port range to produce an unused local address.

On Linux that implementation-defined port range is configurable via /proc/sys/net/ipv4/ip_local_port_range but the default range is 32768 to 60999.

On Windows the port range is 49152 to 65535.

I'm having trouble finding official docs for macOS, but StackOverflow says that it's configurable like on Linux, with a default range of 49152 to 65535.

So the service may can not bootup when the debug process has existed and the people may don't know why

Statically configuring a service to start on a port that's allocated to the pool of ephemeral ports is incorrect. The configuration should be corrected, rather than assuming that no other software on the machine ever uses the widely portable feature of supplying bind() with 0 as the port.

Zheaoli · 2025-04-16T20:44:12Z

Statically configuring a service to start on a port that's allocated to the pool of ephemeral ports is incorrect. The configuration should be corrected, rather than assuming that no other software on the machine ever uses the widely portable feature of supplying bind() with 0 as the port.

This is correct, I agree with your idea. But my concern is the different Linux distro may have different default local port range out of box. If we still use TCP rather than the UDS, I think we may need mention this behavior on the documentation.

But in my side, I think the random server port is not a common usage for socket programming. When we try to communicate between the processes, UDS should be a better way to reach the target. It's more effective and nor more port conflict

Lib/pdb.py

The base class sometimes calls these methods with an exception object.

gaogaotiantian · 2025-04-18T01:30:58Z

Okay I tried to read the protocol for interact and breakpoint commands. Here are my thoughts.

I think the protocol could be simpler and easier to maintain if we reuse more of the existing structure of pdb. We can't fully mimic a stdin/stdout because that could be difficult to parse, but I don't think we need to make the state of pdb controllable from both client and server. I don't think the client needs to send different commands for "interact", "commands" and "breakpoint commands" (which is very unfortunate because it's called commands in pdb but commands is used in the protocol for normal pdb commands).

I think RemotePdb should be the only object that maintains the state, and it tells PdbClient what state it's in - either "normal"? (there might be a better word for it), "interact" or "commands". That should be an enum string that's sent through prompt protocol - it's possible we have more in the future, they'll all be a kind of states and we have a single source of truth.

Then, with the prompt protocol, the server sends this information to the client, and the client display prompts based on the mode (I don't even know if prompt itself is needed, might be). The client should only listen to the server about what the current state is, and behave solely based on that.

In order to do this, we need to give the client the capability to send "control signals" to server, mainly EOF and KeyboardInterrupt - we need a separate prototol keyword for it. This could unify the bahavior on remote pdb. For example, the commands command actually stops from Ctrl+D - it's not obvious, but try it out.

I believe with this protocol, we can do minimum changes to make commands work - not on the client side, but on server side. We shouldn't need to duplicate all the parsing loops on the client side. The client side should almost only do - listen, display, send what user types (or control signal), and listen again.

gaogaotiantian · 2025-04-23T15:57:33Z

Okay sure. I'm good with creating an issue for the bug so we won't forget it and working on landing this PR.

gaogaotiantian · 2025-04-23T15:58:40Z

Oh, we definitely need a whatsnew entry for it.

pablogsal · 2025-04-23T15:59:27Z

Oh, we definitely need a whatsnew entry for it.

I am working on more tests, will be pushing them soon

pablogsal · 2025-04-23T18:03:10Z

@gaogaotiantian what's new entry added and added some tests that cover most of what is not super complicated to test. Do you mind formally approving so I know you are fine to land and then we can open new issues for the different things we want to improve.

@godlygeek will open a issue as soon as we land for the infinite loop problem.

gaogaotiantian · 2025-04-23T19:27:09Z

Lib/test/test_remote_pdb.py

+    def test_protocol_version(self):
+        """Test that incompatible protocol versions are properly detected."""
+        # Create a script using an incompatible protocol version
+        script = f"""


Could you do textwrap.dedent() for all the scripts? So it's visually indented properly.

@gaogaotiantian done

gaogaotiantian · 2025-04-24T00:34:39Z

Is the test flaky? That's something I really want to avoid because everyone is going to be disrupted by flaky tests. I'll run the tests a few times and see if it's stable.

gaogaotiantian · 2025-04-24T00:40:16Z

It failed for a second time, that is worth checking. It's the keyboard interrupt one.

pablogsal · 2025-04-24T01:04:57Z

It failed for a second time, that is worth checking. It's the keyboard interrupt one.

Will check tomorrow

Lib/test/test_remote_pdb.py

pablogsal · 2025-04-25T00:13:15Z

@gaogaotiantian Ok the test is fixed. I plan to land this tomorrow unless you have any other concern.

gaogaotiantian · 2025-04-25T00:32:53Z

Sure let's land this first and polish it later. We still have some work to do but we don't have to do them all together immediately. Thanks for the work, this is a very cool feature for pdb.

pablogsal · 2025-04-25T00:43:01Z

Awesome! Thanks for the thorough review @gaogaotiantian! This was not easy (it's 1,330 of very tricky code and more than a hundred comments) and you have been not only very thorough, patient and understanding and I am happy we have been able to navigate a plan. pdb is certainly in good hands :)

Let's iterate together on polishing the feature! ❤️

Allow pdb to attach to a running process

5225333

Signed-off-by: Matt Wozniski <[email protected]>

bedevere-app bot mentioned this pull request Apr 12, 2025

Implement PEP 768 – Safe external debugger interface for CPython #131591

Open

pablogsal requested a review from Copilot April 14, 2025 13:15

This comment was marked as off-topic.

Sign in to view

gaogaotiantian reviewed Apr 14, 2025

View reviewed changes

godlygeek added 6 commits April 15, 2025 16:32

Remove 2 unused _RemotePdb instance attributes

78a3085

These were removed from the Pdb base class in b577460

Reduce duplication for 'debug' command

90e0a81

The only thing _RemotePdb needs to customize is how the recursive debugger is constructed. Have the base class provide a customization point that we can use.

End commands entry on 'end' and ^C and ^D

e44a670

Set the frame for remote pdb to stop in explicitly

e837246

This makes us robust against changes in the temporary script.

Fix an unbound local in an error message

27efa97

Clean up remote PDB detaching

557a725

Zheaoli reviewed Apr 16, 2025

View reviewed changes

Lib/pdb.py Outdated Show resolved Hide resolved

Zheaoli reviewed Apr 16, 2025

View reviewed changes

Lib/pdb.py Show resolved Hide resolved

godlygeek and others added 8 commits April 17, 2025 17:35

Allow ctrl-c to interrupt a running process

7f7584a

Automatically detach if the client dies unexpectedly

5666ffb

Clear _last_pdb_instance on detach

325f166

Refuse to attach if another PDB instance is installed

72830e2

Handle the confirmation prompt issued by 'clear'

27c6780

Make message and error handle non-string args

baaf28a

The base class sometimes calls these methods with an exception object.

Add some basic tests

e61cc31

Don't use deprecated method

5d59ce1

More tests

c654fdf

pablogsal added 2 commits April 23, 2025 18:31

More tests

205bc55

More tests

bbe784b

pablogsal force-pushed the remote_pdb branch from 7a2bf51 to bbe784b Compare April 23, 2025 17:52

Add what's new entry

30cb537

gaogaotiantian reviewed Apr 23, 2025

View reviewed changes

use dedent

659556f

pablogsal force-pushed the remote_pdb branch 3 times, most recently from 28e5992 to 659556f Compare April 24, 2025 16:20

godlygeek commented Apr 24, 2025

View reviewed changes

Lib/test/test_remote_pdb.py Outdated Show resolved Hide resolved

godlygeek added 2 commits April 24, 2025 16:52

Add synchronization to test_keyboard_interrupt

6c2d970

Stop sending a "signal" message in test_keyboard_interrupt"

100be44

pablogsal approved these changes Apr 25, 2025

View reviewed changes

bedevere-app bot added awaiting merge and removed awaiting review labels Apr 25, 2025

gaogaotiantian approved these changes Apr 25, 2025

View reviewed changes

pablogsal merged commit 797b29b into python:main Apr 25, 2025
43 checks passed

bedevere-app bot removed the awaiting merge label Apr 25, 2025

godlygeek mentioned this pull request Apr 25, 2025

Remote PDB can't interrupt an infinite loop in an evaluated command #132975

Closed

Uh oh!

gh-131591: Allow pdb to attach to a running process #132451

gh-131591: Allow pdb to attach to a running process #132451

Uh oh!

Conversation

godlygeek commented Apr 12, 2025 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

godlygeek commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pablogsal commented Apr 12, 2025

Uh oh!

pablogsal commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as off-topic.

Uh oh!

gaogaotiantian commented Apr 14, 2025

Uh oh!

pablogsal commented Apr 14, 2025

Uh oh!

gaogaotiantian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

godlygeek Apr 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Zheaoli commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

godlygeek commented Apr 16, 2025

Uh oh!

Zheaoli commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gaogaotiantian commented Apr 18, 2025

Uh oh!

gaogaotiantian commented Apr 23, 2025

Uh oh!

gaogaotiantian commented Apr 23, 2025

Uh oh!

pablogsal commented Apr 23, 2025

Uh oh!

godlygeek commented Apr 12, 2025 •

edited by bedevere-app bot

Loading

godlygeek commented Apr 12, 2025 •

edited

Loading

pablogsal commented Apr 12, 2025 •

edited

Loading

godlygeek Apr 15, 2025 •

edited

Loading

Zheaoli commented Apr 16, 2025 •

edited

Loading

Zheaoli commented Apr 16, 2025 •

edited

Loading