Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the run command wait for Hub to get ready #30

Open
lobodpav opened this issue Jan 29, 2022 · 18 comments
Open

Make the run command wait for Hub to get ready #30

lobodpav opened this issue Jan 29, 2022 · 18 comments

Comments

@lobodpav
Copy link
Contributor

lobodpav commented Jan 29, 2022

Introduction

When fine-tuning programs and running them via pybricksdev, I often run the program several times per minute.
However, what happens to me frequently is that

  1. the Hub is off (turns off itself after a certain inactivity period)
  2. the Hub is running the old program when I forget to stop it by pressing the Hub's button

In both cases, the execution fails and I have to either turn the Hub on or stop the program, and then re-run pybricksdev run command.

It would be much more user firiendly if pybricksdev would handle these scenarios:

  1. If the Hub is turned off while pybricksdev run is waiting for connection, the program should be loaded and executed once the Hub is turned on
  2. If the Hub runs a program while pybricksdev run is executed, pybricksdev should
    1. either wait a moment till the user stops the old running program and then load and run the new one
    2. or even better, stop the old program and run the new one

Environment

macOS 12
Python 3.10.1
pybricksdev v1.0.0-alpha.24
Inventor Hub

Replication steps

  1. Switch off the Hub
  2. Execute pybricksdev run ble program.py
  3. Switch on the Hub
  4. Actual behaviour: asyncio.exceptions.TimeoutError
  5. Expected behaviour: The program should be uploaded and executed

Stack trace


Traceback (most recent call last):
  File "/Users/pavel/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/pavel/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/pavel/Git/Other/pybricksdev/pybricksdev/__main__.py", line 9, in <module>
    main()
  File "/Users/pavel/Git/Other/pybricksdev/pybricksdev/cli/__init__.py", line 374, in main
    asyncio.run(subparsers.choices[args.tool].tool.run(args))
  File "/Users/pavel/.pyenv/versions/3.8.12/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/pavel/.pyenv/versions/3.8.12/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/pavel/Git/Other/pybricksdev/pybricksdev/cli/__init__.py", line 198, in run
    await hub.run(script_path, args.wait)
  File "/Users/pavel/Git/Other/pybricksdev/pybricksdev/connections.py", line 367, in run
    await self.send_block(length)
  File "/Users/pavel/Git/Other/pybricksdev/pybricksdev/connections.py", line 340, in send_block
    await asyncio.wait_for(self.checksum_ready.wait(), timeout=0.5)
  File "/Users/pavel/.pyenv/versions/3.8.12/lib/python3.8/asyncio/tasks.py", line 501, in wait_for
    raise exceptions.TimeoutError()
asyncio.exceptions.TimeoutError

Analysis (cli/__init__.py and connections.py)

  1. The device_or_address = await find_device(args.name) works with a default 10s timeout
  2. Increasing the await asyncio.wait_for(self.checksum_ready.wait(), timeout=0.5) timeout does not help
  3. The nus_handler isn't caled at all for some reason, once the Hub is turned on
  4. If I do sleep before a connection is made, everything works.
diff --git a/pybricksdev/connections.py b/pybricksdev/connections.py
index c56d1f4..520e8b3 100644
--- a/pybricksdev/connections.py
+++ b/pybricksdev/connections.py
@@ -291,6 +291,7 @@ class PybricksHub:
             logger.info("Disconnected!")
             self.connected = False
 
+        await asyncio.sleep(5)
         await self.client.connect(disconnected_callback=disconnected_handler)
         try:
             logger.info("Connected successfully!")

I've tried to fix the problem myself but failed. Would you guide me on how could I achieve the behaviour described in the Introduction section? Happy to code the stuff myself if that's possible.

@laurensvalk
Copy link
Member

This would be nice. 👍

However... if this were to become a direction we're going in, perhaps it would be time to pick up the old idea of making a proper Visual Studio Code extension (or something like it) to download and run programs and multi-file projects, and stay connected in between runs.

pybricksdev was originally (and probably still is) intended as a basic tool for pybricks developers to simplify deployment and testing.

@lobodpav
Copy link
Contributor Author

Visual Studio plugin would be nice but then we're stuck with that single IDE.

I pretty much like pybricksdev because I can easily integrate it with IntelliJ which is my favorite IDE.

That's why I volunteer to improve pybricksdev 😊

@lobodpav
Copy link
Contributor Author

Giving the issue a second look, I have found the following:

  1. Run a program
  2. Turned the Hub on
  3. The Hub is found successfully when calling device_or_address = await find_device(args.name)
  4. The connection to the Hub is successful using await hub.connect(device_or_address)
  5. The length of the compiled program is sent to the Hub using await send_block(length)
  6. The device does not respond by sending back the checksum

When adding a 1s wait time before the program length is sent, everything works fine.

I suspect the problem is that the Hub initialisation takes some time and when we send data prior to the Hub being initialised, it simply does not receive it and does not reply with the checksum. Am I correct here?

I have also noticed the StatusFlags do not have any HUB_IS_READY flag which means it is impossible to delay the data sending until the HUB can actually receive any data 😢

@dlech
Copy link
Member

dlech commented Feb 27, 2022

I added a similar delay in cli/flash.py because of the same problem. I don't think that the hub can actually known when it is ready though. I think the problem is that the Bluetooth stack on the computer is still probing the hub for a while and it causes the request to get dropped somewhere along the way. If the hub is able to connect via Bluetooth, then everything in the firmware that we wrote should be "ready" already (unless perhaps you are still holding down the power button).

@lobodpav
Copy link
Contributor Author

I can see the await asyncio.sleep(1) now in the cli/flash.py. Thanks!

It's weird though that the same issue does not happen when the Hub is on before executing pybricksdev run.

Since the Hub is ready when the connection is successful, I don't understand what is happening to the data sent. I'd like to fix this stuff properly without using sleep, but I haven't the foggiest idea what is going on and how to find out the root cause 😞.

@dlech
Copy link
Member

dlech commented Feb 27, 2022

Capturing Bluetooth packets to see what is actually going on is usually a good first step.

@lobodpav
Copy link
Contributor Author

Capturing Bluetooth packets to see what is actually going on is usually a good first step.

Will give it a try. The problem I see is to be able to understand what all the packets mean 😊

@dlech
Copy link
Member

dlech commented Feb 27, 2022

If you are using SPIKE/MINDSTORMS hub, then the issue could also be that BTStack is doing more in the background than the other Bluetooth drivers on other hubs and it needs a bit of delay before advertising on boot. Also, those hubs now have some blocking code for the external flash memory that could be interfering.

@lobodpav
Copy link
Contributor Author

lobodpav commented Feb 28, 2022

Tried to capture BLE packets for both scenarios - Hub turned on prior to running pybricksdev, and Hub turned on after running pybricksdev.

The results are the same for finding and connecting to the Hub.

Two differences I've observed

  1. Once pybricksdev tries to send program length (in my case b'O\x02\x00\x00')
    1. Hub turned on prior to running pybricksdev received the correct \x4d checksum (escaped string "\x03\x41\x20\x08\x00\x04\x00\x04\x00\x1b\x13\x00\x4d"), and the program upload finished successfully.
    2. Hub turned on after running pybricksdev received \x00 value (escaped string "\x03\x40\x20\x0c\x00\x08\x00\x04\x00\x1b\x0d\x00\x00\x00\x00\x00\x00"), and this is where all communication from/to the hub ended.
  2. Source and destination addresses
    1. Hub turned on prior to running pybricksdev is identified as TexasIns_fa:c7:28 (prime_hub) in Wireshark
    2. Hub turned on after running pybricksdev is identified as TexasIns_fa:c7:28 () in Wireshark

@laurensvalk
Copy link
Member

The blocking flash stuff on startup is a likely offender. To find out, try downgrading the firmware to this one or something older. This commit is from just before merging the new flash code.

@lobodpav
Copy link
Contributor Author

lobodpav commented Feb 28, 2022

The blocking flash stuff on startup is a likely offender. To find out, try downgrading the firmware to this one or something older. This commit is from just before merging the new flash code.

Used the latest pybricksdev together with several older commits of the pybricksdev-micropython and found the guilty one 😊

Four subsequent pybricksdev-micropython commits:

  • 8017506 Timeouts if the hub is turned on after executing pybricksdev run
  • 773c927 Code uploads but the program does not run
  • 79c9f76 Code uploads but the program does not run
  • d804d66 Works OK

I guess the code in bricks/stm32/main.c is causing it

static void stm32_main(void) {
     #if PYBRICKS_HUB_PRIMEHUB
     mp_hal_delay_ms(500);
     pb_flash_init();
     #endif

@laurensvalk
Copy link
Member

All of flash needs to be rewritten, not just that one in particular.

But if it’s choosing between waiting a second after boot versus no program being stored at all, I’d rather just wait a second for now. :)

@lobodpav
Copy link
Contributor Author

All of flash needs to be rewritten, not just that one in particular.

I don't know how this stuff works but wondering, @dlech mentioned that once the Hub is discoverable over BLE, it should be ready. So wondering if the flash init could be done before the Hub can be connected to, that would do the trick. But it might just be it's not possible this way.

But if it’s choosing between waiting a second after boot versus no program being stored at all, I’d rather just wait a second for now. :)

As a workaround, I can add 1s sleep into the pybricksdev.cli.Run.run method, the same way the pybricksdev.cli.flash.reboot_official_to_bootloader has.

@dlech
Copy link
Member

dlech commented Feb 28, 2022

So wondering if the flash init could be done before the Hub can be connected to,

Assuming that pb_flash_init() does not call any MicroPython code, we should be able to move it to pbsys_process before pbsys_bluetooth_init().

@dlech
Copy link
Member

dlech commented Feb 28, 2022

@lobodpav, can you please test pybricks/pybricks-micropython@e884640 when it is finished building?

@lobodpav
Copy link
Contributor Author

lobodpav commented Mar 1, 2022

@lobodpav, can you please test pybricks/pybricks-micropython@e884640 when it is finished building?

Tested it now. Something is wrong - after writing the firmware the 3x3 LED square keeps on flashing and the Hub's main button light is off.

An attempt to run a program ends with RuntimeError: disconnected during operation.

@dlech
Copy link
Member

dlech commented Mar 1, 2022

I see now that the "block" operations are calling MICROPY_EVENT_POLL_HOOK, so that change doesn't work.

@Eggmanplant
Copy link

Same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants