Making `panic-never` a requirement or convention for `rust-embedded` libraries where feasible

## TL;DR

It would be great if `rust-embedded` adopted [`panic-never`](https://crates.io/crates/panic-never) as a standard for libraries. I found it impossible to take advantage of `panic-never` while also taking advantage of the rust-embedded libraries necessary to build my first non-trivial Rust embedded project. This was due to frequent uses of `panic!` throughout libraries. This is totally understandable as Rust itself provides very little tooling to avoid this and almost encourages it (i.e. indexing, slicing). While `panic!` is very useful for quick iteration in software, it can be detrimental to firmware without significant tooling/logging in the `panic-handler` which isn't always feasible for embedded projects, especially due to severe lack of context in the panic handler compared to regular error handling. The newish [`no-panic`](https://crates.io/crates/no-panic) crate may help with retrofitting `panic-never`, and const generics may help to avoid common causes of panicking code branches.

---

## Context & Motivation

This suggestion comes from my experience writing firmware for a kinetic artwork over the past few months. I finally have had some time to reflect and thought I'd open an issue here to get others' thoughts on this :) It seems particularly good timing considering that avoiding Rust panics seems to be the hot topic for landing Rust support in the Linux kernel today https://lkml.org/lkml/2021/4/14/1099.

This was one of my first major firmware projects, involving a pretty tall stack of protocols including SPI for LEDs, I2C for time of flight, one-wire UART for motor driver control and Ethernet for real-time TCP/IP communication with the master software. To achieve this project in the deadline that we had would have been **impossible** without all the awesome existing work in the rust-embedded ecosystem. The fact that I could include an Ethernet bootloader, serialization between software and firmware, use a real-time scheduler and more thanks to existing work made the project possible! Naturally, this required leaning on quite a few dependencies, including `postcard` (and in turn `serde`), `rtic`, `smoltcp` and loads of others.

During early prototype testing, I quickly learned just how drastic `panic!`ing could be in firmware compared to my experience with writing software, particularly when controlling a large number of motors attached to expensive parts. This lead me to search for solutions to ensure that I could avoid panicking entirely, which lead me to [the `panic-never` crate](https://crates.io/crates/panic-never).

After a few days of commenting out the entire project and trying to add modules back one by one with `panic-never` included, I quickly realised that, while I *could* track down and address all of the `panic!` sites in my own code, it would be impossible for me to track down and address all `panic!` sites throughout all the dependencies that I required for the project to function - especially considering the limited, cryptic linker errors that `panic-never` could provide, resulting in an approach that consisted of commenting everything out and re-adding parts one at a time until the linker error showed up.

Following the realisation that I would have to accept the possibility of `panic!`s, I began work on a custom panic handler. *Easily* the largest problem with the custom panic handler was the lack of context, and not knowing what state the device was in when the panic occurred... This lead to the need for moving parts of the application state into global state. This was necessary to 1. send some indication of an error back to the master via Ethernet (provided it was even possible to do so in the panicking state) and 2. *disable the motor via UART!* This was of utmost importance as the motor driver has it's own step generator, and if the last thing it received was some high velocity before the panic, then there was nothing else stopping it from endlessly driving out the motors until someone freaks out and cuts the power :scream:  

Beyond the obvious reasons why moving state into a global context was unpleasant, I was using RTIC to handle scheduling. RTIC requires managing state in a certain way in order for its priority task system to function in a safe-yet-efficient manner. This meant **lots** of acrobatics with mutexes and critical sections in order to expose the necessary networking and motor state to the panic handler through a global context, much of which I'm still uncertain is actually *safe* to this day.

---

I want to acknowledge that all of these problems are ultimately our own fault. Specifically, for cornering ourselves by accepting a timeline for a project that meant I simply couldn't both 1. take advantage of many of the awesome existing crates throughout the rust ecosystem that were necessary to make such a sophisticated project possible in a short amount of time and 2. actually review all of these dependencies and develop enough familiarity with their src to guarantee there could be no `panic!` conditions throughout. It is this choice that lead to the need for the aformentioned hacks and awkward panic handling solution.

That said, I think it is at least worth checking whether or not it is possible to have our cake and eat it too by investigating the feasibility of having `panic-never` as a standard practise for embedded libraries. I cannot tell you how much of a relief it would be to know for certain that it simply wasn't possible to panic, particularly when the firmware is moving 100s of motors around on an artist's budget that provides very little room for repairs :joy: While custom panic handlers help, they provide almost no context about the state of the system during a panic by default and encourage some serious anti-patterns in order to handle those cases.

**`no-panic`**

I think perhaps this is more achievable now that [`no-panic`](https://crates.io/crates/no-panic) exists, allowing for a more granular approach to narrowing down `panic!` sites, also with *slightly* better error messages. The function attribute approach allows for achieving a panic-less codebase one function at a time, without having to solve everything at once as is required with `panic-never`alone.

**Indexing, slicing and const generics**

I think const generics may also play a large role in making this possible. Perhaps the sneakiest and most prevalent culprit for introducing panicking code is rust's core `Index`ing and slicing methods. This is especially frustrating when most embedded code works with fixed size arrays, where the author performing the indexing/slicing *knows* that it is safe to do so and that it is impossible for the panic to actually occur. I wonder if we can come up with some const-generic based approach to bounds checking for indexing and slicing of fixed size arrays that avoids the need for generating panicking branches.

**rustc**

Another approach might be to instead focus on landing support for avoiding panicking in `rustc` itself? While `no-panic` is already a big improvement over `panic-never`, it is still a long-shot from having a nicely formatted call-stack with line-numbered links to the source code of each function call that leads to each panic. I'm yet to investigate existing proposals for such a tool.

---

My aim with this issue is mostly to begin a discussion. I'm curious to hear others' thoughts, i.e. Have you had similar experirences? Is this a worthy/pracitcal goal? Or perhaps infeasible for reasons I haven't touched on yet?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Making `panic-never` a requirement or convention for `rust-embedded` libraries where feasible #551

TL;DR

Context & Motivation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Making panic-never a requirement or convention for rust-embedded libraries where feasible #551

Description

TL;DR

Context & Motivation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Making `panic-never` a requirement or convention for `rust-embedded` libraries where feasible #551