Skip to content

Commit 6e9da78

Browse files
committed
Setting the transient pref. to current active slot clears it.
Given the current API for setting the transient boot preference there is an implementation choice. While setting the transient selection to the alternate image (currently non-active) is the common use case, we also need to define what happens when the transient preference is specified to be the currently active RoT Hubris image. The simple case is that setting transient to active will clear the transient selection. That is the choice this PR makes. The other case, setting an actual transient preference to the active gives rise to some more complex behavior that is just annoying. Assume that A is active, B is the pending persistent preference, and A is also the transient preference. On the first reset, the ROM will promote the CFPA scratch page to active, thus making B the persistent preference. However, Bootleby will use the transient preference of A to select the active image. The RoT will then be running Hubris image A. Bootleby has cleared the transient selection. On the second reset, there is no transient selection or pending persistent. Bootleby will select Hubris image B. These scenarios have been tested and analyzed and there is no motivation found for enabling this sequence. If, after merging this PR, it is found that we do want to enable this sequence of two resets in a row running different images, we can change the API to an enum of `Select(SlotId)` or `ClearSelection`. --- Hubris issue 2093 was found in testing where setting the persistent preference to different values in succession leads to the inability to clear a pending persistent boot preference. This may be needed to recover from a previous failed/abandoned RoT Hubris update where one wants to avoid additional reset operations. This is a corner case with a workaround, but it is also against the intent of the API.
1 parent 32d7676 commit 6e9da78

File tree

3 files changed

+53
-18
lines changed

3 files changed

+53
-18
lines changed

Cargo.lock

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,9 +44,6 @@ path = "sys/abi"
4444
[patch."https://github.com/oxidecomputer/hubris".counters]
4545
path = "lib/counters"
4646

47-
[patch."https://github.com/oxidecomputer/management-gateway-service".gateway-messages]
48-
path = "/home/stoltz/Oxide/src/management-gateway-service/gateway-messages"
49-
5047
[workspace.dependencies]
5148
anyhow = { version = "1.0.31", default-features = false, features = ["std"] }
5249
array-init = { version = "2.1.0" }

drv/lpc55-update-server/src/main.rs

Lines changed: 52 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -501,38 +501,44 @@ impl idl::InOrderUpdateImpl for ServerImpl<'_> {
501501
let image = match (component, slot) {
502502
(RotComponent::Hubris, SlotId::A)
503503
| (RotComponent::Hubris, SlotId::B) => {
504-
let active = match bootstate().map_err(|_| UpdateError::MissingHandoffData)?.active {
504+
let active = match bootstate()
505+
.map_err(|_| UpdateError::MissingHandoffData)?
506+
.active
507+
{
505508
stage0_handoff::RotSlot::A => SlotId::A,
506509
stage0_handoff::RotSlot::B => SlotId::B,
507510
};
508511
if active == slot {
509512
return Err(UpdateError::InvalidSlotIdForOperation.into());
510513
}
511-
// Since we will be enforcing rollback protection at the time when the
512-
// boot preference is set, we cannot yet have the alternate image as the
513-
// preferred image. The full image needs to be in place in order to
514-
// evaluate the policy.
515-
let (persistent, pending_persistent, transient) = self.boot_preferences()?;
514+
// Rollback protection will be implemented by refusing to set
515+
// boot preference to an image that has a lower EPOC vale in the
516+
// caboose.
517+
//
518+
// Setting the boot preference before updating would sidestep that
519+
// protection. So, we will fail the prepare step if any
520+
// preference settings are selecting the update target image.
521+
let (persistent, pending_persistent, transient) =
522+
self.boot_preferences()?;
516523
if let Some(pref) = transient {
517524
if active != pref {
518525
return Err(UpdateError::InvalidPreferredSlotId.into());
519526
}
520527
}
521528
if let Some(pref) = pending_persistent {
522529
if active != pref {
523-
return Err(UpdateError::InvalidPreferredSlotId.into())
530+
return Err(UpdateError::InvalidPreferredSlotId.into());
524531
}
525532
}
526533
if active != persistent {
527-
return Err(UpdateError::InvalidPreferredSlotId.into())
534+
return Err(UpdateError::InvalidPreferredSlotId.into());
528535
}
529536
Some((component, slot))
530537
}
531538
(RotComponent::Stage0, SlotId::B) => Some((component, slot)),
532539
_ => return Err(UpdateError::InvalidSlotIdForOperation.into()),
533540
};
534541

535-
536542
self.image = image;
537543
self.state = UpdateState::InProgress;
538544
ringbuf_entry!(Trace::State(self.state));
@@ -938,13 +944,32 @@ impl ServerImpl<'_> {
938944
slot: SlotId,
939945
duration: SwitchDuration,
940946
) -> Result<(), RequestError<UpdateError>> {
947+
// Note: Rollback policy will be checking epoch values before activating.
941948
match duration {
942949
SwitchDuration::Once => {
943-
// TODO check Rollback policy vs epoch before activating.
944-
// TODO: prep-image-update should clear transient selection.
945-
// e.g. update, activate, update, reboot should not have
946-
// transient boot set.
947-
set_hubris_transient_override(Some(slot));
950+
// Get the currently active slot.
951+
let active_slot = match bootstate()
952+
.map_err(|_| UpdateError::MissingHandoffData)?
953+
.active
954+
{
955+
stage0_handoff::RotSlot::A => SlotId::A,
956+
stage0_handoff::RotSlot::B => SlotId::B,
957+
};
958+
959+
// If the requested slot is the same as the active slot,
960+
// treat this as a request to CLEAR the transient override.
961+
//
962+
// If we did allow setting the active slot as the transient
963+
// preference, then we get this confusing 2-boot scenario when
964+
// pending persistent preference is also used:
965+
// the next reset returns us to the original image and the 2nd
966+
// reset has us running the new/alternate image.
967+
if active_slot == slot {
968+
set_hubris_transient_override(None);
969+
} else {
970+
// Otherwise, set the preference as requested.
971+
set_hubris_transient_override(Some(slot));
972+
}
948973
}
949974
SwitchDuration::Forever => {
950975
// Locate and return the authoritative CFPA flash word number
@@ -976,6 +1001,19 @@ impl ServerImpl<'_> {
9761001
let (cfpa_word_number, _) =
9771002
self.cfpa_word_number_and_version(CfpaPage::Active)?;
9781003

1004+
// TODO: Hubris issue #2093: If there is a pending update in the
1005+
// scratch page, it is not being considered.
1006+
// Consider:
1007+
// - A is active
1008+
// - persistent switch to B without reset
1009+
// - scratch page is updated
1010+
// - return Ok(())
1011+
// - persistent switch to A without reset
1012+
// - A is already active, no work performed
1013+
// - return Ok(())
1014+
// - reset
1015+
// - B is active
1016+
9791017
// Read current CFPA contents.
9801018
let mut cfpa = [[0u32; 4]; 512 / 16];
9811019
indirect_flash_read_words(

0 commit comments

Comments
 (0)