Skip to content

Memory corruption on Debian 13 linux 6.12.63+deb13-powerpc64le #416

@jfiusdq

Description

@jfiusdq

The issue happens in a reproducible way, the device is filled with around 850GB of data so a rather big device, it used to work OK but now it does not. Note that I activated dedup on and off on a live vdo using lvm for testing before, not sure if related because this error used to not appear and now it appears in a completely reproducible manner on every boot and if I deactivate auto-activation of the vdo lv on boot it stops happening so it's clearly the source. I have verified all other causes such as disk related, it's not disk corruption, also my server is running ECC memory. It's definitely memory corruption caused by dm-vdo in the kernel. I'm using all default settings for lvm vdo on Debian 13 beside block map cache being 4096MB. The vdo device is on a separate disk than root which the system boots from and where the kernel modules are located (again, their hashes match on disk with official repos, they're not corrupted on disk).

There is something in dm-vdo that writes arbitrary memory address in kernel space and corrupts other module loading.

[    8.453342] module_64: garp: Address 00000000bba0fdc3 of stub out of range of 00000000a8dc9270.
[    8.612851] module_64: cfg80211: REL32 4996086348 out of range!
[    8.670602] module_64: 8021q: Address 00000000d4f483ce of stub out of range of 00000000a8dc9270.
[    8.794161] module_64: cfg80211: REL32 5020002892 out of range!
[   10.007360] device-mapper: vdo: vdo0:physQ0: VDO commencing normal operation
[   10.007409] device-mapper: vdo: vdo0:journalQ: Setting UDS index target state to online
[   10.007457] device-mapper: vdo0:vgchange: device '253:4' started
[   10.007466] device-mapper: vdo0:vgchange: resuming device '253:4'
[   10.007976] device-mapper: vdo: vdo0:dedupeQ: loading or rebuilding index: 253:3
[   10.007986] device-mapper: vdo: vdo0:dedupeQ: Using 8 indexing zones for concurrency.
[   10.025552] device-mapper: vdo0:vgchange: device '253:4' resumed
[   13.354193] device-mapper: vdo: vdo0:dedupeQ: loaded index from chapter 4598 through chapter 5620
[   38.723094] systemd-journald[451]: Time jumped backwards, rotating.
[  198.155103] module_64: dm_crypt: REL32 -4996348216 out of range!
[  201.898237] module_64: xfs: REL32 5252766060 out of range!

I don't really have more details to give since beside those logs there's no other visible memory errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions