-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nwipe getting out of memory error #332
Comments
IP invoked the oom-killer Nwipe was killed just because it will be the biggest consumer of memory even though the out of memory might not be anything to do with nwipe. As IP invoked the oom-killer maybe it's something to do with the network. Can you plug an ethernet lead in and connect it to a network running a DHCP server. I'm wondering whether IP software running in the background is causing a memory leak. If plugging into a network fixes the problem then it will give me an idea where to look. |
Something else you can do to determine whether it's a network issue. Once nwipe has appeared, type ALT F2 to switch virtual terminals and run |
Can you also post |
I have it running right now with a network connection. If that fails, I'll try the rest. |
Well, it worked unplugged. Now I can't get it to fail again. I'll update again when I find some consistency. |
I'm afraid I don't have any technical details, but it seems to correlate to choosing Lagged Fibonacci as the PRNG. |
I'm going to ask @Knogle to comment as the author of the lagged Fibonacci PRNG. For what it's worth I've been using lagged Fibonacci to wipe at least 50 drives on two different systems, Dell I7 8 core 8GB (8 drives simultaneously) and a Clevo I7 20 core (one drive at a time) and haven't experienced any out of memory errors. @rdellaripa Can you post the output of |
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 16h (Models 30h-3fh) Processor Root Complex |
I hope you are doing fine :) Unfortunately I think it has nothing to do with nwipe. According to the trace, the OOM Killer was invoked by "ip" responsible for networking. |
Continuing to test. Just throwing these out here in case they're helpful later. /var/log/messages: meminfo from just before and just after the oom: Sun Mar 2 16:53:10 UTC 2025 |
Further testing has shown that this issue occurs with ISAAC and ISAAC64 as well as lagged Fibonacci. I have not been able to reproduce it on Mersene Twister or XO Rohsiro. |
I would also like to join in here. I have exactly the same "ip invoked oom-killer" problem.
I have disabled the onboard LAN adapter in the BIOS. dmesg_2025-03-05-11-52-53_system_uuid_00000000-0000-0000-0000-4ccc6ad83f18.txt |
Can you maybe run |
Whether you use the network is entirely optional however ShredOS supports NTP (Network Time Protocol) so if you are plugged into a network with internet access it will attempt to contact a NTP server to check what the date and time is and adjust it if the system clock is wrong. If you don't want to connect to a network then you can adjust the system date and time via Nwipe's config menu 'c' (accessed via the drive selection screen) or alternatively via the command line. Network access is required if you are PXE booting. Network access is also required if you want to upload your PDFs and logs automatically to a lftp, ftp or sftp server. In addition you can store Nwipe's config and customer list files remotely so if you are running a hundred ShredOS systems they all boot up with the same customer list and config. This is configured via the shredos_output=".." and shredos_config=".." commands that are placed on the kernel command line in grub.cfg |
That's a possibility, I generally run ShredOS with an active internet enabled network on my desktop systems however I run laptop wipes without a network connection. When I get a chance I'll run the desktops (that have more than one ethernet connection) without a network connection and see if I can reproduce the out of memory. It might also be specific to your hardware as well as only happening with an active network connection. I need to take a closer look at your lspci -k output. |
One other thought, when there is no active network connection the program called 'ip' is executed periodically, I don't remember how often whether every 5 seconds or 30 seconds to check the ethernet port status. This polling doesn't occur once a IP address has been requested and received from the DHCP server so it's possible that if the program called 'ip' is not freeing up allocated memory on exit then slowly memory will get used up. As noted from previous comments about the -V option, the version of 'ip' being used is the stripped down busybox version. Just something to bear in mind when searching for the root cause of this issue. |
yeh, I'll provide it on Monday (with enabled LAN adapter)
I could also test that on monday. I still have a few old routers lying around here so at least there would be a network to connect to, albeit without internet. |
I'm actually running a wipe right now connected just to an old switch so it doesn't have DCHP, internet, etc. What I don't understand at this point is that I've run top -oVIRT in another console, and I haven't seen ip just hanging out there slowly eating up memory. But it is obvious something is, because MemFree goes from 50704 kB to 6101760 kB free when the oom triggers...despite nwipe (or ntpd, the second largest process) growing in size over time. I thought maybe a device driver wasn't freeing memory, but lsmod doesn't show anything. PS: I just noticed that pre-oom, meminfo shows Buffers as 5757772 kB and after, 1112 kB. |
so first of all good news. switching off the NIC helped. my disks were successfully zeroed. Duration: approx. 44-45 hours.
I have now started a new wipe process but this time I have connected an old TP-Link router. a network with DHCP etc. but without internet. |
Attached is the output of
run once a minute, snipped to the minute before and after the oom. |
Can you try killing the ntpd process and dhcpd process in turn in order to try and identify which one is causing the issue. With no network connection I still can't reproduce the issue on my laptop. I was reading about some buffer overflow problem in ntpd but that was way back in the early 2000s. Triggered by a rogue ntp server. So maybe worth killing those processes to see if they are the cause. |
I killed the ntpd process and the issue recurred. I can't kill the dhcpd process, as its spawned from an "if up" from shredos_net.sh |
I'm guessing then that killing shredos_net.sh then stops the out of memory? |
So I boot fine, set parameters and get nwipe running, and then, after a while, it gets killed with an out of memory error. I don't think it's the drive, I managed to do a dd through the whole thing with no errors.
The text was updated successfully, but these errors were encountered: