-
Notifications
You must be signed in to change notification settings - Fork 861
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Win32] High CPU load #1278
Comments
That was d717937, which was originally done as a response to #525. A better response would be to have, perhaps, a .1-second timeout, which is what Wireshark uses - or, at least, to do it when packets are being printed. (Perhaps that's even good enough when they're only being written to a file, given that the buffer is no longer the default system size on *BSD capture, the faster CPUs of today should be better able to handle packed being delivered every .1 second, and the faster networks of today should be more likely to fill up even a larger kernel packet buffer within .1 second, than was the case in 1992 or so.) |
Oh, wait, that's what I already did in 2cd0a90, in 2020. Is this on current versions of tcpdump? Or is it in pre-2cd0a90c24ccf01ad9a034d7d5a6a651c82a4785 versions? 2cd0a90 and post-2cd0a90c24ccf01ad9a034d7d5a6a651c82a4785 versions should only run in immediate mode if the user requests it with |
Yes. From |
OK, so that's probably Windows-specific; I threw a quick test |
Another possible cause for higher CPU usage when printing packets is that the Visual Studio C library's notion of "line-buffered" output is "one character written, one |
Every packet capture mechanism is weird in its own way. (Hat tip to Tolstoy. :-) As per https://npcap.com/guide/npcap-internals.html, the NPF code from WinPcap and Npcap has a single circular buffer (unlike the BPF capture mechanism's pair of buffers, or the Linux PF_PACKET/TPACKET_V3 ring of buffers). There's a "minimum number of bytes to copy" parameter and a read timeout; a
The packet buffer size on Windows, in libpcap, is 1000000 bytes (1 MB - not 1 MiB). WinPcap dates back to at least the early 2000s, if not earlier; 16KB seems like a rather small "minimum amount to transfer". The default buffer size from the early-90s BPF is, I think 4 KiB, which is absurd by modern standards; the maximum, these days, is somewhere between 512 KiB and 16 MiB, and libpcap sets it as high as it can. There's a {WinPcap,Npcap}-only libpcap API to set the minimum amount to copy, |
There aren't any. There aren't any calls to |
There isn't one. If the timeout set by If the timeout in question is > 0, If the timeout in question is -1,
|
Well, I guess setting it to any non-zero value has the effect of turning off non-blocking mode, so that's the connection. The pcap-npf.c code in libpcap is what calls |
Ok, very well. But what's the option in |
The one that fixes the problem that increases the CPU time. Ideally, the one that does so without introducing any new options. The question is "where is that CPU time going?" Your option sets the timeout to 0, meaning it causes reads from the However, setting the timeout to 0 causes captures on Linux and, I think, all system using the BPF capture mechanism, as well as Solaris 10 with DLPI, to block without a timeout as well. If that significantly reduces CPU time on
If it doesn't significantly reduce CPU time on other platforms, that may be either 1) an issue with NPF or 2) an issue with how pcap-npf.c uses NPF. If this happens without If that happens even with Otherwise, it'd probably be time to see whether the Npcap developers can figure out whether there's something being done inefficiently that needs to be fixed. (I.e., I don't think the problem is that there needs to be an option to reduce the CPU load, I think the problem is that there needs to be fixes to some component of the Windows capture code, whether it's in the libpcap code or the packet.dll/NPF driver code or both, to reduce the CPU load by default.) |
Indeed. I fail to see it's VS's line-buffering. |
It could help to state the exact steps to reproduce the problem on one system and the normal operation on the other. |
For the other system, the claim was that "I'm successfully running tcpdump (or windump.exe) on my main Win-10 PC (AMD 3.9 GHz) just fine." Do "successfully" and "just fine" mean that the CPU percentage is at the 0.1% level that it was on the other machine with the change, or does it mean that it was a higher percentage but the other system had more CPU to spare? I.e., was it using more CPU than it should on both machines, but was that less of an issue on the faster machine? |
Correct. If |
Right now, I'm hoping to figure out whether one or more of the knobs for the NPF driver is inappropriately set (whether by pcap-npf.c or something else) and causing more CPU usage than required for the desired level of responsiveness. What level of traffic was the i3 PC receiving and sending? |
I'm successfully running tcpdump (or
windump.exe
) on my main Win-10 PC (AMD 3.9 GHz) just fine.But installing Win-11 on a slow Intel I3 CPU, 2.3 GHz (yes it's possible), I find that
tcpdump.exe
runs witha CPU load of approx. 24% . Almost makes the PC unusable.
I also use the latest NPcap 1.80 on this Win-11 PC.
The high CPU-load is AFAICS due to
PacketReceivePacket()
returns immediately when there is no packets to receive. No?So adding an option
--blocking
totcpdump.exe
:the CPU-load on the Win-11 PC decreases to only 0.1% on average.
Does this make sense?
I'm not sure about the relationship between a
timeout = 0
and a call topcap_setnonblock(pc, 0)
.And BTW, I find no call to
pcap_setnonblock()
in Wireshark either.The text was updated successfully, but these errors were encountered: