Nmap Development mailing list archives
Re: Writing high-performance npcap application
From: Daniel Miller <bonsaiviking () gmail com>
Date: Fri, 29 Apr 2022 13:11:05 -0500
Jan, Thanks for your interest in Npcap! I'll try to answer questions inline below. On Wed, Apr 27, 2022 at 1:21 PM Jan Danielsson <jan.m.danielsson () gmail com> wrote:
[The npcap page said it was ok to use nmap mailing list for npcap related questions. If there's a more appropriate forum, please point me to it.]
Questions can also be posted as Issues on our Github page, but the nmap-dev mailing list is also publicly archived, so it works well for this type of discussion.
At first I got a pretty abysmal performance because I used pcap_sendpacket().
I believe most of the performance difference there would be because Npcap so far does not support a "nonblocking mode" for send operations. This is inherited from WinPcap, and sounds like an awesome place to start with performance enhancements!
This was expected, so I implemented sendqueue support into the pcap crate, and used that instead. This however did not work -- I kept running out of memory. This was the first minor stumbling block: I thought that one could reuse a sendqueue buffer (i.e. it implicitly gets reset after a transmission), but that does not seem to be the case?
The sendqueue buffer is not reset, partly because comparing the queue's `len` to the return value of pcap_sendqueue_transmit() is how you can know if some of the queue was not transmitted due to error (check pcap_geterr() if so). You *can* reuse the memory, just set the `len` to 0 before calling pcap_sendqueue_queue() again. In fact, the code behind pcap_sendqueue_queue() is pretty simple (just some offset math and 2 memcpy operations), so if you have a way to do it faster, feel free to construct your own buffer and attach it to the pcap_send_queue structure.
When I rewrote the code to allocate/free a new sendqueue for each batch, then it worked. And I got _really_ good performance, as well. Just to be clear: Have I understood it correctly that the sendqueue does not autoreset after transmission, and I need to allocate a new sendqueue for each batch?
It does not reset, but allocating a new one each time is not necessary;
just reset the queue's len to 0.
However, when I ran a long test, I got an error which says that some
resources were exhausted. I obviously need to double-check that it's actually releasing the sendqueue on each iteration -- but I'm pretty sure it does. However, I'm sending *a lot* of packets. Is there any known resource leak in npcap when sending very many packets using sendqueues?
There's not a known resource leak. We'd want to start by determining if the resource exhaustion is on the user-side (your app, Packet.dll, and wpcap.dll) or on the driver side (npcap.sys). If the error was reported via pcap_geterr(), then it almost certainly came from the driver. To diagnose, we need to identify what condition triggers the error: 1. A particular pcap_send_queue will reliably trigger the error, even if it is the first one sent. 2. The error triggers only after several different calls to pcap_sendqueue_transmit() Then we can identify whether it is the amount of packets in total or the rate of packet transmission that is the issue. If it is the rate of transmission, then adding a timestamp to each pcap_pkthdr and using the sync parameter will not trigger the error. I'm not suggesting this as a workaround necessarily, but more as a diagnostic tool. Some resources are used while the packet is being sent asynchronously in the driver, and if we overcommit those resources, the driver will end up returning STATUS_INSUFFICIENT_RESOURCES.
The receiver is in much worse shape. It will receive a number of
packets (a few thousand, IIRC) and then simply stop receiving new packets.
Are there any special considerations one must take into account when
trying to receive packets at a high rate? At first I thought the
capture buffer may be overflowing (it was set at 1MB), but when I
increased it to 16MB it stopped at roughly the same number of packets.
(The application does not try to store any data on the receiver -- it
just makes receives the packet, checks that its index matches the
expected index, and then throws away the packet).
This is a bit more concerning, since receiving packets gets a lot more attention and the code is already very well tested at high rates. However, we should rule out a few things to be sure. When you say "stop receiving new packets," do you mean that you start getting errors when you call pcap_* functions, or do you mean the a call to pcap_next() or pcap_next_ex() does not return and/or your callback to pcap_dispatch() or pcap_loop() is not called? Npcap has a few extra configuration parameters beyond the standard ones for libpcap, and in some cases these can mean that packets have arrived but you can't get them from the kernel because there is not enough data: 1. If the read timeout is 0 (default for pcap_create(), to_ms parameter to pcap_open_live()), then the application will wait "forever" for the read event to be signaled by the driver before issuing a ReadFile() to fetch packets. 2. If the MinToCopy value has been set (default 16KB), then the driver will not signal the read event until at least that much data has been captured. This is intended to reduce overhead of frequent calls to ReadFile(). So if you have a packet filter set for a very specific type of traffic, and that traffic stops when there is less than 16KB (default MinToCopy) in the kernel buffer, a read timeout of 0 means you will never get those last few packets because the PacketReadPacket() function in Packet.dll is doing a WaitForSingleObject(ReadEvent, INFINITE), and the driver will never signal the event. There are a few solutions: 1. Use immediate mode (pcap_set_immediate_mode()) to get packets as soon as they come in (for Npcap, this is implemented as pcap_setmintocopy(0)). This may have negative impact on performance for the application as a whole, including on other platforms. Setting the MinToCopy value directly is also supported via pcap_setmintocopy(), but it may result in differences in behavior between platforms, since it is a Npcap/WinPcap-only setting. 2. Set a positive timeout with pcap_set_timeout(). This guarantees (for Npcap) that you will get any waiting packets within the timeout period, even if they are less than MinToCopy. Not all libpcap-supported platforms support a read timeout. 3. Put the pcap handle in nonblocking mode with pcap_setnonblock(). This will cause Npcap to ignore the read event entirely and issue a ReadFile() no matter what. In this case, pcap_dispatch() and pcap_next() may return either 0 or PCAP_ERROR if there are no packets available. You can use pcap_getevent() to get a handle to the event that your application can wait on directly, in case you want to have other logic around the wait, such as using WaitForMultipleObjects if you have several capture handles or other I/O sources to wait for. I hope this has helped a bit. It has definitely given me a lot of good stuff to think about for the future of Npcap! Dan
_______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at https://seclists.org/nmap-dev/
Current thread:
- Writing high-performance npcap application Jan Danielsson (Apr 27)
- Re: Writing high-performance npcap application Daniel Miller (Apr 29)
