tcpdump mailing list archives
Re: libpcap for linux, to_ms redefined
From: Phil Wood <cpw () lanl gov>
Date: Mon, 7 Oct 2002 16:59:47 -0600
On Wed, Sep 18, 2002 at 01:01:30PM -0700, Guy Harris wrote:
On Thu, Mar 28, 2002 at 09:45:07PM -0700, Phil Wood wrote:With the advent of memory mapped ring buffers developed by Alexey Kuznetsov, this function could be accomodated. I treat the value of 'to_ms' in the following manner: if (to_ms == 0) return; // if no packet immediately available then return // to calling program it will poll (good for old // versions of NFR or programs that have other // things to do besides capture packets)And bad for compatibility with other platforms, on which a "to_ms" value of 0 means "if no packet immediately available, block, and wait as long as necessary for enough packets to arrive to fill up a chunk".
Opps. I sure got that one wrong.
Well, I just changed my code so to_ms == 0 will block as long as is
necessary. I still like to have the ability to return to the caller if
there are no packets available. How bad would it be to use a negative
value (or just -1) to mean "if there are no packets this instant in time,
return to the calling program"? I guess the answer is related to how many
libpcap programs use a negative value for to_ms.
For what it's worth. My linux mmap'd pcap behaves as follows:
A. With a positive timeout (initialized by the to_ms value on each call
to pcap_read), a "read" will return if either
1) enough polls have been called to exhaust the timeout value.
or
2) the timeout expires even if no packets have been received.
B. With a zero timeout, a "pcap_read" will never return. The timeout
is considered infinite. Of course callbacks will continue for each
packet that arrives.
C. With a negative value, "pcap_read" will return if either
1) there are no packets on the ring
or
2) the packets that have been queued on the ring have all been
processed.
Basically, the only system call that comes into play while in pcap_read
is the poll system call. And that is only for cases A and B.
There are several timeout behaviors that can be provided by various
platforms' native packet capture mechanisms that support timeouts:
BSD:
With a non-zero timeout, a read will return if either
1) enough data arrives to fill up the buffer
or
2) the timeout expires, even if no data has arrived.
With a zero timeout, the read will return only if enough data
arrives to fill up the buffer, blocking as long as is necessary.
You can do BIOCIMMEDIATE to cause packets to be delivered as
soon as they arrive; if combined with a timeout, that *probably*
means that a read will return if either
1) a packet arrives
or
2) the timeout expires, even if no data has arrived.
Digital UNIX:
With a positive timeout, at least as I read the man page, it
might be the case that, with a non-zero timeout, a read will
return if either
1) a packet arrives
or
2) the timeout expires, even if no data has arrived
so that batching is presumably done only if packets are arriving
faster than the application can read them one at a time - i.e.,
if, before the read wakes up and copies data to userland, more
packets arrive. I don't know whether that's the case, however.
With a zero timeout, the read will return if a packet arrives,
blocking as long as is necessary.
With a negative timeout, the read will return immediately, even
if no packet is available; the value of the timeout is, I infer,
ignored.
They say that BIOCIMMEDIATE has no effect as immediate mode
is always on, which is why I infer that batching isn't done
BSD-style.
Windows with WinPcap:
With a positive timeout, a read will return if either
1) enough data arrives to fill up the buffer
or
2) the timeout expires, even if no data has arrived
at least as I read the current packet.dll documentation. The
bufffer size is set with "PacketSetMinToCopy()" on Windows NT
(NT 4.0, W2K, WXP, W.NETServer); I'm guessing from what the
documentation says about Windows OT (95/98/Me) that one packet
is always enough to fill up the buffer on those OSes.
With a zero timeout, a read will return only if enough data
arrives to fill up the buffer, blocking as long as is necessary.
With "PacketSetMinToCopy()" you can presumably get the
equivalent of BIOCIMMEDIATE.
SunOS 5.x:
With a non-zero timeout, a read will return if either
1) enough data arrives to fill up the buffer
or
2) the timeout expires *AND* at least one packet has
arrived. (Yes, this means that you can't use the
timeout to break out of a loop and do something else
while you're waiting. Such is life.)
With a zero timeout, a read will return as soon as a packet
arrives.
With the timeout cleared, a read will return only if enough data
arrives to fill up the buffer, blocking as long as is necessary.
(libpcap treats a "to_ms" value of 0 as meaning "don't set the
timeout", which means it's cleared, *not* as "return
immediately".)
With a cl
SunOS 4.x:
I don't have the "bufmod" man page handy for SunOS 4.x, but I
*suspect* it's similar to 5.x, as the 5.x "bufmod" is probably
derived from the 4.x "bufmod". No guarantees, however.
SunOS 3.x:
*Probably* behaves like 4.x.
On OSes whose packet capture mechanism *doesn't* support timeouts, a
read will return if a packet arrives, and will wait indefinitely for
that to happen.
All this means that libpcap cannot, merely by using the underlying OS's
mechanisms:
guarantee that a read will always return within a certain
timeout period (the Solaris timeout mechanism only sets the
timeout for batching of packets);
guarantee that packets will not be delivered ASAP (some OSes
don't do batching, and others do only a limited amount of
batching).
On most if not all of the OSes, however, you *can* do a "select()" or
"poll()" or "WaitFor...()" call on the pcap device/socket/whatever, so
that you *can* multiplex reading packets and doing other things. (On
BSD, you may have to use a timeout in the "select()" or "poll()", plus
non-blocking I/O, as "select()" or "poll()" on BPF devices doesn't
always work correctly.)
It would be possible, if people *really* insist on using the timeout for
multiplexing rather than just batching, to make
1) the platforms with no timeout in the OS (Linux, Irix, HP-UX,
etc.)
and
2) the platforms where the timeout can't be used for
multiplexing as the timer doesn't expire unless you have at
least one packet (SunOS 5.x)
do a combination of non-blocking I/O and a "select()" or "poll()" with a
timeout. However
1) that's overkill for applications that *don't* use the timeout
for multiplexing
and
2) it means that people will start relying on it and having all
sorts of weird problems when their application is run with an
older version of libpcap
so I'm not strongly inclined to implement that (which is why the Linux
version doesn't do it) and, if we do implement that, I'd want to do it
only if
1) we do it on *ALL* the platforms where the native OS timeout
can't be used for multiplexing (not just for Linux)
and
2) we add a new API to enable that mode, so that applications
that require that mode have to use the new API and thus won't
build with older versions of libpcap (rather than merely
hanging forever, on platforms such as Linux where there is no
timeout and platforms such as SunOS 5.x where the timeout
doesn't expire if no packets arrive, with older versions of
libpcap).
I would also advocate adding a new API to get "immediate mode", that
being the mode where a read completes as soon as a packet arrives, with
no batching; that'd let us hide all the details of how to request
immediate mode.
-- Phil Wood, cpw () lanl gov - This is the TCPDUMP workers list. It is archived at http://www.tcpdump.org/lists/workers/index.html To unsubscribe use mailto:tcpdump-workers-request () tcpdump org?body=unsubscribe
Current thread:
- Re: libpcap for linux, to_ms redefined Phil Wood (Oct 07)
- Re: libpcap for linux, to_ms redefined Guy Harris (Oct 08)
