tcpdump mailing list archives

Re: bpf/pcap performance


From: Darren Reed <darrenr () reed wattle id au>
Date: Wed, 14 Apr 2004 08:18:56 +1000 (EST)

In some email I received from Guy Harris, sie wrote:

On Apr 12, 2004, at 4:43 PM, Darren Reed wrote:

The problem with pcap_next_ex() is the man page description:
"reads the next packet and returns a success/failure indication"

Well, it says more than that - it indicates what the values of the 
success/failure indication mean, and says what is supplied through 
"*pkt_header" and "*pkt_data" on success.

Sorry, this is my fault.  I was reading the roff directly and only
thought there was one line to it.  Reading it with nroff made a big
difference :-)

Ok, now that I read it, I find it tries to do too much (in my
opinion.) The part that makes it go off and read more data is
of questionable value ?

If it wasn't for that part of its behaviour, it does what I want
out of a pcap_nextrecord().  That said, I think pcap_next_ex()
is complex enough now and shouldn't be furher extended.  So this
kind of makes the application design able to go in two paths:

foo() { ... }

while (pcap_dispatch(..., foo, ..))
        ;

or

while (pcap_next_ex(...) >= 0)
      foo(...)

Well, there are a few others, too, but they're all centered around
looping on some pcap function call.

As for non-blocking more...

How exactly is pcap_next_ex() meant to work in non-blocking mode ?
Or should it be added to the list of functions that don't work with
it in the man page description of pcap_setnonblock() ?  If I've
called pcap_read() as a result of the return of select/poll, then
it is inappropriate for pcap_next_ex() to decide to call pcap_read()
again, when it has run out of data.

Looking more closely at the non-blocking side of things, there seems
to be no use of either p->nonblock or p->timeout apart from in
pcap_setnonblock() ?  Maybe pcap_next_ex() should also return -2
if "p->cc == 0" prior to calling p->read_op if nonblock is true ?

In an application using non-blocking I/O, I might want to do
something like this:

pcap_setnonblock(p)

while (1)
    select()
    if (FD_ISSET(pcap_fileno(p))
        pcap_readdata()
        ...process data...

I suppose this part after FD_ISSET() could be just this using the
current libpcap interface:
        pcap_dispatch(...-1...)

...except that now I'm back to the problem of passing a function
pointer in with C++.

Another thought that came to me on "how it might work", going back
to the quick API that I wrote up, if I could open a pcap file and
mmap it in, I might then be able to say I have a single large
buffer here; iterate over all of the records for me.

I think the current design (pcap_dispatch(), etc) is what's led
to strange API hooks like pcap_breakloop() appearing, when I think
a better design could have side stepped the need for them O:-)

btw, on examining the code for pcap more, I find a disturbing
name problem: pcap.h (the external interface) documents pcap_pkthdr
as the structure used in the dump files,

OK, I'll fix the header to note that it's what's supplied to or by 
API's and say nothing about it being what's actually stored in dump 
files.

except if you look in savefile.c, it uses pcap_sf_pkthdr instead.

I.e., it stores a "pcap_sf_pkthdr", which is what it *should* do.

 To me, this is around
the wrong way, at best.  I understand what is trying to be done
here and that is to make sure the saved file always has the same
format regardless of what is coming in to the pcap_dump() function.

....or what is supplied by "pcap_{dispatch,loop,next,next_ex}()", i.e. 
regardless of what the sizes of the members of a "struct timeval" are.

But in my opinion, pcap_dump() should be using pcap_pkthdr to store
things going out

....which would require that "pcap_pkthdr" be changed to begin with a 
"struct pcap_timeval".  If it's OK to, on platforms where, for example, 
"ts_sec" is 64 bits, break binary compatibility with applications 
dynamically linked with libpcap, we could do that - but, if not, we 
need both structures.

Sorry, maybe my point came across wrong.

There's pcap.h with pcap_pkthdr and pcap-int.h with pcap_sf_pkthdr.
At present, applications are meant to include pcap.h and use pcap_pkthdr.
What should happen is that internally, pcap_int_pkthdr should be used
by libpcap and pcap_pkthdr should be used in pcap_dump() and when reading
pcap dump files.

But that would only make sense if the pcap file format was actually
considered to be a public interface.  Is it?  If not, I'd really like
you to consider making it a public interface.- which I think is the
other, large, ongoing email conversation here is about.

It just dawned on me that you might want a universal file format that
uses particular sizes for things and have pcap_pkthdr use what ever
happens to be the native size and that this fits the current design
and implementation.

Do we have any feeling for whether or not the likes of IBM are open
to taking changes that enhance BPF whilst maintainng backward
compatibility ?

The mail archives with the mail in question are at home, but I think 
I've exchanged mail with some IBM person and not seen anything come of 
it.  If they *are* open, the first enhancement I'd like to see would be 
bug fixes for the problems people have reported, e.g. random EFAULTs 
from reads on BPF devices and the failure of timeouts to work as you 
reported ages ago.

So we're stuck having to build an API that has to be able to work on
top of a broken kernel interface implementation.  Fun.  NOT.

Darren
-
This is the tcpdump-workers list.
Visit https://lists.sandelman.ca/ to unsubscribe.


Current thread: