tcpdump mailing list archives

Re: Accurate ECN support in tcpdump/libpcap


From: "Scheffenegger, Richard via tcpdump-workers" <tcpdump-workers () lists tcpdump org>
Date: Thu, 13 Nov 2025 08:17:00 +0000

--- Begin Message --- From: "Scheffenegger, Richard" <Richard.Scheffenegger () netapp com>
Date: Thu, 13 Nov 2025 08:17:00 +0000
Hi Denis,

Before discussing the various alternate ways - what exactly are you referring to with "breaking compatibility" when 
tcp[tcpflags] maps to 12:2 instead of 13:1?

Is there an expectation that people use "tcpflags" as a "shorthand" for 13:1? Or that is being used in contexts outside 
of the tcp header flags?

Because within the context of the tcp header, the flags were always defined to be 12 bits wide, since 1981, or 2003 at 
the latest. 

But back to the issue - what compatibility issue is expected to arise?  As user, who infrequently uses tcpdump with all 
the functionalities - I'd expect there to be only one way to access *all* tcp flags by name; and if i need something 
special, it'll be possible to convert scripts that somehow (I don't know how, my imagination fails me) require 
tcp[tcpflags] to return only a 8 bit value, to be redefined tcp[13:1] in those scripts as a one-off change after 
updating tcpdump which has proper tcp header flags support...

(If this email doesn't make it to the list, please forward)


Richard Scheffenegger


-----Original Message-----
From: Denis Ovsienko <denis () ovsienko info> 
Sent: Mittwoch, 12. November 2025 19:52
To: tcpdump-workers () lists tcpdump org
Cc: Scheffenegger, Richard <Richard.Scheffenegger () netapp com>
Subject: Re: [tcpdump-workers] Accurate ECN support in tcpdump/libpcap

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments




On Tue, 29 Aug 2023 14:33:28 +0000
"Scheffenegger, Richard via tcpdump-workers"
<tcpdump-workers () lists tcpdump org> wrote:

This change to the parser in libpcap allows access to all 12 bits when 
using the sample from the man page like this

tcpdump 'tcp[tcpflags] & (tcp-rst|tcp-ack) == (tcp-rst|tcp-ack)'

to also include the ‘tcp-ae’ flag:

https://urldefense.com/v3/__https://github.com/the-tcpdump-group/libpc
ap/pull/1210__;!!Nhn8V6BzJA!WUdL8jnT0mVVHLN9rdOApjz7ngMcmdSeKr_-VOugjd
Mf4xy7yu3OkNRATB6vG3JlN--Prjex3DaCA9QB1djZPm95KYzz$

Hello Richard and all.

Thank you for waiting.  I am posting this response to the mailing list rather than the pull request because syntax 
choices tend to have very long-term effect on the difficulty of maintenance, thus it seems appropriate to make a record 
of these considerations in the archives.

I have been thinking about the proposed changes whilst adding tests and documentation for existing syntax features and 
making various code clean-ups, and this allowed me to understand the proposed solution much better and to see it has 
issues that come from TCP header layout and early libpcap design.

Given how much time this matter has taken already, an acceptable better alternative would be implementing the "tcphf" 
arithmetic expression below.  It looks good enough to unblock your work and to become a part of libpcap 1.11.0 when the 
latter becomes available.  It would be nice to study if the other potential solutions discussed below actually work as 
well as they seem on paper, but if in the next few months nobody gets to get this done, then let's say perfect is the 
enemy of good and "tcphf" is good enough.  In any case, let me try preparing the next revision.  The detailed reasoning 
for this is as follows.

Making a change to the filter expression syntax is a matter of finding a good balance between convenience of use, 
compatibility (forward and backward), lack of surprises (what a thing looks and what it does should be the same) and 
cost of maintenance (source code upkeep, testing and documentation).  The problem that needs to be solved in this case 
is that the long-established "tcp[tcpflags]" packet data accessor does not provide forward compatibility for the 
proposed TCP header AE flag.

The proposed solution is "tcp[tcpflags] & tcp-ae".  Seemingly, this has an advantage of not introducing a new syntax 
and being backward compatible; but if implemented as proposed, it would have the disadvantage of introducing a surprise 
behaviour: bare "tcpflags" would still mean 13, but "tcp[tcpflags]" meaning would quietly change from "tcp[13:1]" to 
"tcp[12:2]", and this would introduce the only case of such an inconsistency, both in the user-visible behaviour and in 
the source code.

Arguably, the above still would be a considerable solution in this specific solution space: hypothetically, instead 
redefining "tcpflags"
to 12 and making "tcp[tcpflags:2] & tcp-ae" the recommended syntax would formally work, but it would have the obvious 
disadvantage of a surprise change of an existing behaviour and of breaking backward compatibility ("tcp[tcpflags] & 
tcp-syn" would no longer mean the same), so this alternative (in the same solution space) would be much worse.

Likewise, hypothetically, defining a new named offset to mean 12 and requiring the users to spell something such as 
"tcp[tcpflags12:2] & tcp-ae" would avoid a surprise and would keep the syntax formally consistent and backward 
compatible, but it would be obviously unwieldy, especially if the expression needs to refer to both byte 12 flag(s) and 
byte 13 flags.  Also it would return 16 bits rather than 12.  So this would be a worse alternative (in the same 
solution space) as well.

This way, considering the problem space, I agreed there should be /something/ new instead of the old "tcp[tcpflags]" 
that would mean just the TCP header flags and would not look exactly identical to the old solution.  I pondered what 
other existing syntax could provide a solution space that would align with the problem space better than the existing 
packet data accessor.  Also, since the currently reserved bits of the TCP header in future could potentially mean 
anything else other than new flags (a version number? an overflow space for port numbers?), I tried to see what would 
keep the reserved bits out of the solution space for now, but would allow adding these in future if necessary.

With this in mind, one potential solution could be a new arithmetic expression, something that would work similarly to 
the existing "length" and would be recognisable as TCP header flags.  Let's call it "tcphf" for the sake of comparison. 
 Then the following would be valid regular arithmetic expressions that evaluate to an integer in the range [0x000, 
0x1FF] ([0b000000000, 0b111111111]):

* "tcphf" -- same as "tcp[12:2] & 0x1FF"
* "tcphf & tcp-fin" -- same as "tcp[13] & tcp-fin"
* "tcphf & tcp-syn" -- same as "tcp[13] & tcp-syn"
* "tcphf & tcp-rst" -- same as "tcp[13] & tcp-rst"
* "tcphf & tcp-push" -- same as "tcp[13] & tcp-push"
* "tcphf & tcp-ack" -- same as "tcp[13] & tcp-ack"
* "tcphf & tcp-urg" -- same as "tcp[13] & tcp-urg"
* "tcphf & tcp-ece" -- same as "tcp[13] & tcp-ece"
* "tcphf & tcp-cwr" -- same as "tcp[13] & tcp-cwr"
* "tcphf & tcp-ae" -- same as "tcp[12] & tcp-ae"
* "tcphf & (tcp-syn | tcp-ack) != 0" -- true iff either SYN or ACK is
  set
* "tcphf & (tcp-fin | tcp-rst) == 0" -- true iff neither FIN nor RST is
  set
* "tcphf & (tcp-ece | tcp-cwr) == (tcp-ece | tcp-cwr)" -- true iff both
  ECE and CWR are set

This would be not perfect, but certainly as convenient (or not) as the established bitwise syntax for "tcp[tcpflags]".

To manage the forward compatibility of this, it would take to declare that "tcphf" means a bitmask that is the bitwise 
AND of all named TCP flags, that is, if some hypothetical future "tcp-abc" does not resolve to a number in a particular 
version of libpcap, there is no point in ANDing the raw binary flag value with "tcphf" because that would quetly fail 
to match.  In other words, "tcphf", if used with named flags, would always either work as expected or fail to compile.

Since TCP header flags are often tested as a set, a slightly more generic potential solution would be using the less 
known, but pre-existing "value list" syntax, which means the primitive is true if any of the given values matches):

* "tcphf tcp-fin" -- true iff the flag is set
* "tcphf tcp-syn" -- true iff the flag is set
* "tcphf tcp-rst" -- true iff the flag is set
* "tcphf tcp-push" -- true iff the flag is set
* "tcphf tcp-ack" -- true iff the flag is set
* "tcphf tcp-urg" -- true iff the flag is set
* "tcphf tcp-ece" -- true iff the flag is set
* "tcphf tcp-cwr" -- true iff the flag is set
* "tcphf tcp-ae" -- true iff the flag is set
* "tcphf (tcp-syn or tcp-ack)" -- true iff at least one of SYN or ACK is
  set
* "not tcphf (tcp-fin or tcp-rst)" -- true iff neither FIN nor RST is
  set
* "tcphf tcp-ece and tcphf tcp-cwr" -- true iff both ECE and CWR are set

An advantage of this is that the syntax does not allow mixing the "not"
with the list values, which eliminates a space for confusion.  A disadvantage of this could be a possibility to specify 
ORed flag bits as list values:

* "tcphf (0x0f or 0xf0)" -- ?

Would it mean a multiple-bit value is an illegal argument, or all set bits in a list value must match, or at least one 
set bits in a list value must match?

A more generic potential solution could be introducing a new /type/ qualifier, making it valid for certain values of 
/proto/ qualifiers including "tcp", but not for any explicit /dir/ qualifiers.  The identifier for this regular 
primitive would be an integer, that is, a
bitmask:

* "tcp flags tcp-fin" -- true iff the flag is set
* "tcp flags tcp-syn" -- true iff the flag is set
* "tcp flags tcp-rst" -- true iff the flag is set
* "tcp flags tcp-push" -- true iff the flag is set
* "tcp flags tcp-ack" -- true iff the flag is set
* "tcp flags tcp-urg" -- true iff the flag is set
* "tcp flags tcp-ece" -- true iff the flag is set
* "tcp flags tcp-cwr" -- true iff the flag is set
* "tcp flags tcp-ae" -- true iff the flag is set
* "tcp flags tcp-syn or tcp-ack" -- true iff at least one of SYN and
  ACK is set
* "tcp flags tcp-syn | tcp-ack" -- ?
* "not tcp flags tcp-fin | tcp-rst" -- ?
* "tcp flags tcp-ece and tcp-cwr -- true iff both ECE and CWR are set
* "tcp flags tcp-ece & tcp-cwr -- formally true iff no flags set, but
  in practice most likely a user error

In this case, if the bitmask comprises more than one TCP header flag, the meaning would depend on (and would not be 
immediately obvious) whether "tcp flags NUM" tests for any bit set ("tcp[12:2] & 0x1ff & NUM != 0") or all bits set 
("tcp[12:2] & 0x1ff & NUM == NUM").

Another potential syntax of the above could be using a string for the identifier, which in this case would mean the 
flag names would be scoped and would not need to keep the "tcp-" prefix:

* "tcp flag fin" -- true iff the flag is set
* "tcp flag syn" -- true iff the flag is set
* "tcp flag rst" -- true iff the flag is set
* "tcp flag push" -- true iff the flag is set
* "tcp flag ack" -- true iff the flag is set
* "tcp flag urg" -- true iff the flag is set
* "tcp flag ece" -- true iff the flag is set
* "tcp flag cwr" -- true iff the flag is set
* "tcp flag ae" -- true iff the flag is set
* "tcp flag syn or tcp flag ack" -- true iff at least one of SYN and
  ACK is set, equivalent to "tcp flag syn or ack"
* "not (tcp flag fin or rst)" -- true iff neither FIN nor
  RST is set, unfortunately, in the established grammar this would be
  equivalent to "not tcp flag fin and not tcp flag rst", but not to
  "not tcp flag fin or rst", which is a know and documented peculiarity
* "tcp flag ece and tcp flag cwr" -- true iff both ECE and CWR are set,
  equivalent to "tcp flag ece and cwr"

Using this approach, managing the forward compatibility would be as simple as recognising (or not) specific strings as 
the flag names (i.e.
"tcp flag abc" would be invalid syntax and there would be no syntax to specify a numeric value to try working around 
that, whether successfully or not).

Speaking of "tcp flag ID" or "tcp flags NUM" with regard to other existing protocol names and index operations, "ip" 
and "igrp"
potentially could also be a part of the same solution space, but I do not immediately see any other protocols that 
could use it.

--
    Denis Ovsienko

--- End Message ---
_______________________________________________
tcpdump-workers mailing list -- tcpdump-workers () lists tcpdump org
To unsubscribe send an email to tcpdump-workers-leave () lists tcpdump org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Current thread: