tcpdump mailing list archives
Re: Accurate ECN support in tcpdump/libpcap
From: Denis Ovsienko <denis () ovsienko info>
Date: Fri, 6 Feb 2026 16:40:35 +0000
Helo all. So, I have spent a few weeks researching this direction in more detail, you can find the current results below. On Wed, 12 Nov 2025 18:52:20 +0000 Denis Ovsienko <denis () ovsienko info> wrote: [...]
With this in mind, one potential solution could be a new arithmetic expression, something that would work similarly to the existing "length" and would be recognisable as TCP header flags. Let's call it "tcphf" for the sake of comparison. Then the following would be valid regular arithmetic expressions that evaluate to an integer in the range [0x000, 0x1FF] ([0b000000000, 0b111111111]): * "tcphf" -- same as "tcp[12:2] & 0x1FF" * "tcphf & tcp-fin" -- same as "tcp[13] & tcp-fin" * "tcphf & tcp-syn" -- same as "tcp[13] & tcp-syn" * "tcphf & tcp-rst" -- same as "tcp[13] & tcp-rst" * "tcphf & tcp-push" -- same as "tcp[13] & tcp-push" * "tcphf & tcp-ack" -- same as "tcp[13] & tcp-ack" * "tcphf & tcp-urg" -- same as "tcp[13] & tcp-urg" * "tcphf & tcp-ece" -- same as "tcp[13] & tcp-ece" * "tcphf & tcp-cwr" -- same as "tcp[13] & tcp-cwr" * "tcphf & tcp-ae" -- same as "tcp[12] & tcp-ae" * "tcphf & (tcp-syn | tcp-ack) != 0" -- true iff either SYN or ACK is set * "tcphf & (tcp-fin | tcp-rst) == 0" -- true iff neither FIN nor RST is set * "tcphf & (tcp-ece | tcp-cwr) == (tcp-ece | tcp-cwr)" -- true iff both ECE and CWR are set This would be not perfect, but certainly as convenient (or not) as the established bitwise syntax for "tcp[tcpflags]". To manage the forward compatibility of this, it would take to declare that "tcphf" means a bitmask that is the bitwise AND of all named TCP flags, that is, if some hypothetical future "tcp-abc" does not resolve to a number in a particular version of libpcap, there is no point in ANDing the raw binary flag value with "tcphf" because that would quetly fail to match. In other words, "tcphf", if used with named flags, would always either work as expected or fail to compile.
This is a work in progress. I tried modelling the arithmetic "tcphf" after tcp[] and realised that tcp[] as it is now is not a good model because it has always been IPv4-only (for the avoidance of doubt, the current tcp[tcpflags] is a case of the same problem). Then I prototyped a draft of IPv6 support in tcp[] -- it works in principle, but would need to be done right before the arithmetic "tcphf" could reproduce the method. This does not look a good fit, but before deciding on the final syntax for TCP flags it seems worthwhile to research this direction a little bit more.
Since TCP header flags are often tested as a set, a slightly more generic potential solution would be using the less known, but pre-existing "value list" syntax, which means the primitive is true if any of the given values matches): * "tcphf tcp-fin" -- true iff the flag is set * "tcphf tcp-syn" -- true iff the flag is set * "tcphf tcp-rst" -- true iff the flag is set * "tcphf tcp-push" -- true iff the flag is set * "tcphf tcp-ack" -- true iff the flag is set * "tcphf tcp-urg" -- true iff the flag is set * "tcphf tcp-ece" -- true iff the flag is set * "tcphf tcp-cwr" -- true iff the flag is set * "tcphf tcp-ae" -- true iff the flag is set * "tcphf (tcp-syn or tcp-ack)" -- true iff at least one of SYN or ACK is set * "not tcphf (tcp-fin or tcp-rst)" -- true iff neither FIN nor RST is set * "tcphf tcp-ece and tcphf tcp-cwr" -- true iff both ECE and CWR are set An advantage of this is that the syntax does not allow mixing the "not" with the list values, which eliminates a space for confusion. A disadvantage of this could be a possibility to specify ORed flag bits as list values: * "tcphf (0x0f or 0xf0)" -- ? Would it mean a multiple-bit value is an illegal argument, or all set bits in a list value must match, or at least one set bits in a list value must match?
I am not going to prototype this syntax because it does not look good even on paper. Besides the above ambiguities, this would also be a case of the "not" caveat: tcphf not tcp-fin would seem to mean "IPv4/IPv6 TCP packets with FIN flag cleared". However, it would actually mean: not tcphf tcp-fin which is the same (assuming IPv4+IPv6 implementation) as "(not IPv4 or (is an IPv4 fragment and is not the first fragment) or is not TCP or TCP flag FIN is cleared) or (not IPv6 or is not TCP or TCP flag FIN is cleared)", which is obviously too different from the expected behaviour.
A more generic potential solution could be introducing a new /type/
qualifier, making it valid for certain values of /proto/ qualifiers
including "tcp", but not for any explicit /dir/ qualifiers. The
identifier for this regular primitive would be an integer, that is, a
bitmask:
* "tcp flags tcp-fin" -- true iff the flag is set
* "tcp flags tcp-syn" -- true iff the flag is set
* "tcp flags tcp-rst" -- true iff the flag is set
* "tcp flags tcp-push" -- true iff the flag is set
* "tcp flags tcp-ack" -- true iff the flag is set
* "tcp flags tcp-urg" -- true iff the flag is set
* "tcp flags tcp-ece" -- true iff the flag is set
* "tcp flags tcp-cwr" -- true iff the flag is set
* "tcp flags tcp-ae" -- true iff the flag is set
* "tcp flags tcp-syn or tcp-ack" -- true iff at least one of SYN and
ACK is set
* "tcp flags tcp-syn | tcp-ack" -- ?
* "not tcp flags tcp-fin | tcp-rst" -- ?
* "tcp flags tcp-ece and tcp-cwr -- true iff both ECE and CWR are set
* "tcp flags tcp-ece & tcp-cwr -- formally true iff no flags set, but
in practice most likely a user error
In this case, if the bitmask comprises more than one TCP header flag,
the meaning would depend on (and would not be immediately obvious)
whether "tcp flags NUM" tests for any bit set ("tcp[12:2] & 0x1ff &
NUM != 0") or all bits set ("tcp[12:2] & 0x1ff & NUM == NUM").
I am not going to prototype this syntax because it comes with the same problem space as the above.
Another potential syntax of the above could be using a string for the identifier, which in this case would mean the flag names would be scoped and would not need to keep the "tcp-" prefix: * "tcp flag fin" -- true iff the flag is set * "tcp flag syn" -- true iff the flag is set * "tcp flag rst" -- true iff the flag is set * "tcp flag push" -- true iff the flag is set * "tcp flag ack" -- true iff the flag is set * "tcp flag urg" -- true iff the flag is set * "tcp flag ece" -- true iff the flag is set * "tcp flag cwr" -- true iff the flag is set * "tcp flag ae" -- true iff the flag is set * "tcp flag syn or tcp flag ack" -- true iff at least one of SYN and ACK is set, equivalent to "tcp flag syn or ack" * "not (tcp flag fin or rst)" -- true iff neither FIN nor RST is set, unfortunately, in the established grammar this would be equivalent to "not tcp flag fin and not tcp flag rst", but not to "not tcp flag fin or rst", which is a know and documented peculiarity * "tcp flag ece and tcp flag cwr" -- true iff both ECE and CWR are set, equivalent to "tcp flag ece and cwr" Using this approach, managing the forward compatibility would be as simple as recognising (or not) specific strings as the flag names (i.e. "tcp flag abc" would be invalid syntax and there would be no syntax to specify a numeric value to try working around that, whether successfully or not). Speaking of "tcp flag ID" or "tcp flags NUM" with regard to other existing protocol names and index operations, "ip" and "igrp" potentially could also be a part of the same solution space, but I do not immediately see any other protocols that could use it.
I have studied and prototyped this syntax as much as is practicable
without external input, the prototype can be seen in libpcap pull
request 1621.
The implementation is straightforward; originally it was not: the
[first since 2005] new type qualifier exposed some technical debt in
the interface between the grammar and the generator, but this has been
addressed in the master branch already. It trivially extends to IPv4
header flags (which the implementation includes as well) and
potentially EIGRP or PGM (if required in future). The flag names are
specific to the protocol and opaque to the grammar.
However, the syntax required more work. Because in this primitive the
ID is a string, this syntax is exempt from the problem space of bitwise
arithmetic expressions and integers (whether explicit or named).
However, it is not exempt from the problem space of "not", for example,
tcp flag not fin
would mean "(not IPv4 or (is an IPv4 fragment...", as noted above. Thus
to keep this syntax as surprise-free as possible, there needs to be a
way to specify that the negation applies to the flag state only rather
than the entire primitive. It would be nice to be able to use
tcp flag !fin
except in the lexer "not" is an alias for "!", so in the grammar it
would be exactly the same as the above. I considered a few other syntax
possibilities and found the most sense in joining the flag and its state
into a single string. To that end, examples of the alternatives I have
considered are:
* "not-fin" and "no-fin" (easy to confuse with "not fin"),
* "fin-0" (is 0 the bit value or the bit position?),
* "fin-unset" ("unset" is a verb, not a state, and could be confused
with an assignment), and
* "~fin" ("~" is not a valid bitwise unary operator in the current
implementation, but it may be introduced later, also there is an
ambiguity to whether besides FIN flag cleared it means all other
flags set).
After quite a few drafts the prototype eventually became this:
ip flag flagstate
True if the packet is an IPv4 packet and the IPv4
header flag (MF or DF) is set (if flagstate is one of
{mf-set, df-set}) or cleared (if flagstate is one of
{mf-cleared, df-cleared}). The correct way to test
for a cleared flag is by using the -cleared suffix;
for example,
ip flag df-cleared
correctly does not match packets that are not IPv4
packets, but
ip flag not df-set
does (correctly from the grammar perspective, but usu‐
ally incorrectly from the use case perspective) match
non-IPv4 packets because it means the same as
not (ip and ip flag df-set)
tcp flag flagstate
True if the packet is an IPv4/IPv6 TCP packet and the
TCP header flag (FIN, SYN, RST, PSH, ACK, URG, ECE,
CWR or AE) is set (if flagstate is one of {fin-set,
syn-set, rst-set, psh-set, ack-set, urg-set, ece-set,
cwr-set, ae-set}) or cleared (if flagstate is one of
{fin-cleared, syn-cleared, rst-cleared, psh-cleared,
ack-cleared, urg-cleared, ece-cleared, cwr-cleared,
ae-cleared}). For IPv4 this also verifies that the
datagram is the first fragment or is not fragmented.
For the same reasons as the above, the correct way to
test for a cleared flag is by using the -cleared suf‐
fix, for example:
tcp flag syn-cleared and ack-cleared
If anybody is interested to experiment with this and to provide
feedback, the pull request is going to stay open for a couple weeks. If
everything goes well, before long the "tcphf" research above will be
done and the final syntax will settle before libpcap 1.11.0.
--
Denis Ovsienko
_______________________________________________
tcpdump-workers mailing list -- tcpdump-workers () lists tcpdump org
To unsubscribe send an email to tcpdump-workers-leave () lists tcpdump org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
Current thread:
- Re: Accurate ECN support in tcpdump/libpcap Denis Ovsienko (Feb 06)
