nanog mailing list archives

[NANOG] Re: JunOS and MX trojan and malware


From: Saku Ytti via NANOG <nanog () lists nanog org>
Date: Sun, 16 Mar 2025 10:48:18 +0200

So that I don't appear too partial (I am not impartial, to be sure).

In PTX there is VoQ for every interface, including VoQ to reach
linecard cpu for punt.

This VoQ has 8 queues, with some guaranteed rate and some excess rate
(that can be used if free guaranteed rate).

This was configured exceedingly poorly, packets were classified
statically very poorly to the queues, meaning unimportant packets
anyone in the internet can generate and critical packets only
internally generated were sharing queues. So anyone with 10Mbps of
capacity could break any PTX by knowing what kind of packets to send
(this has always been true since we've had HW based routers, if you
know what to inject, you can break the control-plane with trivial
PPS).

I told Juniper how the should spread the protocols in the queues or
ideally let us define it in ddos-protocols manually as well. And how
the queues should be dimensioned. Some may have noticed before these
changes, that HTTP/SFTP copy of images was slow, that was because the
Queue2 was 10Mbps non-bursting (shared with most critical protocols).
Juniper fixed this partially, making situation worse. They opened up
the queues, letting them burst, without moving the protocol-queue
distribution to more hygienic one. So that's on me, I shouldn't have
said anything.  We tried to get this addressed, but it's just too
complex for vendor to understand.
So while PTX could work very well from architecture POV, due to design
choices in VoQ dimensioning and classifying it's very fragile.

Of course if Juniper actually did allow users to dimension the VoQ and
classify the packets, it wouldn't help anything, as even lo0 is beyond
both Juniper and user skill. Ref: Juniper MX book has hilariously
broken lo0 filter, which any attacker can bypass by swapping
DPORT/SPORT and various other massive gaps.







On Sun, 16 Mar 2025 at 10:34, Saku Ytti <saku () ytti fi> wrote:

Oh. And this is not getting better, this is getting worse.

In juniper you can do flow -> logical -> physical -> npu level
admission control. LPTS is NPU. So collateral damage is very
expensive.

There was 'lpts punt excessive-flow-trap' which was retired, and we
couldn't get Cisco to understand why replacement is needed.

E.g. interface1 customer has L2 loop, and offers us excessive amount
of ARP. Other interfaces in same NPU are dead too, you used to be able
to address this in excessive-flow-trap.


Further it is impossible to expect customers to understand LPTS, when
Cisco does not. We had PE-CE BGP flaps in 690279616, where TAC was
focused on fixing our MQC config, despite LPTS not being subject to
MQC at all. It took escalation to Xander, who initially thought
ingress ACL can be used to discriminate here, until I reminded him how
LPTS works and he luckily didn't try to gas light like TAC, but
immediately agreed that LPTS is not subject to ingress ACL either
(apparently it at some time was, which is why Xander was confused for
a while).

So when LPTS does have gaps or collateral damage, you can't even add
ACL or ingress MQC to tactically address the offending interface.

So lot more complexity would be needed, to make LPTS functional, but
already the complexity is higher than what vendor can support. And
complexity is being reduced (removal of flow-trap) without
understanding why it was actually needed.



On Sun, 16 Mar 2025 at 10:11, Saku Ytti <saku () ytti fi> wrote:

LPTS is not really competitive with Juniper offering. But because
Juniper needs configuration and LPTS does not, in practice LPTS ends
up having better outcome. Granted the outcome is terrible and easy to
bypass, but it is still better than typical Juniper outcome.

I could explain many gaps in it, absolute gaps and relative gaps to
Juniper. But one particular thing is that dimensioning is all wrong,
the device has no idea if it can handle what LPTS admits. For example,
we regularly had 1/8th of our BGP peers go down, because some xipc
worker was congested, because LPTS admitted too many packets to it,
and ended up doing software drops. It does a poor job in deciding what
should and what should not be admitted, and the rate at which they
should be admitted, or that rate of session 1 does not overpower
session 2.

The above problem is particularly hilarious, because the CPU
performance was used by BGP, which meant XIPC had less CPU cycles to
handle what LPTS admitted. Now because XIPC doesn't have higher
priority over BGP, this of course meant that XIPC couldn't give the
packet to BGP, causing more pressure and CPU cycle demand on BGP. If
XIPC had had priority over BGP, then BGP processing would have been
slowed down, but XIPC could have offered it the work it was going to
need to do, reducing overall CPU time.
eXR works better, but that's mostly out of luck, not out of design.
Cisco marketed cXR as real time OS, and stressed that the point of
real time was crucial for mission critical system. Yet cisco ran
everything in flat priority, Cisco did try to introduce priorities in
cXR internally, but it just made things worse, due to having
incomplete understanding on what customers are doing and how. The
losing 1/8th of BGP sessions regularly was known problem to cisco, and
cisco explicitly decided not to try to address it, other than 'imaybe
it'll work better on eXR'.



On Sun, 16 Mar 2025 at 09:01, Jakob Heitz (jheitz) via NANOG
<nanog () lists nanog org> wrote:

Hi Saku,

Search the Internet for “IOS-XR LPTS” for one way to protect the control plane.

Regards,
Jakob.

most others don't even have a way to protect control-plane.
_______________________________________________
NANOG mailing list
https://lists.nanog.org/archives/list/nanog () lists nanog org/message/TFPR5TJHNL7O5LF7PBRIQPCMQMXJNLZ4/



--
  ++ytti



--
  ++ytti



-- 
  ++ytti
_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/nanog () lists nanog org/message/QDV5THCW6ALF2GTDV5BOADX747QKHV6G/

Current thread: