nanog mailing list archives

Re: Link-state EGP


From: nanog--- via NANOG <nanog () lists nanog org>
Date: Mon, 25 Aug 2025 02:44:22 +0200



On 24 August 2025 16:40:20 CEST, Saku Ytti <saku () ytti fi> wrote:
On Sun, 24 Aug 2025 at 13:09, <nanog () immibis com> wrote:

No, you can't, because your upstream's shortest route leads back to you and that's a loop. Any difference in route 
calculation between two nodes in a link-state protocol is likely to create a loop.

The sender will know if it loops or not, if they can choose a
non-shortest path that will not loop. I.e. LFA, loop free alternative.

To give a specific example.

I am AS10
I have upstream transit AS2[123]
I have downstream stubby customer AS3[123]

For every other AS than AS10, AS3[123] I can freely choose any
permutation of AS2[123] to send traffic to, _per-prefx_.

Let's say I see /some/ AS42 path through each of AS2[123] now I can
have a local egress policy for each of AS42 prefix to send it through
any permutation of AS2[123] ECMP or not.


It has to be a shortest path or at least you have to know their shortest path doesn't go back through you. Perhaps 
AS21's shortest path to AS23 is through you. In a link-state protocol you can't do shit to stop them using you as 
transit, besides outright blocking their traffic (breaking the internet) or splitting your AS in 3.

How many times do I have to say it, maybe with big enough letters? ***A LINK STATE ROUTING PROTOCOL IS A DISTRIBUTED 
CONSENSUS ALGORITHM. ALL NODES MUST RUN THE IDENTICAL ALGORITHM ON IDENTICAL INPUT DATA OR THE NETWORK BREAKS.***

Perhaps you've invented a new type of algorithm where that's not the case. In this case I suggest ceasing to call it 
"link state", and writing a detailed paper about it instead of vague hints.


In fact BGP topology is mostly tree, it's mostly non-loopy

Not even remotely true. Customer relationships are almost always a DAG, and that's all we can say.

Locally, on any given router, you see a tree, but each router has its own tree and the interconnection of all the trees 
is not a tree..

Loop prevention often happens anyway as a matter of policy, but BGP explicitly prevents loops by using the path 
attribute.

 so LFA
would be mostly there already. And this is so, because inherent
business reasons (upstream/dowstream) and because we actually have
pretty poor loop prevention hygiene, we filter RIB with different
policies, some dropping more-specifics some not dropping them. Which
from theory POV is a big no-no

Only in a link-state protocol! Luckily, BGP is not a link-state protocol.

, as now you can't guarantee you don't
loop. But we do it, because we understand how _this_ implementation in
practice looks, and we don't use the solutions that. don't work in
_this_ implementation.

Infact even internally in our AS, we would almost certainly loop if we
didn't do MPLS, because due to specific policy and TE reasons we
filter advertisements differently in _iBGP-IN_, this is also kind of
big no-no, and if we did do IP lookup in core transit I cannot at all
guarantee we wouldn't loop, but because we can guarantee that the edge
decision is honored all the way to the other edge, we can get away
with it.

The ability to use policy to affect egressing traffic wouldn't be that
much affected. The ability to affect ingress traffic would be
radically different and we would risk that we walk towards a future
where we are suddenly looking at a very large number of ASn, because
perceived or real needs for disjoint advertisements. So my confidence
remains very low that this would be worthwhile, while certainly we
could make it go.



_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/nanog () lists nanog org/message/DGHPXIAYTHBBG4YBEKH42LZXO55MO3IU/


Current thread: