To: Deepak Jain <deepak () ai net>
cc: Matthew Moyle-Croft <mmc () internode com au>,
Arnold Nipper <arnold () nipper de>, Paul Vixie <vixie () isc org>,
"nanog () merit edu" <nanog () merit edu>
Subject: Re: IXP
Date: Sat, 18 Apr 2009 05:30:41 +0000
From: Stephen Stuart <stuart () tech org>
Not sure how switches handle HOL blocking with QinQ traffic across trunks,
what's the fun of running an IXP without testing some limits?
Indeed. Those with longer memories will remember that I used to
regularly apologize at NANOG meetings for the DEC Gigaswitch/FDDI
head-of-line blocking that all Gigaswitch-based IXPs experienced when
some critical mass of OC3 backbone circuits was reached and the 100
MB/s fabric rolled over and died, offered here (again) as a cautionary
tale for those who want to test those particular limits (again).
At PAIX, when we "upgraded" to the Gigaswitch/FDDI (from a DELNI; we
loved the DELNI), I actually used a feature of the switch that you
could "black out" certain sections of the crossbar to prevent packets
arriving on one port from exiting certain others at the request of
some networks to align L2 connectivity with their peering
agreements. It was fortunate that the scaling meltdown occurred when
it did, otherwise I would have spent more software development
resources trying to turn that capability into something that was
operationally sustainable for networks to configure the visibility of
their port to only those networks with which they had peering
agreements. That software would probably have been thrown away with
the Gigaswitches had it actually been developed, and rewritten to use
something horrendous like MAC-based filtering, and if I recall
correctly the options didn't look feasible at the time - and who wants
to have to talk to a portal when doing a 2am emergency replacement of
a linecard to change registered MAC addresses, anyway?. The port-based
stuff had a chance of being operationally feasible.
The notion of a partial pseudo-wire mesh, with a self-service portal
to request/accept connections like the MAEs had for their ATM-based
fabrics, follows pretty well from that and everything that's been
learned by anyone about advancing the state of the art, and extends
well to allow an IXP to have a distributed fabric benefit from
scalable L2.5/L3 traffic management features while looking as much
like wires to the networks using the IXP.
If the gear currently deployed in IXP interconnection fabrics actually
supports the necessary features, maybe someone will be brave enough to
commit the software development resources necessary to try to make it
an operational reality. If it requires capital investment, though, I
suspect it'll be a while.
The real lesson from the last fifteen or so years, though, is that
bear skins and stone knives clearly have a long operational lifetime.