Home page logo
/

nanog logo nanog mailing list archives

Re: Carrier Circus (was RE: Intermedia (ICIX) brokenness...)
From: "Richard A. Steenbergen" <ras () e-gerbil net>
Date: Fri, 4 May 2001 15:47:09 -0400 (EDT)


On Fri, May 04, 2001 at 12:18:18PM -0700, Jonathan Disher wrote:

Personally, I'm still trying to figure out why Exodus, in all their
apparent wisdom (or lack thereof), has stopped using the GBLX OC-48's
in the former GlobalCenter facilities (or at least SNV3), and is now
shuttling all its traffic out a single Exodus OC-12.  Prior to
yesterday these traces would've shown gblx.net routers (on different
IPs), and would never have touched an exodus backbone...

Hrm lets think about that for a momment shall we. Could it be, perhaps,
that Exodus purchased GlobalCenter and is integrating those facilities
into their network? Could it also be that Exodus has a well designed
network where most of the traffic is quickly sent to peers and an OC48
backbone is not required? I don't see any congestion on that OC12, so
perhaps that is the case? I also don't see a damn thing wrong with the
traceroute you provided, and an OC12 peer to UU is pretty good. Was there
some other complaint or do you just not like it when your traceroute
changes?

Of course, this is probably a move I should've expected from Exodus,
after the mongolian flustercluck that was the AS change in SNV3.  
You'd think they would do something like that carefully, as you can
-seriously- bone customers.  But noooooo.  One of our junior admins
made the change (since I was out of town, but hey, it's cut and
paste!).  He, and all of the other affected customers in SNV3 on the
conference call, were left on hold for about half an hour (plus the
call started half an hour late), whereupon the exodus engineering team
popped back in and said "We're done with our side, you guys go
ahead!".

Actually I was awake for that. I guess your junior engineer wasn't able to
figure out that if he simply put in an additional neighbor statement with
the new AS your downtime would have been less then 30 seconds as bgp came
back up. 30 second outages are pretty light in the history of GCTR and
GBLX outages, if you can't handle maint then you should have setup static
routes out or multihomed, but you shouldn't blame your stupidity or lack
of forethought on other networks.

Now.  Does it seem logical to kill connectivity over BOTH of your
hosting routers at once, thus killing every single BGP-running
customer you have that isn't physically in their cage at the time?  
Or would it seem better to do what I assumed they'd do, which is do
one router, wait for everyone to make changes, then do the other?

ASN changes are not exactly easy or frequent, but I seem to recall that
one going over rather smoothly. Customers were given ample warning and a
conference call was setup to handle any outstanding issues, of which there
were none.

I guess this is what happens when I assume intelligence at a
hosting/backbone provider.

Or when we assume intelligent posts to nanog...

-- 
Richard A Steenbergen <ras () e-gerbil net>       http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)



  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]