Home page logo

nanog logo nanog mailing list archives

Re: Cascading Failures Could Crash the Global Internet
From: Marshall Eubanks <tme () multicasttech com>
Date: Sun, 9 Feb 2003 12:35:27 -0500


A packet switched network can be engineered against cascading failures
in a way that's hard for a circuit switched network. Every time you see a
random wait in a protocol, it's a good bet that the protocol writers were trying to
protect against the tight coupling that leads to cascading failures.

Marshall Eubanks

On Sunday, February 9, 2003, at 10:07  AM, Jack Bates wrote:

From: "Stewart, William C (Bill), SALES"

I think the key is that the failures described in the paper
are caused by overload rather than other things -
too much demand for power blows out the generator,
and without it, the grid tries to get the power from the next
nearest generators, which overload and fail, and try to pull an
even large amount from the _next_ nearest, etc.
So the bit about heterogeneity is probably referring to
the fact that some nodes are bigger or better-connected than others,
and are more likely to blow out a bunch of their neighbors when
they fail and shed a big load.

That's not really how Internet systems usually fail.

A prime example of this theory was the large network I was using back when IE5 first came out. They had one circuit bad which overloaded an ATM circuit at another NAP causing it to generate bit errors. Shutting down the second circuit overloaded both MAE circuits effectively shutting down the network. However, it required manual intervention to create full failure, otherwise TCP would pull back to being useless, effectively killing all connections going that path, but not causing an issue with other paths until the manual
intervention of shutting down the cirucit.

While in theory it was still a cascade failure, it was also poor
planning/policy on the part of the network to not be able to compensate in case of failure. The information provided may be partially inaccurate and is
only hearsay concerning actual outages and effects when various
interventions were tried; no hard fact. Thus it could be taken as solely my
conjecture and not actual fact.


  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]