nanog mailing list archives

Re: Resilient Internet


From: Jay Acuna via NANOG <nanog () lists nanog org>
Date: Mon, 15 Sep 2025 12:28:57 -0500

On Mon, Sep 15, 2025 at 7:07 AM Mike Hammett via NANOG
<nanog () lists nanog org> wrote:
*nods* Well, and that's the rub. Their expectations don't match any Internet SLA I've ever seen

The implied expectation is on a completely infeasible ground
for the provider of a basic internet line. Typically security updates
alone necessary for CPEs, etc, would bump connections more than
a second per 6 months.

It's more like the service level for point-to-point path-protected circuits, or
dual-connection disparate-path dark fiber build directly between
two locations, bought from a telecoms provider, and not an internet connection,
in order to reasonably offer a service level anywhere near this.
"1 second maximum every 6 months";  let alone less.

An infrequent 1 second one-off interruption typically doesn't count
as an outage.  It may also be something that cannot be diagnosed without
major maintenance; that is part of the environment IP exists in, and
the internet itself is a large network with multiple
instable or changing paths through other providers' networks.
Most peers are a best-effort packet delivery, not a promise of 0% loss.

As described the expectation the expectation is a near equal
to guaranteeing a lossless connection between point A and B.
But, point B is across the internet.  Which means
part of the path to B is outside the control of the provider of that line
in the first place, and parts of that path at different points are
on infrastructure which is shared and  overcommitted both by the point
A's network providers and point B's network providers.

Also, point A and point B are both host devices which
are subject to the chance of a local software or hardware issue.

That means when there is a "disruption";  the rational thing for
Provider A to do is to assume  the issue is with point B, the
network in the middle, or end devices, until proven otherwise.
Providers do not assume an issue they have not detected is with
their own service or network;  that has to be proven, and the
proof may be too difficult to accomplish.

Provider B or the networks in the middle have no reason to
adjust their service to accommodate  Provider A or Provider A's
end users'  special requirements   without first making purchases of
additional dedicated infrastructure and contracts with each
provider between point A and B.

For a network line provider to ensure this level of service;
that provider realistically has have to have failover circuits between
point A and B  with dedicated infrastructure on the whole path
not dependent on 3rd parties.

For example a ring path-protected point to point Ethernet or
SONET-based circuits
with established bandwidth reservations  between A and B.
Even with those: you can still expect a few hours of outage per year
for maintenance.
And there is always the chance of that one-off double fibre cut every
few years and similar.

The risk level depends on many variables.

There are those SDN  solutions  that overlay private networks
with redundant forwarding on top of   several internet connections.

_Both_ point A and point B  require multiple internet connections.

The multiple internet connections still have simultaneous failure scenarios
during major internet events.

The expectation can be more protection
from different possible causes of outages,  but the risk for them
does not go to 0.0%,  etc


Mike Hammett
--
-JA
_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/nanog () lists nanog org/message/W6RBDQLPBDUAFV7AOBVJDSV6Z6OFK35N/

Current thread: