nanog mailing list archives

Re: Recommended DNS server for a medium 20-30k users isp


From: Matthew Petach via NANOG <nanog () lists nanog org>
Date: Mon, 11 Aug 2025 18:16:41 -0700

On Mon, Aug 11, 2025 at 3:40 PM William Herrin <bill () herrin us> wrote:

On Mon, Aug 11, 2025 at 3:08 PM Matthew Petach via NANOG
<nanog () lists nanog org> wrote:
often to the point where the final site is so overwhelmed by all the
traffic slamming it that it can't perform healthcheck/depreferencing
anymore.

Hi Matthew,

The unix "nice" command helps in this situation. It's counterintuitive
to run the critical Internet-facing service at a below-normal
priority, but it works. Under normal load there's no difference in
performance but when the server is overloaded administrative access
and health checks have priority access to the CPU.


Oh--I wasn't talking about the CPU having issues.  I was talking about
DDoSing your own site,
with all the inbound traffic worldwide traffic focusing in on the last
remaining site, hammering the network links
to the point of absolute congestion.  At that point, trying to send update
messages to depref the anycast routes
for the site generally fails, leading to an extended outage as all the
traffic gets stuck trying to reach that last site.

It's helpful to set a minimum number of anycast sites in your topology
automation systems, such that
sites will no longer remove themselves from rotation/distribution if doing
so would reduce the count of
active sites below the minimum required site count.

Dynamic systems are great things, but as with most things in the world,
"all things in moderation"
is a good motto to keep in mind.  Allow sites to dynamically adjust, but
only within reasonably set
bounds.  Don't let too many sites decide they need to shed load at once;
the first several, sure; but
if the conditions continue, have a floor below which the system stops
trying to react, and instead holds
steady while paging a human to look at the bigger picture problem, before
the entire system goes off line
due to the lemmings of automation all chasing one another off the
proverbial cliff.

Fortunately for me, the search engine caches have long since purged out the
evidence of how some of
these lessons were learned.   ^_^;;

Matt
_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/nanog () lists nanog org/message/5YVQATBGZAKV7IPFFJFTT5CIRO3RWZWL/

Current thread: