nanog mailing list archives

Re: Correctly dealing with bots and scrapers.


From: Tom Beecher via NANOG <nanog () lists nanog org>
Date: Wed, 16 Jul 2025 15:43:51 -0400

As Chris states, broad IP based blocking is unlikely to be very effective ,
and likely more problematic down the line anyway.

For the slightly more 'honorable' crawlers, they'll respect robots.txt,
and you can block their UAs there.

Fail2ban is a very good option right now. It will be even better if
nepenthes eventually integrates with it. Then you can have some real fun.


On Wed, Jul 16, 2025 at 3:39 PM Andrew Latham via NANOG <
nanog () lists nanog org> wrote:

Chris

Spot on, and I am getting the feeling this is where the value to a
geo-ip service comes to play that offers defined "eyeball networks" to
allow.

On Wed, Jul 16, 2025 at 12:57 PM Chris Adams via NANOG
<nanog () lists nanog org> wrote:

Once upon a time, Marco Moock <mm () dorfdsl de> said:
Place a link to a file that is hidden to normal people. Exclude the
directory via robots.txt.

Then use fail2ban to block all IP addresses that poll the file.

The problem with a lot of the "AI" scrapers is that they're apparently
using botnets and will often only make a single request from a given IP
address, so reactive blocking doesn't work (and can cause other issues,
like trying to block 100,000 IPs, which fail2ban for example doesn't
really handle well).
--
Chris Adams <cma () cmadams net>
_______________________________________________
NANOG mailing list

https://lists.nanog.org/archives/list/nanog () lists nanog org/message/AFJF4UQJZW6ALTY6SA7OHBN2AZC72SZQ/



--
- Andrew "lathama" Latham -
_______________________________________________
NANOG mailing list

https://lists.nanog.org/archives/list/nanog () lists nanog org/message/DHUYTBIXFMWE2KWC5NKCR7AJIWPYUL4E/
_______________________________________________
NANOG mailing list 
https://lists.nanog.org/archives/list/nanog () lists nanog org/message/ECB77Z6SVDZZ6SZ5YGWP4YJ5HVX6KRQE/

Current thread: