nanog mailing list archives
Re: Captchas on Cloudflare-Proxied Sites
From: "Constantine A. Murenin via NANOG" <nanog () lists nanog org>
Date: Wed, 2 Jul 2025 17:45:17 -0500
On Wed, 2 Jul 2025 at 14:38, William Kern via NANOG <nanog () lists nanog org> wrote:
On 7/1/25 8:22 PM, Constantine A. Murenin via NANOG wrote:But the bots are not a problem if you're doing proper caching and throttling.Not all site traffic is cacheable or can be farmed out to a CDN.
That's just an excuse for inadequate planning and misplaced priorities. If you start with the requirement that it'd all be cacheable, then EVERYTHING can be cached, especially for the ecommerce and the catalogue stuff. OSS nginx is free and relatively easy to use, with excellent documentation, and it offers superb caching functionality. You don't need an external CDN to do the caching. You can even cache search results, especially for the non-logged users. Why would you NOT? If, to quote arstechnica, "a GitLab link is shared in a chat room", why would you want ANYONE to wait an extra millisecond, let alone "having to wait around two minutes" for Anubis proof-of-work, to access the result, if the result was already computed and known, because it was already assembled for the person who posting the link in the first place? These things could even be cached in the app itself, and even shared between all logged and non-logged users, if performance and web scale is paramount. Else, it can be architectected to be cachable with nginx.
Dynamic (especially per-session) requests (think ecommerce) can't be cached. Putting an item into the shopping cart is typically one of the more resource driven events. We have seen bots that will select the buy button and put items into the cart, possibly to see any discounts given. You end up with hundreds of active 'junk' cart sessions on a small site that was not designed for that much traffic.
Why is the simple act of placing an item in a shopping cart a resource-driven event? This can literally be done on the front-end without any server requests at all, let alone resource-driven ones. If you DO store an expensive session on the server for this, instead of in the browser, then you also likely expire said carts even for users who intended to return and complete the purchase. Does the owner know? Yes, it's more work to have a separate cookie cart for anonymous users, but if that's a business requirement, why not? This way, even if someone comes back many months later, if they've never cleared the cookies, their cart will still be there, waiting for them, at zero cost to your shopping cart database. Isn't that how it should be? Stores that empty your cart in 3 days, or which require captchas for basic product viewing, are the best example of misplaced priorities. I usually click the X button before they can complete their captcha. And won't bother adding anytihng to the shopping cart again if the store is known for data loss.
Forcing the bot (or a legit customer) to create yet another login to create a cart can help but that generates push back from the store owner. The owners don't want that until the payment details phase or they want purchasers to be able to do a guest checkout. They will point that on amazon.com you don't have to login to put an item in the cart. Rate limiting is not effective when they come from different ip ranges. The old days of using
Rate limiting would make sense for expensive things like search (and `git blame`), which should also be combined with caching, too, especially if you aren't even using AI or past purchases/views. Things like adding an item to a cart, should be a local event for anonymous users, so, it should be impossible to rate-limit that. Product listings and categories should 100% be cached, absolutely no exceptions. Search pages also absolutely have to be cached, I dunno who ever though of the brilliant idea that search somehow isn't cacheable, especially on all those sites where it's 100% deterministic and identical for all users. If someone wants to get the entire site of all the products, I don't see a good reason to preclude that. In the old days, any vendor would be happy to send you the entire catalogue of their offerings, all at once, in print form in the US for major brands, and in Microsoft Excel for the more local vendors, but now suddenly we want to prevent people from viewing several products all at a time, or being able to shop the way they want to, or see the prices for more than a handful of products at a time?! Misplaced priorities 100%. Best regards, Constantine.
a Class C (/24) as a rate limit key are no longer effective. The bots come from all over the providers space (often Azure) but can be from any of the larger providers and often from different regions. if you throttle EVERYONE then legit customers can get locked out with 429 or even 503s And has been pointed out. Relying on the browser string is no longer effective. They use common strings and change them dynamically. Sincerely, William Kern PixelGate Networks.
_______________________________________________ NANOG mailing list https://lists.nanog.org/archives/list/nanog () lists nanog org/message/KA2KKQUKLYTXC3KR2JHVKZIZSSUGHY2C/
Current thread:
- Re: Captchas on Cloudflare-Proxied Sites, (continued)
- Re: Captchas on Cloudflare-Proxied Sites Josh Reynolds via NANOG (Jul 01)
- Re: Captchas on Cloudflare-Proxied Sites Constantine A. Murenin via NANOG (Jul 01)
- Re: Captchas on Cloudflare-Proxied Sites Josh Reynolds via NANOG (Jul 01)
- Re: Captchas on Cloudflare-Proxied Sites niels=nanog--- via NANOG (Jul 02)
- Re: Captchas on Cloudflare-Proxied Sites Rich Kulawiec via NANOG (Jul 02)
- Re: Captchas on Cloudflare-Proxied Sites Constantine A. Murenin via NANOG (Jul 02)
- Re: Captchas on Cloudflare-Proxied Sites nanog--- via NANOG (Jul 06)
- Re: Captchas on Cloudflare-Proxied Sites niels=nanog--- via NANOG (Jul 07)
- Re: Captchas on Cloudflare-Proxied Sites nanog--- via NANOG (Jul 09)
- Re: Captchas on Cloudflare-Proxied Sites William Kern via NANOG (Jul 02)
- Re: Captchas on Cloudflare-Proxied Sites Constantine A. Murenin via NANOG (Jul 02)
- Re: Captchas on Cloudflare-Proxied Sites niels=nanog--- via NANOG (Jul 03)
- Re: Captchas on Cloudflare-Proxied Sites Constantine A. Murenin via NANOG (Jul 03)
- Re: Captchas on Cloudflare-Proxied Sites Rich Kulawiec via NANOG (Jul 06)
- Re: Captchas on Cloudflare-Proxied Sites William Herrin via NANOG (Jul 06)
- Re: Captchas on Cloudflare-Proxied Sites Dan Lowe via NANOG (Jul 07)
- Re: Captchas on Cloudflare-Proxied Sites Brandon Martin via NANOG (Jul 07)
- Re: Captchas on Cloudflare-Proxied Sites Dan Lowe via NANOG (Jul 07)
- Re: Cats vs Mice [CAPTCHAs on Cloudflare-proxied sites] nanog--- via NANOG (Jul 09)
- Re: Cats vs Mice [CAPTCHAs on Cloudflare-proxied sites] niels=nanog--- via NANOG (Jul 09)
- Re: Cats vs Mice [CAPTCHAs on Cloudflare-proxied sites] Brandon Martin via NANOG (Jul 09)
