nanog mailing list archives

Re: Monitoring other people's sites (Was: Website for ipv6.level3.com returns "HTTP/1.1 500 Internal Server Error")


From: Nick Hilliard <nick () foobar org>
Date: Tue, 20 Mar 2012 15:53:42 +0000

On 20/03/2012 14:54, Jeroen Massar wrote:
For everybody who is "monitoring" other people's websites, please please
please, monitor something static like /robots.txt as that can be
statically served and is kinda appropriate as it is intended for robots.

Depends on what you are monitoring.  If you're looking for layer 4 ipv6
connectivity then robots.txt is fine.  If you're trying to determine
whether a site is serving active content on ipv6 and not serving http
errors, then it's pretty pointless to monitor robots.txt - you need to
monitor /.

Oh and of course do set the User-Agent to something logical and to be
super nice include a contact address so that people who do check their
logs once in a while for fishy things they at least know what is
happening there and that it is not a process run afoul or something.

Good policy, yes.  Some robots do this but others don't.

Of course, asking before doing tends to be a good idea too.

Depends on the scale.  I'm not going to ask permission to poll someone
else's site every 5 minutes, and I would be surprised if they asked me the
same.  OTOH, if they were polling to the point that it was causing issues,
that might be different.

The IPv6 Internet already consists way too much out of monitoring by
pulling pages and doing pings...

"way too much" for what?  IPv6 is not widely adopted.

Fortunately that should heavily change in a few months.

We've been saying this for years.  World IPv6 day 2012 will come and go,
and things are unlikely to change a whole lot.  The only thing that World
IPv6 day 2012 will ensure is that people whose ipv6 configuration actively
interferes with their daily Internet usage will be self-flagged and their
configuration issues can be dealt with.

 (who noticed a certain s....h company performing latency checks against
one of his sites, which was no problem, but the fact that they where
causing almost more hits/traffic/load than normal clients was a bit on
the much side

If that web page is configured to be as top-heavy as this, then I'd suggest
putting a cache in front of it. nginx is good for this sort of thing.

Nick


Current thread: