
Full Disclosure mailing list archives
Re: Google's robots.txt handling
From: Patrick Webster <patrick () aushack com>
Date: Thu, 13 Dec 2012 14:18:03 +1100
I wouldn't consider this an issue. If Google didn't do this, someone else would have (e.g. my rather old http://www.aushack.com/robanukah/ does it but I never bothered to index the web at large). I believe it was suggested to Shodan and others, so it was only a matter of time. If anything, Google is raising awareness by including it in their results (which I noticed cropped up about 6 months (?) ago). It is also worth noting that some organisations (and some security appliances) use it for bait. E.g. robots.txt = Disallow: /database.bak and as soon as a request is seen the IP is blacklisted permanently, because their behaviour either means that a spider is disobeying robots, or more than likely it is a human poking around where they shouldn't be. Should Google index it? Probably not - but then you're back to point #1, if they didn't someone else would have - and Google does a better job at it, so by all means... Interestingly, Google indexes their own sites https://www.google.com/search?q=inurl:robots.txt+filetype%3Atxt+site%3Agoogle.com. At least they're not playing double standards. My only questions is *why* did they suddenly decide to include this? I'd hazard a guess that they released new & improved indexing code, and this was a by-product of their improvement (perhaps related to the TXT file-type?). -Patrick _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.grok.org.uk/full-disclosure-charter.html Hosted and sponsored by Secunia - http://secunia.com/
Current thread:
- Re: Google's robots.txt handling, (continued)
- Re: Google's robots.txt handling James Lay (Dec 10)
- Re: Google's robots.txt handling Gynvael Coldwind (Dec 10)
- Re: Google's robots.txt handling Benji (Dec 11)
- Re: Google's robots.txt handling Swair Mehta (Dec 11)
- Re: Google's robots.txt handling Stefan Edwards (Dec 11)
- Re: Google's robots.txt handling Gildseth, Tommy (Dec 11)
- Re: Google's robots.txt handling Gynvael Coldwind (Dec 10)
- Re: Google's robots.txt handling Philip Whitehouse (Dec 11)
- Re: Google's robots.txt handling Denis McMahon (Dec 11)
- Re: Google's robots.txt handling Lehman, Jim (Dec 12)
- Re: Google's robots.txt handling Christoph Gruber (Dec 12)
- Re: Google's robots.txt handling Patrick Webster (Dec 12)
- Re: Google's robots.txt handling Mario Vilas (Dec 13)
- Re: Google's robots.txt handling Philip Whitehouse (Dec 13)
- Re: Google's robots.txt handling Jeffrey Walton (Dec 13)
- Re: Google's robots.txt handling Julius Kivimäki (Dec 14)
- Re: Google's robots.txt handling Christoph Gruber (Dec 12)
- Re: Google's robots.txt handling James Lay (Dec 10)
- Re: Google's robots.txt handling Lehman, Jim (Dec 13)
- Re: Google's robots.txt handling Ulisses Montenegro (Dec 11)
- Re: Google's robots.txt handling Philip Whitehouse (Dec 11)