Home page logo
/

fulldisclosure logo Full Disclosure mailing list archives

Google Sitemap Directory and File Enumeration 0day
From: Adam Muntner <adam.muntner () quietmove com>
Date: Thu, 12 Oct 2006 08:12:13 -0700

While playing with the Google Webmaster tools, I came across the
“Sitemap” XML protocol which is used to inform search engines about
pages on your website that are available for crawling.

The protocol spec is at
https://www.google.com/webmasters/sitemaps/docs/en/protocol.html

Think of this as the anti-robots.txt - instead of URLs with Disallow:
tags, you have URLs for which the web administrator is saying “Index
me.”

Sitemap makes anti-forensics Google hacking more productive. It’s only a
matter of time before enumeration tools like Wikto use it the same way
that they use robots.txt to locate files.

There are two interesting security-related issues with Sitemap, one
significantly more interesting than the other.

First, you can find pages with it that aren’t indexed by Google. The
Sitemap protocol spec says “Using this protocol does not guarantee that
your webpages will be included in search indexes. (Note that using this
protocol will not influence the way your pages are ranked by Google.)”
This is the lesser of the interesting points.

Far more interesting - you can find pages in the sitemap.xml which would
not be indexed if it weren’t for the Sitemap protocol…

You can find some interesting stuff by querying for Sitemap files.

“.htaccess” inurl:sitemap filetype:xml
“global.asa” inurl:sitemap filetype:xml

Whew.

There are a LOT of automagic-generation Sitemap scripts out there which
create Sitemap.xml files not by spidering a site, as they should… but by
reading the contents of directories inside the web root from the local
filesystem and creating the Sitemap.xml file from that.

Ouch.

I don’t blame Google - I think the Sitemap protocol is a pretty good
idea, if though the guys at the Black Hat SEO forum think it doesn't
help you get better rankings. It tells the Google search engine where to
find pages which otherwise might not get indexed. 

Due to a plethora of rotten Sitemap.xml generation scripts, this is a
directory and file enumeration issue that is going to be with us for a
long, long time to come.


     Adam Muntner, CISSP | 
                 Partner | 
         QuietMove, Inc. | w: http://www.quietmove.com
Securing the Nexus Between People, Technology, and Information.
                       ((Q))

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

  By Date           By Thread  

Current thread:
  • Google Sitemap Directory and File Enumeration 0day Adam Muntner (Oct 12)
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]