Home page logo

nmap-dev logo Nmap Development mailing list archives

Re: Web crawling library proposal
From: Patrik Karlsson <patrik () labb1 com>
Date: Wed, 19 Oct 2011 22:02:57 -0400

Hey Paulino,

Nice work. I spotted a few things when running the http-email-harvest.
First, it didn't finish even though I let it run for a LONG time, not sure
why really, will let you know once I find out more.
I noticed that it actually downloaded a lot of zipfile off from the web
site, these should probably be blacklisted for this particular script.
Also, from what I could see in the debug messages the script didn't seem to
chop of anchor links treating the following as two different urls:


On Wed, Oct 19, 2011 at 3:25 AM, Paulino Calderon
<paulino () calderonpale com>wrote:

Hi list,

I'm attaching my working copies of the web crawling library and  a few
scripts that use it. It would be great if I can get some feedback.

All the documentation is here:

I'm including 3 scripts using the library:
* http-sitemap - Returns a list of URIs found. (Useful for target enum)
* http-phpselfxss-scan - Returns a list of PHP files vulnerable to Cross
Site Scripting via infecting the variable $_SERVER["PHP_SELF"].
* http-email-harvest - Returns a list of the email accounts found in the
web server.

NSE scripts would start a crawling process and then get a list of URIs to
be processed as the programmer wishes. For example if we wanted to write a
script to look for backup files we could simply do:

 httpspider.crawl(host, port)
 local uris = httpspider.get_sitemap()
 for _, uri in pairs(uris) do
   local obj = http.get(uri .. ".bak")
   if page_exists(obj and other params...) then
       results[#results+1] = uri

There is still work to be done since spidering can be as complex as we want
but I wanted to get an idea of what are the most important things to add to
my TODO list for the following days.

I've also setup a vulnerable application that you are free to scan:

nmap -p80 --script http-sitemap,http-email-**harvest,http-phpselfxss-scan
--script-args httpspider.path=/sillyapp/ calder0n.com

nmap -p80 --script http-phpselfxss-scan,http-**email-harvest,http-sitemap
--script-args httpspider.path=/sillyapp/ calder0n.com

Starting Nmap 5.59BETA1 ( http://nmap.org ) at 2011-10-19 00:13 PDT
Nmap scan report for calder0n.com (
Host is up (0.14s latency).
80/tcp open  http
| http-email-harvest: info () domain com
|_nmap-dev () insecure org
| http-sitemap: URIs found:
| http://calder0n.com/sillyapp/**secret/2.php<http://calder0n.com/sillyapp/secret/2.php>
| http://calder0n.com/sillyapp/**index.php<http://calder0n.com/sillyapp/index.php>
| http://calder0n.com/sillyapp/
| http://calder0n.com/sillyapp/**secret/1.php?hola=1<http://calder0n.com/sillyapp/secret/1.php?hola=1>
| http://calder0n.com/sillyapp/**one.php<http://calder0n.com/sillyapp/one.php>
| http://calder0n.com/sillyapp/**1.php<http://calder0n.com/sillyapp/1.php>
| http://calder0n.com/sillyapp/**two.php<http://calder0n.com/sillyapp/two.php>
| http-phpselfxss-scan: Vulnerable files:
| http://calder0n.com/sillyapp/**secret/2.php/%27%22/%3E%**
| http://calder0n.com/sillyapp/**1.php/%27%22/%3E%3Cscript%**


Paulino Calderón Pale
Web: http://calderonpale.com
Twitter: http://www.twitter.com/**paulinocaIderon<http://www.twitter.com/paulinocaIderon>

Sent through the nmap-dev mailing list
Archived at http://seclists.org/nmap-dev/

Sent through the nmap-dev mailing list
Archived at http://seclists.org/nmap-dev/

  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]