
Nmap Development mailing list archives
[NSE] http-archive.nse
From: George Chatzisofroniou <sophron () latthi com>
Date: Mon, 23 Sep 2013 11:25:58 +0300
The attached script crawls through the previous versions of the target website (it is getting them from archive.org) and extracts links from them. It then checks if these links exist today and outputs the results. It is useful for discovering hidden pages that were used in the past but still exist on the target. It also gives an overview of the target website through time. Unfortunately, the script is not working properly right now, because the archive.org app is always changing its contents, so the patterns need to be updated. I post it today, because it is the last day of GSoC. Consider this mail as a check point. I will make it stable in the next days. A part of the output against insecure.org, when it was working properly was like that: | web.archive.org/web/19981205075750/http://www.insecure.org/ | | Alive links: | insecure.org/myworld.html | insecure.org/reading.html | insecure.org/sploits.html | insecure.org/credits.html | | | web.archive.org/web/19990125100235/http://www.insecure.org/ | | Alive links: | insecure.org/nmap/index.html | | | web.archive.org/web/20000301165730/http://www.insecure.org/ | | Alive links: | insecure.org/sploits_solaris.html | insecure.org/nmap/nmap-fingerprinting-article.html | insecure.org/nmap/index.html#download | | | web.archive.org/web/20020124070013/http://www.insecure.org/ | | Alive links: | insecure.org/sploits_linux.html | insecure.org/nmap/ Note that this script is pretty intrusive for both archive.org and the target website, that's why there are maxyears and singleyears options to limit the crawling operations. Me and Patrick think that we can split the logic of this script into at least three smaller scripts. * http-archive that brings all archives in an interval. * http-archive-liveness that brings 'alive' and 'dead' links from archives. * http-archive-hidden that brings 'hidden' links from archives. -- George Chatzisofroniou
Attachment:
http-archive.nse
Description:
_______________________________________________ Sent through the dev mailing list http://nmap.org/mailman/listinfo/dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [NSE] http-archive.nse George Chatzisofroniou (Sep 23)