mailing list archives
[patch] Bug in httpspider.LinkExtractor
From: Daniel Miller <bonsaiviking () gmail com>
Date: Tue, 22 May 2012 11:03:47 -0500
Two bugs and a code structure improvement in this patch to the
httpspider library, found while working with the http-chrono script.
First bug, the LinkExtractor portion of httpspider doesn't check for a
negative maxdepth (indicating no limit), and rejects all links.
Second bug, the withinhost and withindomain matching functions would
throw an error when presented with a URL without a host portion.
Example: <a href="http://">link</a>. I threw in a test for parsed_u.host
== nil, assuming that that would fail either of the checks.
Lastly, the attached patch fixes moves the function definition for
validate_link out of the innermost loop of the LinkExtractor.parse
function. It had been declared as a closure over url, then called on the
very next line. I chose to move it to a method of the LinkExtractor
class, in case it should ever need to be overridden, but it could have
just as easily been inlined.
Sent through the nmap-dev mailing list
Archived at http://seclists.org/nmap-dev/
- [patch] Bug in httpspider.LinkExtractor Daniel Miller (May 22)