Home page logo

nmap-dev logo Nmap Development mailing list archives

Re: [NSE] Robots rethink
From: "Eddie Bell" <ejlbell () gmail com>
Date: Thu, 5 Jun 2008 12:50:10 +0100

2008/6/4 Fyodor <fyodor () insecure org>:
On Wed, Jun 04, 2008 at 09:25:53PM +0100, jah wrote:
On 04/06/2008 19:06, Eddie Bell wrote:
Good idea, the amended version is attached. I've also increased the
verbose output line length (from 40 to 50) so that less vertical space
is taken up.

Aye, I think it's great to have the option not to print all the disallow
entries, I like the amendment.
How about a count of disallow entries for non-verbose results?  This
would give an idea as to how interesting the robots.txt file might be
and whether it's worth running the scan with more verbosity.

Yeah, the improvements look great.  And too-long output is an
important issue.  But as a general note, I think the solution of "dump
the long stuff in -v and print just a short summary without -v, and
maybe print all sorts of crap with -vv" may be overused.  The first
goal should be to find a way to format the results compactly and
readably which works well regardless of verbosity.  And if there are a
few control variables (such as max number of entries printed), maybe
you could just tweak those a bit based on verbosity.  There may be
some good cases for having completely different output formats based
on verbosity level, but they are few and far between.

Also, I think it is important that output size be limited even with
-v.  Because users don't want to be bombarded with 200 lines of
robots.txt in their output.

So maybe a good way to handle a script like robots.nse is:

o Check for the somewhat common case of an emplty robots.txt (for
 example, that's what you'll find at http://insecure.org/robots.txt)
 and either print nothing, or print that it is empty in that case.

o Print the summary line (that robots.txt exists and has XX entries)
 in all cases.

o Maybe print up to 2-4 lines worth of entries in normal mode, and a
 higher number like up to 10 lines in verbose mode.  That way people
 see a small sampling even in in normal mode.

Note that I haven't had time to even look at your changes very
closely, so you may be doing some or much of this stuff already.  And
don't take this as any criticism of robots.nse.  I just thought it
made a good example to launch into a discussion of how we handle
output verbosity.

I believe that one of the reasons Nmap is so successful is that we put
a whole lot of work into presenting information to users in a clean,
orderly, useful fashion.  Certain other port scanners, for example,
just print open ports as they are found and leave you with a mess of
debug-looking output mixed with open port information which is not
even sorted to keep ports from the same host together, much less
sorted numerically.  Also, many tools simply flood you with data just
because they have it available, even when there are few if any
practical uses for that data.  This causes the important information
to get lost in the flood.  Whenever new output is added to Nmap (from
an NSE script or whatever), try to think of how that information could
actually be useful to someone.  If you come up blank, it is generally
best to leave it out.


It would be hard to do it with line numbers as the code stores all the
entries in
a big table which does not account for lines but it can be done with number
of entries. For example,

No robots file:

    Interesting ports on scanme.nmap.org (
    80/tcp open  http

Empty robots file:

     Interesting ports on insecure.org (
     80/tcp open  http
     |_ robots.txt: is empty

Normal mode:

     Interesting ports on py-in-f99.google.com (
     80/tcp open  http
     |_ robots.txt: has 136 disallowed entries

Single verbose (15 entries):

    Interesting ports on py-in-f99.google.com (
    80/tcp open  http
    |  robots.txt: has 136 disallowed entries (15 shown)
    |  /news?output=xhtml& /search /groups /images /catalogs
    |  /catalogues /news /nwshp /? /addurl/image? /pagead/ /relpage/
    |_ /relcontent /sorry/

Double verbose or debug (50 entries):

     Interesting ports on eh-in-f99.google.com (
    80/tcp open  http
    |  robots.txt: has 136 disallowed entries (50 shown)
    |  /news?output=xhtml& /search /groups /images /catalogs
    |  /catalogues /news /nwshp /? /addurl/image? /pagead/ /relpage/
    |  /relcontent /sorry/ /imgres /keyword/ /u/ /univ/ /cobrand /custom
    |  /advanced_group_search /advanced_search /googlesite
/preferences /setprefs
    |  /swr /url /default /m? /m/? /m/lcb /m/search? /wml? /wml/?
    |  /wml/search? /xhtml? /xhtml/? /xhtml/search? /xml? /imode? /imode/?
    |_ /imode/search? /jsky? /jsky/? /jsky/search? /pda? /pda/?

Double debug or debug + triple verbose

   Interesting ports on jc-in-f99.google.com (
   80/tcp open  http    syn-ack
   |  robots.txt: has 136 disallowed entries (136 shown)
   |  /news?output=xhtml& /search /groups /images /catalogs
   |  /catalogues /news /nwshp /? /addurl/image? /pagead/ /relpage/
   |  /relcontent /sorry/ /imgres /keyword/ /u/ /univ/ /cobrand /custom
   |  /advanced_group_search /advanced_search /googlesite /preferences
   |  /swr /url /default /m? /m/? /m/lcb /m/search? /wml? /wml/?
   |  /wml/search? /xhtml? /xhtml/? /xhtml/search? /xml? /imode? /imode/?
   |  /imode/search? /jsky? /jsky/? /jsky/search? /pda? /pda/? /pda/search?
   |  /sprint_xhtml /sprint_wml /pqa /palm /gwt/ /purchases /hws /bsd?
   |  /linux? /mac? /microsoft? /unclesam? /answers/search?q=
   |  /local? /local_url /froogle? /products? /froogle_ /product_
   |  /products_ /print /books /patents? /scholar? /complete
   |  /sponsoredlinks /videosearch? /videopreview? /videoprograminfo?
   |  /maps? /mapstt? /mapslt? /maps/stk/ /mapabcpoi? /translate?
   |  /ie? /sms/demo? /katrina? /blogsearch? /blogsearch/
   |  /blogsearch_feeds /advanced_blog_search /reader/ /uds/ /chart? /transit?
   |  /mbd? /extern_js/ /calendar/feeds/ /calendar/ical/
   |  /cl2/feeds/ /cl2/ical/ /coop/directory /coop/manage /trends?
   |  /trends/music? /notebook/search? /music /browsersync /call
   |  /archivesearch? /archivesearch/url /archivesearch/advanced_search
   |  /base/search? /base/reportbadoffer /base/s2 /urchin_test/ /movies?
   |  /codesearch? /codesearch/feeds/search? /wapsearch? /safebrowsing
   |_ /reviews/search? /orkut/albums /jsapi /views? /c/ /cbk

Attachment: robots.nse

Sent through the nmap-dev mailing list
Archived at http://SecLists.Org

  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]