Nmap Development mailing list archives

A Comment on the Versatility of Nmap+V

From: "Jay Freeman \(saurik\)" <saurik () saurik com>
Date: Tue, 2 Sep 2003 17:35:56 -0700
Multiple people have now used the term "banner grabbing" to describe the
functionality that nmap+V provides, which indicates to me that I haven't
spent enough time describing the functionality of nmap+V, how it works,
what's implemented, and where I want it to go. "Banner grabbing" is an
extremely simplistic tool, that, while possible for implementing a large
percentage of the "protocol detection" tasks, doesn't live up to the
"network mapping" vision that Fyodor puts forward for nmap in his recent
e-mail.

My eventual goal with Nmap+V, since I started working on it over three years
ago all the way up until today, is to have a versatile platform for doing
extensive service scanning. Using my scan engine I want it to be easy to
develop scans that not only determine the protocol/application, but can also
determine a variety of other things other things that are associated with
services running on given ports. I will get more into this later in this
e-mail.

Possibly more importantly, however, I don't want to stop short with a 90%
solution for the protocol / application version detection case. If there's
some reasonably used protocol out there that the system is unable to get
good information back from, then I don't consider it to be a "quality
implementation" (as laid out in Fyodor's e-mail of the goals for what he
would like to see implemented in nmap).

To illustrate this problem I tend to use the example of IRC. If you connect
to irc.saurik.com on port 6667 you will receive this immediately:

:irc.saurik.net NOTICE AUTH :*** Looking up your hostname...
:irc.saurik.net NOTICE AUTH :*** Found your hostname (cached)
:irc.saurik.net NOTICE AUTH :*** Checking ident...
:irc.saurik.net NOTICE AUTH :*** Checking for open socks server...
:irc.saurik.net NOTICE AUTH :*** No socks server found (good!)...

This isn't always true of IRC servers, some don't reply at all. Regardless,
this alone doesn't tell you much about the remote implementation. Most
modern IRC servers, each with very different capabilities once you've
actually logged in, send back the exact same strings (down to the
capitalization).

The next step is to attempt to login:

USER bob bob bob bob
NICK bob

This comes back as following:

:irc.saurik.net NOTICE AUTH :*** No ident response; username prefixed with ~
:irc.saurik.net NOTICE bob :*** If you are having problems connecting due to
ping timeouts, please type /quote pong 67AE11 or /raw pong 67AE11 now.
PING :67AE11

Now, at this point you may be able to determine that this is UnrealIRCd (as
I don't know much about the timeouts string), but that string has nothing to
do with the protocol and is sometimes customized to make it more obvious for
the end users of this particular server. The important point here is the
"PING" that is sent. This is done to verify that we aren't spoofing our
return address. To this we must reply:

PONG :67AE11

At which point we actually get some useful information:

:irc.saurik.net 001 bob :Welcome to the IRC IRC Network
bob!~bob () ip68-6-65-181 sb sd cox net
:irc.saurik.net 002 bob :Your host is irc.saurik.net, running version
Unreal3.1.3-Komara
:irc.saurik.net 003 bob :This server was created Fri Jun 28 2002 at 06:25:56
CDT
:irc.saurik.net 004 bob irc.saurik.net Unreal3.1.3-Komara
oOiwghskSaHANTcCfrxeWqBFIzdvtGj lvhopsmntikrRcaqOALQbSeKVfHGCuzN
...

With nmap+V, I actually perform these steps (with random user information)
in order to obtain the version from the server. I do it rather quickly as
well as I have a strong implementation of parallel port requests and short
circuiting when I get enough information to determine the match. Most
importantly, however, defining this query wasn't that complicated using my
new file format from 2.99:

<probe name="IRC" ports="6665-6669">
    <label name="IRC"/>

    <send data="USER \R-----+ \R-----+ \R-----+ \R-----+\nNICK \R-----+\n"/>

    <switch timeout="5000000">
        <match regex="PING ([^\r\n]+)">
            <send data="PONG $1\n"/>
        </match>
    </switch>

    <switch timeout="500000">
        <match regex=":[^ ]+ 004 [^ ]+ [^ ]+ ([^ ]+)">
            <set-value name="Service" value="IRC"/>
            <set-value name="Version" value="$1"/>

            <match regex=":[^ ]+ 001 [^ ]+ :Welcome to the ([^\n]+)
Network">
                <set-value name="Network" value="$1"/>
            </match>
        </match>
    </switch>
</probe>

This definition is arguably more complex than the equivilant TCL/Expect
script (largely because it was written in an XML format and has some
information duplication due to that), but TCL/Expect does horribly with
binary protocols (even in the simpler cases, nmap+V can currently handle
most of the simple cases and I have plans for how to make it handle the more
complex ones). Also, TCL/Expect isn't easily run in parallel against
numerous ports at the same time without resorting to running multiple
instances or trying to use a threaded version of TCL and running each port
in a seperate thread. The requirements of a fast "scan an entire series of
computers" system is therefor much different. I have also mentioned the
usage of external languages for this purpose to people at various times on
various lists (including this one a few years back), and their first
reaction is usually one of "bloat" (which I disagree with, but it _is_ a
concern if users see it that way). I am definitely still open to discussions
about possible paths including "try to integrate either some -isms from
language X or just use language X with this cool option that let's you
control it's execution slicing".

Now that I've gone over that, looking at the language you can probably see
how a lot of differen things could rather easily be determined, and still
determined efficiently, from the myriad of protocols available on a given
computer:

o statistics about the computer (obtaining the MAC address, the number of
open MySQL connections, what time the computer responds to NTP requests
with)
o specific configuration of applications (what Apache modules are installed,
what extensions are supported by the telnet daemon, what the administrative
policies of the SMTP server are)
o general recon on the state of the machine (finding out what DNS zones it
is authoritative for, custom responses to identd)
o information about the content served by the computer (what IRC network the
IRC server is connected to, what the title of the default website is)

A lot of these are already possible with Nmap+V's current implementation
(although not actually implemented in the service probes file, which is
rather sparse and mainly concentrates on some particular examples involving
HTTP). Some of these still need more time, but I have plans for how to add
the required functionality to the language in a general enough way that it
could be easily applied to numerous other situations. A lot of it comes down
to making the language feel more like XSL/T and making it more turing
complete, as well as providing some helper functions that you can call on
string match arguments in order to do some binary mangling.

An example of what I'm talking about with the latter case is allowing you to
do:

<set-value name="# of Connections" value="{int($1)}"/>

I recently came to the realization that a rather good option on each of
these fronts is to design a new language that draws as much from an existing
language as possible and to use XML wherever it makes sense to make things
easier to parse and generate from other tools. As you can see my current
plan revolves around mimicking XSL/T's paradigms (as indicated in the above
with $ variables and {} execution context escaping).

Given all of this, I implore people to think twice before just labeling this
as "banner grabbing". It is so much more than that :). _However_, even so,
nmap+V isn't "overpowered" for the task of service/version scanning. Even if
you feel that simplistic banner grabbing is the best way to approach the
problem, nmap+V is still extremely leightweight and I will argue capable of
executing just as fast as something that is specialized for "send probe,
receive one of a number of responses" (possibly eating up a small amount
more CPU in the process, but A) it is negligable and B) this is an I/O bound
problem anyway). nmap+V simply offers the flexibility to occasionally add
extra information gathering to the pipeline (as I do with HTTP for Apache
modules and <title/> tags), or the ability to get the version from more
complicated protocols (such that might require a handshaking step at the
outset). This more simplistic usage case is how I've been using nmap+V's
configuration language thus far in my distributions.

======================================================================

Here is an example of a more complex use case:

Maybe you would like to find SMTP servers that seem to have open relays;
maybe you administrate a corporate/university network and want to make sure
none of the computers on it are accidentally configured such that a spammer
might utilize it. You could load the following script and a -sVVV scan would
do that for you (note that the <clear/> command here is really important and
is not present in 2.99; you will need to get nmap+V from CVS or wait until
3.00 tomorrow):

----------------------------------------------------------------------
<?xml version="1.0"?>
<switch timeout="5000000" xmlns="http://saurik.com/nmap+V/3.00";>

<match regex="^220 ([^ ]*) .*SMTP">
    <set-value name="Service" value="SMTP"/>
    <send data="EHLO $1\r\nMAIL From: <spam@test.unknown>\r\nRCPT To:
<spam@test.unknown>\r\nDATA\r\n"/>
    <set-value name="Open Relay?" value="No :)"/>
    <switch timeout="5000000">
        <match regex="\n354 ">
            <clear/>
            <send data=".\r\n"/>
            <switch timeout="1000000">
                <match regex="^250 ">
                   <set-value name="Open Relay?" value="Yes! :("/>
                </match>
            </switch>
        </match>
        <match regex="\n550"/>
    </switch>
</match>

</switch>
----------------------------------------------------------------------

Here is some example output from that (localhost comes up as relay here
because I have 127.0.0.1 in my /etc/mail/access file so it was a useful
example):

----------------------------------------------------------------------
[root(2)@ironclad nmap]# ./nmap -sVVV -p25 localhost mx1.mail.yahoo.com
maila.microsoft.com

Starting nmap 3.30+V ( http://www.insecure.org/nmap/ ) at 2003-09-01 05:18
CDT
Interesting ports on localhost.localdomain (127.0.0.1):
Port       State       Service             Protocol     Version
25/tcp     open        smtp                SMTP
  Open Relay?: Yes! :(

Interesting ports on mta-v20.level3.mail.yahoo.com (64.156.215.5):
Port       State       Service             Protocol     Version
25/tcp     open        smtp                SMTP
  Open Relay?: No :)

Interesting ports on mail1.microsoft.com (131.107.3.125):
Port       State       Service             Protocol     Version
25/tcp     open        smtp                SMTP
  Open Relay?: No :)

Nmap run completed -- 3 IP addresses (3 hosts up) scanned in 1.924 seconds
[root(2)@ironclad nmap]#
----------------------------------------------------------------------

Now, this scan chose to directly target port 25, but this wasn't a
requirement. As the script detected SMTP before attempting to send the
commands for relaying, it wouldn't be an issue to run this on multiple
ports. However, it _will_ take irritatingly long becuase it will wait for a
"220 something SMTP" for up to 5 seconds before giving up on that port (and
even with 8 ports at a time or whatever the default is, this could take a
while on each computer). So, to fix this, we add a short circuit to our
switch. Now, unfortunately I'm not currently using libpcre or any kind of
advanced regex engine, so the best I can do is to look for a string that
starts with something other than a '2' (at least, with my knowledge of
regex):

<match regex="^[^2]"/>

Adding this to the outermost switch should make processing immediately stop
on ports that at least respond. From here it would be a matter of tweaking
the timeout for your needs.

Sincerely,
Jay Freeman (saurik)
saurik () saurik com


---------------------------------------------------------------------
For help using this (nmap-dev) mailing list, send a blank email to 
nmap-dev-help () insecure org . List run by ezmlm-idx (www.ezmlm.org).
Current thread:

A Comment on the Versatility of Nmap+V Jay Freeman (saurik) (Sep 02)