Nmap Security Scanner
*Intro
*Ref Guide
*Install Guide
*Download
*Changelog
*Book
*Docs
Security Lists
*Nmap Hackers
*Nmap Dev
*Bugtraq
*Full Disclosure
*Pen Test
*Basics
*More
Security Tools
*Pass crackers
*Sniffers
*Vuln Scanners
*Web scanners
*Wireless
*Exploitation
*Packet crafters
*More
Site News
Site Search:
Exploit World
Advertising
About/Contact
Credits
Sponsors:
edgeos



Nmap Development: Re: [BUG] NSE/Nsock filehandle exhaustion

Re: [BUG] NSE/Nsock filehandle exhaustion

From: Stoiko Ivanov <stoiko_at_xover.htu.tuwien.ac.at>
Date: Thu, 30 Aug 2007 20:35:10 +0200

Hi,

On Tue, Aug 28, 2007 at 12:55:04AM +0000, Brandon Enright wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Developers,
>
> I hate to be reporting a bug without a patch but I haven't been able to
> fully track this one down and I'm sure someone here is going to have more
> insight into the problem than me.
>
> With the latest NSE implementation compiled from SVN, Nmap runs my machine
> out of filehandles when I scan large block of machines at a time.
>
> Here is sample output:
>
> ...
> SCRIPT ENGINE: Will run ././nmap/current/scripts//ripeQuery.nse against
> 132.239.74.211
> SCRIPT ENGINE: Running scripts.
> SCRIPT ENGINE: Runlevel: 1.000000
> Initiating SCRIPT ENGINE at 23:49
> SCRIPT ENGINE Timing: About 0.00% done
> Socket troubles: Too many open files
> Socket troubles: Too many open files
> ...lots of socket trouble errors...
> Socket troubles: Too many open files
> Segmentation fault
>
>
> Over 1024 copies of the ripeQuery.nse script were being executed in that
> hostgroup. I went ahead and increased the max filehandle count with ulimit
> - -n and /etc/security/limits.conf from 1024 to 10240. Unfortunately instead
> of solving the problem, it hits another:
>
> SCRIPT ENGINE: Will run ././nmap/current/scripts//ripeQuery.nse against
> 132.239.75.10
> SCRIPT ENGINE: Running scripts.
> SCRIPT ENGINE: Runlevel: 1.000000
> Initiating SCRIPT ENGINE at 23:58
> SCRIPT ENGINE Timing: About 0.00% done
> nmap: gh_list.c:346: gh_list_remove_elem: Assertion `list->count != 0 ||
> (list->first == ((void *)0) && list->last == ((void *)0))' failed. Aborted
>
For both cases a backtrack would be a great help for chasing down the bug
I personally use gdb for debugging, so I can only provide you with
instructions for gdb:

You can create one by allowing your shell to dump core:
$ ulimit -c unlimited

and afterwards run gdb on the executable and the core file:
$ gdb ./nmap ./core

once gdb provides you with a prompt just type
bt full

and you should get the function where the segfault/assertion failure
occured (and by which functions it was called)

> I've done some digging and the issue seems to be the number of concurrent
> sockets that are being opened.
>
>..snip..
> Is it possible that Nmap/NSE is calling socket:connect() more than 1024
> times in parallel *before* the parallel scripts get to the socket:close()
> call? This doesn't sound very likely to me.
This is exactly what is happening. NSE-scripts get scheduled through a
round-robin style algorithm. All script-host combination which will run are
stored in a list. Each time a script yields (i.e. pauses to wait for the
completion of a nsock-event) the next one starts running - once the
nsock-event completed the script is put at the *end* of the list containing
all scripts.

Maybe a solution to this problem would be to put the scripts which
already got their network-event done in the beginning of the list.
Through this scripts, which already started running would be prefered over
those which had no chance to run at all, and would thus finish execution
(and closing their sockets) sooner.
I've tried this and it seems to work better (although I couldn't reproduce
the assertion failure from your second try) - I'll commit the patch in a
second.

Another solution would be to change the scheduling algorithm to run at most
1024 script-host combinations in one batch (which wouldn't solve the
problem if a script opens more than one socket).

>
> How do I go about troubleshooting this? I'd like some way of seeing the
> number of simultanious scripts being blocked waiting for the connect call
> to see if it gets over 1024.
Scripts which wait for network I/O are pushed in a list of waiting scripts
(currently this is in nse_main.cc, line 280, function: process_mainloop())
and once the network I/O request is handled they get pushed at the end of
the running_scripts list (nse_main.cc line 346,
function: process_waiting2running() ).

>
> Maybe there needs to be some (tunable) cap on the number parallel NSE
> scripts or number of sockets allowed open by NSE at a time. I'm thinking
> that nmap.new_socket() could block if the number of connected/open sockets
> goes above some threshold.
>
> Please let me know if there is anything I can do to help troubleshoot this
> or if I need to clarify anything stated above.
The backtracks would be a great help, maybe you could pass --script-trace
as an option too.

And of course feedback, wheter the bug-fix solved the issue would be
welcome.
>
> Brandon

cheers
stoiko

>
> - --
> Brandon Enright
> Network Security Analyst
> UCSD Network Operations
> bmenrigh_at_ucsd.edu
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.7 (GNU/Linux)
>
> iD8DBQFG03JpqaGPzAsl94IRAvtbAKCV0Q0ejdWuZgcaSB+jfbSkQ8JyigCfb1z0
> M2G/h4YS2YlG5tqyymKE6SI=
> =XOfS
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Sent through the nmap-dev mailing list
> http://cgi.insecure.org/mailman/listinfo/nmap-dev
> Archived at http://SecLists.Org

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org
Received on Aug 30 2007

[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]
edgeos