Nmap Development mailing list archives
Re: [BUG] NSE/Nsock filehandle exhaustion
From: Stoiko Ivanov <stoiko () xover htu tuwien ac at>
Date: Thu, 30 Aug 2007 20:35:10 +0200
Hi, On Tue, Aug 28, 2007 at 12:55:04AM +0000, Brandon Enright wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Developers, I hate to be reporting a bug without a patch but I haven't been able to fully track this one down and I'm sure someone here is going to have more insight into the problem than me. With the latest NSE implementation compiled from SVN, Nmap runs my machine out of filehandles when I scan large block of machines at a time. Here is sample output: ... SCRIPT ENGINE: Will run ././nmap/current/scripts//ripeQuery.nse against 132.239.74.211 SCRIPT ENGINE: Running scripts. SCRIPT ENGINE: Runlevel: 1.000000 Initiating SCRIPT ENGINE at 23:49 SCRIPT ENGINE Timing: About 0.00% done Socket troubles: Too many open files Socket troubles: Too many open files ...lots of socket trouble errors... Socket troubles: Too many open files Segmentation fault Over 1024 copies of the ripeQuery.nse script were being executed in that hostgroup. I went ahead and increased the max filehandle count with ulimit - -n and /etc/security/limits.conf from 1024 to 10240. Unfortunately instead of solving the problem, it hits another: SCRIPT ENGINE: Will run ././nmap/current/scripts//ripeQuery.nse against 132.239.75.10 SCRIPT ENGINE: Running scripts. SCRIPT ENGINE: Runlevel: 1.000000 Initiating SCRIPT ENGINE at 23:58 SCRIPT ENGINE Timing: About 0.00% done nmap: gh_list.c:346: gh_list_remove_elem: Assertion `list->count != 0 || (list->first == ((void *)0) && list->last == ((void *)0))' failed. Aborted
For both cases a backtrack would be a great help for chasing down the bug I personally use gdb for debugging, so I can only provide you with instructions for gdb: You can create one by allowing your shell to dump core: $ ulimit -c unlimited and afterwards run gdb on the executable and the core file: $ gdb ./nmap ./core once gdb provides you with a prompt just type bt full and you should get the function where the segfault/assertion failure occured (and by which functions it was called)
I've done some digging and the issue seems to be the number of concurrent sockets that are being opened. ..snip.. Is it possible that Nmap/NSE is calling socket:connect() more than 1024 times in parallel *before* the parallel scripts get to the socket:close() call? This doesn't sound very likely to me.
This is exactly what is happening. NSE-scripts get scheduled through a round-robin style algorithm. All script-host combination which will run are stored in a list. Each time a script yields (i.e. pauses to wait for the completion of a nsock-event) the next one starts running - once the nsock-event completed the script is put at the *end* of the list containing all scripts. Maybe a solution to this problem would be to put the scripts which already got their network-event done in the beginning of the list. Through this scripts, which already started running would be prefered over those which had no chance to run at all, and would thus finish execution (and closing their sockets) sooner. I've tried this and it seems to work better (although I couldn't reproduce the assertion failure from your second try) - I'll commit the patch in a second. Another solution would be to change the scheduling algorithm to run at most 1024 script-host combinations in one batch (which wouldn't solve the problem if a script opens more than one socket).
How do I go about troubleshooting this? I'd like some way of seeing the number of simultanious scripts being blocked waiting for the connect call to see if it gets over 1024.
Scripts which wait for network I/O are pushed in a list of waiting scripts (currently this is in nse_main.cc, line 280, function: process_mainloop()) and once the network I/O request is handled they get pushed at the end of the running_scripts list (nse_main.cc line 346, function: process_waiting2running() ).
Maybe there needs to be some (tunable) cap on the number parallel NSE scripts or number of sockets allowed open by NSE at a time. I'm thinking that nmap.new_socket() could block if the number of connected/open sockets goes above some threshold. Please let me know if there is anything I can do to help troubleshoot this or if I need to clarify anything stated above.
The backtracks would be a great help, maybe you could pass --script-trace as an option too. And of course feedback, wheter the bug-fix solved the issue would be welcome.
Brandon
cheers stoiko
- -- Brandon Enright Network Security Analyst UCSD Network Operations bmenrigh () ucsd edu -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG03JpqaGPzAsl94IRAvtbAKCV0Q0ejdWuZgcaSB+jfbSkQ8JyigCfb1z0 M2G/h4YS2YlG5tqyymKE6SI= =XOfS -----END PGP SIGNATURE----- _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- [BUG] NSE/Nsock filehandle exhaustion Brandon Enright (Aug 27)
- Re: [BUG] NSE/Nsock filehandle exhaustion Stoiko Ivanov (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion Brandon Enright (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion majek04 (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion Brandon Enright (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion majek04 (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion Brandon Enright (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion Brandon Enright (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion Fyodor (Aug 30)
- Re: [BUG] NSE/Nsock filehandle exhaustion Stoiko Ivanov (Aug 30)
