Nmap Development mailing list archives
Lua bugfixes and a new buffering feature
From: doug () hcsw org
Date: Sat, 23 Jun 2007 05:19:12 -0700
Hi nmap-dev!
I just found 2 showstopper bugs in the PCRE-Lua interface, fixed them
and committed the fixes to SVN. It seems to work fine now although the
documentation is still hopelessly insufficient for anybody that
doesn't know how to read the C source code. :)
The REAL required interface is:
my_regex = pcre.new("my PCRE pattern", 0, "C")
my_regex:exec(string_to_match_against, 0, 0)
I am caching the compiled PCRE regexps into the NSE registry using
a fairly straightfoward scheme:
init = function()
-- Start of MOTD, we'll take the server name from here
nmap.registry.ircserverinfo_375 = nmap.registry.ircserverinfo_375
or pcre.new("^:([\\w-_.]+) 375", 0, "C")
-- NICK already in use
nmap.registry.ircserverinfo_433 = nmap.registry.ircserverinfo_433
or pcre.new("^:[\\w-_.]+ 433", 0, "C")
...
Then I'm having the action() function (NOT the portrule function) call
init() so that these regexps are compiled at most once per Nmap
invocation and only then if the action() function for the script is
actually called.
Perhaps it would be useful to look for an init function which is called
only once per script per nmap invocation and only right before action()
is called? Another solution we should consider is passing a table to the
action function that scripts can use for cross-invocation persistent data
structures. This would avoid any possible registry conflict problems
(every script would have its own table if it wanted it). I don't know if
better registry naming is required or not.
IMPORTANT NOTE FOR NSE SCRIPT WRITERS: Don't use the function
receive_lines() unless you plan on doing your own line parsing.
This function WILL RETURN MORE THAN JUST THE FIRST LINE OF DATA
IF MORE IS AVAILABLE.
This can be a problem in many scenarios. Most often with NSE you will
just miss pieces of data that you don't care about anyways. But sometimes
you will miss important lines or you will actually PROCESS AN INCOMPLETE
LINE that just happened to be delivered with another line and/or crossed
a read() boundary.
Consider an application that executes this code to send data to
your NSE script:
write(sd, "hello\nworld\n", 12);
Since write knows nothing about newlines, this will be bundled up in one
packet and both lines will probably be delivered in the same read() call
(which also knows nothing about newlines) by your OS. This means if you
are looping for the output in, say, a while loop...
while true do
my_line = sd:receive_lines(1)
... my_line will probably be "hello\nworld\n" NOT "hello\n".
... If we process just hello we would miss world!
Or even more insidiously, if the packet got split in the middle and you
had "hello\nwo" delivered. Unless you store that "wo" for the next call
you will be working with incomplete or wrong data.
The way some NSE scripts deal with this (see showHTTPVersion.nse) is
by keeping a string "response" and appending all data to the end of
that and then running regexps on the response at every step to see if
any match. This method will work fine for some tasks.
But if you want to reliably process data line-by-line as it arrives you
need to use something called a "buffer". The most straightforward way to
implement this in modern languages is by using a closure. Although I
personally find Lua syntax very cumbersome and verbose, Lua does offer a
powerful set of primitives that are, in my opinion, vital to and sufficient
for productive programming: lexical closures, tail-call optimisation, and
dynamic typing.
If the concept of closures frightens you, you can probably get away with
thinking about them like objects: a closure is sort of an object with exactly
one method: "apply". ;)
I'm including a fairly general closure-based buffer implementation that I
am using in my IRC script to process data on a line-by-line basis. Assuming
you have a socket sd you use it like so:
my_buffer = make_buffer(sd, "[\r\n]+")
and then
status, value = my_buffer()
status and value are the same as for read_lines(1) (except see the comments).
As you can see it is useful for much more than just lines (anything separated
by something you can write a lua pattern for). Barring any so-far unnoticed bugs
this should be a very safe, reliable way to parse line-based protocols and
I suggest we put it (or something like it) into the NSE standard library.
Empty lines currently aren't returned which could be a problem for some protocols
(like HTTP) but this is a tiny tweak.
Best,
Doug
PS. It has just come to me that maybe the best pattern to use for regular
newlines might be "\r?\n" instead of "[\r\n]+"! Oh well. :)
-- Generic buffer implementation using lexical closures
--
-- Pass make_buffer a socket and a separator lua pattern [1]
--
-- Returns a function bound to your provided socket with behaviour identical
-- to receive_lines() except it will return AT LEAST ONE [2] and AT MOST ONE "line".
-- The data is returned WITHOUT the pattern/newline on the end.
-- Empty "lines" ARE NOT RETURNED.
--
-- [1] Use the pattern "[\r\n]+" for regular newlines
-- [2] Except where there is trailing "left over" data not terminated by a pattern
-- (in which case you get the data anyways)
--
-- -Doug, June, 2007
make_buffer = function(sd, sep)
local self, result
local buf = ""
self = function()
local i, j, status, value
i, j = string.find(buf, sep)
if i then
if i == 1 then -- empty line
buf = string.sub(buf, j+1, -1)
return self() -- tail
else
value = string.sub(buf, 1, i-1)
buf = string.sub(buf, j+1, -1)
return true, value
end
end
if result then
if string.len(buf) > 0 then -- left over data with no terminating pattern
value = buf
buf = ""
return true, value
end
return nil, result
end
status, value = sd:receive()
if status then
buf = buf .. value
else
result = value
end
return self() -- tail
end
return self
end
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- Lua bugfixes and a new buffering feature doug (Jun 23)
- Re: Lua bugfixes and a new buffering feature Fyodor (Jun 25)
