mailing list archives
Re: [RFC] NSE pack/unpack library
From: Brandon Enright <bmenrigh () ucsd edu>
Date: Thu, 19 Jun 2008 23:14:17 +0000
-----BEGIN PGP SIGNED MESSAGE-----
On Fri, 20 Jun 2008 00:29:54 +0200
"Philip Pickering" <pgpickering () gmail com> wrote:
I've started working on a NSE library for handling binary data,
comparable to the perl pack/unpack functions. It's based on the
lpack library and therefore it differs from perl's pack/unpack.
Excellent. string.sub and string.byte were getting old :-)
Basically, there will be two functions, bin.pack and bin.unpack:
bin.pack(template, arg1, arg2, ...)
... template is the format string (see below)
... argN are the data values, which should be packed, according to
the template --> returns a string with the packed data
... bindata is a string with the packed binary data
... template is, again, the format string
--> returns the position where it stopped as first value and
the unpacked data values as the following return values
(the position can be used to subsequently fetch more data
by using it as a third parameter)
If I'm to understand this we'll have to unpack in a loop rather than
local a, b, c, d
(a, b, c, d) bin.unpack("C C C C", somestring)
It sure would be nice to be able to get a list back rather than having
to do it in a loop.
The format string which lpack uses is a bit different to perl's,
some operator characters stand for completely different things,
but I modified as many as possible to match perl a bit more. Right
now they are:
'Z' ... zero-terminated string
'p' ... string preceded by length byte
'P' ... string preceded by length word
'a' ... string preceded by length size_t
'A' ... string
'f' ... float
'd' ... double
'n' ... Lua number
'c' ... char
'C' ... byte = unsigned char
's' ... short
'S' ... unsigned short
'i' ... int
'I' ... unsigned int
'l' ... long
'L' ... unsigned long
'<' ... little endian
'>' ... big endian
'=' ... native endian
(note that the last three are modifiers)
Being different than perl should be fine as long and we have solid
documentation (which I suspect we will have).
Numerical modifiers following the operators stand for
repetitions (or to tell unpack how many characters to
read if using 'A').
How about the * operator?
What's missing is the B/b (bit string starting with MSB/LSB) and
H (hex string). Operators like n, N, v and V for big/little endian
shorts and longs seem unnecessary because of '<' and '>'.
Personally I find H the most useful of this bunch. I could do without
B/b. < and > operators probably make more sense than different
I'm unsure if certain features from the perl-version are
needed, for example 'u' to uuencode strings
(since another task I'll work on addresses base64 they
should probably get their own mechanism, if uuencoding
is really needed).
Yeah, it would be nice to have stand-alone base64 routines rather than
to use pack and unpack.
Operators I want to add are the aforemetioned 'b'/'B' and 'H',
but also 'x' (for a null byte) and maybe 'X' to back up a byte.
Using 'X' significantly obscures what the format string is doing.
Are there any other features which might be useful
or important? I'd also appreciate any other comments.
I like all your ideas here and I think this will be very useful.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
-----END PGP SIGNATURE-----
Sent through the nmap-dev mailing list
Archived at http://SecLists.Org