Home page logo
/

nmap-dev logo Nmap Development mailing list archives

[NSE] Unicode library
From: Daniel Miller <bonsaiviking () gmail com>
Date: Tue, 11 Mar 2014 06:38:30 -0500

Hello again, devs!

On the 19th of February, I committed another NSE library, unicode.lua
(http://nmap.org/nsedoc/lib/unicode.html). It is intended for
general-purpose lightweight encoding/decoding/transcoding of Unicode
and other character encodings. The original purpose was to replace all
the trivial null-filling and -skipping that scripts were using to
"decode" Windows Unicode strings (UTF-16 LE).

As a result of this support, our SMB scripts should be able to
preserve internationalized Windows share names, user names, etc. as
well as authenticate with non-ASCII passwords. Displaying them to the
user is a separate problem, since the conversion from UTF-16 to UTF-8
will remove the nulls, but will result in output like this:
"Vi\xe1\xbb\x87t Nam" instead of "Việt Nam." In light of that, future
improvements could be:

* Console/terminal encoding detection for Nmap generally, with full
UTF-8 support throughout. ICANN's new Unicode TLDs may prove difficult
for Nmap to scan otherwise.

* Better error checking and recovery for decoding errors. Currently
errors result in a failure to decode, but the library also accepts
many things that are incorrect without warning.

* Converting scripts that currently negotiate Windows OEM strings to
negotiate Unicode, since OEM code pages vary and cannot be negotiated.

* Normalization. This is unlikely to be complete, since Unicode
normalization is an enormous topic. Much better to find a good C
library that does this and incorporate it instead.

Some references:

"The Absolute Minimum Every Software Developer Absolutely, Positively
Must Know About Unicode and Character Sets (No Excuses!)" -
http://joelonsoftware.com/articles/Unicode.html

"Unicode Security Guide" - http://websec.github.io/unicode-security-guide/

Happy coding!
Dan
_______________________________________________
Sent through the dev mailing list
http://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

  By Date           By Thread  

Current thread:
  • [NSE] Unicode library Daniel Miller (Mar 11)
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]