
Nmap Development mailing list archives
[NSE] Unicode library
From: Daniel Miller <bonsaiviking () gmail com>
Date: Tue, 11 Mar 2014 06:38:30 -0500
Hello again, devs! On the 19th of February, I committed another NSE library, unicode.lua (http://nmap.org/nsedoc/lib/unicode.html). It is intended for general-purpose lightweight encoding/decoding/transcoding of Unicode and other character encodings. The original purpose was to replace all the trivial null-filling and -skipping that scripts were using to "decode" Windows Unicode strings (UTF-16 LE). As a result of this support, our SMB scripts should be able to preserve internationalized Windows share names, user names, etc. as well as authenticate with non-ASCII passwords. Displaying them to the user is a separate problem, since the conversion from UTF-16 to UTF-8 will remove the nulls, but will result in output like this: "Vi\xe1\xbb\x87t Nam" instead of "Việt Nam." In light of that, future improvements could be: * Console/terminal encoding detection for Nmap generally, with full UTF-8 support throughout. ICANN's new Unicode TLDs may prove difficult for Nmap to scan otherwise. * Better error checking and recovery for decoding errors. Currently errors result in a failure to decode, but the library also accepts many things that are incorrect without warning. * Converting scripts that currently negotiate Windows OEM strings to negotiate Unicode, since OEM code pages vary and cannot be negotiated. * Normalization. This is unlikely to be complete, since Unicode normalization is an enormous topic. Much better to find a good C library that does this and incorporate it instead. Some references: "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" - http://joelonsoftware.com/articles/Unicode.html "Unicode Security Guide" - http://websec.github.io/unicode-security-guide/ Happy coding! Dan _______________________________________________ Sent through the dev mailing list http://nmap.org/mailman/listinfo/dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [NSE] Unicode library Daniel Miller (Mar 11)