|
Nmap Development
mailing list archives
[Proof of Concept] Efficient, ASCII-safe port compression
From: doug () hcsw org
Date: Fri, 6 Jul 2007 03:06:30 -0700
Hi nmap-dev!
I was thinking this evening about the problem of encoding long
lists of port strings efficently and reliably. When I have done
large-scale scans in the past, I have run up against all sorts of
scalability problems, many of them based on not having an efficient,
transportable encoding for sets of ports.
Also, when you want to keep complete information on all the ports
in large scans you often end up listing out massive ASCII lists in
XML/greppable output. Consider when there are 20 open ports, but
30000-some closed and 30000-some filtered; Nmap can ony collapse
one of those lists without throwing out information. Insignifigant
you say? Well, perhaps, but remember for a large scale distributed
scanning effort you want to be able to make use of tiny 1mb shell
accounts for scanning as well as your comfy terabyte servers.
Finally, I was considering the problem of not being able to know
what ports were scanned in the XML output of any given scan
because we don't encode the contents of the nmap-services file in
the scan itself.
Let me introduce you to portcompress:
http://hcsw.org/downloads/portcompress.c
This simple, portable C file is a program that runs in 2 modes:
Usage: portcompress [-e|-d]
In encode mode (-e) takes whitespace separated decimal
port numbers until EOF and prints out a compressed port list.
In decode mode (-d) reads in a compressed port list and
prints out the corresponding ports separated by newlines.
Given lists of integers it encodes it in an efficient run-length
encoded ASCII-armoured format:
$ echo "1 2 3 4 5 6 7 8 9 10" | ./portcompress -e
JZ**xA
$ echo "1 2 3 4 5 6 7 8 9 10" | ./portcompress -e | ./portcompress -d
1
2
3
4
5
6
7
8
9
10
$ echo "1 2 3 4 5 6 7 8 9 10 65533 65534 65535" | ./portcompress -e
JZ**u*A
$ echo "1 2 3 4 5 6 7 8 9 10 9876 65533 65534 65535" | ./portcompress -e
JZyaF32WT8A
$ cat ~/nmap/svn/nmap/nmap-services |perl -ne 'print "$1\n" if m/^[\w-_]*\s*(\d+)/;'|sort|uniq|./portcompress -e|wc -c
696
That's right, all the TCP and UDP port numbers in the services file
can be enumerated in 696 bytes, ASCII-safe! It would be something like
4 times longer (and not ASCII-safe) if we just listed the 2 byte integers
back-to-front.
The secret is an efficient run-length encoding algorithm I developed.
It encodes runs of length 4 or more as a simple count of the length of
the run. For maximum efficiency the length of the run is variable encoded
itself. (This is very similar to an algorithm a professor of mine, Dr. Paeth,
invented. Dr's Paeth algorithms are also used in PNG, JPEG, etc).
Here is the bit-stream protocol, from the source:
Protocol:
Either
00 = 0
11 = 1
or
01 = RLE string of 0s
10 = RLE string of 1s
followed by one of
00 = 2 bits
01 = 4 bits
10 = 8 bits
11 = 16 bits
followed by (run length - 4)
encoded in a binary number of the previously specified bits
Examples:
"101" => "110011"
"111" => "111111"
"1111" => "100000"
"11111" => "100001"
Anyways, this was a very quick hack-job so please let me know if you
find any bugs or have other suggestions! I took code from at least 2
other bits of Hardcore Software: ASCII armour and nuff. :)
Enjoy,
Doug
Attachment:
signature.asc
Description: Digital signature
_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org
By Date
By Thread
Current thread:
- [Proof of Concept] Efficient, ASCII-safe port compression doug (Jul 06)
|