Nmap Security Scanner
*Intro
*Ref Guide
*Install Guide
*Download
*Changelog
*Book
*Docs
Security Lists
*Nmap Hackers
*Nmap Dev
*Bugtraq
*Full Disclosure
*Pen Test
*Basics
*More
Security Tools
*Pass crackers
*Sniffers
*Vuln Scanners
*Web scanners
*Wireless
*Exploitation
*Packet crafters
*More
Site News
Site Search:
Exploit World
Advertising
About/Contact
Credits
Sponsors:
edgeos network security services platform



Nmap Development: Re: [NSE] http.lua and delimiters

Re: [NSE] http.lua and delimiters

From: jah <jah_at_zadkiel.plus.com>
Date: Wed, 01 Oct 2008 16:20:37 +0100

On 01/10/2008 01:32, David Fifield wrote:
> On Wed, Sep 24, 2008 at 03:43:21AM +0100, jah wrote:
>
>> I noticed a few issues with showHTMLTitle.nse and whilst I was working
>> through these I found that http.request() was not always returning an
>> HTTP response correctly.
>>
>> Specifically the call to stdnse.make_buffer() uses "\r\n" as it's
>> pattern to delimit lines in the response. This pattern was changed from
>> "\r?\n" when the ability to dechunk chunked encoding was added [1] in
>> tandem with a change to the second argument to table.concat() when
>> putting the body of the response back together again (from "\n" to
>> "\r\n") to avoid modifying the body and messing-up the dechunking process.
>>
>> I decided to knock-up a quick script which sends an HTTP request, uses
>> socket.receive() in a loop to collect the response as an unmolested
>> string and then detects the characters used to delimit the header and
>> body and the characters used to delimit lines in both the header and the
>> body.
>>
>> I then ran this script against a few hundred thousand random hosts and
>> extracted the following info from the results.
>>
>> 3902 hosts had port 80 open, but only 2770 hosts responded to the GET
>> request.
>>
>> 2451 ~88.5% used \r\n\r\n to separate header and body
>> Of these, 2374 delimited header values with \r\n, 5 used \n and 72 were
>> single value headers containing no delimiters.
>> Of the same 2451 hosts, 335 were header only responses, 937 delimited
>> lines in the body of the response with \r\n and 1179 with \n.
>>
>> 165 ~ 6% used \n\n to separate header and body
>> Of these, 7 delimited header values with \r\n, 17 used \n and 141 were
>> single value headers containing no delimiters.
>> Of the same 165 hosts, 3 were header only responses, not one delimited
>> lines in the body of the response with \r\n and the remaining 162 used \n.
>>
>> 154 ~5.5% responded with a header and a body not separated by a double
>> newline.
>> These were all headerless responses which were dealt with in a previous
>> patch [2].
>>
>
> Thanks for doing this great research! I was able to reproduce your
> results using the nl.nse script. I ran
>
> nmap -iR 10000 -PN -p 80 -sC --script=nl.nse -n -T4 -v
>
> 125 hosts had port 80 open, but 30 of them returned no data. Of the
> remaining 95,
>
> 82 86.3% used a \r\n\r\n delimiter
> 5 5.3% used a \n\n delimiter
> 8 8.4% used neither of the above delimiters
>
> I think using a heuristic to get the header delimiter is fine. Wget does
> it: it splits the header from the body by looking for \n\n or \n\r\n,
> and splits header lines on either \n or \r\n. cURL does it: it ends
> header lines on \n and and the header on a line beginning with \r or \n.
> The only thing I would do differently is this bit of code:
>
> -- try and separate the head from the body
> if response:match( "\r\n\r\n" ) then
> header, body = response:match( "^(.-)\r\n\r\n(.*)$" )
> elseif response:match( "\n\n" ) then
> header, body = response:match( "^(.-)\n\n(.*)$" )
>
> This would fail if the header uses \n delimiters but there is an
> \r\n\r\n somewhere in the body; the first match would succeed and grab
> part of the body with the header. What you want is whichever of those
> two matches gives you a shorter header.
>
Yes it would fail if the header and body were separated by \n\n and
there was \r\n\r\n somewhere in the body. As you see in my results
though, this never happened and it's quite possibly a very unlikely
event. You are right though, we should add a check for this and return
the shorter header.
> Guessing the line ending for the header is fine, but we shouldn't do
> anything like that for the body. We don't even know that the body is
> made up of "lines"; it should be treated as a block of data, in other
> words not split up and rejoined. In the case of chunked encoding, cURL
> is strict about requiring \r\n, so we should be safe doing the same.
> Wget doesn't do chunked.
>
I agree and this is exactly what happens with my modification. The body
is separated from the header and returned in one unmodified string
unless the server reports it as chunked and we managed to match a
delimiter for use in dechunking.

The code prefers \r\n as a chunk delimiter and, due to the method I'm
using to dechunk, doesn't need to worry about line breaks inside chunks
- It merely does string.sub() for as many characters as defined by the
chunk size and concatenates chunks without delimiters. Thus there is no
modification to the body of the response.
So far, in my tests the sum of the chunk sizes equals the length of body
returned by http.lua every time.

I shall concentrate on finding servers that return chunked data and
continue to try and break this modification, but I believe it achieves
the goal which was to handle any combination of delimiters and return
the body unmolested.

I still haven't found any instance where the header and the body are not
delimited by a double newline (except when the response if header only)
so the part of the code which deals with these cases (kind of) is
looking ever more unnecessary.

Regards,

jah

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org
Received on Oct 01 2008

[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]