Security Incidents mailing list archives

very interesting 0day tool... http honeypot in action


From: Michal Zalewski <lcamtuf () coredump cx>
Date: Tue, 12 Mar 2002 11:17:26 -0500 (EST)


Hello list[s],

My small, home-brew honeypot was hit by something pretty interesting today
- an automated, not published, not widely used web reconnaisance tool. I
do not have a better name for it - it appears to gather information about
the structure of your webserver by recursively downloading the data,
querying external web crawlers (probably google.com) to include data not
directly referenced on your webpages, and later, tries to brute-force
certain locations on the server (such as administration scripts, logs,
misc files, pr0n). It is safe to assume that further client-side
processing is done to classify the contents, aggregate it and extract
possibly sensitive information from the noise.

Its "behavioral patterns" are very unique and pretty uncommon - this is
not yet another "common cgi scripts" scanner. It seems to be designed to
perform targeted attacks. I couldn't find any references to this tool, or
any logs showing this kind of activity in the past. I guess many readers
can find it interesting to examine their logs or analyze it further.

Such tools are relatively difficult to write (and this one is far from
being perfect, as you will see later), but are also very valuable for
potential attackers or pen-testers. As far as I know, there are no
comprehensive tools of this kind available publicly. I know that many
people (including myself) have their private codes of this kind. This is
also a very good proof that sufficiently challenging, customized honeypots
can be used to capture targeted, smart attacks. I never thought that this
really trivial installation would provide such results.

The tool is apparently launched by hand against a specific host. This can
be guessed by analyzing the initial behavior - the attacker first made few
regular, slow connections to the server, one of them with a typo.  Two
minutes later, he/she launched the tool, which kept firing 5-10 HEAD and
GET requests per second or such, approximately 1000 requests in total.

The attack was apparently triggered by a curiosity - the server I am
referring to is running some minimal http honeypot, providing bogus
"secret" data to visitors. The "secret" URL was "leaked" to certain
communities (egg, IRC channels). This is the initial activity (to protect
my honeypot, I've changed the "secret" URL slightly):

node-d-2425.a2000.nl - - [12/Mar/2002:14:15:59 +0100] "GET /privare HTTP/1.1" 404 788
node-d-2425.a2000.nl - - [12/Mar/2002:14:16:13 +0100] "GET /private%20stuff/ HTTP/1.1" 200 183
node-d-2425.a2000.nl - - [12/Mar/2002:14:16:44 +0100] "GET /privare HTTP/1.1" 404 788
node-d-2425.a2000.nl - - [12/Mar/2002:14:16:48 +0100] "GET /privare%20stuff/ HTTP/1.1" 404 788
node-d-2425.a2000.nl - - [12/Mar/2002:14:17:34 +0100] "GET /private%20stuff/pass.shtml?pass=blaat HTTP/1.1" 200 399
node-d-2425.a2000.nl - - [12/Mar/2002:14:17:43 +0100] "GET /private%20stuff/passwd.dat HTTP/1.1" 200 48938
node-d-2425.a2000.nl - - [12/Mar/2002:14:19:14 +0100] "GET /private%20stuff/index2.shtml HTTP/1.1" 200 23
node-d-2425.a2000.nl - - [12/Mar/2002:14:19:20 +0100] "GET /private%20stuff/index1.shtml HTTP/1.1" 200 23
node-d-2425.a2000.nl - - [12/Mar/2002:14:19:24 +0100] "GET /private%20stuff/index3.shtml HTTP/1.1" 404 788

As you can see, there's a gap between 14:17 and 14:19, the time attacker
used to examine passwd.dat file he/she obtained from the system.

Then, the scan started. Phase 1 was recursive, rapid suck of the contents:

node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET / HTTP/1.0" 200 17421
node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "HEAD /head.jpg HTTP/1.1" 200 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "HEAD /lcam.jpg HTTP/1.1" 200 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET /prof.html HTTP/1.0" 200 20479
node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET /soft/ HTTP/1.0" 200 7966
node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "HEAD /mobp.jpg HTTP/1.1" 200 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:20:49 +0100] "GET /mobp/ HTTP/1.0" 200 15305

Note that the fingerprint of this tool is pretty unique - HTTP/1.0 GET on
HTML files and directories, and HTTP/1.1 (different version!) HEAD on
other file types. Interesting... All requests have 'Referer' field set to
the server name (http://myhost/), and 'User-Agent' to 'Mozilla/4.0
(compatible; MSIE 5.0; Windows 98; DigExt)', which is, quite obviously,
bogus. The remote system appears to run Windows right now, but I am not
the administrator of this box, so I couldn't run p0f, tcpdump or such. Of
interesting things, this crawler attempts to index every directory even if
it is not explictly referenced in HTML code.  For example, if I have a
link to catspace/BIGLOG.txt on my webpage, the crawler will attempt to
index catspace/ directory too:

node-d-2425.a2000.nl - - [12/Mar/2002:14:20:59 +0100] "GET /catspace/ HTTP/1.0" 403 720

The crawler is rather poorly written - one of URLs on my webpage refers to
http://myhost:54321/. The crawler incorrectly parses this URL into this
request:

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:05 +0100] "GET /:54123/ HTTP/1.0" 404 748

Another bug - URLs taken from certain directory indexes have extra '/'
appended at the end:

node-d-2425.a2000.nl - - [12/Mar/2002:14:20:52 +0100] "GET /soft/uc.c/ HTTP/1.0" 404 748

This will keep certain files from being indexed, at least with Apache.
Note that this happens only for certain directories (probably because I
have different FancyIndexing settings for different directories). This
seems to prove this code is not based off existing crawler and is a custom
work.

Then, phase 2 is brute-forcing - this phase is partially interleaved with
phase 2, which suggests multithreading application:

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /2/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /8/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /5/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /4/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:36 +0100] "GET /123/ HTTP/1.0" 404 7
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:37 +0100] "GET /a/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:38 +0100] "HEAD /about HTTP/1.1" 404 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:38 +0100] "HEAD /account HTTP/1.1" 404 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /accounts/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /admin/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /adm/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /action.asp HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "GET /ad/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:39 +0100] "HEAD /accounts HTTP/1.1" 404 0

Our first guess is that this tool might be looking for PHP scripts to
exploit recent mod_php vulnerability. However many requests are not likely
to contain scripts - it tries to find certificates, mails, source codes,
default html files, administrative services, or... pr0n.

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /amateurs/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /amateur/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /apps/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:40 +0100] "GET /app/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /archives/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /arc/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /archive/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:41 +0100] "GET /asp/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:42 +0100] "GET /bank.asp HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:42 +0100] "GET /bin/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:42 +0100] "GET /binaries/ HTTP/1.0"
404 748

[...]

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:44 +0100] "GET /book/ HTTP/1.0" 404 748

[...]

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:45 +0100] "GET /certificates/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:46 +0100] "GET /code/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:47 +0100] "GET /controlpanel/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:47 +0100] "HEAD /codes HTTP/1.1" 404 0

[...]

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "HEAD /data HTTP/1.1" 404 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "HEAD /database HTTP/1.1" 404 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "HEAD /debug HTTP/1.1" 404 0
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:49 +0100] "GET /Default.htm HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /dmr/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /doc/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /dhtml/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:50 +0100] "GET /door/ HTTP/1.0" 404 748

[...]

node-d-2425.a2000.nl - - [12/Mar/2002:14:21:53 +0100] "GET /email/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:53 +0100] "GET /download/ HTTP/1.0" 404 748
node-d-2425.a2000.nl - - [12/Mar/2002:14:21:53 +0100] "GET /emails/ HTTP/1.0" 404 748

[...]

The overall list of checked resources that returned 404 code:

/1 /123 /2 /3 /4 /5 /6 /7 /8 /9 /a /abc /about /account /accounts /ad /adm
/admin /ads /al /amateur /amateurs /ani /ani1 /anime /app /apps /appz /arc
/archive /archives /asian /asians /asp /b /bin /binaries /binary /bizarre
/black /book /books /c /cat /catalog /catalogs /certif /certificate
/certificates /certified /certify /cgi /cgi- /cgibin /cgi-bin /cgi-win
/code /codes /coding /content /contents /controlpanel /crack /cracks /ctc
/d /data /database /debug /dhtml /dir /dirs /dmr /dmr1 /doc /docs /door
/double /download /downloads /downloadz /driver /drivers /e /email /emails
/entry /en_US /f /file /filez /final /food /forum /free /freepic /freepics
/front /ftp /fuck /fucks /g /gal /galleries /gallery /galls /game /games
/gamez /girl /girls /girlz /graph /graphic /graphics /graphs /h /hardcore
/help /hidden /hide /home /htaccess /htdata /htdoc /htdocs /html /htpasswd
/htpasswrd /i /id /ids /image /images /images_dir /imagez /index /info /j
/k /l /lancelot /les /lesb /lesbian /lesbians /lesbo /lez /link /links
/linkz /list /log /logs /m /mail /mails

...for some reason, the scan ended around letter 'm', so I can't determine
what else would it look for, or if there are any later phases. And because
the scan probably didn't provide attacker with any useful data in this
case, I can't tell how would he/she attempt to use eventual information.
One last thing I noticed:

node-d-2425.a2000.nl - - [12/Mar/2002:14:20:51 +0100] "HEAD /soft/unicorns.tgz HTTP/1.1" 404 0

This file used to be on my server, but is no longer available there. This
suggests that this tool crawls not only pages found directly, but also
previously indexed and cached by other systems (such as google.com).

Well, ok, enough from me, I could probably write few more pages, but I
don't want to insult your intelligence or make blind guesses. Have fun!
Check your logs, post your hypotesis!

-- 
_____________________________________________________
Michal Zalewski [lcamtuf () bos bindview com] [security]
[http://lcamtuf.coredump.cx] <=-=> bash$ :(){ :|:&};:
=-=> Did you know that clones never use mirrors? <=-=
          http://lcamtuf.coredump.cx/photo/


----------------------------------------------------------------------------
This list is provided by the SecurityFocus ARIS analyzer service.
For more information on this free incident handling, management 
and tracking system please see: http://aris.securityfocus.com


Current thread: