Home page logo
/

pen-test logo Penetration Testing mailing list archives

Re: Optimal wildcard search algorithm
From: "Andres Riancho" <andres.riancho () gmail com>
Date: Wed, 29 Nov 2006 18:28:32 -0300

I have been thinking about this for some time now, and this is what I
came up with:

<pseudo-algorithm>
1) bruteforce N rounds ( where N is a number between 2 and 4, should
do some tests to confirm what is the best choice ). By bruteforce I
mean, search for a* , b* ... z* and keep only the ones that match (
this would be the first round ). Then, for the second round, aa*, ab*
....yz* and keep only the ones that match.

2) "elimination". Lets suppose that we only got one matching group
after the two rounds of bruteforcing, the group is 'ac*' . Now what i
propose is to search for ac*a* , ac*b*... ac*z*. This step will tell
you if there is a string matching ac* that has a letter a somewhere,
if it hasnt, you can safely eliminate that char in the next
iterations.

3) Using the result of step 2, create a subset of the letters and use
more rounds of bruteforcing and "elimination".

</pseudo-algorithm>

The key number here as you can see is N.
- Big N creates less groups -> elimination step is more successfull,
but bruteforcing step is slow.
- Small N creates more groups -> elimination step is less successfull,
but bruteforcing step is faster.

Note: My proposal of N between 2 and 4 is completely arbitrary :)

On 11/28/06, Tim <tim-pentest () sentinelchicken org> wrote:
> Do I understand correctly, that the wildcard test you describe only
> gives an exists/doesn't exist answer? How many tests can you afford
> per second?

Yes, the attack would be somewhat blind, resulting in yes/no or
yes/no/error responses.  For instance, if this were a login form using
LDAP, it might return success for a proper username/password
combination, failure for a nonexistent username or bad password, and an
error for a wildcarded username that does exist but can't be bound to,
since there's no injection in the LDAP bind request.

> What you can try is to analyse some sample username lists for
> ways to equalise their distribution. For example, you wouldn't want to
> start with a*, better jump directly to aa*, ab*, ..., az*

You think starting with 2-character prefixes is actually faster?  It
doesn't seem that way to me.  In the worst case for a single character
test, you need to make 36+36*m, where m is the number of successful
results from the first round.

If you start with 2-character tuples, you'll always need to perform
36*36 tests to get to the same point in the search, which is certainly
slower on average.

> You might try taking samples from
> http://www.galbithink.org/names/agnames.htm
> http://blogs.parc.com/playon/archives/2006/06/naming_patterns.html
> or similar, some username/password combo lists, etc.
>
> You can also test for suffixes (*a, *b, ..) hoping for a better
> distribution.

Indeed, I think using common names, or even just starting with the most
probable characters would speed up the initial findings.  This is very
useful if your goal is simply to find one or a few usernames.  However,
if you want to find them all, it probably doesn't help much.

Besides, the general algorithm I seek would be useable on other fields.
Take for example, the following ldap injections:

*)(userPassword=A*
*)(userPassword=B*
...

If an LDAP server isn't properly secured, this could allow one to
brute-force all password hashes using just the blind responses.
Password hashes wouldn't have predictable distribution.  Similar
situations may exist in SQL injections as well, though there's probably
easier attacks in that situation.


> If you're allowed single-char wildcards, you could do more interesting
> searches - tests for certain username lengths being the most
> important. You can also walk the search space based on the *second*
> letter of the username _a%, _b% etc, which will (I guess) be more
> equally distributed than the first letter.

Indeed, with single-character wildcards, a lot of additional information
can be divulged.  In fact, one could determine the length range of all
strings right off, which would make it easy to determine when we're
"done".  I didn't want to complicate the question by introducing these
though.

> I don't know if searching for substrings in the middle hoping they
> would prune the search tree will be helpful - but you can analyze
> those sample lists and see if particular patterns come up. If there is
> a good set of 2 or 3 character strings that have near-zero frequencies
> in the sample lists, that's the way to go.

Yes, one of my burning questions is to figure out whether mid-string
checks can be beneficial.  Multi-character mid-string searches are
likely a bad idea, for the same reasons I described above.

While I appreciate the ideas you've suggested, I'm really looking for a
more formal answer to the optimality question.

thanks,
tim

------------------------------------------------------------------------
This List Sponsored by: Cenzic

Need to secure your web apps?
Cenzic Hailstorm finds vulnerabilities fast.
Click the link to buy it, try it or download Hailstorm for FREE.
http://www.cenzic.com/products_services/download_hailstorm.php?camp=701600000008bOW
------------------------------------------------------------------------




--
Andres Riancho
http://w3af.sourceforge.net/ Web App Attack and Audit Framework
http://www.securearg.net/ Secure from the source

------------------------------------------------------------------------
This List Sponsored by: Cenzic

Need to secure your web apps?
Cenzic Hailstorm finds vulnerabilities fast.
Click the link to buy it, try it or download Hailstorm for FREE.
http://www.cenzic.com/products_services/download_hailstorm.php?camp=701600000008bOW
------------------------------------------------------------------------


  By Date           By Thread  

Current thread:
[ Nmap | Sec Tools | Mailing Lists | Site News | About/Contact | Advertising | Privacy ]