mailing list archives
Re: FW: Introducing a new generic approach to detecting SQL injection
From: "Paul J. Morris" <mole () morris net>
Date: Fri, 22 Apr 2005 16:39:05 -0400
On Fri, 22 Apr 2005 15:26:41 -0400
Mohit Muthanna <mohit.muthanna () gmail com> wrote:
Once the allowed character set gets beyond $sanitized =
preg_replace("/[^a-zA-Z0-9]/", "", $untrusted) especially into the
Don't use simple regexp matching.
Why not? I am not matching known attacks, I am stripping everything but
a small set of known good characters. How are you going to construct a
sql injection attack using the character set [A-Za-z]? Yes, you can
try to overflow preg_replace (or the dbms if I don't truncate your
input), but the set [A-Za-z] isn't going to enable a sql injection
attack. If I have a single field being submitted from a form where the
characters in a legitimate query will only be in the set [A-Za-z], I
know with certainty that $santized will not contain a sql injection
attack if I filter it with $sanitized = preg_replace("/[^a-zA-Z]/", "",
$untrusted), regardless of any other dependencies (e.g. with php, I am
not dependent on the settings of safe_mode or of magic_quotes_gpc). If
the set of legitimate characters includes quote characters or slashes or
the like, then I entirely agree with you that escaping and encoding
libraries are an important element.
This technique, though novel, is really
Agreed, most of the time there are better and more efficient ways to
handle the problem. I find it interesting as it appears (and I'm not
sure that this is true), to rely on passing the known good rather than
filtering out a set of known attacks.
I'll reiterate; unless your regexp is robustly tested don't use it.
There are many libraries out there for URL/Base64/Unicode/etc. etc.
encoding, decoding and escaping. Use them to clean up your input.
I have seen too many discussions of ways to get around escaping of
attack characters by interesting twists on encoding to be sure that the
library I choose has though of all of the possible ways around the
decoding and escaping. Encoding/decoding/escaping relies on the
library recognizing known attack characters, something it may be very
good at, but something experience has taught us is hard to do.
If your database API supports it, use prepared statements and
Agreed. Prepared statements are a very powerful tool, when
Don't use simple string interpolation (without quote handling).
I don't see the rationale for this. The rationale for never filter
out known bad characters is clear, but filtering out all but a small set
of known good characters seems the simplest and surest way of sanitizing
It's really that easy.
In the realm of multibyte characters with multiple kinds of clients
I'm not at all convinced it is. I don't know that an attacker isn't
going to encode a query terminating character in a way that is going to
get through the decoding and escaping. The fundamental principle of
escaping is that of recognizing known bad characters - something that
experience teaches us is inferior to allowing only known good characters
Mohit Muthanna [mohit (at) muthanna (uhuh) com]
Paul J. Morris
Biodiversity Information Manager, The Academy of Natural Sciences
1900 Ben Franklin Parkway, Philadelphia PA, 19103, USA
mole () morris net AA3SD PGP public key available
Full-Disclosure - We believe in it.
Hosted and sponsored by Secunia - http://secunia.com/