mailing list archives
Re: The Static Analysis Market and You
From: "Andy Steingruebl" <steingra () gmail com>
Date: Tue, 14 Oct 2008 14:18:26 -0700
On Tue, Oct 14, 2008 at 7:53 AM, Dave Aitel <dave () immunityinc com> wrote:
Also annoyingly the false positive rate is enormous even when run
against the tiny test programs they are using to demo the tools with.
So you end up with a ten page list of "bugs" that you may or may not
be able to understand enough to fix. All the tools provide nice code
browsers and a graph of data flow to help you with this process, but
in practice it's not enough.
1. Running the tools against 6 programs (3 Java, 3 C) generated 47000
distinct "vulns" - in some cases (tinyhttpd) generating about 80%
false positive rate. Imho anything over 5% makes the process unusable
by developers. They estimated 1 man year to do false/true positive
determination on the entire set - which is a LOT of time.
2. In many cases skilled engineers were unable to determine the false
positive/true positive of a particular vulnerability warning (or were
"wrong" about it). Imagine how well your developer team will do!
I hear both of these points a lot but never get any insight into whether the
items found are bugs, but not exploitable security vulnerabilities, or
whether they aren't bugs at all. Most of the stats I've seen, and my own
testing, tells me that for at least some of the analysis rules the false
positive rate for a bug is actually quite low. Its just that it isn't
necessarily on a vulnerable path, doesn't have untrusted inputs, etc.
While this is an indictment of the promise of the vendors to help you fix
security problems, it isn't necessarily an indictment of the technology's
ability to find bugs. Whether they are the ones you'd want to prioritize
is another question and obviously plays into the value of the tool, but at
the same time an attitude that says you're only going to fix the known to be
currently vulnerable bugs in your code rather than most bugs found,
especially those that do potentially unsafe things, won't result in very
high assurance software.
Take for example a checker that finds "banned" api calls. It will
potentially find a lot of cases strcpy() in your code. Are all of them
unsafe? Probably not. Some are copying statically sized buffers, things
from config files that you completely control, etc. Most of the findings of
uses of strcpy() are in the strict sense going to be false positives. At
the same time you might not want them in your code as they are hard to get
right. strcpy() is an easy example of course because grep will find it.
There are other cases though of more complicated checkers that will find
issues in an automated fashion that are a lot harder to find with manual
I'd like to see some better underlying data about these and how people are
classifying their false positives.
steingra () gmail com
Dailydave mailing list
Dailydave () lists immunitysec com