Dailydave mailing list archives

Re: (the root of the root and the bud of the bud)

From: Sean Heelan via Dailydave <dailydave () lists aitelfoundation org>
Date: Tue, 14 Jan 2025 00:58:28 +0000

As it happens, I’ve found the most effective way to use LLMs is to de-anthropomorphise them entirely and treat them 
very like fuzzers (large scale generation of results, lots of false positives/nonsense, filtered by some oracle).

The “conversation with an AI” approach where you imagine yourself as having a single artificial brain to interact with 
is (currently at least) practically far less useful than one in which you are content with 1/1000000 responses being 
correct and architect the rest of the system within which the LLM lives to leverage this.

- Sean

On Mon, Jan 13, 2025 at 03:34, Thomas Dullien via Dailydave < [dailydave () lists aitelfoundation org](mailto:On Mon, 
Jan 13, 2025 at 03:34, Thomas Dullien via Dailydave <<a href=)> wrote:

Hey,

I have one quibble: We are using "reasoning" in a qualitative, not descriptive, form here -- "fuzzing is or is not
reasoning", "LLMs reason or do not reason". I am not sure this is helpful. Fuzzing is empirically successful at
finding crashes. Somebody that needs to light a fire and smashes two stones together until they throw sparks does
not, once the fire burns, need to justify that 'stones perform reasoning'. The stones were the tools that got the job
done, and that's what counts.

Similarly, does it matter whether LLMs reason, or whether LLMs are good translators from human language to code (and
possibly vice versa)?

My big regrets with my (unreasonably durable) early rejection of fuzzing was that I had absorbed a value system where
somehow "thinking very hard about a complex thing abstractly" had value in itself, somewhat detached from empirical
results. Something that didn't involve "reasoning" but rather "naive high-speed experimentation" couldn't possibly
have the same value.

So LLMs are clearly powerful tools for many things. Rapid fuzzer creation, judging the intuitiveness of an API,
perhaps even analyzing some pieces of code sometimes (altho the models that seem to be able to do this are
"scratchpad models", which are somewhat different from straight-up LLMs...). They are also great in that the shift to
semi-supervised learning on all of human written records unlocked access to a lot of data and a lot of implicit
knowledge that humans have, but haven't codified.

Do we need to know what reasoning is, or whether a fuzzer or a chess engine or an LLM "reasons"? Only if we attach
special value to it, beyond the empirical results -- and given how that misled me in the past, I'd rather not do that.

Cheers,
Thomas

Am Sa., 11. Jan. 2025 um 23:17 Uhr schrieb Dave Aitel via Dailydave < dailydave () lists aitelfoundation org>:

Memories and thoughts are the same thing, someone tried to explain to me recently. You have to think to remember, in
other words. This is hard to grasp for a lot of people because they think they have memories. They wrongly think
memory is a noun instead of a verb, which is ok in philosophy and psychology but in cutting edge computer science we
have to be precise about these sorts of things.

Twenty-five years ago, when I first started writing fuzzers, a full quarter century, people thought it was an
absolutely stupid thing to do. The smart people were using their giant brains to do static analysis. They were
tainting and sinking. They were reading the code and finding flaws. They did threat models. They did not write
glorified for loops that made different amounts of A's go into different RPC functions. But I had the hubris of a
teenage hacker, and I thought it was fun. More fun, perhaps, than reading code.

In 2025, fuzzing is part of the software development lifecycle for any organization rich enough to call a hyperscale
datacenter home. It is a sine qua non for secure software. Fuzzing, we now understand, is reasoning. And if you
can't reason over your code, you can't secure it.

Part of the value is that fuzzing echoes machine learning in that it scales nicely with the amount of CPU you could
use. And there's no false positives when you measure whether an input crashes a program - it either does or it does
not.

There are downsides of course - many inputs may cause the same crash. Fuzzing identifies a flaw exists, but it
doesn't tell you what the flaw actually is. And fuzzing often finds enough flaws that development teams become
overwhelmed with triage. And of course, fuzzing can often be too dumb to reach the important bugs, since it is
exploring the space of possible inputs semi-randomly, even with coverage guided analysis.

We (as a community) tried to correct these things with SMT solvers, or smarter fuzzers. But now we have a new tool:
LLMs, which reason in a very different way. But still they reason.

Admittedly, there are many disbelievers. "LLMs just repeat what they are trained on" and taken to an extreme that's
true but that's also true for any of us. In practice, they reason perfectly well. And not too long from now, maybe a
couple years at most, any organization that is not using them widely for security engineering is left behind the
curve - the same way teams not using fuzzers are today.

Memories and thoughts are, in essence, the same thing because both require the act of reasoning. In computer
science, fuzzing and LLMs are tools that embody this principle. They don't passively store knowledge - they actively
explore, test, and refine it.

When I first started fuzzing, it was dismissed as a foolish endeavor because it didn’t look like traditional
reasoning. Now, it’s indispensable. LLMs are on a similar path: misunderstood by some, but already reshaping how we
approach security.

Just as fuzzing forced us to rethink what reasoning over code looks like, LLMs are forcing us to rethink reasoning
itself. In both cases, the act - not the object - is what matters. They are the root of the root and the bud of the
bud - the foundation of what comes next. And if you don’t carry this forward, you risk being left behind in a world
that’s growing beyond you.

-dave

_______________________________________________
Dailydave mailing list -- dailydave () lists aitelfoundation org
To unsubscribe send an email to dailydave-leave () lists aitelfoundation org

_______________________________________________
Dailydave mailing list -- dailydave () lists aitelfoundation org
To unsubscribe send an email to dailydave-leave () lists aitelfoundation org

Current thread:

(the root of the root and the bud of the bud) Dave Aitel via Dailydave (Jan 11)
- Re: (the root of the root and the bud of the bud) Darren Bounds via Dailydave (Jan 12)
- Re: (the root of the root and the bud of the bud) Thomas Dullien via Dailydave (Jan 12)
  - Re: (the root of the root and the bud of the bud) Sean Heelan via Dailydave (Jan 13)
- Re: (the root of the root and the bud of the bud) Don A. Bailey via Dailydave (Jan 12)