Dailydave mailing list archives

Re: sboms and LLMs


From: Isaac Dawson via Dailydave <dailydave () lists aitelfoundation org>
Date: Thu, 12 Sep 2024 11:42:53 +0900

Well this is rather timely! Although I'm not sure using an LLM for the
behavioral aspect is entirely necessary. I've been working on an
experimental system that does just what you talk about for dependencies (
https://docs.gitlab.com/ee/user/application_security/dependency_scanning/experiment_libbehave_dependency.html,
pre-alpha!). My solution uses static analysis because I'm a fan of
determinism.

Snark aside, looking at behaviors of what our dependencies are doing is
definitely another signal we should be using when we determine whether we
want to add a dependency or whether something fishy is going on. I have
lots of ideas on where to take this but I actually never thought about
adding it to an SBOM. An interesting idea for sure.

-Isaac


On Thu, Sep 12, 2024 at 11:21 AM Dave Aitel via Dailydave <
dailydave () lists aitelfoundation org> wrote:

People doing software security often use LLMs more as orchestrators than
anything else. But there's so many more complicated ways to use them in our
space coming down the pipe. Obviously the next evolution of SBOMs
<https://www.cisa.gov/resources-tools/resources/cisa-sbom-rama> is that
they represent not just what is contained in the code as some static tree
of library dependencies, but also what that code does in a summary fashion
that you can check once you get the final binaries. In a certain sense, you
can think of this as a behavioral attestation between the software
publisher and the consumer who is actually running the product.

In other words, if my product is meant to connect to WWW.SPYWARE.RU, then
it should say so in the SBOM behavioral manifest. But of course in practice
these things get quite complicated, and hence you need to summarize
semi-structured data (aka, the behavioral manifest is rarely exact), and
then compare it to what is seen when the software itself is run (which if
you've ever run strace ...is voluminous). That smells like a job for an
LLM, or at the very least, a vector comparison. Likewise, automatically
building harnesses to run and capture security sensitive information (or
performance information as we learned from XZ), is rapidly also becoming a
job <https://google.github.io/oss-fuzz/research/llms/target_generation/>
for an LLM.

I perhaps am channeling everyone else's
<https://www.cisa.gov/speaker/allan-friedman> worry that too much of the
SBOM community is arguing about which XML fields belong in a VEX addendum,
rather than pushing the concepts forwards to actually solve problems. Or
perhaps not! At some level, the software vendors are getting dragged
through this process by their hair, which is very fun to watch.

-dave

_______________________________________________
Dailydave mailing list -- dailydave () lists aitelfoundation org
To unsubscribe send an email to dailydave-leave () lists aitelfoundation org

_______________________________________________
Dailydave mailing list -- dailydave () lists aitelfoundation org
To unsubscribe send an email to dailydave-leave () lists aitelfoundation org

Current thread: