oss-sec mailing list archives

Re: AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024)


From: "David A. Wheeler" <dwheeler () dwheeler com>
Date: Mon, 19 Aug 2024 17:02:29 -0400


On Aug 17, 2024, at 4:32 PM, Alfredo Ortega <ortegaalfredo () gmail com> wrote:

I found a real bug (OpenBSD IPv6 Multicast Forwarding Cache sysctl
kernel heap overflow) using Mistral-Medium almost 6 months ago:
https://github.com/ortegaalfredo/vulns-ai/blob/main/openbsd_mfc6_sysctl_overflow.txt

The simple tool that did it is also released as open-source here:

https://github.com/ortegaalfredo/autokaker

About to release the second version, and a vscode plugin, next week.

That's even more evidence that LLMs can find at least some vulnerabilities.

Also - here's a visualization that tries to show how AIxCC competitors
did against the challenge problems:
https://dashboard.aicyberchallenge.com/collectivesolvehealth

You can see that the tools found & fixed many of the seeded vulnerabilities in
nginx, a few in all but one of the others, and they struggled with the
Linux kernel. The Linux kernel is *huge* compared to most projects, so that
isn't too surprising. The final competition is in about a year, so there's hope that
the tools will make improvements in that time as part of the challenge.

To be honest, even finding and fixing *some* problems automatically is a big
win, especially if false reports are rare. Still, the better the tools are at finding and
fixing vulnerabilities, the better off we are.

--- David A. Wheeler


Current thread: