oss-sec mailing list archives
Re: AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024)
From: "David A. Wheeler" <dwheeler () dwheeler com>
Date: Mon, 19 Aug 2024 17:02:29 -0400
On Aug 17, 2024, at 4:32 PM, Alfredo Ortega <ortegaalfredo () gmail com> wrote: I found a real bug (OpenBSD IPv6 Multicast Forwarding Cache sysctl kernel heap overflow) using Mistral-Medium almost 6 months ago: https://github.com/ortegaalfredo/vulns-ai/blob/main/openbsd_mfc6_sysctl_overflow.txt The simple tool that did it is also released as open-source here: https://github.com/ortegaalfredo/autokaker About to release the second version, and a vscode plugin, next week.
That's even more evidence that LLMs can find at least some vulnerabilities. Also - here's a visualization that tries to show how AIxCC competitors did against the challenge problems: https://dashboard.aicyberchallenge.com/collectivesolvehealth You can see that the tools found & fixed many of the seeded vulnerabilities in nginx, a few in all but one of the others, and they struggled with the Linux kernel. The Linux kernel is *huge* compared to most projects, so that isn't too surprising. The final competition is in about a year, so there's hope that the tools will make improvements in that time as part of the challenge. To be honest, even finding and fixing *some* problems automatically is a big win, especially if false reports are rare. Still, the better the tools are at finding and fixing vulnerabilities, the better off we are. --- David A. Wheeler
Current thread:
- AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024) David A. Wheeler (Aug 16)
- Re: AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024) Alfredo Ortega (Aug 17)
- Re: AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024) David A. Wheeler (Aug 19)
- Re: AI Cyber Challenge (AIxCC) semi-final results from DEF CON 32 (2024) Alfredo Ortega (Aug 17)
