Useless security reports generated by AI are frustrating open-source maintainers

Alfonso Maruccia

Posts: 2,559   +950
Staff
Facepalm: Generative AI services are neither intelligent nor capable of providing a meaningful addition to open-source development efforts. A security expert who has had enough of "spammy," hallucinated bug listings is venting his frustration, asking the FOSS community to sidestep AI-generated reports.

Generative AI models have already proven powerful tools in the hands of cyber-criminals and fraudsters. However, hucksters can also use them to spam open-source projects with useless bug reports. According to Seth Larson, the number of "extremely" low-quality, spammy, and LLM-hallucinated security reports has recently increased, forcing maintainers to waste their time on things of low intelligence.

Larson is a security developer at the Python Software Foundation who also volunteers on "triage teams" tasked with vetting security reports for popular open-source projects such as CPython, pip, urllib3, Requests, and others. In a recent blog post, the developer denounces a new and troublesome trend of sloppy security reports created with generative AI systems.

These AI reports are insidious because they appear as potentially legitimate and worth checking out. As Curl and other projects have already pointed out, they are just better-sounding crap but crap nonetheless. Thousands of open-source projects are affected by this issue, while maintainers aren't encouraged to share their findings because of the sensitive nature of security-related development.

"If this is happening to a handful of projects that I have visibility for, then I suspect that this is happening on a large scale to open source projects," Larson said.

Hallucinated reports waste volunteer maintainers' time and result in confusion, stress, and much frustration. Larson said that the community should treat low-quality AI reports as malicious, even if this is not the original intent of the senders.

He had valuable advice for platforms, reporters, and maintainers currently dealing with an uptick in AI-hallucinated reports. The community should employ CAPTCHA and other anti-spam services to prevent the automated creation of security reports. Meanwhile, bug reporters should not use AI models to detect security vulnerabilities in open-source projects.

Large language models don't understand anything about code. Finding legitimate security flaws requires dealing with "human-level concepts" such as intent, common usage, and context. Maintainers can save themselves from a lot of trouble by responding to apparent AI reports with the same effort put forth by the original senders, which is "near zero."

Larson acknowledges that many vulnerability reporters act in good faith and usually provide high-quality reports. However, an "increasing majority" of low-effort, low-quality reports ruin it for everyone involved in development.

Permalink to story:

 
"Whats our budget report look like?"
"Well, the HR staff we hired said we could save money by replacing some of the developer team managers with AI"
"I love it! Hire more HR staff with the money we saved and see what other ideas they come up with!"
 
I read a while ago discussing this specifically with the Linux kernel developers -- there was a discussion on whether to add bcachefs filesystem to the kernel, with those arguing "no" partially doing so based on developer burnout. First, pointing out they only had about 1/3rd the amount of developers between ext4, btrfs, the VFS filesystem layer, etc. that they had in the past; and that these developers were now spending like 3/4ths of their time triaging through mostly complete garbage security flaw reports generated by automated systems. So adding bcachefs in, with a new piece of code in there they knew they'd get absolutely firehosed by additional reports whether they represented an actual problem in the code or not.

This is a serious issue! Should AI be used to find bugs? I won't go so far as to say "no it shouldn't", but it should be used as a tool by the security researchers, it can point out where to look and the security researcher can review that piece of code and file off a report if there's a problem. I would go so far as to say devs should have a tool to automatically filter out AI-generated reports and bin them (well, the dev could decide whether to throw them out or look at them.. but honestly I'd assume most would throw them out.)
 
Back