Mozilla says Claude AI uncovered over 100 Firefox bugs in just two weeks, including 14 high-severity flaws

Alfonso Maruccia

Posts: 2,515   +935
Staff
The takeaway: While some companies are struggling with a flood of unreliable or hallucinated AI-generated bug reports, Mozilla is finding real value in bug-seeking bots. The foundation has begun working with Anthropic to strengthen Firefox's security, and several AI-assisted bug fixes have already landed in the browser's codebase.

Mozilla is now working with Anthropic's Frontier Red Team to identify and patch potentially dangerous security vulnerabilities in Firefox. According to Mozilla, the AI company approached them a few weeks ago with results from a newly developed, AI-assisted bug-hunting method. The approach appears to work, Mozilla said, and could ultimately lead to a safer Firefox experience for everyone.

Anthropic's team focused on Firefox's JavaScript engine, in part because the Red Panda browser offers a widely used and "deeply scrutinized" open-source codebase that makes it ideal for testing new analysis techniques. The AI system uncovered several security flaws in the JS engine and also produced minimal test cases, allowing Firefox developers to quickly verify and reproduce the issues.

In total, developers confirmed 14 high-severity security bugs, which resulted in 22 separate CVE tracking IDs. Mozilla said all of these issues have already been fixed in the latest Firefox release (version 148.0). The process also uncovered 90 additional low-priority bugs, which have since been addressed.

Mozilla emphasized that Anthropic's approach to bug reporting differs significantly from other AI-driven efforts. Some major open-source projects, including curl, have been forced to discourage or outright ban AI-generated contributions after being flooded with low-quality submissions from users attempting to earn bug bounty rewards without proper vetting.

Many of the vulnerabilities uncovered through Anthropic's technique are typically discovered through fuzzing, an automated testing method that feeds unexpected inputs into software to trigger crashes. However, Mozilla said the AI model also identified several classes of logic bugs that traditional fuzzing techniques often miss.

After seeing the results, Mozilla plans to incorporate the new AI-assisted method into its broader security and development workflow. The organization expects Anthropic's Claude models and other advanced AI systems to help uncover additional issues in the future.

If the approach proves scalable, it could also help identify large numbers of previously "undiscoverable" bugs across other popular open-source projects where fuzzing and other traditional techniques have reached their limits without the help of AI.

Permalink to story:

 
Yea, but can you patch them without creating new bugs? If not, this highly profitable wild goose chase continues unabated. What would security vendors do if there were no more bugs? Do we dare to imagine?
 
Yea, but can you patch them without creating new bugs? If not, this highly profitable wild goose chase continues unabated. What would security vendors do if there were no more bugs? Do we dare to imagine?
I believe, bug fixing is a general problem independent from mentioned AI-assisted bug-hunting method. The purpose of the tool is to detect bugs. But it would be nice to know more details about this method.
 
Mozilla is a company that is 100% dependent on its sponsors (Google's owners).
Do Google and Anthropic have the same owners?
It's all a lie.
 
These are bugs in human generated code found by AI trying to "break" the browser in the same way that human testers do it. Then said reported bugs were verified by humans.

AI is a useful tool even with it's flaws if used correctly.
The only use I've found for LLM's (LLM's are not AI) is not using them.
 
Are these bugs or hallucinations and part of the error rate of these LLM's?
They’re not hallucinations. The LLM helped surface potential issues through analysis, but every finding still requires human verification, just like results from static analyzers or fuzzers. Some reports are false positives, but many were confirmed bugs. The model accelerates discovery ... it doesn’t replace validation.
 
The only use I've found for LLM's (LLM's are not AI) is not using them.
Whether you like them or not, LLMs are categorized as AI in modern ML research. Like any tool, they have limitations, but they can still be useful for tasks like code review, analysis, and triaging bugs.

Saying LLMs aren’t AI is mostly a philosophical argument. In ML research they’re literally categorized as AI models.
 
Whether you like them or not, LLMs are categorized as AI in modern ML research. Like any tool, they have limitations, but they can still be useful for tasks like code review, analysis, and triaging bugs.

Saying LLMs aren’t AI is mostly a philosophical argument. In ML research they’re literally categorized as AI models.
You can call LLM's whatever you want, it doesn't change the fact that they're not AI, they're LLM's, only capable of regurgitating what has already been created.
 
You can call LLM's whatever you want, it doesn't change the fact that they're not AI, they're LLM's, only capable of regurgitating what has already been created.
Is that all you got?
You are arguing a definition debate, not the actual capability of the technology. Researchers in machine learning and computer science classify LLMs as AI systems. You can call it or classify how you like, does not change reality!
 
The only use I've found for LLM's (LLM's are not AI) is not using them.
At the very least they are a smarter (pun intended) google search.

I now get the same work done in less time and often leave work an hour earlier. Yes you have to double check things and assume they are making mistakes - same with new hires. But even doing all that, I have 5 more hours a week for fun for $40/month.

I actually am glad people don't want to use them because if everyone does then I lose my competitive advantage.
 
You can call LLM's whatever you want, it doesn't change the fact that they're not AI, they're LLM's, only capable of regurgitating what has already been created.
Yes they can ONLY regurgitate from sum total of human knowledge in books, online, in code repositories, in academic articles...

I'm going to hold out until they can pull from 110% of human knowledge too.
 
At the very least they are a smarter (pun intended) google search.

I now get the same work done in less time and often leave work an hour earlier. Yes you have to double check things and assume they are making mistakes - same with new hires. But even doing all that, I have 5 more hours a week for fun for $40/month.

I actually am glad people don't want to use them because if everyone does then I lose my competitive advantage.
In my field of work these LLM's are as frustrating as it gets as the information they provide is not only inconsistent, with data changing randomly, but also entirely wrong.
 
This seems like a good use of A.I, by FF. I don't want it myself at all on my PC, but in simple terms, Anthropic helped uncover security flaws for FF. Surely that's good.

I would not have posted except for the fact that it's Anthropic. Currently, at least, they seem to be the "good guys," when it comes to A.I. Keep it up and be the exception!

They actually have some ethical values as shown by their standing up to USA Defence/War Ministry.
Good for them. This kind of thing is becoming rare in the era A.I. Mostly it's 100% profit focused and market share.

At least Anthropic A.I still retains human values.
 
Firefox has been very buggy lately, slow page loading ect even with the latest version!🤬
Buggy and unstable. It sometimes crashes when loading pictures. All these bug fixes certainly are paying off. AI slop certainly yields results.
 
In my field of work these LLM's are as frustrating as it gets as the information they provide is not only inconsistent, with data changing randomly, but also entirely wrong.
So is much of the information online. People put entirely too much faith in a random poster on reddit or a random blog post.

Do you still have to use your brain with the info LLMs give you? Yes.
But it can give me as good as info as searching online for an hour in seconds and that time savings alone is worth more than $20 a month to me.

Also the free versions are much worse than paid.

Claude's Opus is much better than the rest for writing and code.

One trick is to give a second AI the output of the first and tell it to find what is wrong with it.
 
Back