AI-assisted code can be inherently insecure, study finds

Programmers must be educated about strong coding practices

By Alfonso Maruccia December 29, 2022, 9:34

AI-assisted code can be inherently insecure, study finds

Serving tech enthusiasts for over 25 years.
TechSpot means tech analysis and advice you can trust.

Forward-looking: Machine learning algorithms are all the rage now, as they are used to generate any kind of "original" content after being trained on enormous pre-existing datasets. Code-generating AIs, however, could pose a real issue for software security in the future.

AI systems like GitHub Copilot promise to make programmers' lives easier by creating entire chunks of "new" code based on natural-language textual inputs and pre-existing context. But code-generating algorithms can also bring an insecurity factor to the table, as a new study involving several developers has recently found.

Looking specifically at Codex, the AI platform developed by OpenAI, which is also the code-making engine of the aforementioned GitHub Copilot, the study recruited 47 different developers. Ranging from undergraduate students to experienced professionals, said developers were tasked with using Codex to solve security-related problems in Python, JavaScript, C, and other high-level programming languages.

The researchers said that when the programmers had access to the Codex AI, the resulting code was more likely incorrect or insecure compared to the "hand-made" solutions conceived by the control group. Furthermore, the programmers with AI-assisted solutions were more likely to say that their insecure code was secure compared to the aforementioned control group.

Neil Perry, a PhD candidate at Stanford and the study lead co-author, said that "code-generating systems are currently not a replacement for human developers." Said developers could be using AI-assisted tools to complete tasks outside their own areas of expertise, or to speed up a programming task they are already skilled in. They should be both concerned, the study author said, and they should always double-check the generated code.

According to Megha Srivastava, a postgraduate Stanford student and the second co-author of the study, Codex is anything but useless: despite the shortcomings of the "stupid" AI, code-generating systems can be useful when employed for low-risk tasks. Furthermore, the programmers involved in the study didn't have a particular expertise in security matters, which could have helped in spotting vulnerable or insecure code, Srivastava said.

AI algorithms could also be fine-tuned to improve their coding suggestions, and companies that develop their own systems can get better solutions with a model generating code more in-line with their own security practices. Code-generating technology is an "exciting" development with many people eager to use it, the study authors said. It's just that there is still a lot of work to be done on finding proper solutions to AI shortcomings.

4 comments 1.7K likes and shares

// Related Stories

Featured on TechSpot