Reliable detection of AI-generated text is impossible, a new study says

Please welcome our new LLM-based, artificial text overlords

By Alfonso Maruccia March 22, 2023, 9:05

Reliable detection of AI-generated text is impossible, a new study says

Serving tech enthusiasts for over 25 years.
TechSpot means tech analysis and advice you can trust.

What just happened? The suffocating hype around generative algorithms and their unhinged proliferation have pushed many people to try and find a reliable solution to the AI-text identification problem. According to a recently published study, said problem is destined to be left unsolved.

While Silicon Valley corporations are tweaking business models around new, ubiquitous buzzwords such as machine learning, ChatpGPT, generative AIs and large language models (LLM), someone is trying to avoid a future where no one will be able to recognize statistically composed texts from those put together by actual human intelligence.

According to a study by five computer scientists from the University of Maryland, however, the future could already be here. The scientists asked themselves: "Can AI-Generated Text be Reliably Detected?" The answer they landed on is that text generated by LLMs cannot be reliably detected in practical scenarios, both from a theoretical and practical standpoint.

The unregulated use of LLMs can lead to "malicious consequences" such as plagiarism, fake news, spamming, etc., the scientists warn, therefore reliable detection of AI-based text would be a critical element to ensure the responsible use of services like ChatGPT and Google's Bard.

The study looked at state-of-the-art LLM detection methods already on the market, showing that a simple "paraphrasing attack" is enough to fool them all. By employing a light word rearrangement of the originally generated text, a smart (or even a malicious) LLM service can "break a whole range of detectors."

Even using watermarking schemes, or neural-network based scanners, it's "empirically" impossible to reliably detect LLM-based text. Worst-case scenario, paraphrasing can bring the accuracy of LLM detection down from a baseline of 97 percent to 57 percent. This means a detector would do no better than a "random classifier" or a coin toss, the scientists noted.

Watermarking algorithms, which put an undetectable signature over the AI-generated text, are completely erased by paraphrasing and they even come with an additional security risk. A malicious (human) actor could "infer hidden watermarking signatures and add them to their generated text," the researchers say, so that the malicious / spam / fake text would be detected as text generated by the LLM.

According to Soheil Feizi, one of the study's authors, we just need to learn to live with the fact that "we may never be able to reliably say if a text is written by a human or an AI."

A possible solution to this fake text-generation mess would be an increased effort in verifying the source of text information. The scientist mentions how social platforms have started to widely verify accounts, which could make spreading AI-based misinformation more difficult.

6 comments 408 likes and shares

// Related Stories

Featured on TechSpot