AI-generated books have overrun public libraries, with no easy solution in sight

Alfonso Maruccia

Posts: 2,564   +954
Staff
Editor's take: Generative AI models are powerful tools that can be exploited by scammers, criminals, and those looking to fabricate an entire writing career. These models can, in theory, generate an endless stream of seemingly coherent text, and they have already been used to flood platforms that provide digital services to public libraries.

The internet is becoming a wasteland, devoid of human interaction, as bots consume global bandwidth with malicious and worthless traffic. According to those in the ebook lending industry, AI-generated text has already become a major issue for publicly funded libraries. Low-quality "books" are flooding the market, overwhelming both automated filters and human reviewers with an almost impossible challenge.

A recent report by 404 Media highlights that the problem primarily affects OverDrive and Hoopla, the two leading companies that public libraries rely on for ebook management and lending. OverDrive allows libraries to curate their collections, selecting which books to offer, while Hoopla provides unrestricted access to its entire catalog. Although Hoopla-powered libraries can cap book prices, they have no control over which titles become available to their users.

The core issue with Hoopla's model is the rising presence of fake content – – referred to in the publishing industry as "vendor slurry." Even before generative AI became widespread, publishers and libraries were already struggling against an influx of low-quality, often self-published ebooks. For years, individuals have churned out "summaries" of popular books with little to no original content. Now, with tools like ChatGPT, the mass production of meaningless, automated content has reached an entirely new level.

Luca Bartlomiejczyk, a librarian at Edith Wheeler Memorial Library in Monroe, Connecticut, stated that the "gargantuan" number of books available on Hoopla is largely composed of low-quality material with little to no appeal for human readers. "If you're going to say, 'we have 15,000 ebooks on our platform,' and 5,000 of those are low quality, AI generated or stuff that's just put on there without any kind of like oversight or selection criteria being followed, what are you actually offering to us?" Bartlomiejczyk asked.

A growing number of publishers and so-called "authors" specialize in the vendor slurry business. One example is IRB Media, which has hundreds of books on Hoopla – all AI-generated summaries of pre-existing titles. As Bartlomiejczyk explained, a customer searching for a specific book could easily end up with an AI-generated summary instead. Lending such worthless content costs libraries money while delivering a disappointing, AI-powered reading experience.

Two years ago, Library Futures and the Library Freedom Project urged Hoopla and OverDrive to address the issue of low-quality books, particularly those denying the Holocaust or promoting hate against minorities. Hoopla removed the offending titles, explaining that both human and algorithmic reviewers had failed to prevent them from entering its catalog.

Now, librarians like Bartlomiejczyk are calling for greater accountability from digital lending platforms, as AI-driven content degradation is a problem unlikely to disappear anytime soon. No one is advocating for an outright ban on AI-generated books, but such content should be clearly labeled in catalogs so readers know exactly what they are downloading to their e-readers

Permalink to story:

 
Steam and the Nintendo E-Shop have the same problem with games. Nintendo, for all their harping on having the best games, mostly sells recycled mobile games from the Android app store in between releases.
 
Google I assume must have got quite good sifting out spam emails, odd one still gets through

The above ****show , is going to be endlessly repeated through all digital markets and actual products that may or may not ever exist, or on demand

How to stop a Tshirt ripoff printing company especially if seller and buyer are complicit

Most books even written by humans as mentioned ( self published ) are so so. The problem here is when they run these books through AI as an editor, then that may trigger false positive

flipside - absolute gems go dark.

Suppose we will see the rise of more curators

The apple tax to publish might be a disincentive

However a curation service funded by $10 fee will deter most AI bulk generators
Even If a self published author knows now mostly likely to earn a loss of $9, than a profit of 5 cents , with a $10 fee , probably many will take it - ie to just know someone might read their book and a lasting legacy

Fun fact - just making best seller list , in no way means a liveable ongoing income
Only a small % of authors really make any money
So people aspiring to be like their favourite author may be disappointed to know they are not well off
 
"If you're going to say, 'we have 15,000 ebooks on our platform,' and 5,000 of those are low quality, AI generated or stuff that's just put on there without any kind of like oversight or selection criteria being followed, what are you actually offering to us?"

Well... simple math says, they're offering you 10,000 decent ebooks...
 
Maybe, while they're clearing out all the AI generated books, they could remove all the vanity press (self published) books at the same time. My wife's brother produced a book and asked me to review it. I couldn't. The story line was mildly interesting but there were spelling mistakes on every page. Even the first word in the book had a grammatical error. What do you say?
 
Maybe, while they're clearing out all the AI generated books, they could remove all the vanity press (self published) books at the same time. My wife's brother produced a book and asked me to review it. I couldn't. The story line was mildly interesting but there were spelling mistakes on every page. Even the first word in the book had a grammatical error. What do you say?

On the plus side, you could at least prove it was written by a human
 
Back