Back in February, OpenAI revealed an artificial intelligence dubbed GPT-2 that's capable of writing convincing fake news using snippets as a starting point. The organization saw the high potential for misuse, so it decided to do a staged release that allowed limited use by researchers.
The scaled back versions haven't been misused in the wild, so OpenAI has now released the full model of GPT-2 that contains 1.5 billion parameters, hoping it will help researchers improve text detection and generation tools. With more fine-tuning, it could soon help experts determine whether a malicious actor is generating spam and misinformation or impersonating someone online. It may also serve to improve chatbots to do more than just troll telemarketers.
GPT-2 has been trained to predict the next words in text taken from 8 million web pages. But while the quality of its output is impressive, human-perceived credibility is still relatively low. The text it generates isn't always coherent and can include repetitive structures, sudden topic-switching, and world modeling failures. An example might be that GPT-2 can write about fires happening under water or someone being their father's father.
Simply put, the more data it has, the higher the quality of its output. On the other hand, GPT-2 fails spectacularly when it comes to very specific or esoteric content. Then again, OpenAI's technology has been surpassed by Google's BERT, which is used to predict what people are searching for online using the power of its cloud infrastructure.
Microsoft recently poured $1 billion into OpenAI to fuel its ambitions to create an "artificial general intelligence." In the meantime, you can play with the web version of the much more limited GPT-2 here.