I recently put up a new piece at the newsletter for my Prophet of Causation book: some in-depth background research on the relationship between Ayn Rand and the Stoics. There’s a lot of interesting stuff there, but you have to subscribe and support the book project to read the whole thing.
I’ve been posting a bit lightly to this newsletter in the last week or so because I’ve been writing a few articles for other publications that haven’t quite been published yet. I’ll let you know when they go up.
The Real AI Apocalypse
A while back, I speculated about an AI doom loop, but perhaps not the one you were expecting.
In order to train programs on good-quality text and images, AI platforms rely on an abundance of human-generated material from existing media, and not just amateur bloggers or social media, but well-funded, carefully edited publications. If AI-generated text and images were to completely flood the internet—as they may soon do—then an AI that goes out to “scrape” the material it is designed to emulate would be working off of text and images generated by other AI. This raises the prospect of a doom loop where AI is copying AI and getting farther and farther away from anything that is useful or meaningful to humans.
Well, this is bolstered by several recent studies.
AI is eating its own tail. In what can be best described as a terrible game of telephone, AI could begin training on error-filled, synthetic data until the very thing it was trying to to create becomes absolute gibberish. This is what AI researchers call “model collapse.”
One recent study, published on the pre-print arXiv server, used a language model called OPT-125m to generate text about English architecture. After training the AI on that synthetic text over and over again, the 10th model’s response was completely nonsensical and full of a strange obsession with jackrabbits.
Another recent study, similarly posted to the pre-print arXiv server, studied AI image generators trained on other AI art. By the AI’s third attempt to create a bird or flower with only a steady diet of AI data, the results came back blurry and unrecognizable.
So, in order to train new AI models effectively, companies need data that’s uncorrupted by synthetically created information….
For now, engineers must sift through data to make sure AI isn’t being trained on synthetic data it created itself. For all the hand-wringing regarding AI’s ability to replace humans, it turns out these world-changing language models still need a human touch.
This report seems to be based in part on a previous article with a more suggestive title: “AI Is an Existential Threat to Itself.” The most intriguing detail is that researchers initially described the phenomenon of “model collapse” as “model dementia.”
This led me to speculate on Threads that our science fiction has gotten the AI apocalypse all wrong:
Keep reading with a 7-day free trial
Subscribe to The Tracinski Letter to keep reading this post and get 7 days of free access to the full post archives.