Why ‘artificial intelligence’ is getting dumber
Did you know that cats have been on the moon? Is it safe to look at the sun for 15 minutes, or even longer, if you have dark skin? Or should you eat one small stone a day to stay healthy?
These are some of the latest pearls of wisdom Google is serving up to its US users (we’re not so lucky here in the UK yet). “Let Google do the searching for you,” the search giant promised when it unveiled a feature called AI Previews earlier this month. This integrates Google’s Gemini generative-AI model into its search engine. The answers it generates appear above the traditional list of ranked results. And you can’t get rid of them.
AI Overviews didn’t have the impact Google had hoped for, to say the least. It certainly gained instant internet virality, with people sharing their favorite answers. Not because they are helpful, but because they are so funny. For example, when you ask AI Insights for a list of fruits ending in ‘um’, it returns: ‘Applum, Strawberrum and Coconut.’ This is what, in the dictionary of artificial intelligence, is called a ‘hallucination’.
Despite a market cap of $2 trillion and the ability to employ the greatest brains on the planet, Google keeps stumbling around AI. Its first attempt to join the generative AI gold rush last February was the ill-fated chatbot Bard, which had similar problems spouting factual inaccuracies. In his first live demo, Bard falsely claimed that the James Webb Space Telescope, not launched until 2021, had taken the ‘first pictures’ of Earth outside the solar system. The mistake wiped $100 billion off Google’s market value.
This February, Google once again started with artificial intelligence, this time with Gemini, an image and text generator. The problem was that it had very heavy guardrails. When asked to produce historically accurate images, it would instead generate black Nazi soldiers, Native American founding fathers, and a South Asian pope.
It was a ‘good faith mistake’, he said The Economist. But the problems inherent in generative artificial intelligence did not catch him unprepared. They will know about their opportunities and pitfalls.
Before the current AI mania truly began, analysts had already concluded that generative AI was unlikely to improve the user experience, and might even degrade it. That caution was abandoned when investors started piling in.
So why does Google’s artificial intelligence produce such broken results? In fact, it works exactly as you would expect. Don’t be fooled by the ‘artificial intelligence’ brand. Basically, AI Overviews simply tries to guess the next word you should use, according to statistical probability, but without any adherence to reality. An algorithm cannot say ‘I don’t know’ when asked a difficult question, because it doesn’t ‘know’ anything. He can’t even do simple math, as users have shown, because he has no basic concept of numbers or valid arithmetic operations. Hence the hallucinations and failures.
This is less of a problem when the result is not so important, for example when the AI processes the image and creates a minor glitch. Our phones use machine learning to process our photos every day, and most mistakes we don’t notice or care about. But the fact that Google advises us to start eating rocks is not a small mistake.
Such mistakes are more or less inevitable due to the way the artificial intelligence is trained. Instead of learning from a curated dataset of accurate information, AI models are trained on a huge, virtually open dataset. Google’s AI and ChatGPT have already mined as much of the web as possible and, needless to say, a lot of what’s on the web isn’t true. Forums like Reddit are full of sarcasm and jokes, but the AI treats them as credible, as honest and accurate explanations of problems. Developers have long used the phrase ‘GIGO’ to describe what’s going on here: garbage in, garbage out.
The problem of artificial intelligence hallucinations is consistent across fields. This largely prevents generative AI from being practically useful in commercial and business applications, where you might expect it to save a lot of time. A new study on generative artificial intelligence in the legal profession finds that the extra verification steps now required to ensure the AI isn’t hallucinating are negating the time saved by its implementation.
‘[Programmers] they still make the same gross mistakes as before. No one has really solved hallucinations with big language models and I don’t think we can,’ noted cognitive scientist and veteran artificial intelligence skeptic Professor Gary Marcus last week.
Another problem now comes to light. AI makes an already bad job worse by generating false information that then pollutes the rest of the web. ‘Google learns whatever rubbish it sees on the internet and nothing creates rubbish better than artificial intelligence’, as one X user put it.
Last year, leading AI companies admitted that after they ran out of content to pull from the web, they started using synthetic training data – that is, data generated by generative AI itself. A year ago, OpenAI’s Sam Altman said he was ‘pretty confident that soon all data will be synthetic data’, made by other AIs.
This is a huge problem. This essentially causes the model to ‘collapse’ and stop producing useful results. ‘Model collapse is when generative artificial intelligence becomes unstable, unreliable or stops working. It can happen when generative AI models are trained on content generated by AI rather than humans,’ Professor Nigel Shadbolt of the Open Data Institute warned last December. One researcher, Jathan Sadowski, called the phenomenon ‘Habsburg AI’, after the Spanish Habsburg dynasty that died out in 1700 as a result of inbreeding diseases.
You can argue that something like this already happens without the help of artificial intelligence, for example when a false fact is inserted into Wikipedia, quoted in the media, and then the media citations become a justification for further inclusion in Wikipedia.
AI simply automates and accelerates this process of generating falsehoods. This week, Telegraph gave the following example: ‘When Google claimed that there was no African country beginning with the letter K, its response appeared to be based on a web discussion where ChatGPT had incorrectly asked the same question. In other words, AI is now using other AI fabrications as gospel.’
The most appropriate description of this phenomenon comes from some American researchers, who last year coined the term ‘Model Autophagy Disorder’ or MAD. They wanted to evoke the practice of introducing bovine prions into the feed supply, a practice that caused bovine spongiform encephalopathy, or mad cow disease. ‘Our primary conclusion in all scenarios is that without sufficient fresh real data in each generation of the autophagic loop, future generative models are doomed to progressively decrease in quality (precision) or diversity (recall), they wrote.
Very few people warned about the shortcomings of generative AI when OpenAI open sourced its ChatGPT tool in November 2022. Now ChatGPT has polluted the web and poisoned itself and other AI tools. Cleaning this up will be a big challenge. While the promised gains of artificial intelligence are elusive, the costs are clearly starting to rise.
Andrew Orlowski is a weekly columnist in Telegraph. Visit his website here. Follow him on X: @AndrewOrlowski.