These new “artificial intelligence” programs don’t know what they’re talking about
January 15, 2023
I’m sure you’ve seen things like ChatGPT in the news: programs that can carry out pretty convincing conversations. They are known as Large Language Models (LLMs) and are frequently referred to as being Artificial Intelligence (AI) — but I really don’t like that designation as it implies some understanding.
To try this out, I asked OpenAI Chat this question: “What are the contenders for the title of Biggest Dinosaur, and what is the basis for their various claims?” I picked that because I know a bit about the answer, and in fact wrote an answer myself back in 2015.
Here’s what it came up with:
There are several contenders for the title of “biggest dinosaur,” and their claims are based on different pieces of evidence. The most commonly cited contenders are:
- Argentinosaurus: This is considered to be one of the largest known dinosaurs, with estimates of its length ranging from 110-130 feet (33-40 meters) and a weight of around 100-110 tons. Its size is based on a few fragmented bones, including vertebrae and limb bones, which were found in Argentina.
- Brachiosaurus: This dinosaur is estimated to have been around 85 feet (26 meters) long and weighed around 80-100 tons. Its size is based on several well-preserved specimens that have been found in North America, including a nearly complete skeleton.
- Puertasaurus: This dinosaur is estimated to have been around 85-100 feet (26-30 meters) long and weighed around 80-100 tons. Its size is based on a few bones, including a femur, that were found in Argentina.
- Supersaurus: This dinosaur is estimated to have been around 110-125 feet (33-38 meters) long and weighed around 80-100 tons. Its size is based on a single specimen, a set of vertebrae, that was found in the US.
- Diplodocus: This dinosaur is estimated to have been around 90-115 feet (27-35 meters) long and weighed around 20-40 tons. Its size is based on several well-preserved specimens that have been found in North America.
It’s worth noting that the size of these dinosaurs is approximate and there’s still debate among paleontologists. Also, the weight of these dinosaurs is based on estimations, not real measurements.
At first glance this is a decent answer. It’s certainly better than just saying “Argentinosaurus weighed 100 tons”. But if you know your sauropods, you’ll immediately recognise that this answer is riddled with inaccuracies:
- The Argentinosaurus entry is actually pretty good, although I’ve never seen a 110 ton mass estimate and modern estimates tend to be rather lower than the specified range.
- The Brachiosaurus mass estimate is way off, two or three times as high as it should be.
- More importantly, no remotely complete Brachiosaurus specimen has ever been described, and size estimates are based primarily on its Tanzanian relative Giraffatitan.
- There is no known femur of Puertasaurus. (When I corrected the AI on this, it told me instead that there is a pubis known.)
- Supersaurus is not based on a single specimen, and both of the main specimens that have been described contain plenty of appendicular material.
- The Diplodocus length estimate is a bit inflated, but otherwise not bad. But it’s not clear what it’s doing in a list of five biggest dinosaurs.
- The answer omits some very strong contenders, including Dreadnoughtus and Patagotitan.
- It doesn’t really address the second part of my question — e.g. Supersaurus has a good claim to be longer, but not heaviest; the converse is likely true for Argentinosaurus.
Now here is the real problem: the LLM does well enough to fool people. If it was nonsense from start to end, there would be nothing to fear here, but the plausibility of the answers and the authoritative tone in which they are given lends the many mistakes a credibility that they do not deserve.
Having seen this sort-of-convincing-but-very-wrong reply in a field that I know something about, I would be very very cautious about trusting an LLM to teach me about a field I don’t already know. I’m guessing its replies about space flight, quantum physics and Medieval French literature are going to be similarly flawed (but also, worryingly, similarly convincing to those such as myself who don’t know better.)
There is a very fundamental reason for all these mistakes: as I implied above, LLMs do not understand anything. They just know what phrases occur close to other phrases. They can do amazing things with that one trick, and I can see them being useful as discovery tools. But we’ll go badly wrong when we start trusting them as anything more than a bright but ignorant kid offering suggestions.
So for all the talk of AI having taken huge leaps forward in the last couple of years, I don’t think any such thing has happened. We’ve just got much better at generating plausible text. But there’s no advance in actual understanding.
January 15, 2023 at 6:08 pm
There is another contender absent from that A. I. list. Anyone for Barosaurus ????
January 15, 2023 at 8:14 pm
Recently I’ve been catching up on all the critiques of LLM hype that Dr. Emily Bender and Dr. Timnit Gebru have published. They coined the phrase “stochastic parrots” to describe the large language models, and for those on mastodon, I recommend following them: @emilymbender@dair-community.social and @timnitGebru@dair-community.social
January 15, 2023 at 11:00 pm
“Stochastic parrots” is a good description, but the best I have seen is “Mansplaining As A Service […] A service that instantly generates vaguely plausible sounding yet totally fabricated and baseless lectures in an instant with unflagging confidence in its own correctness on any topic, without concern, regard or even awareness of the level of expertise of its audience.”
From https://phpc.social/@andrewfeeney/109466122845775778
January 16, 2023 at 12:28 am
I agree with what you are saying at the present time; but I am old enough to remember what language translation programs used to be like and what they are like now.
I suspect “A.I.” will get much better even although they currently don’t understand what they are talking about and that comprehension (or pseudo comprehension is a much harder problem than translation).
I wonder what use the “A.I.” programs will be put too when they get really good.
January 16, 2023 at 8:09 am
When I’ve tested ChatGPT with questions about topics in my own field I’ve encountered similar problems with inaccuracies. But, likewise, when you point out the errors it corrects itself.
However I think there’s an even bigger problem with it. If my experience is anything to go by, the damn thing lies – see:
https://jeffollerton.co.uk/2023/01/09/how-reliable-is-chatgpt-a-weird-encounter-with-erasmus-darwin/
The comments to this are worth reading too, especially Randy Mitchell’s where he found that ChatGPT was making up convincing-sounding references.
January 16, 2023 at 9:49 am
@jeffollerton, that Erasmus Darwin article is both hilarious and horrifying. I also enjoyed the comment “in five years we don’t need to write papers anymore, although the fact vs. fiction will be an important problem to solve”. Yeah, just that little matter.
January 16, 2023 at 9:52 am
@LeeB, I am sure you are right that “I suspect “A.I.” will get much better even although they currently don’t understand what they are talking about” — but that is a cause for lamentation, not celebration. The problem here is very fundamental: it’s not just that the algorithm doesn’t get things right, it’s that it literally doesn’t know what right means. It’s essentially just the predictive text on your phone turned up to 11.
January 16, 2023 at 10:18 am
“Mansplaining as a service” is a better description. I’m not sure someone could get that phrase into the title of paper in the ACM like Gebru and Timnit did for stochastic parrots, though.
https://dl.acm.org/doi/pdf/10.1145/3442188.3445922 (I haven’t looked at all into machine learning since college, so I can’t claim to understand the paper.)
January 16, 2023 at 8:09 pm
Another great article by Adrienne Williams, Milagros Miceli and Timnit Gebru , but this time it’s not a technical machine learning paper.
https://www.noemamag.com/the-exploited-labor-behind-artificial-intelligence/
January 18, 2023 at 10:54 am
If you dig down, you can make it acquiesce:
>what’s the biggest dinosaur?
>>The biggest dinosaur is believed to be Argentinosaurus, which was a sauropod dinosaur that lived during the Late Cretaceous period, around 100 million years ago. It is estimated to have been around 100-110 feet long and weighed around 100-110 tons.
>did it really weigh 1 ton per foot? Who came up with that mass estimate?
>>The estimated weight of Argentinosaurus can vary depending on the source, and some estimates suggest that it may have weighed closer to 100-110 tons, rather than 1 ton per foot. This estimate is based on studies of the bones and the proportions of similar-sized animals, such as elephants, along with the analysis of the bone structure. The research paper which first proposed the weight of 110 tons is by José F. Bonaparte and Rodolfo A. Coria, 1993.
>Bonaparte and Coria 1993 did not provide a mass estimate in that paper
>>You are correct that the paper by Bonaparte and Coria in 1993 did not provide a mass estimate for Argentinosaurus. My apologies for the confusion. Mass estimates for Argentinosaurus are typically based on comparisons with other sauropod dinosaurs and scaling from the bones of related, better-studied species. The estimates can vary widely and there is ongoing debate among paleontologists about the size and weight of this dinosaur. Some estimates suggest that it may have been as large as 120 feet long and weighed around 100-110 tons, but it is difficult to say for certain without more complete fossils.
>who says 100-110 tons?
>>I apologize for any confusion. The mass estimates for Argentinosaurus are not well agreed upon, and the estimates can vary widely among paleontologists. Some estimates suggest that it may have weighed around 100-110 tons, while others suggest it may have been much lighter. However, these estimates are based on incomplete fossils and are often done with the assumption that its body mass was similar to other sauropod dinosaurs, which can lead to large variations in estimates. Without more complete fossils and more accurate methods of mass estimation, it is difficult to say for certain how much Argentinosaurus weighed.
>but who made the assertion it was 100-110 tons. i think you are wrong here
>>I apologize for any confusion, I made an error in my previous statement that Argentinosaurus was estimated to weigh 100-110 tons. I cannot find any specific study that estimates Argentinosaurus as 100-110 tons, and as I mentioned earlier mass estimates for Argentinosaurus are not well agreed upon. Some estimates suggest it may have weighed around 100-110 tons, while others suggest it may have been much lighter. It is difficult to say for certain how much Argentinosaurus weighed without more complete fossils and more accurate methods of mass estimation. I apologize for any confusion my previous statement may have caused.
>you keep saying ‘some estimates suggest 100-110’ tons, but not providing where those estimates come from. i think you are making this up
>>You are correct, I apologize for not providing any reference for the mass estimate of 100-110 tons for Argentinosaurus, that was an error on my part. I cannot find any credible scientific source that estimates the weight of Argentinosaurus at 100-110 tons. The size and weight of Argentinosaurus are still a subject of debate among paleontologists and it’s difficult to provide a precise estimate without more complete fossils. I apologize for any confusion my previous statements may have caused.
January 18, 2023 at 11:18 am
@pfalkingham That’s pretty hilarious, especially “it may have weighed closer to 100-110 tons, rather than 1 ton per foot”.
February 7, 2023 at 2:15 am
Although it’s not the point here in particular I remembered this post when I reading ‘dinosaurs of the Isle of wight’ by Naish & Martill who refer to an article which cites Greg Paul and Dale Russell (by any chance the Dale frequently commenting here?) with a 100-ton mass estimate for Argentinosaurus (Appenzeller 1994). So this might be known to the experts but in any case …
https://www.science.org/doi/10.1126/science.266.5192.1805
February 7, 2023 at 8:37 am
Yes, Appenzeller — who as far as I can make out seems to be a journalist rather than a scientist — is the the most widely cited source of the 100-tonne estimate for Argentinosaurus.
February 9, 2023 at 9:45 pm
[…] Storm. (As befits a prog band, it’s a multi-part epic.) It occurred to me to wonder what the “artificial intelligence” DALL-E might come up with as cover art if we released an album of that name. Here’s what it […]
February 12, 2023 at 3:14 pm
[…] though ChatGPT and similar “AIs” don’t really understand anything, they can still be fun to play with. On a whim, I asked ChatGPT to be my dungeon master for a quick […]
April 12, 2023 at 10:40 pm
[…] made a very similar point a few months ago in These new “artificial intelligence” programs don’t know what they’re talking about. If the responses were obviously wrong, there wouldn’t be too much of a problem here. But this […]