ChatGPT sounds convincing, but gets it all wrong
In the experiment that I set up last week, I gave ChatGPT 8 statements and asked it to write a referenced sentence on each. These were then combined into two paragraphs in a way that you may consider writing for your PhD thesis.
See the experiment here.
During this week, I checked the statements given and the citations used. Here are the results:
The results of my experiment are in, and it’s bad news for ChatGPT or anyone who might want to try to rely on it for writing referenced statements for their PhD.
Most statements sounded good, and were certainly on message, but the citations were sometimes vague and when they could be pinned down turned out to be about other subjects or did not contain the information claimed. This is really bad news, because at first sight the information seemed credible. In the case of a PhD student, the examiners are likely to know their specialist field and will hopefully recognise the mis-citations. But for undergraduate essays, this is a far more difficult situation as those marking them will not have the time or subject area expertise to check for mis-cites. If it sounds credible, it will most likely fly.
As a recap
I gave ChatGPT a series of short statements and asked for a sentence with citations. In the following, you will see this broken down statement by statement:
Paragraph 1
Given: "Invasive species are a growing global problem"
Chat GPT Statement 1: According to the International Union for Conservation of Nature (IUCN), invasive species are a growing global problem that pose a significant threat to biodiversity and ecosystem health.
My assessment of this statement is that it is very vague. While I can easily imagine that IUCN has made (many) statements such as this, it would be very difficult to pin down the source – which is primarily what we are interested in when we use a citation. Thus, while I feel that the statement is credible, it is unhelpful. Many might be tempted to use this text, and retrospectively assign another citation that would fit it. I would caution against this because (a) you’ll waste your time trying to find a citation that fits this text, and (b) you’d be better off reading pertinent literature and citing from what you’ve read. If you are unsure where to start with pertinent literature, try citations within citations from literature that you do know. The first paragraph of their text should give relevant literature for what you are trying to cite in your first paragraph – but always find and read it.
Chat GPT Statement 2: The National Wildlife Federation (NWF) reports that invasive species cost the United States government an estimated $120 billion annually in damages and control measures (Masterson, 2018).
My assessment: I got very stuck with this statement, which has caused me a lot of difficulties in trying to track down this document attributed to Masterson in 2018. A search of the National Wildlife Federation (NWF) website (https://www.nwf.org/) for “Masterson” came up with only two documents (Pathak et al 2022; Small-Lorenz et al., 2017). Neither document is authored by someone called “Masterson”. I have written to NWF to ask them whether they have any insight into where this statement might have come from.
Next is the figure of $120 billion annually. This likely comes from the Pimental et al. (2005) paper, which has now been updated by Fantle-Lepczyk et al. (2022), who have provided the most recent assessment of costs to the USA. In their abstract, they claim that a conservative estimate would be ~$20 billion per year, with ~$73 billion as an upper bound on this figure. Hence, if Masterson really did write a report for NWF then it seems that this was a very high estimate, and not in line with other (more recent) literature. Given the wide ranging access to the internet, it is surprising that ChatGPT decided to use the NWF source.
Chat GPT Statement 3: The Convention on Biological Diversity (CBD) states that invasive species are one of the main threats to global biodiversity, causing species extinctions, altering ecosystems, and reducing ecological resilience (CBD, 2021).
My assessment: This is another very vague statement, this time attributed to the Convention on Biological Diversity (CBD). They produced a large number of documents in 2021, and trying to find the one that made this statement, and on what it would be based would require plenty of time. I don’t doubt that it is correct, but it would be better to use a source that others can find. Thus, rather like statement 1, I would junk this statement as unhelpful for including in your PhD.
Chat GPT Statement 4: A study conducted by the University of California, Davis, found that invasive species can have negative impacts on human health, recreation, and property values, affecting people’s quality of life in areas where they are present (Koch et al., 2011).
My assessment: Here I started by searching the UCD website for “Koch invasive”. This came up with a number of hits, but none of which looked like anything that could be linked to the statement given. Next, I tried searching Scopus with “invasive species” in the Keywords, Abstract and , and Koch as first author. In 2011, only one paper by Frank H. Koch (of the Department of Forestry and Environmental Resources and USDA Forest Service, both in North Carolina – none of the other authors were from UCD) was written on invasive forest insect species (Koch et al., 2011). The article does not mention house prices, human health or quality of life. Had I been looking for a citation for a statement like this I might have started with the Blackburn et al. EICAT framework [@blackburn2014unified] and looked for examples of people citing this with their specified categories.
At the end of the first paragraph, I have to say that I’m not impressed with what ChatGPT actually came up with. It sounded good, but the statements were vague, and no sources could be verified. Hence, in terms of how useful this would be for writing your PhD, I would say that it gets zero points and was probably a waste of time. I would hope that you are more familiar with the broad context of your work to write a better first paragraph from scratch.
Paragraph 2
Chat GPT Statement 5: A global analysis of freshwater ecosystems found that they are among the most vulnerable to the impacts of climate change, pollution, and invasive species, with declines in biodiversity and ecosystem services being reported worldwide (Vörösmarty et al., 2010).
My assessment: Paragraph 2 looks like it is starting on a much firmer footing, and this suggests to me that ChatGPT needs more specific input in order to get it to be more useful. Firstly, there is a paper by Vörösmarty et al. (2010), and it is about fresh water. But it doesn’t mention invasive species. What is actually says is: “non-native species threaten biodiversity but are typically inconsequential to human water security.” This appears to be the opposite of what ChatGPT thought, and while we might have been tempted to keep this sentence as the citation (1) existed and (2) is about fresh water, again we find that this statement doesn’t match what we needed. There are papers that would be a much better fit for our purposes like Gallardo et al. (2016) and Gallardo et al. (2018).
Chat GPT Statement 6: Studies have shown that the introduction of invasive fish species can cause significant ecological and economic changes in freshwater systems, including alterations in food webs, declines in native fish populations, and reduced recreational and commercial fishing opportunities (Leprieur et al., 2008; Ricciardi & MacIsaac, 2011).
My assessment: Both citations here are readily findable and are on topic. Leprieur et al. (2008a) does indeed discuss economic activities, but finds evidence that they correlate with increased likelihood of non-native fish introductions – not that non-native species cause economic changes as claimed by ChatGPT. Similarly, as Leprieur et al. (2008b) study the patterns of invasive fish occurrence, they do not have any findings about ecological impact. Ricciardi & MacIsaac (2010) is a book chapter that focuses on the impacts of invasions in freshwater. There is a brief mention of economic impacts in their historical perspective, otherwise they do not discuss this issue. There is a good review of ecological impacts, and they do give examples of alterations in food webs, declines in native fish populations and a mention of impact on commercial fisheries. There is no assessment of impacts on recreational fishing.
Thus, for statement 6, ChatGPT failed with one citation, but got some success with another albeit less than 100%. It’s not a complete fail, but it is mostly wrong.
Chat GPT Statement 7: Invasive fish species have been intentionally or unintentionally introduced into both rivers and lakes, causing significant ecological impacts and economic costs globally (Savini et al., 2010; Phelps et al., 2016).
My assessment This statement is rather like a reiteration of the previous statement. As I asked for these separately, we cannot blame ChatGPT for this instance of repetition, but let’s see how it does on fish introductions to rivers and lakes – which is what was asked for.
Savini et al. (2010) is a paper about the top 27 alien animals introduced into Europe for aquaculture, so it is on topic with lots of information about invasive freshwater fish. However, it does not mention anything about rivers or lakes, does not mention intentional or unintentional introductions, but does give extensive examples about ecological impact. No specific mention of economic impacts are made, although they could be inferred due to some of the mentions of damaged aquacultural activities due to invasions. Not a complete fail then, but certainly this citation cannot be used to substantiate the statement made by ChatGPT alone.
The Phelps et al. citation could have been one of two papers published that year with Phelps as a lead author about fish. Phelps et al. (2010a) is about Shovelnose Sturgeon in North American rivers and does not mention invasive fish, but does talk about re-introductions. Phelps et al. (2010b) is about the illegal wildlife trade and does mention fish, but only in relation to trade. Neither of these publications could be cited to affirm the statement made by ChatGPT.
Chat GPT Statement 8: A study by Jeschke et al. (2014) found that over 12,000 introductions of invasive fish species have already occurred globally, with new introductions continuing to happen at an increasing rate, highlighting the urgent need for effective prevention and management strategies.
My assessment There is a paper by Jeschke et al. (2014). Rather than being about invasive fish, this paper covers impact of all invasive species. Non-native fish are given as an example of differences in opinions between quantifiable and value of change of impact, but are otherwise not mentioned. If this was the publication that ChatGPT was referring to, then it was a complete fail as none of Statement 8 could be attributed to this paper.
Was ChatGPT a complete fail?
Yes, I would say that having checked the references, most of what was said sounded good, but could not be backed up by the citations given, in the cases where they could be found. For this reason, I am confident in calling this experiment a complete fail, not because there was no truth, but because it was mostly misleading (incorrect citations and unhelpful misdirection to correct citations).
For most of you who are trying to write their PhD thesis, I would suggest that you are completely wasting your time to try this approach with ChatGPT. This doesn’t mean that you can’t use it as an to writing, but that you can’t rely on it to produce anything that is factually correct.
Perhaps ChatGPT is getting all these citation uses wrong because it is looking to see how other people use citations? This is not unreasonable, as mis-citing is a real problem. But having this platform reinforcing false citation habits is the last thing you need.
Given that it’s taken me several hours to check two paragraphs, I feel no hesitation in suggesting that you avoid using ChatGPT in this way – and if you do, beware – it probably got it all wrong.
If ChatGPT were a student, I'd have some choice words to say after spending all this time and finding out that almost nothing was correct. If I was feeling charitable, then I'd say "cocky" and "unafraid of making up rubbish in order to have something sound nice".
Literature Cited (by me - who knows what ChatGPT was citing)
Blackburn, Tim M., Franz Essl, Thomas Evans, Philip E. Hulme, Jonathan M. Jeschke, Ingolf Kühn, Sabrina Kumschick, et al. “A Unified Classification of Alien Species Based on the Magnitude of Their Environmental Impacts.” PLOS Biology 12, no. 5 (May 6, 2014): e1001850. https://doi.org/10.1371/journal.pbio.1001850.
Fantle-Lepczyk, J.E., P.J. Haubrock, A.M. Kramer, R.N. Cuthbert, A.J. Turbelin, R. Crystal-Ornelas, C. Diagne, and F. Courchamp. “Economic Costs of Biological Invasions in the United States.” Science of the Total Environment 806 (2022). https://doi.org/10.1016/j.scitotenv.2021.151318.
Gallardo, B., A.E. Bogan, S. Harun, L. Jainih, M. Lopes-Lima, M. Pizarro, K.A. Rahim, R. Sousa, S.G.P. Virdis, and A. Zieritz. “Current and Future Effects of Global Change on a Hotspot’s Freshwater Diversity.” Science of the Total Environment 635 (2018): 750–60. https://doi.org/10.1016/j.scitotenv.2018.04.056.
Gallardo, B., M. Clavero, M.I. Sánchez, and M. Vilà. “Global Ecological Impacts of Invasive Species in Aquatic Ecosystems.” Global Change Biology 22, no. 1 (2016): 151–63. https://doi.org/10.1111/gcb.13004.
Jeschke, Jonathan M., Sven Bacher, Tim M. Blackburn, Jaimie T. A. Dick, Franz Essl, Thomas Evans, Mirijam Gaertner, et al. “Defining the Impact of Non-Native Species.” Conservation Biology 28, no. 5 (2014): 1188–94. https://doi.org/10.1111/cobi.12299.
Koch, Frank H., Denys Yemshanov, Manuel Colunga-Garcia, Roger D. Magarey, and William D. Smith. “Potential Establishment of Alien-Invasive Forest Insect Species in the United States: Where and How Many?” Biological Invasions 13, no. 4 (April 1, 2011): 969–85. https://doi.org/10.1007/s10530-010-9883-8.
Leprieur, Fabien, Olivier Beauchard, Simon Blanchet, Thierry Oberdorff, and Sébastien Brosse. “Fish Invasions in the World’s River Systems: When Natural Processes Are Blurred by Human Activities.” PLOS Biology 6, no. 2 (February 5, 2008): e28. https://doi.org/10.1371/journal.pbio.0060028.
Phelps, Jacob, Duan Biggs, and Edward L Webb. “Tools and Terms for Understanding Illegal Wildlife Trade.” Frontiers in Ecology and the Environment 14, no. 9 (2016): 479–89. https://doi.org/10.1002/fee.1325.
Phelps, Q. E., S. J. Tripp, M. J. Hamel, J. Koch, E. J. Heist, J. E. Garvey, K. M. Kappenman, and M. a. H. Webb. “Status of Knowledge of the Shovelnose Sturgeon (Scaphirhynchus platorynchus, Rafinesque, 1820).” Journal of Applied Ichthyology 32, no. S1 (2016): 249–60. https://doi.org/10.1111/jai.13241.
Pimentel, David, Rodolfo Zuniga, and Doug Morrison. “Update on the Environmental and Economic Costs Associated with Alien-Invasive Species in the United States.” Ecological Economics, Integrating Ecology and Economics in Control Bioinvasions, 52, no. 3 (February 15, 2005): 273–88. https://doi.org/10.1016/j.ecolecon.2004.10.002.
Ricciardi, Anthony, and Hugh J. Macisaac. “Impacts of Biological Invasions on Freshwater Ecosystems.” In Fifty Years of Invasion Ecology, 211–24. John Wiley & Sons, Ltd, 2010. https://doi.org/10.1002/9781444329988.ch16.
Savini, D., A. Occhipinti-Ambrogi, A. Marchini, E. Tricarico, F. Gherardi, S. Olenin, and S. Gollasch. “The Top 27 Animal Alien Species Introduced into Europe for Aquaculture and Related Activities.” Journal of Applied Ichthyology 26, no. SUPPL. 2 (2010): 1–7. https://doi.org/10.1111/j.1439-0426.2010.01503.x.
Vimercati, Giovanni, Anna F. Probert, Lara Volery, Ruben Bernardo-Madrid, Sandro Bertolino, Vanessa Céspedes, Franz Essl, et al. “The EICAT+ Framework Enables Classification of Positive Impacts of Alien Taxa on Native Biodiversity.” PLOS Biology 20, no. 8 (August 16, 2022): e3001729. https://doi.org/10.1371/journal.pbio.3001729.
Vörösmarty, C.J., P.B. McIntyre, M.O. Gessner, D. Dudgeon, A. Prusevich, P. Green, S. Glidden, et al. “Global Threats to Human Water Security and River Biodiversity.” Nature 467, no. 7315 (2010): 555–61. https://doi.org/10.1038/nature09440.