Recycling text - new guidelines clarify a thorny issue
I have written elsewhere on this blog about plagiarism. Plagiarism is when you copy text from a source document somebody else has written, and paste it into your own document. You will be aware that plagiarism is not acceptable either for any documents that you hand in at the University or for anything that you want to publish (see blog entry on plagiarism).
But what if the source document that you want to copy from is something that you've written yourself?
Does this still count as plagiarism?
Or is it text recycling?
In some new guidelines, recently published, Hall et al (2021) help demystify text recycling in its different formats and explain what is permissible when and why.
Text recycling
Example
Developmental
Reusing text that you have written but not published, for example in your proposal or thesis.
Generative
Use of already published text that becomes more obscure when you attempt to reword it, such as technical settings in your methods.
Adaptive
Using text published in one format on the same subject but for a different audience. For example, using some text published in a paper for a popular article.
Duplicate
Repeating published text wholesale for intention to publish again for the same audience.
Developmental recyclingis when you are reusing text that you have written for example between your proposal and something you intend for publication or in an ethics application that you also want to use in your thesis. All of this sort of developmental recycling is permitted and actually encouraged. I would further encourage you to use the opportunity of recycling this text to develop it and refine it further, condensing and improving where you can.
Generative recyclingis where you take pieces of already published text for example from the methods when it does not make sense to change the text or actually makes it more obscure to reword it in order to avoid plagiarism. In my experience this doesn't amount to more than a few sentences describing technical settings on equipment. However this will depend strongly on your own subject area and may amount to larger chunks of text. In my previous advice on generative recycling I suggested that it is usually possible to reword most of the methods sections of papers. I reiterate here that this is the most preferable outcome and that you avoid any text recycling at all. You should really only be generatively recycling material if you cannot avoid it. These are situations where the text becomes more obscure by your attempts to reword it. There are some extra guidelines for generative recycling where you should have been an author preferably the lead author on the original text and that you make it transparent that the text has been recycled to readers (via a citation) and you may also want to declare it in your cover letter if there are no journal guidelines. Also make sure that any co-authors are aware.
Adaptive recyclingis where you are using published text for example for my paper as the basis for content in a popular article online or in a magazine or op-ed. I think that this kind of text recycling is quite unnecessary because you almost certainly need to reword your text for a different audience. There maybe times such as figure legends where you need to reuse text that was already published. If you do find yourself in such a position then check with the copyright owner of the material that you are able to reuse the text that you want without legal issues.
Duplicate recyclingis where large tracts of texts are essentially the same for the same message and audience. This is never likely to be sanctioned as it suggests that you are attempting to publish the same work twice. It will not be legal or ethical.
Read More:
Hall, S, Moskovitz, C, and Pemberton, M (2021) “Understanding text recycling. A Guide for Researchers” Text Recycling Research Project:https://textrecycling.org/resources/
Nick already published two papers from his thesis (see here), but this latest paper out today in BMC Ecology & Evolution looks at the relationship between bite force, morphology, and diet across a number of agamids in southern Africa. He found that although head morphology and bite force relate to each other, they don't have strong or expected relationships with the ecology of the species. For example, rock agamas that have particularly flat heads for fitting under rocks, actually bite very hard. In general species with greater in-levers for jaw closing have a greater bite force and are associated to an increase of hard prey in the diet.
If you've never conducted any performance work, then it's worth watching this video and seeing how much work goes into getting every datapoint. Especially when the animals bite hard...
The bigger they are - the harder they bite!
This is a video by Nick that shows what happens when Giovanni got bitten.
Nick's work built on fieldwork done by Anthony Herrel, Bieke Vanhooydonck and the reptile team back in 2008.
Great to see this work being published!
Read Nick's work here:
Tan, W.C., Measey, J., Vanhooydonck, B. & Herrel, A. (2021) The relationship between bite force, morphology, and diet in southern African agamids. BMC Ecol Evo21, 126. https://doi.org/10.1186/s12862-021-01859-w
Tan, W.C., Herrel, A. & Measey, J. (2020) Dietary observations of four southern African lizards (Agamidae). Herpetological Conservation and Biology 15(1), 69-78 pdf
Solving the attractiveness of incredible results with prediction markets
The idea that science is in crisis is one that has been building now for at least two decades. Evidence for this crisis revolves around studies that provide evidence of publication bias and especially the lack of repeatability of high profile studies. The findings appear bemusing because all studies are subjected to peer-review before being published and therefore go through some kind of quality check. Certainly this would suggest that it would not be easy to pick which studies are replicable and which are not. Yet this does not appear to be the case.
A study that gave subjects the possibility of predicting which studies were replicable and which were not found that it was possible to predict an advance of any replication studies being made. Moreover they found that once replication studies were made they followed the prediction market (Dreber et al 2015). Thus, this suggests that individuals in peer review are not particularly good at determining whether or not the study is replicable, but that a prediction market is.
Incredible results
Humans have a bias toward wanting to believe significant results (Tivers 2011), even when the potential for these to be the result of aType I error is quite high. Perhaps the positive feedback gained from incredible results by the media (traditional and social) gives the impetus to drive selection. But a new study suggests that, all else being equal (including gender bias, author seniority, etc.), these incredible results also generate more citations (Serra-Garcia & Gneezy 2021).
As we are aware, citations are a form of currency in current day science. Increased citations to journals (within 2 years of publication) give them higher Impact Factors, which in turn allow them to leverage better manuscripts and higher APCs. Increased citations to authors allow them to compete in a competitive job market, opening the door to tenure, grants and awards. We should be aware of the increasing number ofretractions associated with fraud, which has placed the perpetrators in advantageous jobs.
The research by Serra-Garcia & Gneezy (2021) is of particular note as they only selected publications from two journals,NatureandScience, meaning that this journal playing field was very similar. They used the dataset from Dreber et al (2015) allowing them to see which of the studies was actually repeatable, and the ones that weren’t being literally incredible. They found that incredible studies received more citations, even after the replication studies (see Dreber et al 2015) showed that they lacked credibility. Moreover, after the failure to replicate, only 12% of these additional citations reported their incredible nature and hence the increased citations are not generated from those that report on failure to replicate.
The recognition that incredible results are attractive to high-ranking journals and those who cite research in their own fields helps to lift the veil on the way in which today’s science has a positive feedback forchancers and crooks. We are susceptible to scientific fraud because we appear to be drawn by the incredible, presumably because the credible simply doesn’t seem exciting enough. Given that as individuals we perform poorly, can we use prediction markets to give us the edge on our inbuilt biases?
Finding a use for prediction markets
Prediction markets are simply crowdsourcing to determine the outcome of a particular event, in this case whether or not a study was replicable. However, there is a gambling twist that borrows from the stock market. For example, you might think that there is an 85% chance that the study is replicable, and this is how you enter the market. Once all participants place their predictions in, a consensus prediction is reached, and now the trading begins. If the consensus prediction is at 0.62, and you really believe that the chance is 0.85 you should buy stocks valued at 0.62 because if you are right you will make money. However, if the consensus stock is 0.95, then you would be better off selling your share of 0.85 if you really believe that there’s a 0.15 chance the replication will fail. Like the real market, there is no reason for this market to be static. For example, one of the authors of the original study could give a talk, and during the talk participants start buying or selling their stock as extra confidence or skepticism is gained. Likewise, during the questions an astute member of the audience may rattle the author, resulting in a fall of the ‘price’ or consensus outcome.
Potential uses of prediction markets
If individual reviewers aren’t good at spotting incredible results, perhaps this should be passed to a crowdsourcing platform to determine whether or not high profile studies should be published in high profile journals. As all of the editorial board team members have expertise in the journal’s content, perhaps they could make up the panel of experts that judge on the replicability of each issue’s content. Over time, each editorial board team members’ quality would be recorded, and over time their ‘usefulness’ might be quantitatively valued.
Prediction markets are likely to work well when complex decisions have to be taken by changing a small committee of people with limited information to a much larger group with better collective experience.
Choosing student projects to fund
Each year a limited number of people apply for bursaries to conduct projects in invasion biology at the CIB. The number of bursaries available is between a tenth and a quarter of the number of applicants. The committee (of 5 people) examines each application on a number of criteria from the application including qualities of the student, the project, the focus and past performance of the advisor. Knowledge of the panel is imperfect as they don’t know all of the information behind each application. Occasionally, phone calls during decision meetings are made to fill in blanks, but decisions are made on scoring each project with projects that have the top scores getting funded.
Enter the prediction market.Now a larger number of people can get involved - this could be the entire Core Team of the CIB, or the core team and all existing students. Some will have much better information than the original panel, and will have some impetus to trade with greater confidence. Those with less information are less likely to participate or buy less stock. Advisors who have several student applications are similarly forced to either split their stakes evenly, or potentially back a preferred application over one they consider less likely to succeed. Once the projects are funded (based on the outcome of the prediction market), their success or otherwise will be gauged by whether the degree is gained by the student within the time allotted. For students that fail to produce in time (or meet any set of milestones) will be regarded as having failed, and the payout will be for those that had stock predicting this.
Other potential uses
A similar position is faced by anyone looking at applications from stakeholders that have imperfect knowledge, but where a community of experts exists that have better collective knowledge. Outcomes of proposed projects need to have clear milestones, and on a timescale whereby the participants are still likely to be around to see the return of their knowledge investment. There are a lot of potential uses within the academic environment including:grant applications; hiring committees; etc.
Recognition of weakness
Once we have recognised where we are likely to perform badly at decision making, we should be willing to look to other solutions to improve our performance. There is clearly some distaste in the idea of a money market on making decisions, so could this instead form part of the reputation of those that take part? Would you be willing to wager your scientific reputation on the outcome of a hire, a student bursary or a grant application?
References:
Dreber, A, T Pfeiffer, J Almenberg, S Isaksson, B Wilson, Y Chen, BA Nosek, and M Johannesson. Using Prediction Markets to Estimate the Reproducibility of Scientific Research. Proceedings of the National Academy of Sciences112, no. 50 (December 15, 2015): 15343–47.https://doi.org/10.1073/pnas.1516179112.
Measey, John.How to Write a PhD in Biological Sciences: A Guide for the Uninitiated, 2021.http://www.howtowriteaphd.org/.
Serra-Garcia, M, and U Gneezy. Nonreplicable Publications Are Cited More than Replicable Ones. Science Advances7, no. 21 (May 1, 2021): eabd1705.https://doi.org/10.1126/sciadv.abd1705.
Trivers, Robert.The Folly of Fools: The Logic of Deceit and Self-Deception in Human Life. 1st edition. New York, NY: Basic Books, 2011.
Want to read more?
This blog is the basis of two books currently in press, but you can read them free online now:
What if we did have good, editorially coordinated peer review of preprints?
What if, instead of these manuscripts effectively leaving the preprint system, they were updated together with the reviews that prompted the updates, each with their own linked DOI?
What if the journals themselves were simply pointing to collections of papers that had been curated in this way?
Simply, this could be a website that throws a veneer of a journal as waypoints to peer reviewed journals?
This world has already been imagined and is functioning in mathematics, where Overlay Journals have begun to prosper.
According to Brown (2010), the idea of overlaying has been with us for some time, and exist as websites that offer a series of links to other papers. It makes me think of the early days of the internet where there were websites that consisted of lists of other websites, before the days of search engines. Thinking about it in this way, a review article could be considered an ‘overlay paper,’ the contents of Web of Science as an ‘overlay database.’ But, for me at least, this is not where the real potential lies. Instead, imagine the overlay journal as a way in which academics entirely remove the need for publishers. The need for this is increasingly evident as we become more familiar with the ways in which we rely on traditional publishing models to pervade our scientific project with confirmation bias. Overlay journals no longer require a publisher to store the publication. This is done at the preprint server. The reviews are housed at the same arXiv site (or would be in an ideal and transparent version)(Rittman 2020), as is the manuscript in its final form after being accepted by the overlay journal editor. The authors themselves are responsible for the final layout. The Overlay Journal co-ordinates the reviews and conducts the editorial work, and then simply acts as a pointer to the finished product: no papers, no publishers, no editorial management software, no costs and all papers are Diamond OA!
The math journal Discrete Analysis (indexed in both Web of Science and Scopus) was the first of these new ‘arXiv overlay journals’ (since 2015, and indexed since 2017), and following on this link will allow you to quickly appreciate what an Overlay Journal is. Each ‘published’ paper still sits on its original preprint server. The overlay journal itself offers a brief editorial summary of what you’ll find if you click through to the paper. This is a fantastic idea in that it pitches editors back into being responsible content curators. As an editor I’d want to be motivated to publish a paper that I liked in order to write an editorial summary about it.
Because only the accepted version is provided with an ‘article number’ and the style file of the journal layout, the author then produces the final version of record (VoR) of the accepted manuscript by running the style file with LaTex. All of this is possible with free software, for example by using Rmarkdown (Xie, Allaire, and Grolemund 2018).
Using preprint servers also allows the entire process to be transparent, very quickly becoming associated with other great initiatives like the Centre for Open Science - OSF.
What do traditional publishers think of ‘Overlay Journals’?
Surely, the onset of ‘Overlay Journals’ should have publishers quaking in their boots? Strangely not. But their response should really be enough to wake us up.
But they do see that there’s a possibility of disruption:
I think that the real threat to our traditional … if Overlay Journals have Impact Factors and can provide the same services, and they are free… then I think that that does pose a threat.
As this has already happened, it would be interesting to know how traditional publishers are going to prevent an Overlay Journal take-over.
What is happening in biological sciences?
At the time of writing, there are no ‘arXiv Overlay Journals’ in the biological sciences. However, there’s a 'nearly' model. Peer Community in Evolutionary Biology comes very close to the ‘arXiv Overlay Journal’ model. These preprints are submitted to PCI-Evol Biol, and are reviewed and (if they aren’t rejected), a recommendation is given. The site then publishes the recommendation from peers as well as pointing to the preprint. However, unlike Discrete Analysis, the preprint remains ‘unpublished’ despite the peer review and can then be taken onto a traditional journal.
There’s an excellent tie-in here with transparency. Because preprints are Diamond OA, and reviews are OA, the process is all transparent.
The blog post above is written for my new book: How to publish in Biological Sciences: A guide for the uninitiated. You can read the book as I write it here. If you have ideas about items that aren't available in the book yet, please contact me!