Reviewed Preprints and the Emerging Genre of Editorial Assessment

Almost a year has passed since eLife revealed their new publishing model. They went from rejecting or accepting papers for publication in eLife after peer review to publishing all papers after peer review. This is called the “reviewed preprint” system.

This model has met a lot of resistance. I was positive about the idea with one caveat: it puts a lot of emphasis on the “editorial assessment”. Traditionally, the reject/accept decision would be the culmination of the peer review process. A reviewed preprint on the other hand gets published regardless of the peer review process. Great papers, completely convincing: published in eLife. Unsupported claims, poor methodology and other nonsense: published in eLife. The editorial assessment (called eLife assessment in this case), which is found right under the paper abstract would be the fastest way to distinguish a good from a bad paper. Of course any paper also speaks for itself and editorial assessments should not be read uncritically. In extreme cases, both the paper itself and the open peer review that is published with the paper can give the most comprehensive picture of a papers quality. However, having a quicker way to judge a paper or specific claims is extremely important under many circumstances.

Journal prestige is still the quickest way most scientists judge a paper by, if they lack time, or expertise, or both. One of the major criticisms of eLife’s reviewed preprint model isbeen that it has essentially given up its own prestige by ceasing to reject papers and has thereby made it impossible to easily judge eLife papers. I strongly disagree that it is now impossible to quickly judge eLife papers, because a well written editorial assessment can be a much better to judge papers than journal prestige. But I do think that the editorial assessment is now extremely important. I also think it will over time become its own genre of scientific writing and will develop a coded language, similar to reference letters. That’s why I have been looking at editorial assessments of papers that just showed up at the top of the content feed to see how editorial assessments in eLife actually look. How do they highlight good papers and how do they deal with identified problems during peer review.

How are Editorial Assessments Made?

Some of the process is explained here. However, the document does not explain the process in detail. For example, how do different reviewers and the editor resolve disagreements? Here it says: “[…] the editor and the reviewers write this assessment with the help of a common vocabulary to ensure consistency.” There is similar language in this document: “The eLife assessments have been designed to provide a clear summary of what the editors and reviewers thought about the preprint.” Under most circumstances it is probably relatively easy for reviewers and editors to agree, but a bit more transparency on the process and who has final word might be useful.

On the use of the “common vocabulary” I have two thoughts. On the one hand, a very restrictive common vocabulary might prevent writers from finding the most appropriate words. On the other hand, English is not a native language for most scientists (myself included). A smaller common vocabulary can make it easier to write and read. My opinion is that being clear and restrictive about a relatively small vocabulary is a good idea for this kind of assessment. But it is not clear which vocabulary is the best, how big it should be and I assume it will change over time and as other journals adopt similar systems there will be some productive experimentation.

Homo Naledi be Turning in Their Grave

Can the editorial assessment actually distinguish a good from a bad paper? While there are no objectively bad or good papers, we can start with one that attracted a lot of controversy on twitter. This paper about deliberate burial in homo naledi got a lot of people angry. People said it shouldn’t be published by eLife, or anyone for that matter, because it does not scientifically support its main claim. But alas, it was published as a preprint by the authors and then an eLife editor decided to organize peer review for it and then a version of record was published by eLife. But does the editorial assessment reflect the controversial quality of the paper? Did the reviewers agree with twitter? I think they did and the editorial assessment reads pretty clear to me. Take for example this sentence: “The four reviewers were in strong consensus that the methods, data, and analyses do not support the primary conclusions.” I’m not an archaeologist and I don’t have a strong opinion on the paper. If I strongly felt the paper was bad, I’d like to think that I would be happy to have this sentence permanently associated with the paper.

Maybe the paper is so horrid that in an ideal world it wouldn’t exist at all? Maybe the four reviewers and the editor could have spent their time better. But this is not an ideal world. This is the world where people want to find interesting things and sometimes their data does not back up the thing.

Editorial Assessments Seem Overall Positive but Rarely Uncritical

Other papers that I randomly clicked on in the eLife content feed are mostly positive but I was surprised that most of them bring up some kind of criticism. I expected them to be much more flowery. Press release-like. For example, for this paper, the editorial assessment admits that “The authors present convincing data […]” but the data is also “[…] as the authors acknowledge, currently incomplete with respect to establishing a functional role […]”.

For this paper the editorial assessment finds four words from the common vocabulary. Three of them positive: valuable, solid and convincing. But also, “[…] evidence to support the argument that chirps are mostly used for navigation rather than communication is incomplete.” This seems like a serious limitation for a paper that is mainly abut chirps and makes me believe that maybe the editorial assessment is a bit too positive?

Here is one that is entirely positive about genetic control of mosquito populations. I tried to read between the lines to see if there is any hidden criticisms but could not find it.

“Incomplete” seems to be the most frequent word from the common vocabulary, probably because it is one of the mildest criticisms? Isn’t any scientific paper in some way incomplete? Here is another one with only one negative word, that word again being incomplete.

Here is one that seems very negative, mentioning again “incomplete” but also strongly criticizing the paper in language that does not fit the common vocabulary.

To finish on a positive one, “The mathematical analyses in the paper are convincing and possible limitations, including the abstraction from biological details, are well discussed.” You love to see it.

The Future of the Reviewed Preprint – Summary

Is eLife going to be flooded with low quality papers that are unpublishable elsewhere? Or will scientists be so afraid of unfavorable public peer review that no one will submit interesting papers? I do not think people will knowingly upload low quality work. Extremely negative editorial assessments seem way too adverse. Of course that requires scientists that make career decisions to read the editorial assessment, which they kind of have to, since they cannot simply rely on eLife’s impact factor and the fact that they published the paper.

I also do not think people will be scared to submit papers to eLife. While editorial assessments do not shy away from criticism, they also acknowledge the scientific value of imperfect papers. That middle ground is ideal for reviewed preprints. Overall, I believe the current state of editorial assessments at eLife is good and makes me excited for the future of the genre.

Simulation-Based