Hacker News new | past | comments | ask | show | jobs | submit login
Propagation of mistakes in papers (databasearchitects.blogspot.com)
124 points by greghn on July 26, 2022 | hide | past | favorite | 40 comments



Reminds me of https://en.wikiquote.org/wiki/Oil_drop_experiment, famously described by Feynman,

> Millikan measured the charge on an electron by an experiment with falling oil drops, and got an answer which we now know not to be quite right. It's a little bit off because he had the incorrect value for the viscosity of air. It's interesting to look at the history of measurements of the charge of an electron, after Millikan. If you plot them as a function of time, you find that one is a little bit bigger than Millikan's, and the next one's a little bit bigger than that, and the next one's a little bit bigger than that, until finally they settle down to a number which is higher.


That experiment is also used to teach about selective data exclusion (potential scientific fraud) as well as a resistance to challenging an already published value (future experiments searching to verify Millikan's value rather than show it was incorrect or off) in a lot of experimental classes.

https://en.wikipedia.org/wiki/Oil_drop_experiment


Another example I recently came across was the early measurements of the AU using radar. The first two experiments tried to bounce radar off of Venus had very noisy data, but they seemed to have a detection that implied a distance that was pretty close to the earlier measurements that had been done using parallax. But after the equipment was upgraded, the detections went away and it turned out that they had just been noise. Later on an even more powerful radar system was able to successfully bounce a radar signal off of Venus and it turned out that the AU was quite a bit different from its earlier value.


Is that part of the genesis for these conversations about how perhaps the physical constants of the universe are slowly changing over time? If you look at the 'right' experiments, the speed of light slowly crept up over time too, IIRC. When the movement is all in one direction it's easy to speculate that maybe that's because the target keeps moving.


No, any possible variation in fundamental "constants" is going to be far slower. The fine-structure constant (which is generally what's actually being discussed when people talk about a variable speed of light), for instance, changes by no more than a factor of about 10^-17 per year.


Did we know that 70 years ago? I’m talking about history, not today.


That particular bound wasn't. Others were. For example: Millikan's value for the electron charge was about .995 of the current best measurements. That was about a century ago. So either someone was wrong, the average rate of change has dramatically increased recently (in which case the universe is conspiring against physicists and all bets are off), or human biochemistry is significantly younger than human civilization.


I don't think this sort of thing is all that unusual.

I once did a web-of-science search for citations to a foundational paper in my field. It was published in volume 13 of a particular journal, and that was listed in a little over 90% of the citations, but the other citations all listed the journal as 113. My assumption is that somebody cited it in error, and that others were basically copying the citation from the bibliography, rather than going back to the original paper to get the original metadata.

Does this mean that about 10% of writers were basically lying about having read the original paper? Well, maybe. But I fear that the number might be higher than 10%, because the correct citations might also have resulted from just copying from a bibliography.

I tell this story to my students, in hopes that they will actually read the original papers. Quite a few take my advice to heart. Alas, not all do.


There's a semi-famous line of research by Simkin which uses citation copying errors as 'radioactive tracers' to estimate the rate of copying & nonreading, under the logic that (in a pre-digital age), you could not possibly have repeated the '113' error if you got an ILL copy or physically consulted volume '13' (if only because you would be pissed at wasting your time either checking volume 113 first or verifying there's no such thing as volume 113):

https://www.gwern.net/Leprechauns#citogenesis-how-often-do-r...

Your 10% isn't far off from the 10-30% estimates people get, so not bad.


This is somewhat a criticism of how contemporary citations work though.

Primitive science (or even pre-publishing science) doesn't get cited because humanity figured it out before our current system was in place.

It may sound silly, but no one feels the need to cite Eratosthenes when implying the world is round.

But many people do feel the need to cite the colorimetric determination for phosphorus (an SCI top 100 paper) even though it was published 100 years ago and is generally considered “base-level science.”

It is certainly an interesting paper to read, but I’m not sure I need every scientist to read it in order to believe they know how to do colorimetric analysis.


Papers get cited for very different reasons, from "historically significant paper cited in the introduction" over "related work we're building upon" to "this proves a statement/theorem we're using". It might make sense to start categorizing citations in some way.

Only for some categories I expect the authors to have read the cited paper. Some of them don't necessarily mean the cited paper is high quality. Some are recommended reading to understand the original paper, some are'nt


I wonder what fraction of citations are there for sake of having citations in a paper. That is something we were trained for as undergraduates. Fact that we should include enough citations just for sake of it.


Or maybe that's because somebody published a bibtex entry for that paper that got that volume number wrong and those people just copied and pasted the entry without reading.


Indeed. I don't really ever go to the actual original journal of an old (pre digital) paper, instead I search the internet for pdf that someone already scanned in and then, ideally, also download the accompanying bibtex.


I'd be curious to know whether the percentage of incorrect citations varies over time. I would guess more recent authors would be more likely to search by title in Google Scholar or SciHub (or use the DOI link, if available) rather than actually use the volume and page number, which could result in more authors who did read the article nonetheless getting the volume number wrong.


I wonder if there's a trick we're missing related to the dead-tree history of papers that we could address.

Namely, paper references always reach back in time. Papers don't reference papers that were written after they were written. And if that sounds stupid, bear with me a second.

We've talked a lot about the reproducibility problem, and that's part of propagation errors in papers (I didn't prove this value, I just cribbed it from [5]). If we had a habit of peer reviewing papers and then adding the peer review retroactively to the original paper, both for positive and negative results, would we slow this merry-go-round down a little bit and reduce the head-rush? Would that help prevent people from citing papers that have been debunked?


Solid point. The paper is a delta-mapper: it provides a p --> ∆p prior to posterior-change. However, it does not tell you anything about p or p'=p+∆p itself. To get true value of p^n, we sum over all ∆p in some way (affected by the path we take through papers addressing).

You're modifying the thing so that future ∆p^{i+k} are added to the delta-mapper so that ∆p is appropriately modified accounting for that ∆p^{i+k}. It's like path-compression in a union-find structure.

It is interesting as a helpful approach but does suffer from the pingback spam problem, right? And I have a slightly sneaking suspicion that it is not an accidental oversight in science that leads to these problems.


You can always look at all the papers that cite the paper of interest. That provides the other direction deltas, and is something i do all the time, especially for older papers, looking for papers that might be relevant to me.


Ah, he was suggesting a "Reverse Bibliography" list on every paper, but we do, in fact, already have that: it's what you're talking about. The "cited in" section all the science search engines offer.


Not a full reverse list, just for the ones that tried to reproduce the results. The websites can handle the rest.


I see. Everything that is "same experimental design" with the same starting hypothesis. I like it.


> Namely, paper references always reach back in time. Papers don't reference papers that were written after they were written.

Actually, the opposite happens quite regularly: the author X got a pre-print of a paper by author Y. From Y, he knows that the paper will be published in the future in an upcoming volume of a scientific magazine. So, X already writes this future reference into his paper.


A dream of mine is a database or at least a standard for attaching the main results (summary statistics for population and variables (by group) + some effect measure/regression coefficients/.. and variation/SD/CI/p value would cover 90%+) of a paper in a machine-readable way so that they can be easily imported for metaanalysis or comparison. Then we can have "living" metaanalyses using filters to include new studies as they appear. Please take the challenge:)


Researchers as a whole need to do more checking. While I agree that errors like the one identified in the link are rare, they are not so rare that one shouldn't spend the time looking for them or assume that everything was done properly.

I've speculated before that peer review gives researchers false confidence in published results [0]. A lot of academics seems to believe that peer review is much better at finding errors than it actually is. (Here's one example of a conversation I had on HN that unfortunately was not productive: [1].) To be clear, I think getting through peer review is evidence that a paper is good, albeit weak evidence. I would give the fact that a paper is peer reviewed little weight compared against my own evaluation of the paper.

[0] https://news.ycombinator.com/item?id=22290907

[1] https://news.ycombinator.com/item?id=31485701


> To be clear, I think getting through peer review is evidence that a paper is good

I think this depends on how you define good. I'm sure there's some variation across fields, but peer review generally seeks to establish that what is presented in the paper is plausible, logically consistent, well-presented, meaningful, and novel. That list is non-exhaustive, but correct is very hard to establish in a peer review process. In my experience, it would be rare for a reviewer to repeat calculations in a paper unless something seems fairly obviously off.

As a computer scientist, it would be even more rare for a peer reviewer to examine the code written for a paper (if it is available) to check for bugs. Point being, there are a lot of reasons a paper that appears good may be completely incorrect. Although this is typically for reasons that I as a casual reader would be even less likely to distinguish than a reviewer who is particularly knowledgeable about that particular field.


Peer review can help improve a paper (and it has improved some of mine); however, contrary to some popular notions, it doesn't lend "truth" to a paper.

Peer reviewers are not monitoring how experiments were conducted, they only have access to a data set that is by necessity already highly selected from all the work that went into producing the final manuscript. The authors thus bear ultimate responsibility.

When considering published work close to mine, I use my own judgement of the work, regardless of peer review or which journal it is published in (for example it may be in a PhD thesis). For work where I am not so familiar with the methodologies, I prefer to wait for independent verification/replication (direct or indirect) from a different research group, which ideally used different methods.


Well, to be fair, there is the journal "organic syntheses"

https://en.m.wikipedia.org/wiki/Organic_Syntheses


Somewhere near 100% of my shipped bugs have been peer reviewed so that makes a lot of sense to me.


I just completed a paper review as a reviewer. After I think 4 rounds, the author finally ran the calculation I had asked for in the initial review and admitted I was right. We got there in the end, but I had to sit on my hands.


I've got a close relative who reviews papers all the time in their field (not CS). Based on that my take is that if a paper passes peer review it is a good indicator there's nothing egregiously wrong with the stuff that's written.


> Judging by publication date the source seems to be this paper (also it did not cite any other papers with the incorrect value, as far as I know). And everybody else just copied the constant from somewhere else, propagating it from paper to paper.

And the Scheuermann and Mauve paper mentions that they picked the value (0.775351) from the Philippe Flajolet paper that only mentions it without the extra 5. It's not that it was calculated again, reviewed or something like that. It was simple picked up and typed wrong.


Have you thought what would be needed and what would imply to have a kind of CI/CD pipeline for unit testing assertions on papers?


How do you CI/CD assertions in papers using animal models in experiments that take months to years?


The slow async process of scientific progress.

The assertions I've imagined would be among the innovative papers on the established papers values.

One outcome is that this would strongly encourage fitting in and discourage innovation and disruption.

It could discourage publishers from correcting a past error even stronger?

So I'm not sure if I like the idea at all to be honest.

Let's keep it uninvented as it is.


How would that work? You can't automate testing of papers. However flawed the process is, this is what peer review is intended to do.


A billion dollars worth of AI and formal methods research


A different kind of replication crisis.


Well this is a bit like what happens in software when some library everyone uses turns out to have some awful bug in it that propagates to everything else. I think it is par for the course for abstraction generally.


same thing happens in the news - we assume due diligence has been satisfactorily (and honestly) conducted by publishers we hold in high esteem, and happily propagate without scrutiny, so long as it fits our preferred narrative


xkcd citogenesis comic




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: